Skip to content

Portability — running off COSMA

haloemu was developed on COSMA, but prediction is portable: the trained artifacts ship inside the package and every registered property predicts with only PyPI dependencies. Training, validation, and re-measurement remain COSMA-bound because they need the halocat loaders and the simulation data.

What runs where

Task Off-COSMA? Needs
Predict any registered property ✅ yes PyPI deps + the in-package artifacts
Train / re-train ❌ no halocat loaders + the mg_glam PkTables (COSMA data)
Validation / evidence harnesses ❌ no halocat (and freyja for the optional hmf cross-check)

"Every registered property" means all of hmf, pk_mm, xi_mm, b_cum, b_diff, xi_hh, xi_hh_smallr, r_ab, and vel_m10 … vel_c22, for both LCDM and fRn1. This is verified by predicting with halocat and freyja made unimportable (simulating a non-COSMA machine).

Dependencies

Prediction needs only packages on PyPI, declared in pyproject.toml:

numpy  scipy  h5py  jax  tinygp  scikit-learn  colossus   (matplotlib only for plots)
  • colossus provides the EH98 linear-theory baseline and the peak-height σ(M), so it is required for pk_mm, xi_mm, b_cum, b_diff, and xi_hh.
  • hmf, r_ab, and vel_* predict without even colossus.

The non-PyPI packages halocat (data loaders + cosmology table) and freyja (an external HMF emulator) are not needed for prediction:

  • halocat imports are confined to training/assembly functions and are lazy.
  • freyja was removed from the b_diff inversion: n̄ now comes from the in-suite hmf emulator (a gated swap — in-suite vs freyja RMS 0.63 %, median 0.16 % across the 64-model design). freyja survives only as an optional cross-check (source="freyja" in bias_differential.nbar, and the hmf_validation.py comparison).

Install off-COSMA

git clone <halobias repo>
pip install -e halobias        # pulls the PyPI deps; ships the artifacts
import os; os.environ.setdefault("JAX_PLATFORMS", "cpu")
from haloemu import get_registry
reg = get_registry()
reg.predict("b_diff", "LCDM", 0.25, [0.31, 0.677, 0.967, 0.83])   # works, no halocat/freyja

Artifacts are bundled

haloemu/artifacts/** and manifest.json are declared as package-data, so a normal install carries the ~6 MB of trained pickles. No data download is needed to predict.

Caveats

  • Cross-version pickle risk. The artifacts are pickled objects that depend on the class definitions and array formats of numpy / scikit-learn / tinygp / jax. Unpickling on very different library versions can fail or warn; if you hit this, align versions with the ones used to train (or re-train on COSMA and re-ship). Consider pinning these in a downstream environment.
  • Redshift / property coverage is unchanged by portability: z = 0.25 for all properties, z = 0.00 for the matter sector only. See Caveats.
  • Hardcoded COSMA paths (MG_GLAM_ROOT, the FREYJA_ROOT default) live only in training/cross-check code paths and never run during prediction.