Portability — running off COSMA¶
haloemu was developed on COSMA, but prediction is portable: the trained
artifacts ship inside the package and every registered property predicts with
only PyPI dependencies. Training, validation, and re-measurement remain
COSMA-bound because they need the halocat loaders and the simulation data.
What runs where¶
| Task | Off-COSMA? | Needs |
|---|---|---|
| Predict any registered property | ✅ yes | PyPI deps + the in-package artifacts |
| Train / re-train | ❌ no | halocat loaders + the mg_glam PkTables (COSMA data) |
| Validation / evidence harnesses | ❌ no | halocat (and freyja for the optional hmf cross-check) |
"Every registered property" means all of hmf, pk_mm, xi_mm, b_cum,
b_diff, xi_hh, xi_hh_smallr, r_ab, and vel_m10 … vel_c22, for both
LCDM and fRn1. This is verified by predicting with halocat and freyja
made unimportable (simulating a non-COSMA machine).
Dependencies¶
Prediction needs only packages on PyPI, declared in pyproject.toml:
colossusprovides the EH98 linear-theory baseline and the peak-height σ(M), so it is required forpk_mm,xi_mm,b_cum,b_diff, andxi_hh.hmf,r_ab, andvel_*predict without evencolossus.
The non-PyPI packages halocat (data loaders + cosmology table) and freyja
(an external HMF emulator) are not needed for prediction:
halocatimports are confined to training/assembly functions and are lazy.freyjawas removed from theb_diffinversion: n̄ now comes from the in-suitehmfemulator (a gated swap — in-suite vs freyja RMS 0.63 %, median 0.16 % across the 64-model design).freyjasurvives only as an optional cross-check (source="freyja"inbias_differential.nbar, and thehmf_validation.pycomparison).
Install off-COSMA¶
import os; os.environ.setdefault("JAX_PLATFORMS", "cpu")
from haloemu import get_registry
reg = get_registry()
reg.predict("b_diff", "LCDM", 0.25, [0.31, 0.677, 0.967, 0.83]) # works, no halocat/freyja
Artifacts are bundled
haloemu/artifacts/** and manifest.json are declared as package-data,
so a normal install carries the ~6 MB of trained pickles. No data download is
needed to predict.
Caveats¶
- Cross-version pickle risk. The artifacts are pickled objects that depend on
the class definitions and array formats of
numpy/scikit-learn/tinygp/jax. Unpickling on very different library versions can fail or warn; if you hit this, align versions with the ones used to train (or re-train on COSMA and re-ship). Consider pinning these in a downstream environment. - Redshift / property coverage is unchanged by portability: z = 0.25 for all properties, z = 0.00 for the matter sector only. See Caveats.
- Hardcoded COSMA paths (
MG_GLAM_ROOT, theFREYJA_ROOTdefault) live only in training/cross-check code paths and never run during prediction.