Halo catalogues¶
A halo catalogue holds the per-halo state vectors for one
(gravity, redshift, imodel, ibox) realisation. The pipeline keeps two
on-disk representations of it:
- the upstream plain-text
CatshortV.*.DATfiles (read-only, never modified), and - a per-realisation HDF5 mirror
halo.hdf5written underOUTPUT_ROOT.
Downstream stages always read from halo.hdf5; the .DAT is parsed
exactly once, when the HDF5 mirror is missing.
Output layout¶
halocat.config.get_output_dir gives
you the directory:
from halocat import config as C
out_dir = C.get_output_dir("LCDM", 0.25, imodel=1, ibox=1)
halo_path = f"{out_dir}/halo.hdf5"
Reading the catalogue¶
read_halo_hdf5 returns a plain
dict[str, np.ndarray] keyed by CATALOGUE_COLUMNS:
from halocat.io import read_halo_hdf5
from halocat import config as C
data = read_halo_hdf5(halo_path)
print(len(data[C.MASS_COLUMN]), "haloes")
print("columns:", list(data))
The standard columns are:
| Column | Description | Units |
|---|---|---|
x, y, z |
comoving position | Mpc/h |
vx, vy, vz |
peculiar velocity | km/s |
Mtot |
total halo mass (the mass column) | M⊙/h |
Mbound |
bound-particle mass | M⊙/h |
Rvir |
virial radius | Mpc/h |
Vrms, Vcirc |
velocity dispersion / circular vel | km/s |
Cvir, Lambda, Xoff, ... |
shape & spin diagnostics |
MASS_COLUMN (= "Mtot") is the column
used by all downstream measurements.
File-level attributes¶
halo.hdf5 carries a small set of file-level attributes describing the
realisation:
import h5py
with h5py.File(halo_path, "r") as f:
print(dict(f.attrs))
# {'box_size': 1024.0, 'gravity': 'LCDM', 'ibox': 1, 'imodel': 1,
# 'redshift': 0.25, 'snapnum': 137,
# 'source_dat': '/cosma8/.../CatshortV.0137.DAT'}
Auto-reformat from .DAT¶
If halo.hdf5 is missing, run_single
reformats it on demand:
from halocat.pipeline import run_single
status = run_single(
"LCDM", 0.25, imodel=1, ibox=1,
do_halo=True, do_hmf=False, do_tpcf=False, do_vel=False,
)
assert status["ok"]
The same path is taken transparently by
HMFLoader.get,
XiHHLoader.get, and
XiHHLoader.measure_pair
when they need halo data and the HDF5 mirror is not yet on disk.
Fiducial-cosmology realisations¶
Three fiducial cosmology runs are exposed through the same loaders by
sentinel imodel values:
("LCDM", imodel=0)— DESI_MGx100/GR (ΛCDM fiducial)("fRn1", imodel=0)— F5n1 (|f_R0| = 1e-5)("fRn1", imodel=-1)— F6n1 (|f_R0| = 1e-6)
Each has 100 boxes (ibox=1..100) and 27 snapshots. See
Configuration → Fiducial cosmology runs
for the full table of redshifts and source paths.
status = run_single("LCDM", 0.25, imodel=0, ibox=1) # writes halo.hdf5 etc.
hmf = HMFLoader().get("LCDM", 0.25, imodel=0, ibox=1)
xi = XiHHLoader().get("fRn1", 0.25, imodel=-1, ibox=1) # F6n1
Worked example¶
scripts/example_load_halo.py is a minimal CLI walkthrough — it loads a
realisation, prints attributes / column summary, demonstrates a mass cut,
and constructs (N, 3) position/velocity arrays: