halocat¶
halocat is a small Python package for processing GLAM halo catalogues from
the DEGRACE-pilot simulation suite. It turns the raw plain-text
CatshortV.*.DAT catalogues into per-realisation HDF5 mirrors and computes
three derived statistics on top of them:
- the halo mass function (HMF),
- the halo–halo two-point correlation function
xi_hh, and - pairwise velocity moments via the
pairvel/PairVel.jlwrapper.
Every measurement is load-or-measure: existing outputs are read straight from disk, and missing ones are produced on demand by the pipeline.
What it gives you¶
halocat.io.read_halo_hdf5— read a reformatted halo catalogue as adict[str, np.ndarray]keyed byhalocat.config.CATALOGUE_COLUMNS.halocat.HMFLoader/halocat.XiHHLoader— load-or-measure data loaders returning typed records (HMFRecord,XiHHRecord) and supporting sub-grid stacking viaget_grid(...).halocat.XiHHLoader.measure_pair—xi_hh(r | bin1, bin2)for an arbitrary pair of finite-width log-mass bins, returned as an in-memoryXiHHPairRecord(never written to disk).halocat.pipeline.run_single/run_all— orchestrate the full pipeline for one realisation or the full sub-grid.- A
halocatconsole-script andscripts/drivers for batch use on COSMA.
At a glance¶
from halocat import config as C, HMFLoader, XiHHLoader
from halocat.io import read_halo_hdf5
# 1. Load a halo catalogue
halo_path = f"{C.get_output_dir('LCDM', 0.25, 1, 1)}/halo.hdf5"
data = read_halo_hdf5(halo_path)
# 2. Load (or measure) the HMF and the static-grid xi_hh
hmf = HMFLoader().get("LCDM", 0.25, imodel=1, ibox=1)
xi = XiHHLoader().get("LCDM", 0.25, imodel=1, ibox=1)
# 3. Custom mass-bin pair, in-memory only
rec = XiHHLoader().measure_pair(
"LCDM", 0.25, 1, 1,
log10M1=(13.0, 13.3), log10M2=(13.7, 14.0),
)
print(rec.r.shape, rec.xi.shape)
Where to next¶
- Installation — set up the
cosemuenvironment and installhalocatin editable mode. - Quick start — walk through one full realisation end to end.
- User guide — task-oriented pages for each data product.
- API reference — auto-generated from the source docstrings.
Convention reminders¶
r centres for xi_hh
Bin centres are always the arithmetic mean of r_edges. The package
never trusts pycorr.sepavg because it returns NaN for empty bins
and varies between realisations, which would break sub-grid stacking.
Source .DAT files are read-only
The pipeline never modifies CatshortV.*.DAT. The reformatter writes
a fresh HDF5 mirror under halocat.config.OUTPUT_ROOT and downstream
stages only read from there.