Catalogue I/O

Catalogue loading and saving helpers live in halocat/io.py.

Supported input formats

BDM CatshortV

The existing repository scripts read BDM halo catalogues in a text format with an 8-line header. HaloCat.from_file(..., format="bdm") reproduces that convention.

Default column mapping:

  • x: column 0
  • y: column 1
  • z: column 2
  • vx: column 3
  • vy: column 4
  • vz: column 5
  • mass: column 7

You can override the mapping:

cat = HaloCat.from_file(
    "catalogue.dat",
    format="bdm",
    field_map={"x": 0, "y": 1, "z": 2, "mass": 9},
    skiprows=8,
)

HDF5

cat = HaloCat.from_file("halo_catalogue.hdf5", format="hdf5")
cat = HaloCat.from_file("halo_catalogue.hdf5", format="hdf5", group="box1")

The loader reads one dataset per field from the selected group or file root. When field_map is provided for HDF5 input, the mapping values should be dataset names.

Metadata may be stored directly in HDF5 attributes or packed into a metadata_json attribute.

NPZ

cat = HaloCat.from_file("halo_catalogue.npz")

NPZ archives are a lightweight interchange format for storing the catalogue arrays and metadata together.

NPY structured arrays

cat = HaloCat.from_file("halo_catalogue.npy")

.npy input expects a structured array with named columns.

CSV and named text tables

cat = HaloCat.from_file("halo_catalogue.csv", field_map={"mass": "m200c"})
cat = HaloCat.from_file("halo_catalogue.txt", format="text")

Delimited text input expects a header row with column names. For named-column formats, field_map values should be source column names rather than integer indices.

Saving

HaloCat.save(...) currently writes:

  • .npz
  • .hdf5 / .h5

Examples:

cat.save("halo_catalogue.npz", overwrite=True)
cat.save("halo_catalogue.hdf5", overwrite=True)

Metadata behavior

When loading from file, HaloCat.from_file(...) will use stored metadata for:

  • boxsize
  • redshift
  • periodic
  • cosmology
  • units

Explicit keyword arguments passed to from_file(...) take precedence over the values found on disk.

Low-level helpers

halocat.io also exposes:

  • infer_catalog_format(path, format=None) to resolve a file format from a suffix or explicit override
  • load_catalog_file(...) to load raw field dictionaries plus metadata
  • save_catalog_file(...) to save field dictionaries without constructing a HaloCat