Getting Started

Installation

For a basic editable install:

pip install -e .

To include the common local development extras:

pip install -e ".[dev]"

Optional extras can also be installed separately:

  • .[io] for HDF5 I/O and pandas conversion
  • .[plot] for plotting helpers
  • .[docs] for MkDocs
  • .[test] for pytest

Create a catalogue

You can construct a catalogue directly from arrays:

import numpy as np

from halocat import HaloCat

cat = HaloCat(
    x=np.array([0.0, 1.0, 2.0]),
    y=np.array([0.0, 0.0, 0.0]),
    z=np.array([0.0, 0.0, 0.0]),
    vx=np.array([100.0, 120.0, 140.0]),
    vy=np.zeros(3),
    vz=np.zeros(3),
    mass=np.array([1.0e12, 2.0e12, 5.0e12]),
    boxsize=100.0,
    redshift=0.5,
)

Vector aliases are also supported when your positions or velocities already live in (N, 3) arrays:

cat = HaloCat(
    positions=np.array([[0.0, 0.0, 0.0], [1.0, 0.0, 0.0]]),
    velocities=np.array([[100.0, 0.0, 0.0], [120.0, 0.0, 0.0]]),
    mass=np.array([1.0e12, 2.0e12]),
    boxsize=100.0,
)

You can also load a catalogue from disk:

cat = HaloCat.from_file(
    "CatshortV.0001.0001.DAT",
    format="bdm",
    boxsize=1024.0,
    redshift=0.5,
)

Format inference works for .npz, .npy, .csv, .txt, .dat, .h5, and .hdf5 paths.

Common operations

Selections:

massive = cat.select_by_mass(1.0e12, 1.0e14)
region = cat.select_region(x=(0.0, 50.0), y=(0.0, 50.0), z=(0.0, 50.0))
subset = cat.select([True, False, True])
selected_ids = cat.select_by_field("halo_id", values=[101, 205, 333])

Field access:

positions = cat.get_positions()
velocities = cat.get_velocities()
masses = cat.get_field("mass")

cat.add_field("log10mass", np.log10(masses))

Metadata and inspection:

print(cat.field_names)
print(cat.size)
print(cat.summary())

Saving uses either .npz or .hdf5 / .h5:

cat.save("halo_catalogue.npz", overwrite=True)
cat.save("halo_catalogue.hdf5", overwrite=True)

Measurements

hmf = cat.measure.hmf(log10mass_edges=np.arange(12.0, 15.1, 0.1))
xi = cat.measure.xi(np.geomspace(0.5, 50.0, 20), chunk_size=2048)
vel = cat.measure.pairwise_velocity_moments(np.geomspace(0.5, 20.0, 12))
spin_vs_mass = cat.measure.binned_statistic("spin", np.logspace(12.0, 15.0, 8))

Convenience aliases on HaloCat are also available:

hmf = cat.measure_dndlog10M(log10mass_edges=np.arange(12.0, 15.1, 0.1))
xi = cat.measure_2PCF(np.geomspace(0.5, 50.0, 20))

Pair-counting measurements require boxsize and currently assume periodic-box distances. For the current implementation, the largest separation edge must satisfy r_max <= 0.5 * min(boxsize).

DEGRACE workflow

If you are working with the DEGRACE-pilot datasets, the typical sequence is:

  1. Reformat a raw BDM halo catalogue with reformat_degrace_pilot_halo_catalogue(...).
  2. Optionally copy the corresponding matter power spectrum with copy_degrace_pilot_pk_mm(...).
  3. Measure and save the halo mass function with measure_degrace_pilot_hmf(...).
  4. Load or plot the saved products with load_degrace_hmf(...), load_degrace_pk_mm(...), plot_degrace_hmf_models(...), or plot_degrace_pk_mm_models(...).