Suite design¶

The simulation design¶

The DEGRACE-pilot design is 64 cosmologies sampling the 4-parameter ΛCDM space (Omega_m, h, n_s, S_8), each run as five boxes (initial-condition seeds). The f(R) (n = 1) twin campaign re-runs the same 64 cosmologies with a fifth parameter logf_R0 ∈ [−7, −4], reusing the matched ΛCDM seeds. Snapshots are at z = 0.25 and z = 0.00.

Two 100-box fiducial suites at the GR fiducial background — F5n1 (logf_R0 = −5) and F6n1 (−6) — provide independent-seed out-of-sample tests that are not limited by the 5-box sampling noise floor.

The plugin abstraction¶

Each property is a plugin. Two ABCs, treated uniformly by the registry:

TrainedProperty — assemble() builds a training bundle from halocat loaders, build() fits the artifact, validate() produces the accuracy block. Its train_depends are upstream data products.
DerivedProperty — no weights; derive(deps, theta) combines upstream registry artifacts (depends_on, optional_depends) at predict time.

The CLI train/predict/validate verbs dispatch through these. A derived property cannot be trained; haloemu train b_diff tells you so.

Core engine¶

haloemu/core/ holds the property-agnostic machinery, written once:

gp.py — the tinygp Gaussian-process wrapper (kernel, mean, restarts).
emulator.py — PCAGPEmulator (weighted PCA + per-component GPs) and PerBinGPEmulator, including the per-target log/arcsinh transforms.
loo.py — leave-one-out machinery and metrics, including the corner-excluded interior LOO (radius mode) that is the headline gate.
boost.py (under properties/) — the seed-paired MGBoostEmulator.
reduce.py, grids.py, bundle.py, util.py — reductions, grids, the training-bundle container, and shared helpers.

Conventions¶

Kernel/mean: expsquared / constant is optimal across the suite — a full kernel × mean × n_components × n_restarts ablation (ablation_sweep.py) confirmed matern is worse-to-catastrophic and a linear mean helps no deployed artifact.
n_restarts = 10 suite-wide. This is not mere hygiene: the f(R) hmf boost was acutely under-converged at the old n_restarts = 3 (the so-called "screened-tail weakness"); raising restarts fixed it with zero new simulations.
Accuracy is gated on the independent-seed fiducials, not just interior LOO, so the 5-box sampling floor does not flatter the numbers.
One artifact per key; the manifest records accuracy, dependency pins, and code version per artifact.

Phases¶

The suite was built in shippable phases, each gated before the next:

Phase	Properties
0	`b_cum`, `b_diff` (evolved from the legacy `emulator_b_eff` / `b_diff`)
1	`hmf`
2	`pk_mm`, `xi_mm`
3	`xi_hh`, `xi_hh_smallr`, `r_ab`
4	`vel_m10 … vel_c22`
5	the f(R) (`fRn1`) suite across the above

For the full design rationale see SUITE_DESIGN.md and DOCUMENTATION.md in the repository; for every gate number see the methods paper doc/emulation_methods.pdf.