Accuracy & gates¶

Headline accuracy (z = 0.25)¶

Property	ΛCDM	f(R)
`hmf`	interior LOO 0.49 % (median at SEM floor)	fiducial OOS 0.3–0.5 %, screened tail resolved
`pk_mm`	0.3–0.8 %	composed boost 0.5–0.9 %
`xi_mm`	masked RMS ≈ 0.49 %, BAO χ_rms ≈ 0.46	composed boost 0.5–0.9 %
`xi_hh` (full scale)	1–2 % to 60 Mpc/h	2.0–2.6 % to 60 Mpc/h
`b_cum`	≈ 0.5 %	direct 5-param GP
`vel_*`	interior LOO χ ≈ 0.4–0.7 (at SEM floor)	χ ≈ 1 (low/3rd), 1.4–2.3 (high even)

Live per-artifact numbers: reg.entry(prop, gravity, z)["accuracy"] or Registered artifacts. Every number with its derivation is in the methods paper doc/emulation_methods.pdf.

How accuracy is measured¶

Interior corner-excluded LOO — leave-one-out over the design with the hull corners excluded (radius mode), so the metric reflects interpolation, not extrapolation to the design edges.
Fiducial OOS — predicting the 100-box independent-seed fiducials (F5n1, F6n1; DESI-GR for ΛCDM). These are sharp: not limited by the 5-box sampling noise floor that flatters interior LOO.
Fractional vs χ — fractional RMS where the quantity is sign-definite; χ = residual / SEM where it crosses zero (ξ near the BAO, all velocity moments). χ < 1 means at or below the simulation noise floor.

What the audit established (2026-06-16)¶

A suite-wide kernel × mean × n_components × n_restarts ablation, gated on the independent-seed fiducials, confirmed expsquared / constant is optimal everywhere and most artifacts sit at the noise floor. It deployed three improvements and proved the rest already optimal:

ΛCDM hmf n_components 3→4 — interior LOO 0.65 %→0.49 %.
f(R) hmf n_restarts 3→10 — the "screened-tail weakness" was optimizer under-convergence, not design sampling: F6n1 fiducial χ 95→6.6, tail max 27 %→0.67 %. Closed with zero new sims.
f(R) xi_hh_smallr jitter floor (jitter_floor_frac = 0.05) — regularises under-fit high PCA modes; small-r ξ_hh bands ~6 % better.

n_restarts = 10 is now the suite-wide default as convergence insurance.

Rejected routes (recorded; do not re-implement)¶

Seed-paired boosts for f(R) b_cum and xi_hh_smallr (worse at the fiducial / noisier target) — both stay direct.
r-space / Hankel-hybrid anchoring of xi_mm on large scales (degrades the BAO).
A linear mean for the f(R) hmf boost (trades F5n1 for F6n1; restarts is the real fix).
matern kernel for any trained artifact (worse-to-catastrophic on these smooth targets).
Velocity standardized moments and r-binning/r_max trimming (no deployable gain; see Velocity).

Open items¶

Frozen-seed large-scale offset in ξ / ξ_hh (both gravities) — pk_mm is closed via the low-k anchor; ξ remains open. See Caveats.
f(R) high even velocity moments plateau at χ ≈ 1.4–2.3 — a design-sampling limit addressable with new simulations (f(R) gravity).