Skip to content

Accuracy & gates

Headline accuracy (z = 0.25)

Property ΛCDM f(R)
hmf interior LOO 0.49 % (median at SEM floor) fiducial OOS 0.3–0.5 %, screened tail resolved
pk_mm 0.3–0.8 % composed boost 0.5–0.9 %
xi_mm masked RMS ≈ 0.49 %, BAO χ_rms ≈ 0.46 composed boost 0.5–0.9 %
xi_hh (full scale) 1–2 % to 60 Mpc/h 2.0–2.6 % to 60 Mpc/h
b_cum ≈ 0.5 % direct 5-param GP
vel_* interior LOO χ ≈ 0.4–0.7 (at SEM floor) χ ≈ 1 (low/3rd), 1.4–2.3 (high even)

Live per-artifact numbers: reg.entry(prop, gravity, z)["accuracy"] or Registered artifacts. Every number with its derivation is in the methods paper doc/emulation_methods.pdf.

How accuracy is measured

  • Interior corner-excluded LOO — leave-one-out over the design with the hull corners excluded (radius mode), so the metric reflects interpolation, not extrapolation to the design edges.
  • Fiducial OOS — predicting the 100-box independent-seed fiducials (F5n1, F6n1; DESI-GR for ΛCDM). These are sharp: not limited by the 5-box sampling noise floor that flatters interior LOO.
  • Fractional vs χ — fractional RMS where the quantity is sign-definite; χ = residual / SEM where it crosses zero (ξ near the BAO, all velocity moments). χ < 1 means at or below the simulation noise floor.

What the audit established (2026-06-16)

A suite-wide kernel × mean × n_components × n_restarts ablation, gated on the independent-seed fiducials, confirmed expsquared / constant is optimal everywhere and most artifacts sit at the noise floor. It deployed three improvements and proved the rest already optimal:

  • ΛCDM hmf n_components 3→4 — interior LOO 0.65 %→0.49 %.
  • f(R) hmf n_restarts 3→10 — the "screened-tail weakness" was optimizer under-convergence, not design sampling: F6n1 fiducial χ 95→6.6, tail max 27 %→0.67 %. Closed with zero new sims.
  • f(R) xi_hh_smallr jitter floor (jitter_floor_frac = 0.05) — regularises under-fit high PCA modes; small-r ξ_hh bands ~6 % better.

n_restarts = 10 is now the suite-wide default as convergence insurance.

Rejected routes (recorded; do not re-implement)

  • Seed-paired boosts for f(R) b_cum and xi_hh_smallr (worse at the fiducial / noisier target) — both stay direct.
  • r-space / Hankel-hybrid anchoring of xi_mm on large scales (degrades the BAO).
  • A linear mean for the f(R) hmf boost (trades F5n1 for F6n1; restarts is the real fix).
  • matern kernel for any trained artifact (worse-to-catastrophic on these smooth targets).
  • Velocity standardized moments and r-binning/r_max trimming (no deployable gain; see Velocity).

Open items

  • Frozen-seed large-scale offset in ξ / ξ_hh (both gravities) — pk_mm is closed via the low-k anchor; ξ remains open. See Caveats.
  • f(R) high even velocity moments plateau at χ ≈ 1.4–2.3 — a design-sampling limit addressable with new simulations (f(R) gravity).