Objective
Design and lock the first-stage cheap screen of the basin-finder cascade. The cheap screen must rank-correlate with the 6D hamilton_loglik_cap_mc benchmark well enough to forward true elites to the medium-tier refiner, while running fast enough that screening tens of thousands of feasible candidates is wall-clock-tractable.
Setup
- Model / parameterization: 54-D constrained MSRE in phi-space.
- Evaluator: 11 candidate F1 screens implemented in
02_implementation/src/f1_candidates.jl(3D Hamilton, 3D Hamilton+BC yields, 6D variants, MC pricing variants F1-2/3/4 at varying mc_R). - Comparator: F1-0 (3D Hamilton + MC bond yield lookup, ~1895 ms/eval) and the frozen 6D benchmark
hamilton_loglik_cap_mc(mc_R=1000, mc_H_burn=150) on a 100-point stratified subset of the Exp 01 feasible archive. - Acceptance rule: top-K recall match the baseline, total per-eval time < 100 ms F1 cost cap, deterministic.
Procedure
- Score all 11 candidates plus F1-0 on the 100-point stratified panel; full-panel scoring (2,516 points) infeasible inside the 8-hour budget (estimated 22.8 hours for the benchmark).
- Compute Spearman, Kendall tau, top-K recall (K=5,10,15,20,25), and recall/second per candidate.
- Independent referee2 audit re-derives all 12 reported metrics from the raw CSV tables and runs 1000-bootstrap adversarial checks.
Results
- F1-0b (3D Hamilton-Kim + BC Riccati yields): Spearman 0.9596, top-25 recall 0.96, 0.9 ms/eval, deterministic — ~1550x speedup vs F1-0 with no ranking penalty.
- F1-1 (6D body with income) has identical top-K recall to F1-0/F1-0b at panel scale: income adds no detectable F1 ranking signal.
- All MC-based F1 candidates (F1-2/3/4 family): 914-6500 ms/eval — exceed the 100 ms F1 cost cap.
- Quintile-1 (top 20%) Spearman degrades to 0.38-0.41 for all viable candidates: fine-grained ordering among elites is noisy regardless of screen choice.
- Audit: referee2 reproduces all 12 metrics bit-identically; fragility audit flags one open caveat — F1-0 vs F1-1 top-10 overlap drops to 20% on the full 2516-point panel, but the benchmark could not be scored at that scale.
Analysis
The 1550x speedup comes entirely from eliminating the per-theta MC bond-yield lookup rebuild, which accounts for 99.95% of F1-0’s per-evaluation cost; the 3D Hamilton-Kim filter itself is ~1 ms. The result confirms the cascade-architecture hypothesis (cheap deterministic filter is enough to forward true elites), but the rank-correlation evidence at the very tail is mediocre. F1-0b is locked as the F1 stage; F1-2 family is forwarded to Exp 03 as natural F2 candidates.
Claim updates
- state-dependent-transitions-improve-yield-fit: tested_by, strength moderate. The Hamilton-Kim filter at the F1 stage shows that 3-D macro + analytical regime-switching yields are sufficient to rank parameter vectors, evidence that the regime-switching yield-model channel carries first-order screening information.
- yield-curve-information-sharpens-identification-monetary: tested_by, strength moderate. F1-0b adds yield curve information and matches the 6D benchmark ranking; F1-1 (with income) adds nothing further, isolating the yield curve as the operative information source at the F1 stage.
Follow-up
- Open: extreme-tail divergence between F1-0b and F1-1 on full panel — propose a targeted ~50-point adversarial study.
- Promotes the F1-2 family to Exp 03 as the F2 candidate set.
- See also: basin-finder-03-f2-medium-refiner, basin-finder-complete-program-summary.