Basin Finder 02: F1 Cheap-Screen Design (3D Hamilton-Kim + BC Riccati)

Objective

Design and lock the first-stage cheap screen of the basin-finder cascade. The cheap screen must rank-correlate with the 6D hamilton_loglik_cap_mc benchmark well enough to forward true elites to the medium-tier refiner, while running fast enough that screening tens of thousands of feasible candidates is wall-clock-tractable.

Setup

Model / parameterization: 54-D constrained MSRE in phi-space.
Evaluator: 11 candidate F1 screens implemented in 02_implementation/src/f1_candidates.jl (3D Hamilton, 3D Hamilton+BC yields, 6D variants, MC pricing variants F1-2/3/4 at varying mc_R).
Comparator: F1-0 (3D Hamilton + MC bond yield lookup, ~1895 ms/eval) and the frozen 6D benchmark hamilton_loglik_cap_mc (mc_R=1000, mc_H_burn=150) on a 100-point stratified subset of the Exp 01 feasible archive.
Acceptance rule: top-K recall match the baseline, total per-eval time < 100 ms F1 cost cap, deterministic.

Procedure

Score all 11 candidates plus F1-0 on the 100-point stratified panel; full-panel scoring (2,516 points) infeasible inside the 8-hour budget (estimated 22.8 hours for the benchmark).
Compute Spearman, Kendall tau, top-K recall (K=5,10,15,20,25), and recall/second per candidate.
Independent referee2 audit re-derives all 12 reported metrics from the raw CSV tables and runs 1000-bootstrap adversarial checks.

Results

F1-0b (3D Hamilton-Kim + BC Riccati yields): Spearman 0.9596, top-25 recall 0.96, 0.9 ms/eval, deterministic — ~1550x speedup vs F1-0 with no ranking penalty.
F1-1 (6D body with income) has identical top-K recall to F1-0/F1-0b at panel scale: income adds no detectable F1 ranking signal.
All MC-based F1 candidates (F1-2/3/4 family): 914-6500 ms/eval — exceed the 100 ms F1 cost cap.
Quintile-1 (top 20%) Spearman degrades to 0.38-0.41 for all viable candidates: fine-grained ordering among elites is noisy regardless of screen choice.
Audit: referee2 reproduces all 12 metrics bit-identically; fragility audit flags one open caveat — F1-0 vs F1-1 top-10 overlap drops to 20% on the full 2516-point panel, but the benchmark could not be scored at that scale.

Analysis

The 1550x speedup comes entirely from eliminating the per-theta MC bond-yield lookup rebuild, which accounts for 99.95% of F1-0’s per-evaluation cost; the 3D Hamilton-Kim filter itself is ~1 ms. The result confirms the cascade-architecture hypothesis (cheap deterministic filter is enough to forward true elites), but the rank-correlation evidence at the very tail is mediocre. F1-0b is locked as the F1 stage; F1-2 family is forwarded to Exp 03 as natural F2 candidates.

Claim updates

state-dependent-transitions-improve-yield-fit: tested_by, strength moderate. The Hamilton-Kim filter at the F1 stage shows that 3-D macro + analytical regime-switching yields are sufficient to rank parameter vectors, evidence that the regime-switching yield-model channel carries first-order screening information.
yield-curve-information-sharpens-identification-monetary: tested_by, strength moderate. F1-0b adds yield curve information and matches the 6D benchmark ranking; F1-1 (with income) adds nothing further, isolating the yield curve as the operative information source at the F1 stage.

Follow-up

Open: extreme-tail divergence between F1-0b and F1-1 on full panel — propose a targeted ~50-point adversarial study.
Promotes the F1-2 family to Exp 03 as the F2 candidate set.
See also: basin-finder-03-f2-medium-refiner, basin-finder-complete-program-summary.

LeatherSagiKnowledgebase

Explorer