Problem
Between 2026-04-11 and 2026-04-12 three experiments closed (Exp 10, 11, 12) that were originally intended to deliver operational improvements for the production pipeline. None of them did. Before freezing the pipeline, someone needs to say clearly what the three closures did establish, what they removed from the option menu, and what changed versus the 2026-04-10 design-brief-global-optimization-pipeline.
Key idea
The final-day sprint removed alternatives rather than opening better ones. The production conclusion is now more conservative but more stable: keep the hybrid Hamilton-Kim + MC cap-rate objective, absorbing NBC, the 54-D constrained parameterization, and BOBYQA multi-start + paired high-R verification; do not ship adaptive splice logic, strict θ-only tail caching, or Exp 11 curvature output for restart scaling. The single cheap follow-up still worth doing before freeze is false-basin classification.
The sprint also strengthened the interpretation that the main unresolved
approximation issue is the tail operator itself (N3: η_H · r/(1-r)
with a single global r), not splice timing, not θ-only caching, and not
curvature-based restart scaling.
Method
For each of Exp 10 / 11 / 12, the synthesis asks:
- What was decided?
- What was tested?
- What did we learn?
- What is the pipeline implication?
Then Section 2 cross-cuts across the three to compare against the pre-sprint
synthesis (design-brief-global-optimization-pipeline) — what changed
materially, what did not change, and what new open questions appeared.
Results
Exp 10 (strict θ-only hybrid tail cache) → KILL_NEGATIVE_LEVERAGE. The
strict θ-only operator cache inside the hybrid evaluator is bit-identical to
the fixed-burn baseline, preserves ranking, and gives no measurable wall-time
gain (mean speedup 1.0006, median 0.9956). The dominant cost remains in
seed-dependent objects (decay-table build, Q-table sweep). A latent
cache-contract hazard is logged: MCDynamicsCache is not actually a safe
-only object if differs across compatible
expansions; any future cache layer must key on something stronger than the
current hash. See exp-10-hybrid-tail-cache-theta-only.
Exp 11 (block-relative curvature at incumbent) → DIAGNOSTIC_ONLY +
passes_with_major_fragility. Block-relative deterministic curvature at
ep06c_polished on eval_nll_hamilton gives top-2 stiff directions in the
Monetary means block that are stencil-robust, but the Auditor’s central-FD
pipeline revealed a spec-level forward-FD diagonal stencil bug: the formula
H[i,i] = 2(f_+ − f_0)/h² is the gradient-free stencil valid only at a
critical point, and ep06c_polished is polished on eval_nll_sim not
eval_nll_hamilton, so the contamination term reaches
in the Monetary means block. Ranks 3–5 are
sign-flipped under central FD; rank-5 block dominance drops 1.000 → 0.396.
rotation_scaling.json is not fit for restart preconditioning. See
exp-11-curvature-incumbent-scaled-diagnostic.
Exp 12 (per-period integer online onset detection) → KILL_DETECTOR_FAILS.
Phase A Stage-0 oracle sweep fires PROCEED_TO_PHASE_B at , but the
Auditor finds the PROCEED is seed-conditional (2/6 fresh seeds KILL G3, and
the Builder’s seed is the max of 6) and -conditional (G3 collapses 11.4×
from to , ~4× faster than the MC-noise-floor
shrinkage, gate flips PROCEED → KILL between and ). The
pre-registered R3 mid-Phase-B checkpoint fails C1 on both seed schedules at
(O1/G09 median ratio 1.107 / 1.080, i.e. O1 is 8–11%
worse than G09 on pricing error), even though O1 is ~10% faster, 10×
smoother on chatter, and 10× lower on fallback rate. Root cause: the
pre-registered calibration objective is ~180× dominated by the
oracle-alignment term, implicitly turning O1 into an oracle-matcher rather
than an -minimizer. See
exp-12-online-asymptotic-onset-detection.
Material changes vs pre-sprint synthesis.
- Tail misspecification is now the leading unresolved approximation hypothesis. Exp 09 showed splice timing matters; Exp 12 shows finer onset detection does not rescue the approximation where it matters.
- Strict θ-only caching is no longer a live production idea.
- Curvature-based restart scaling is no longer a live production idea
(until a clean central-FD follow-up
11bis run). - Four new gotchas logged from Exp 12 alone: oracle-distance penalty silently dominating a mixed calibration objective (180× ratio); chatter/flip-rate metrics must be well-defined on the comparator arm first; gate thresholds set within ~1σ of seed-wise noise floor flip on seed resample; fine-grid argmin exploits finite- MC noise against a coarse-grid argmin. Three gotchas from Exp 11: gradient-contaminated FD stencil, cross-objective non-criticality, same-stencil “verifier independence” is tautological.
Limitations
- Single-author, single-day snapshot. Not yet cross-reviewed by a librarian or auditor session; the synthesis is a “what we know now” memo, not a referee report.
- Does not resolve the 154-nat MLE−DGP gap. Still outside scope.
- N3 operator-floor hypothesis is the leading interpretation, not a claim. Exp 12 does not prove the tail operator is the bottleneck; it rules out onset detection as a rescue.
Open questions
- Does the geometric tail operator
η_H · r/(1-r)with a single globalrunder-fit the state-dependent decay geometry? Testing this is operator-floor null N3, not yet opened. - Should false-basin classification (label-symmetry / gap-of-gaps) be run pre-freeze, and if so under what decision rule?
- Is there a clean central-FD curvature rerun (Exp 11b) that would restore
rotation_scaling.jsonto operational preconditioning?
My take
The value of this memo is exactly that it removes rather than adds. Three experiments that all had plausible upsides at the time of the design brief closed without producing any, and the memo is the one place where the removals are written down before they fade into undocumented tribal knowledge. It is the natural companion to design-brief-global-optimization-pipeline — the brief gives the pre-sprint menu; this memo gives the post-sprint cuts.
Related
- design-brief-global-optimization-pipeline — the pre-sprint synthesis this memo updates.
- exp-10-hybrid-tail-cache-theta-only — the KILL_NEGATIVE_LEVERAGE result.
- exp-11-curvature-incumbent-scaled-diagnostic — the DIAGNOSTIC_ONLY + fragility result.
- exp-12-online-asymptotic-onset-detection — the KILL_DETECTOR_FAILS result.
- exp-09-adaptive-tail-splice-mc-pricing — the splice-timing predecessor that Exp 12 attempted to improve upon.
- david-leather, jacob-sagi — project PIs.