A 54-Dimensional Constrained Stochastic Optimization Benchmark from Regime-Switching Asset Pricing

Problem

The project has a concrete, externally-verifiable optimization problem that has resisted five fundamentally different strategies over a decade: quasi-random grid search with a closed-form likelihood approximation, MCMC with absorbing boundaries, neural-network surrogates, temporal-difference learning on the pricing recursion, and value-function iteration on a discretized state space. This note presents the problem in a form that the optimization research community can engage with without needing to understand the economics: a 54-dimensional constrained stochastic maximum likelihood instance arising from a Markov-switching rational-expectations model of commercial real estate prices.

Key idea

The CRE estimation task is recast as a benchmark instance for derivative-free, constrained, stochastic optimization research. The note gives an operational “interface” that a black-box solver sees (box-bounded $θ \in R^{54}$ , seven implicit feasibility constraints each with a continuous margin function, a noisy objective with three fidelity levels, CRN support), together with the empirical record of twelve systematic experiments and a characterization of what has and has not worked.

Method

Decision variable. $θ \in R^{54}$ with simple box bounds $ℓ_{i} \leq θ_{i} \leq u_{i}$ ; the reduction from 55 raw parameters to 54 is via the $m_{g}$ IS-stationarity constraint, which gave a 70,350× gradient- conditioning improvement (see coupled-riccati-recursions-regime-switching-asset).
Objective. Noisy oracle $\hat{f} (θ; ω)$ returning an estimate of $- L (θ)$ under a Rao-Blackwellized particle filter over a quarterly panel of $T = 119$ observations; piecewise-smooth under CRN.
Noise taxonomy. Fresh $ω$ per call → std ≈ 217 nats; CRN with per-eval MC rebuild → std ≈ 2 nats; CRN with cached MC lookup → deterministic but biased (+5269 nats under a fixed lookup).
Three fidelity levels. $F_{1}$ (Hamilton-Kim + BC Riccati, 0.9 ms, rank correlation $ρ = 0.96$ with the benchmark objective), $F_{2}$ (Hamilton + MC cap rates $R = 100$ , 2.2 s, $ρ = 0.996$ ), and the full evaluation (per-eval MC rebuild, $R = 100$ paths, horizon $H = 150$ , 5.2 s on 8 threads).
Seven implicit constraints with continuous margins $m_{k} (θ)$ : box constraints ( $m_{1}$ ), determinant barrier ( $m_{2}$ , smooth algebraic surface), ordering inequalities ( $m_{3}$ , linear half-spaces), RE fixed-point existence / spectral radius ( $m_{4}$ , combining Cho–Moreno forward-iteration convergence with Kronecker-product spectral radius gaps — see forward-method-rational-expectations, forward-convergence-condition), risk-neutral stability ( $m_{5}$ , at the current best point $m_{5} \approx 0.005$ , within 0.5% of the boundary), coupled Riccati convergence / no-bubble condition ( $m_{6}$ , log-price-growth rate — see no-bubble-condition, spectral-radius-test-markov-pricing-operator), and a long-run variance positive-definiteness condition ( $m_{7}$ ).
Available surrogates. A GBT feasibility classifier with AUC 0.997 at 0.01 ms per query (ratio 130× faster than the Boolean oracle); three objective-fidelity levels above.

Results

Feasible volume fraction: $\approx 3%$ of the prior box.
Feasible at Euclidean distance $r = 0.001$ from the current best known point: 62%. At $r = 0.01$ : 1.9%. At $r = 0.2$ : 0% (of 10,000 draws). The feasible set is locally dense near good points but has a very small global measure.
225 multi-start runs yielded 225 distinct terminal basins (no detected convergence to a common point), indicating either severe multi-modality or a flat manifold of near-optima.
NLL at the DGP parameter value on the simulated panel: $- 5, 386$ .
Best NLL achieved at time of writing: $- 5, 020$ (this note reflects the state before the recent arc; as of 2026-04-12 the incumbent is ep06c_polished at $- 5, 381.59 \pm 0.15$ SEM at $R = 1 0^{5}$ , closing most of that gap — see exp-07-optarch-10-restart-allocation-incumbent).
BOBYQA dominates for noisy MC objectives in 54-D; L-BFGS fails (gradient cost prohibitive under per-eval MC rebuild); NelderMead stagnates; MCMC rejects too often at the feasibility boundary.

Limitations

Benchmark reference only. The note is not itself a research claim about the CRE model or its identification; its role is to expose the problem to external reasoners who may have better tools than the ones tried to date.
Sample-specific optimum. On the simulated panel, the sample MLE sits 154 nats below the DGP parameter value at $R = 1 0^{5}$ — the optimum is panel- specific, not DGP-specific. The benchmark treats the DGP NLL as a reference point, not as a claim about the global minimum.
Noise taxonomy is protocol-dependent. Standard deviations reported at the current best; they change across the feasible set, and they are qualitatively different on cached vs per-eval MC rebuild.
$F_{1}$ at 0.96 rank correlation is not trustworthy for final polishing. The Hamilton filter ignores cap rates and therefore produces an 1572-nat systematic gap vs the RBPF objective — useful for screening only.

Open questions

Can a derivative-free optimizer exploit the known structure of $m_{4}$ and $m_{6}$ (spectral-radius-of-Kronecker-product sublevel sets) rather than treating feasibility as a black box?
Is there a structured multi-fidelity scheme that avoids the rank-correlation collapse at the ~5-nat scale where the $F_{2}$ and full objective stop agreeing?
What is the right way to handle the 154-nat displacement between the sample MLE and the DGP? Is it finite-sample overfit ( $T = 119$ , $n_{θ} = 54$ ), weakly identified nuisance directions absorbing sample noise, or a residual objective-DGP mismatch?
Does the label-symmetry / gap-of-gaps idea reliably classify a candidate terminal basin as a false basin vs a real local optimum?

My take

This is the paper form of the project’s central optimization pain point, packaged as an invitation. It consolidates the empirical lessons of the twelve recent experiments — most of them ingested as wiki experiments under this init (see exp-11-curvature-incumbent-scaled-diagnostic, exp-12-online-asymptotic-onset-detection, exp-10-hybrid-tail-cache-theta-only, exp-09-adaptive-tail-splice-mc-pricing, exp-07-optarch-10-restart-allocation-incumbent, basin-finder-complete-program-summary, sim-recovery-experiment-local-global) — into a single self-contained benchmark specification. The companion synthesis document design-brief-global-optimization-pipeline is a longer internal brief for the same audience, with more CRE-model context.

It is also the natural target of the global-optimization-pipeline-msre-cre research idea: once the pipeline is formalized, this note is the artifact against which it is benchmarked.

leather-sagi-markov-switching-cre-asset-pricing — the underlying model.
riccati-equations-leather-sagi — the analytic pricing derivation that defines $m_{6}$ (coupled Riccati convergence).
cho-moreno-2010-forward-method-rational-expectations — defines the forward iteration underlying $m_{4}$ .
costa-fragoso-marques-mjls-textbook — the MJLS spectral radius theory underlying $m_{4}$ and $m_{5}$ .
design-brief-global-optimization-pipeline — companion synthesis for the same external audience.
global-optimization-pipeline-msre-cre — the active research idea this benchmark is written to support.
exponential-quadratic-asset-pricing-factors — the pricing kernel structure.
markov-switching-rational-expectations-cre-pricing — the model class.
david-leather, jacob-sagi — authors.

LeatherSagiKnowledgebase

Explorer

A 54-Dimensional Constrained Stochastic Optimization Benchmark from Regime-Switching Asset Pricing

Problem

Key idea

Method

Results

Limitations

Open questions

My take

Graph View

Table of Contents

Backlinks

LeatherSagiKnowledgebase

Explorer

A 54-Dimensional Constrained Stochastic Optimization Benchmark from Regime-Switching Asset Pricing

Problem

Key idea

Method

Results

Limitations

Open questions

My take

Related

Graph View

Table of Contents

Backlinks