Definition

A Switching Kalman Filter (SKF), also called a Switching State-Space Model, Switching Linear Dynamical System, Jump Markov Linear System, or Conditional Gaussian State-Space Model, is a state-space model in which a discrete latent mode variable — itself a Markov chain with transition matrix — selects which set of linear-Gaussian dynamics and/or observation matrices is active at time :

Conditional on a regime path , the model is exactly a (time-varying) linear-Gaussian state-space model, so the continuous state is conditionally Kalman-filterable. The full filtering posterior is a mixture of Gaussians, with one component per regime history.

Intuition

An SKF is what you build when (a) you know your dynamics or observation noise are piecewise linear with a discrete mode (e.g. tracking a manoeuvring aircraft, where horizontal motion and vertical motion are different sub-models), or (b) you want to model non-Gaussian noise as a mixture of Gaussians (e.g. robust regression, sensor failure detection), or (c) you want to model regime-dependent macroeconomic dynamics where monetary-policy or volatility states alter the linear law of motion. The SKF gives you the expressive power of a hidden Markov model layered on top of a Kalman filter without abandoning Gaussian closed-form updates conditional on a regime path.

Formal notation

  • : discrete mode at time
  • : transition matrix
  • : initial regime distribution
  • : regime- dynamics, process noise, observation, and observation-noise matrices
  • : continuous latent state
  • : observation
  • Filtering posterior: — exactly a mixture of Gaussians (see exponential-belief-state-growth)
  • : smoothed regime weight (E-step output)

Variants

  • Switching dynamics only selects , are fixed
  • Switching observations only selects ; useful for modelling outliers and sensor failure as a mixture
  • Both switching — fully general SKF; the dynamics and observation switches may share or have separate Markov chains
  • Switching AR (SAR) model, so is observed directly; the only hidden variable is discrete , and exact inference is tractable (no approximation needed)
  • Data-association SKF ([Ghahramani-Hinton 1996b]) — the switch selects which sub-process is read out into ; models data-association ambiguity
  • Multiple-Hypothesis Tracking ([Bar-Shalom-Fortmann 1988]) — selection approximation that keeps only the highest-probability regime histories
  • Compound regime SKF — two or more independent Markov chains over ; the effective regime space is the Cartesian product of their states (e.g. monetary-policy chain × wage-rigidity chain in the CRE asset pricing model)

Comparison

  • vs Kalman filter — adds a discrete regime; conditional on the regime path the Kalman update is unchanged. Marginal posterior is multimodal in even when each component is unimodal.
  • vs HMM — adds a continuous state; emission “matrix” is now a continuous Kalman update rather than a finite categorical likelihood.
  • vs Dynamic Bayesian Network with all-discrete state — the SKF retains closed-form Gaussian updates per regime, avoiding the discretisation cost of large continuous state spaces.
  • vs Particle filter for the same model — particle filters are selection approximations in the SKF taxonomy; collapsing approximations (gpb-imm-collapsing-filters) are the deterministic alternative.

When to use

  • Piecewise-linear dynamics with a known discrete mode structure (manoeuvring target tracking, regime-switching macro models)
  • Non-Gaussian noise modelled as a mixture of Gaussians (robust regression, outlier-tolerant filtering, fault detection)
  • Macroeconomic and financial time-series models with monetary-policy or volatility regimes (e.g. the CRE asset pricing model in this workspace, Hamilton 1989, Kim 1994, Bansal-Zhou 2002)
  • Any conditional-Gaussian state-space model where you want a Rao-Blackwellised representation: marginalise the continuous state analytically per regime path, then approximate or sample only over regime histories

Known limitations

  • Exact inference is intractable. The exact filtering posterior is a mixture of Gaussians (exponential-belief-state-growth); every practical SKF algorithm approximates this growth.
  • Local minima in EM. EM-based SKF learning has candidate segmentations and is notorious for converging to poor local optima (deterministic annealing helps but does not solve the problem).
  • Mode-mismatch sensitivity. SKFs assume the number of modes and the parameter tying scheme are known; misspecification can cause silent drift.

Open problems

  • Optimal trade-off between collapsing order and computational cost (gpb-imm-collapsing-filters)
  • Tight error bounds on collapsing approximations beyond Boyen-Koller
  • Online learning of SKF parameters with regime change-point detection

Key papers

My understanding

The SKF is the right modelling choice whenever you have (a) a continuous state with linear-Gaussian dynamics conditional on a discrete mode and (b) a small enough mode space ( in the single digits to low tens) that mixing or collapsing over regime histories is tractable. The Rao-Blackwellised particle filter pattern — particles carry only the regime history, and the continuous state is exact-Kalman conditional on each particle — is the selection-class SKF approximation in Murphy’s taxonomy and is the inference engine used by the CRE asset-pricing model in this workspace (SimMdlPrices/src/rbpf.jl).