Definition
A Switching Kalman Filter (SKF), also called a Switching State-Space Model, Switching Linear Dynamical System, Jump Markov Linear System, or Conditional Gaussian State-Space Model, is a state-space model in which a discrete latent mode variable — itself a Markov chain with transition matrix — selects which set of linear-Gaussian dynamics and/or observation matrices is active at time :
Conditional on a regime path , the model is exactly a (time-varying) linear-Gaussian state-space model, so the continuous state is conditionally Kalman-filterable. The full filtering posterior is a mixture of Gaussians, with one component per regime history.
Intuition
An SKF is what you build when (a) you know your dynamics or observation noise are piecewise linear with a discrete mode (e.g. tracking a manoeuvring aircraft, where horizontal motion and vertical motion are different sub-models), or (b) you want to model non-Gaussian noise as a mixture of Gaussians (e.g. robust regression, sensor failure detection), or (c) you want to model regime-dependent macroeconomic dynamics where monetary-policy or volatility states alter the linear law of motion. The SKF gives you the expressive power of a hidden Markov model layered on top of a Kalman filter without abandoning Gaussian closed-form updates conditional on a regime path.
Formal notation
- : discrete mode at time
- : transition matrix
- : initial regime distribution
- : regime- dynamics, process noise, observation, and observation-noise matrices
- : continuous latent state
- : observation
- Filtering posterior: — exactly a mixture of Gaussians (see exponential-belief-state-growth)
- : smoothed regime weight (E-step output)
Variants
- Switching dynamics only — selects , are fixed
- Switching observations only — selects ; useful for modelling outliers and sensor failure as a mixture
- Both switching — fully general SKF; the dynamics and observation switches may share or have separate Markov chains
- Switching AR (SAR) model — , so is observed directly; the only hidden variable is discrete , and exact inference is tractable (no approximation needed)
- Data-association SKF ([Ghahramani-Hinton 1996b]) — the switch selects which sub-process is read out into ; models data-association ambiguity
- Multiple-Hypothesis Tracking ([Bar-Shalom-Fortmann 1988]) — selection approximation that keeps only the highest-probability regime histories
- Compound regime SKF — two or more independent Markov chains over ; the effective regime space is the Cartesian product of their states (e.g. monetary-policy chain × wage-rigidity chain in the CRE asset pricing model)
Comparison
- vs Kalman filter — adds a discrete regime; conditional on the regime path the Kalman update is unchanged. Marginal posterior is multimodal in even when each component is unimodal.
- vs HMM — adds a continuous state; emission “matrix” is now a continuous Kalman update rather than a finite categorical likelihood.
- vs Dynamic Bayesian Network with all-discrete state — the SKF retains closed-form Gaussian updates per regime, avoiding the discretisation cost of large continuous state spaces.
- vs Particle filter for the same model — particle filters are selection approximations in the SKF taxonomy; collapsing approximations (gpb-imm-collapsing-filters) are the deterministic alternative.
When to use
- Piecewise-linear dynamics with a known discrete mode structure (manoeuvring target tracking, regime-switching macro models)
- Non-Gaussian noise modelled as a mixture of Gaussians (robust regression, outlier-tolerant filtering, fault detection)
- Macroeconomic and financial time-series models with monetary-policy or volatility regimes (e.g. the CRE asset pricing model in this workspace, Hamilton 1989, Kim 1994, Bansal-Zhou 2002)
- Any conditional-Gaussian state-space model where you want a Rao-Blackwellised representation: marginalise the continuous state analytically per regime path, then approximate or sample only over regime histories
Known limitations
- Exact inference is intractable. The exact filtering posterior is a mixture of Gaussians (exponential-belief-state-growth); every practical SKF algorithm approximates this growth.
- Local minima in EM. EM-based SKF learning has candidate segmentations and is notorious for converging to poor local optima (deterministic annealing helps but does not solve the problem).
- Mode-mismatch sensitivity. SKFs assume the number of modes and the parameter tying scheme are known; misspecification can cause silent drift.
Open problems
- Optimal trade-off between collapsing order and computational cost (gpb-imm-collapsing-filters)
- Tight error bounds on collapsing approximations beyond Boyen-Koller
- Online learning of SKF parameters with regime change-point detection
Key papers
- murphy-1998-switching-kalman-filters — unified treatment of SKF inference (GPB1, GPB2, IMM, variational, MHT) and EM-based learning
My understanding
The SKF is the right modelling choice whenever you have (a) a continuous
state with linear-Gaussian dynamics conditional on a discrete mode and (b) a
small enough mode space ( in the single digits to low tens) that mixing or
collapsing over regime histories is tractable. The Rao-Blackwellised particle
filter pattern — particles carry only the regime history, and the continuous
state is exact-Kalman conditional on each particle — is the selection-class
SKF approximation in Murphy’s taxonomy and is the inference engine used by the
CRE asset-pricing model in this workspace (SimMdlPrices/src/rbpf.jl).