A closed-form KL upper bound enables anomaly-free greedy pairwise Gaussian mixture reduction

Statement

For Gaussian mixture reduction, ranking candidate component-pair merges by the Runnalls closed-form upper bound on the KL divergence between the original two-component sub-mixture and its moment-matched single-Gaussian replacement yields a greedy pairwise reduction algorithm that (i) is computable in closed-form at the same O(N²) per-round asymptotic cost as the older Salmond joining and Williams ISE criteria, (ii) is anomaly-free on the standard pathological examples used to criticize those older criteria, and (iii) produces reductions of consistently better quality (lower true KL gap to the unreduced mixture) at matched component budgets.

Evidence summary

The single source (runnalls-kl-gaussian-mixture-reduction, 2007) presents the KL upper bound

B(i, j) = ½ [(w_i + w_j) log det P_ij − w_i log det P_i − w_j log det P_j]

derives it analytically for the moment-matched merge of two Gaussian components, walks through the canonical Salmond and Williams failure cases to show that the bound ranks the “right” pair as cheapest in each case, and reports synthetic high-dimensional experiments where the resulting greedy reduction has a smaller Monte-Carlo true-KL gap than Salmond / Williams at the same M.

Conditions and scope

Holds for greedy pairwise reduction. Higher-order rebalancings (three-way merges, EM refits, full optimization) are out of scope.
All components must be positive-definite. Rank-deficient covariances require regularization before the bound is well-defined.
Anomaly-freeness is established on the standard Salmond / Williams test cases, not as a universal guarantee.
The bound is an upper bound, not the exact KL — in regimes where the two candidate components have very different covariance shapes the bound is loose and the greedy choice can in principle be sub-optimal.

Counter-evidence

None recorded in the wiki yet. Open question whether modern variational or Wasserstein-based GMR methods strictly dominate the KL-bound greedy choice on high-dimensional or strongly multimodal mixtures.

Linked ideas

(None yet.)

Open questions

Is there a tightening of the bound (e.g. second-order) that retains the closed form?
Can the bound serve as a data-driven stopping rule for choosing M?
Does the dominance over Salmond / Williams persist on degenerate / near-rank-deficient mixtures?

LeatherSagiKnowledgebase

Explorer