TU MunichVolkswagen Group (United States)Feb 26, 2026arXiv:2602.23050

Latent Matters: Learning Deep State-Space Models

Alexej Klushyn, Alexej Klushyn, Richard Kurle, Richard Kurle, Maximilian Soelch, Maximilian Soelch, Botond Cseke, Botond Cseke, Patrick van der Smagt, Patrick van der Smagt

AI Summary

The paper addresses the problem that maximizing the evidence lower bound (ELBO) in deep state-space models (DSSMs) does not guarantee learning the true underlying dynamics. To address this, the authors propose a constrained optimization framework for training DSSMs. They introduce the extended Kalman VAE (EKVAE), which combines amortized variational inference with Bayesian filtering/smoothing, achieving improved prediction accuracy and system identification compared to RNN-based DSSMs.

Key Contribution

Forget RNNs for modeling dynamics: a Kalman filter-VAE hybrid can learn state-space representations that disentangle static and dynamic features, outperforming previous models in prediction accuracy and system identification.

Abstract

Deep state-space models (DSSMs) enable temporal predictions by learning the underlying dynamics of observed sequence data. They are often trained by maximising the evidence lower bound. However, as we show, this does not ensure the model actually learns the underlying dynamics. We therefore propose a constrained optimisation framework as a general approach for training DSSMs. Building upon this, we introduce the extended Kalman VAE (EKVAE), which combines amortised variational inference with classic Bayesian filtering/smoothing to model dynamics more accurately than RNN-based DSSMs. Our results show that the constrained optimisation framework significantly improves system identification and prediction accuracy on the example of established state-of-the-art DSSMs. The EKVAE outperforms previous models w.r.t. prediction accuracy, achieves remarkable results in identifying dynamical systems, and can furthermore successfully learn state-space representations where static and dynamic features are disentangled.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations48

Influential citations3

References30

Year2026

VenueNeural Information Processing Systems

Related Papers

Finding related papers...

Search

Latent Matters: Learning Deep State-Space Models

Related Papers