Mar 11, 2026arXiv:2603.10572

Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control

AI Summary

This paper introduces a layered control architecture for reach-avoid POMDPs that decouples goal reaching, information gathering, and safety. Information gathering is formalized as a Lyapunov convergence problem in belief space using Belief Control Lyapunov Functions (BCLFs) learned via reinforcement learning, while safety is ensured using Belief Control Barrier Functions (BCBFs) with conformal prediction for probabilistic guarantees. The resulting control synthesis uses lightweight quadratic programs, enabling real-time performance with high-dimensional, non-Gaussian beliefs, demonstrated in simulation and on a space-robotics platform.

Key Contribution

Achieve real-time safety-critical robot control in partially observable environments by decoupling goal reaching, information gathering, and safety into modular, certificate-based components operating directly in belief space.

Abstract

Partially Observable Markov Decision Processes (POMDPs) provide a principled framework for robot decision-making under uncertainty. Solving reach-avoid POMDPs, however, requires coordinating three distinct behaviors: goal reaching, safety, and active information gathering to reduce uncertainty. Existing online POMDP solvers attempt to address all three within a single belief tree search, but this unified approach struggles with the conflicting time scales inherent to these objectives. We propose a layered, certificate-based control architecture that operates directly in belief space, decoupling goal reaching, information gathering, and safety into modular components. We introduce Belief Control Lyapunov Functions (BCLFs) that formalize information gathering as a Lyapunov convergence problem in belief space, and show how they can be learned via reinforcement learning. For safety, we develop Belief Control Barrier Functions (BCBFs) that leverage conformal prediction to provide probabilistic safety guarantees over finite horizons. The resulting control synthesis reduces to lightweight quadratic programs solvable in real time, even for non-Gaussian belief representations with dimension $>10^4$. Experiments in simulation and on a space-robotics platform demonstrate real-time performance and improved safety and task success compared to state-of-the-art constrained POMDP solvers.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References85

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control

Related Papers