EdinburghMar 1, 2026arXiv:2603.01292

Integrating LTL Constraints into PPO for Safe Reinforcement Learning

Maifang Zhang, Qian Zuo, Vaishak Belle, Fengxiang He

AI Summary

The paper introduces Proximal Policy Optimization with Linear Temporal Logic Constraints (PPO-LTL), a novel safe RL framework that incorporates LTL-specified safety constraints into PPO. By monitoring LTL constraint violations using limit-deterministic Büchi automata and translating them into penalty signals via a logic-to-cost mechanism, the method guides policy optimization through a Lagrangian scheme. Experiments in Zones and CARLA environments demonstrate that PPO-LTL effectively reduces safety violations while maintaining competitive performance compared to existing methods.

Key Contribution

Forget hand-engineered reward shaping: PPO-LTL lets you specify complex safety requirements as LTL formulas and automatically penalizes violations during RL training.

Abstract

This paper proposes Proximal Policy Optimization with Linear Temporal Logic Constraints (PPO-LTL), a framework that integrates safety constraints written in LTL into PPO for safe reinforcement learning. LTL constraints offer rigorous representations of complex safety requirements, such as regulations that broadly exist in robotics, enabling systematic monitoring of safety requirements. Violations against LTL constraints are monitored by limit-deterministic Büchi automata, and then translated by a logic-to-cost mechanism into penalty signals. The signals are further employed for guiding the policy optimization via the Lagrangian scheme. Extensive experiments on the Zones and CARLA environments show that our PPO-LTL can consistently reduce safety violations, while maintaining competitive performance, against the state-of-the-art methods. The code is at https://github.com/EVIEHub/PPO-LTL.

Constitutional AI & AI Ethics RLHF & Preference Learning Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Integrating LTL Constraints into PPO for Safe Reinforcement Learning

Related Papers