Feb 17, 2026arXiv:2602.15817

Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning

Oswin So, Eric Yang Yu, Songyuan Zhang, Matthew Cleaveland, Mitchell Black, Chuchu Fan

AI Summary

The paper addresses the challenge of applying reinforcement learning (RL) to parameter-robust reachability problems where the feasibility of the initial condition set is unknown. They introduce Feasibility-Guided Exploration (FGE), which simultaneously identifies a feasible subset of initial conditions and learns a safe policy within that subset. Empirical results on MuJoCo and Kinetix simulators show that FGE achieves over 50% more coverage compared to existing methods, demonstrating improved robustness.

Key Contribution

Forget hand-engineering initial conditions for robust RL: this method *learns* which conditions are feasible while simultaneously training a safe policy.

Abstract

Recent advances in deep reinforcement learning (RL) have achieved strong results on high-dimensional control tasks, but applying RL to reachability problems raises a fundamental mismatch: reachability seeks to maximize the set of states from which a system remains safe indefinitely, while RL optimizes expected returns over a user-specified distribution. This mismatch can result in policies that perform poorly on low-probability states that are still within the safe set. A natural alternative is to frame the problem as a robust optimization over a set of initial conditions that specify the initial state, dynamics and safe set, but whether this problem has a solution depends on the feasibility of the specified set, which is unknown a priori. We propose Feasibility-Guided Exploration (FGE), a method that simultaneously identifies a subset of feasible initial conditions under which a safe policy exists, and learns a policy to solve the reachability problem over this set of initial conditions. Empirical results demonstrate that FGE learns policies with over 50% more coverage than the best existing method for challenging initial conditions across tasks in the MuJoCo simulator and the Kinetix simulator with pixel observations.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning

Related Papers