Search papers, labs, and topics across Lattice.
This paper introduces SafeGate, a neurosymbolic architecture that filters unsafe natural language commands for LLM-controlled robots before execution, using structured safety properties extracted from the commands and a deterministic decision gate. To further ensure safety, Task Safety Contracts decompose accepted commands into invariants, guards, and abort conditions, enforced via Z3 SMT solving, to prevent unsafe state transitions during execution. Experiments across simulation and real-world robot tasks demonstrate that SafeGate reduces the acceptance of defective commands while maintaining high acceptance of benign tasks, outperforming existing LLM-based safety frameworks.
LLM-controlled robots can be made significantly safer by filtering unsafe natural language commands *before* they're executed, preventing downstream errors.
Large Language Models (LLMs) are increasingly used to convert task commands into robot-executable code, however this pipeline lacks validation gates to detect unsafe and defective commands before they are translated into robot code. Furthermore, even commands that appear safe at the outset can produce unsafe state transitions during execution in the absence of continuous constraint monitoring. In this research, we introduce SafeGate, a neurosymbolic safety architecture that prevents unsafe natural language task commands from reaching robot execution. Drawing from ISO 13482 safety standard, SafeGate extracts structured safety-relevant properties from natural language commands and applies a deterministic decision gate to authorize or reject execution. In addition, we introduce Task Safety Contracts, which decomposes commands that pass through the gate into invariants, guards, and abort conditions to prevent unsafe state transitions during execution. We further incorporate Z3 SMT solving to enforce constraint checking derived from the Task Safety Contracts. We evaluate SafeGate against existing LLM-based robot safety frameworks and baseline LLMs across 230 benchmark tasks, 30 AI2-THOR simulation scenarios, and real-world robot experiments. Results show that SafeGate significantly reduces the acceptance of defective commands while maintaining a high acceptance of benign tasks, demonstrating the importance of pre-execution safety gates for LLM-controlled robot systems