Search papers, labs, and topics across Lattice.
This paper introduces a novel approach to robotic imitation learning by integrating scene graphs as a structured memory mechanism to enhance spatial and temporal context awareness. By leveraging dynamic scene graphs that represent object relationships and their changes over time, the method enables robots to retain crucial historical context, facilitating improved reasoning during task execution. Experimental results show significant performance gains in both simulated mobile manipulation and real-world tabletop scenarios, especially in environments characterized by partial observability and the need for long-term reasoning.
Robots equipped with scene graphs can significantly outperform traditional imitation learning methods in complex, partially observed environments.
Imitation learning enables robots to learn how to execute tasks via observation. However, real-world environments like homes and offices are often severely partially observed due to their large spatial scales. In addition, many tasks involve executing a series of subtasks requiring autonomous robots to reason over extended time horizons. To address these challenges, we propose using scene graphs as an explicit and structured memory mechanism in imitation learning. By maintaining a dynamic scene graph that captures object-centric relationships and their evolution over time, our method allows the agent to retain relevant historical context during task execution to efficiently reason over incrementally accrued scene information. Our experiments on simulated mobile manipulation and real-world tabletop manipulation demonstrate that our approach substantially improves policy performance, particularly in settings that demand long-term reasoning and robust generalization under partial observability.