RAI InstituteUMichUPennMay 31, 2026arXiv:2606.01072

Expanding Spatial and Temporal Context for Robotic Imitation Learning With Scene Graphs

Jianing Qian, Qinhe Peng, Emmanuel Panov, Leonor Fermoselle, Dinesh Jayaraman, Bernadette Bucher, Tarik Kelestemur

AI Summary

This paper introduces a novel approach to robotic imitation learning by integrating scene graphs as a structured memory mechanism to enhance spatial and temporal context awareness. By leveraging dynamic scene graphs that represent object relationships and their changes over time, the method enables robots to retain crucial historical context, facilitating improved reasoning during task execution. Experimental results show significant performance gains in both simulated mobile manipulation and real-world tabletop scenarios, especially in environments characterized by partial observability and the need for long-term reasoning.

Key Contribution

Robots equipped with scene graphs can significantly outperform traditional imitation learning methods in complex, partially observed environments.

Abstract

Imitation learning enables robots to learn how to execute tasks via observation. However, real-world environments like homes and offices are often severely partially observed due to their large spatial scales. In addition, many tasks involve executing a series of subtasks requiring autonomous robots to reason over extended time horizons. To address these challenges, we propose using scene graphs as an explicit and structured memory mechanism in imitation learning. By maintaining a dynamic scene graph that captures object-centric relationships and their evolution over time, our method allows the agent to retain relevant historical context during task execution to efficiently reason over incrementally accrued scene information. Our experiments on simulated mobile manipulation and real-world tabletop manipulation demonstrate that our approach substantially improves policy performance, particularly in settings that demand long-term reasoning and robust generalization under partial observability.

Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Expanding Spatial and Temporal Context for Robotic Imitation Learning With Scene Graphs

Related Papers