Search papers, labs, and topics across Lattice.
This paper introduces EPIC, an algorithm-hardware co-optimization framework designed to enhance egocentric perception in resource-constrained AR glasses by intelligently processing and storing perceptual data. By leveraging user gaze, pose, and inertial signals, EPIC significantly reduces the memory and energy overhead associated with high-resolution video capture while maintaining accuracy in intelligent assistance tasks. The results demonstrate a remarkable reduction in memory footprint by 27.5 times and energy consumption by 24.3 times compared to traditional methods, highlighting its potential for efficient embodied intelligence applications.
EPIC slashes memory usage by 27.5x and energy consumption by 24.3x, making intelligent AR glasses feasible for everyday use.
Modern smart AR glasses are evolving into intelligent systems that support foundation model-based assistance through continuous perception of the user and surrounding environment. However, this perception-first design creates major bottlenecks. Continuously capturing, processing, and storing rich perceptual streams, especially high-resolution egocentric video, imposes substantial power and memory overhead, which is difficult to sustain on resource-constrained AR glasses. In this work, we propose EPIC, an efficient egocentric perception system for embodied intelligence on smart AR glasses. EPIC is an algorithm-hardware co-optimization framework that leverages gaze, pose, and inertial signals to infer user intent and retain only the most informative parts of high-resolution perceptual input, greatly reducing perception overhead. Our results show that EPIC reduces memory footprint by $27.5\times$ and energy consumption by $24.3\times$ on average compared with full video baseline solution, while preserving intelligent assistance accuracy on egocentric video understanding tasks, a key application scenario for embodied intelligence on smart glasses.