Search papers, labs, and topics across Lattice.
This paper introduces a real-time, predictive foveated imaging system that intelligently allocates pixel bandwidth to task-relevant regions while preserving a low-resolution global context. By framing foveated acquisition as a sensor attention policy-learning problem, the authors leverage past observations to optimize future measurements, effectively closing the perception-acquisition loop. Extensive simulations and real-world validations on a 200-megapixel dual-stream sensor reveal that this approach significantly enhances task performance under strict bandwidth constraints compared to existing methods.
Task-aware foveated imaging can dramatically improve visual perception performance while operating within severe bandwidth limitations.
Ultra-high-resolution image sensors offer the potential to capture fine spatial details critical for many visual perception tasks, but acquiring and processing all pixels at full resolution is often infeasible under realistic bandwidth, latency, and power constraints. Existing approaches address this challenge through acquisition strategies such as spatial or temporal downsampling, which irrevocably discard information before task relevance can be assessed. In this work, we introduce a real-time, predictive, and task-aware foveated imaging system that operates directly at image acquisition time. Leveraging emerging dual-stream sensor architectures, our method dynamically allocates limited pixel bandwidth to task-relevant regions of interest while maintaining a low-resolution global context. We formulate foveated acquisition as a sensor attention policy-learning problem, in which past observations guide actions that determine future measurements, closing the perception-acquisition loop. Through extensive simulation across multiple perception tasks, we demonstrate that our approach achieves high task performance under strict pixel budgets and significantly outperforms relevant baselines operating at the same bandwidth. We further validate our system on a 200-megapixel dual-stream sensor, capturing real-world videos under realistic bandwidth and latency constraints, demonstrating the practical feasibility of task-driven, acquisition-time foveated imaging.