Search papers, labs, and topics across Lattice.
This paper introduces FUN, a Focal U-shaped Network for joint hyperspectral image (HSI) reconstruction and object detection, addressing the limitations of slow post-capture reconstruction in snapshot spectral imaging. FUN uses a shared U-shaped backbone with focal modulation, an efficient alternative to self-attention, to enable mutual task interaction and reduce computational complexity. Experiments on a new HSI object detection dataset demonstrate state-of-the-art performance with 40% fewer parameters and 30% less computation compared to alternatives.
Ditch slow hyperspectral image processing: FUN achieves real-time object detection by jointly learning reconstruction and detection in a single, efficient network.
Conventional push-broom hyperspectral imaging suffers from slow acquisition speeds, precluding real-time object detection; in contrast, snapshot spectral imaging enables instantaneous hyperspectral images (HSIs) capture, making real-time object detection feasible, yet its potential is often compromised by time-consuming post-capture reconstruction. To address this issue, we propose the Focal U-shaped Network (FUN), a novel end-to-end framework that jointly performs HSI reconstruction and object detection via multi-task learning. FUN employs a shared U-shaped backbone, where reconstruction provides underlying spectral information while detection guides semantic-aware priors learning, facilitating mutually beneficial task interaction. Crucially, we introduce focal modulation, an efficient alternative to self-attention that modulates spatial and spectral features while reducing quadratic computational complexity, enabling a self-attention-free architecture for joint reconstruction and detection. Furthermore, we contribute a new HSI object detection dataset with 8712 annotated objects across 363 HSIs to facilitate evaluation of the proposed method. Experiments demonstrate that FUN achieves state-of-the-art performance on both tasks, using 40% fewer parameters and 30% less computation than recent alternatives, making it promising for future real-time edge deployment. The code and datasets are available: https://github.com/ShawnDong98/FUN.