Search papers, labs, and topics across Lattice.
The paper introduces Quantization-aware Distilled Restoration (QDR), a novel framework for compressing image restoration models for edge deployment using quantization-aware training and knowledge distillation. QDR addresses challenges like teacher-student capacity mismatch and spatial error amplification by employing FP32 self-distillation and Decoder-Free Distillation (DFD), which corrects quantization errors at the network bottleneck. The framework also includes Learnable Magnitude Reweighting (LMR) to stabilize optimization and an Edge-Friendly Model (EFM) with Learnable Degradation Gating (LDG) for efficient spatial degradation localization, achieving Int8 performance close to FP32 with significant speedups on edge devices.
Achieve near-FP32 image restoration performance with an Int8 model that runs at 442 FPS on NVIDIA Jetson Orin, all thanks to a quantization-aware distillation framework that avoids decoder distillation.
Quantization-Aware Training (QAT), combined with Knowledge Distillation (KD), holds immense promise for compressing models for edge deployment. However, joint optimization for precision-sensitive image restoration (IR) to recover visual quality from degraded images remains largely underexplored. Directly adapting QAT-KD to low-level vision reveals three critical bottlenecks: teacher-student capacity mismatch, spatial error amplification during decoder distillation, and an optimization "tug-of-war" between reconstruction and distillation losses caused by quantization noise. To tackle these, we introduce Quantization-aware Distilled Restoration (QDR), a framework for edge-deployed IR. QDR eliminates capacity mismatch via FP32 self-distillation and prevents error amplification through Decoder-Free Distillation (DFD), which corrects quantization errors strictly at the network bottleneck. To stabilize the optimization tug-of-war, we propose a Learnable Magnitude Reweighting (LMR) that dynamically balances competing gradients. Finally, we design an Edge-Friendly Model (EFM) featuring a lightweight Learnable Degradation Gating (LDG) to dynamically modulate spatial degradation localization. Extensive experiments across four IR tasks demonstrate that our Int8 model recovers 96.5% of FP32 performance, achieves 442 frames per second (FPS) on an NVIDIA Jetson Orin, and boosts downstream object detection by 16.3 mAP