RoboticsSJTUMar 2, 2026arXiv:2603.02139

Rethinking Camera Choice: An Empirical Study on Fisheye Camera Properties in Robotic Manipulation

Han Xue, Nan Min, Nan Min, Xiaotong Liu, Xiaotong Liu, Wendi Chen, Yuan Fang, Yuan Fang, Jun Lv, Cewu Lu, Chuan Wen

AI Summary

This paper empirically investigates the impact of fisheye cameras on imitation learning for robotic manipulation, focusing on spatial localization, scene generalization, and hardware generalization. The study finds that while fisheye cameras improve spatial localization in complex environments and enhance scene generalization with sufficient training diversity, they suffer from scale overfitting during cross-camera transfer. By addressing scale overfitting with Random Scale Augmentation (RSA), the authors demonstrate improved hardware generalization performance.

Key Contribution

Fisheye cameras in robotics offer superior scene generalization, but only if you train them with enough environmental diversity to avoid overfitting.

Abstract

The adoption of fisheye cameras in robotic manipulation, driven by their exceptionally wide Field of View (FoV), is rapidly outpacing a systematic understanding of their downstream effects on policy learning. This paper presents the first comprehensive empirical study to bridge this gap, rigorously analyzing the properties of wrist-mounted fisheye cameras for imitation learning. Through extensive experiments in both simulation and the real world, we investigate three critical research questions: spatial localization, scene generalization, and hardware generalization. Our investigation reveals that: (1) The wide FoV significantly enhances spatial localization, but this benefit is critically contingent on the visual complexity of the environment. (2) Fisheye-trained policies, while prone to overfitting in simple scenes, unlock superior scene generalization when trained with sufficient environmental diversity. (3) While naive cross-camera transfer leads to failures, we identify the root cause as scale overfitting and demonstrate that hardware generalization performance can be improved with a simple Random Scale Augmentation (RSA) strategy. Collectively, our findings provide concrete, actionable guidance for the large-scale collection and effective use of fisheye datasets in robotic learning. More results and videos are available on https://robo-fisheye.github.io/

Computer Vision Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References57

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Rethinking Camera Choice: An Empirical Study on Fisheye Camera Properties in Robotic Manipulation

Related Papers