Mar 31, 2026arXiv:2603.29192

Efficient Camera Pose Augmentation for View Generalization in Robotic Policy Learning

Sen Wang, Huaiyi Dong, Jingyi Tian, Jiayi Li, Zhuo Yang, Zhuoyi Yang, Tongtong Cao, Anlin Chen, Shuang Wu, Le Wang, Sanping Zhou, Sanpin Zhou

AI Summary

This paper introduces GenSplat, a feed-forward 3D Gaussian Splatting framework for view-generalized robotic policy learning. GenSplat reconstructs high-fidelity 3D scenes from sparse, uncalibrated inputs using a permutation-equivariant architecture and a 3D-prior distillation strategy to prevent geometric collapse. By rendering diverse synthetic views from these 3D representations, the method augments the observational manifold during training, leading to policies that generalize better to novel viewpoints.

Key Contribution

Policies trained with view augmentation from a novel feed-forward 3D Gaussian Splatting framework maintain robust execution under severe spatial perturbations where baselines fail.

Abstract

Prevailing 2D-centric visuomotor policies exhibit a pronounced deficiency in novel view generalization, as their reliance on static observations hinders consistent action mapping across unseen views. In response, we introduce GenSplat, a feed-forward 3D Gaussian Splatting framework that facilitates view-generalized policy learning through novel view rendering. GenSplat employs a permutation-equivariant architecture to reconstruct high-fidelity 3D scenes from sparse, uncalibrated inputs in a single forward pass. To ensure structural integrity, we design a 3D-prior distillation strategy that regularizes the 3DGS optimization, preventing the geometric collapse typical of purely photometric supervision. By rendering diverse synthetic views from these stable 3D representations, we systematically augment the observational manifold during training. This augmentation forces the policy to ground its decisions in underlying 3D structures, thereby ensuring robust execution under severe spatial perturbations where baselines severely degrade.

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References70

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Efficient Camera Pose Augmentation for View Generalization in Robotic Policy Learning

Related Papers