Search papers, labs, and topics across Lattice.
This paper introduces Multimodal Adversarial Quality Policy (MAQP) to improve the safety of vision-guided robot grasping using RGBD data. MAQP employs a Heterogeneous Dual-Patch Optimization Scheme (HDPOS) to address distribution discrepancies between RGB and depth modalities during adversarial patch generation. Furthermore, it uses a Gradient-Level Modality Balancing Strategy (GLMBS) to resolve optimization imbalances between RGB and depth patches by reweighting gradient contributions based on channel sensitivity and distance-adaptive perturbation bounds.
Achieve safer robot grasping by cleverly crafting adversarial patches that account for both RGB and depth data, outperforming methods that only consider RGB.
Vision-guided robot grasping based on Deep Neural Networks (DNNs) generalizes well but poses safety risks in the Human-Robot Interaction (HRI). Recent works solved it by designing benign adversarial attacks and patches with RGB modality, yet depth-independent characteristics limit their effectiveness on RGBD modality. In this work, we propose the Multimodal Adversarial Quality Policy (MAQP) to realize multimodal safe grasping. Our framework introduces two key components. First, the Heterogeneous Dual-Patch Optimization Scheme (HDPOS) mitigates the distribution discrepancy between RGB and depth modalities in patch generation by adopting modality-specific initialization strategies, employing a Gaussian distribution for depth patches and a uniform distribution for RGB patches, while jointly optimizing both modalities under a unified objective function. Second, the Gradient-Level Modality Balancing Strategy (GLMBS) is designed to resolve the optimization imbalance from RGB and Depth patches in patch shape adaptation by reweighting gradient contributions based on per-channel sensitivity analysis and applying distance-adaptive perturbation bounds. We conduct extensive experiments on the benchmark datasets and a cobot, showing the effectiveness of MAQP.