Search papers, labs, and topics across Lattice.
This paper tackles the problem of failure detection in multimodal machine learning models, which is critical for deploying these models in high-stakes applications. The authors introduce Adaptive Confidence Regularization (ACR), a framework that leverages the observation that multimodal predictions often exhibit lower confidence than unimodal branches in failure cases. ACR incorporates an Adaptive Confidence Loss to penalize confidence degradation during training and Multimodal Feature Swapping, a novel outlier synthesis technique, to improve failure recognition.
Multimodal models often exhibit lower confidence than their unimodal counterparts when they're about to fail, and this work leverages that insight to build a better failure detector.
The deployment of multimodal models in high-stakes domains, such as self-driving vehicles and medical diagnostics, demands not only strong predictive performance but also reliable mechanisms for detecting failures. In this work, we address the largely unexplored problem of failure detection in multimodal contexts. We propose Adaptive Confidence Regularization (ACR), a novel framework specifically designed to detect multimodal failures. Our approach is driven by a key observation: in most failure cases, the confidence of the multimodal prediction is significantly lower than that of at least one unimodal branch, a phenomenon we term confidence degradation. To mitigate this, we introduce an Adaptive Confidence Loss that penalizes such degradations during training. In addition, we propose Multimodal Feature Swapping, a novel outlier synthesis technique that generates challenging, failure-aware training examples. By training with these synthetic failures, ACR learns to more effectively recognize and reject uncertain predictions, thereby improving overall reliability. Extensive experiments across four datasets, three modalities, and multiple evaluation settings demonstrate that ACR achieves consistent and robust gains. The source code will be available at https://github.com/mona4399/ACR.