Search papers, labs, and topics across Lattice.
Technical University of Munich, Mmax(0,confi−conf).\mathcal{L}_{\text{acl}}=\frac{1}{M}\sum_{i=1}^{M}\max(0,\textit{conf}_{i}-\textit{conf}). (7) Appendix F More Ablation Studies Parameter Sensitivity. We evaluate the sensitivity of our framework to two key hyperparameters using the HMDB51 dataset. First, the maximum swapping dimension nmaxn_{max} for Multimodal Feature Swapping (MFS) was varied among 128128, 256256, and 512512, with results presented in Table 10. An nmaxn_{max} value of 256256 yielded the optimal balance, achieving robust performance across all evaluation metrics. Subsequently, with nmaxn_{max} fixed at 256256, the weight λacl\lambda_{\text{acl}} for Adaptive Confidence Loss (ACL) was evaluated over the set 0.20.2, 0.50.5, 1.01.0, and 2.02.0 (detailed in Table 10). A value of λacl=2.0\lambda_{\text{acl}}=2.0 consistently delivered the strongest FD performance. Importantly, the framework’s performance remained stable across both parameter sweeps, underscoring its robustness to variations in these hyperparameters. AURC↓\downarrow AUROC↑\uparrow FPR95↓\downarrow ACC↑\uparrow 128 29.08 88.27 49.57 86.66 256 25.11 90.55 46.22 86.43 512 25.34 90.98 43.90 85.97
1
0
3
Multimodal models often exhibit lower confidence than their unimodal counterparts when they're about to fail, and this work leverages that insight to build a better failure detector.