Search papers, labs, and topics across Lattice.
This paper introduces Semi-Supervised Multimodal Domain Generalization (SSMDG), a new problem setting for learning robust multimodal models from multi-source data with limited labeled samples. To address this, they propose a unified framework incorporating Consensus-Driven Consistency Regularization for reliable pseudo-labeling, Disagreement-Aware Regularization for handling ambiguous samples, and Cross-Modal Prototype Alignment for domain- and modality-invariant representations. Experiments on newly established SSMDG benchmarks demonstrate that the proposed method outperforms existing approaches in both standard and missing-modality scenarios.
Achieve robust multimodal generalization with few labels by exploiting both consensus and disagreement among modalities, even when some modalities are missing.
Multimodal models ideally should generalize to unseen domains while remaining data-efficient to reduce annotation costs. To this end, we introduce and study a new problem, Semi-Supervised Multimodal Domain Generalization (SSMDG), which aims to learn robust multimodal models from multi-source data with few labeled samples. We observe that existing approaches fail to address this setting effectively: multimodal domain generalization methods cannot exploit unlabeled data, semi-supervised multimodal learning methods ignore domain shifts, and semi-supervised domain generalization methods are confined to single-modality inputs. To overcome these limitations, we propose a unified framework featuring three key components: Consensus-Driven Consistency Regularization, which obtains reliable pseudo-labels through confident fused-unimodal consensus; Disagreement-Aware Regularization, which effectively utilizes ambiguous non-consensus samples; and Cross-Modal Prototype Alignment, which enforces domain- and modality-invariant representations while promoting robustness under missing modalities via cross-modal translation. We further establish the first SSMDG benchmarks, on which our method consistently outperforms strong baselines in both standard and missing-modality scenarios. Our benchmarks and code are available at https://github.com/lihongzhao99/SSMDG.