Search papers, labs, and topics across Lattice.
The paper identifies and addresses Semantic Coverage Imbalance (SCI), a bias in vision datasets where rare but meaningful semantic concepts are underrepresented, leading to unfair model performance. To mitigate SCI, they propose SemCovNet, which incorporates a Semantic Descriptor Map (SDM), Descriptor Attention Modulation (DAM), and Descriptor-Visual Alignment (DVA) loss to explicitly learn and correct semantic coverage disparities. Experiments demonstrate that SemCovNet reduces the Coverage Disparity Index (CDI) and enhances model reliability, achieving fairer performance across multiple datasets.
Uncovered: a new bias called Semantic Coverage Imbalance (SCI) plagues vision models, but SemCovNet offers a way to fix it.
Modern vision models increasingly rely on rich semantic representations that extend beyond class labels to include descriptive concepts and contextual attributes. However, existing datasets exhibit Semantic Coverage Imbalance (SCI), a previously overlooked bias arising from the long-tailed semantic representations. Unlike class imbalance, SCI occurs at the semantic level, affecting how models learn and reason about rare yet meaningful semantics. To mitigate SCI, we propose Semantic Coverage-Aware Network (SemCovNet), a novel model that explicitly learns to correct semantic coverage disparities. SemCovNet integrates a Semantic Descriptor Map (SDM) for learning semantic representations, a Descriptor Attention Modulation (DAM) module that dynamically weights visual and concept features, and a Descriptor-Visual Alignment (DVA) loss that aligns visual features with descriptor semantics. We quantify semantic fairness using a Coverage Disparity Index (CDI), which measures the alignment between coverage and error. Extensive experiments across multiple datasets demonstrate that SemCovNet enhances model reliability and substantially reduces CDI, achieving fairer and more equitable performance. This work establishes SCI as a measurable and correctable bias, providing a foundation for advancing semantic fairness and interpretable vision learning.