SDUUQApr 21, 2026arXiv:2604.19339

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

Shijie Wang, Zijian Wang, Yadan Luo, Haojie Li, Zi Huang, Mahsa Baktashmotlagh

AI Summary

This paper introduces a Divide-and-Conquer Holistic Cognition Network (DHCNet) to improve ultra-fine-grained visual categorization (Ultra-FGVC) performance in data-limited scenarios by focusing on holistic cues. DHCNet decomposes holistic cues into spatially-associated subtle discrepancies, using a self-shuffling operation on local regions to establish spatial associations and guide perception of the original topological structure. Experiments on five Ultra-FGVC datasets demonstrate that DHCNet achieves state-of-the-art performance by iteratively refining and incorporating these holistic cues as supervisory signals.

Key Contribution

Decomposing holistic visual cues into subtle, spatially-associated discrepancies allows for state-of-the-art ultra-fine-grained classification even with limited training data.

Abstract

Ultra-fine-grained visual categorization (Ultra-FGVC) aims to classify highly similar subcategories within fine-grained objects using limited training samples. However, holistic yet discriminative cues, such as leaf contours in extremely similar cultivars, remain under-explored in current studies, thereby limiting recognition performance. Though crucial, modeling holistic cues with complex morphological structures typically requires massive training samples, posing significant challenges in data-limited scenarios. To address this challenge, we propose a novel Divide-and-Conquer Holistic Cognition Network (DHCNet) that implements a divide-and-conquer strategy by decomposing holistic cues into spatially-associated subtle discrepancies and progressively establishing the holistic cognition process, significantly simplifying holistic cognition while reducing dependency on training data. Technically, DHCNet begins by progressively analyzing subtle discrepancies, transitioning from smaller local patches to larger ones using a self-shuffling operation on local regions. Simultaneously, it leverages the unaffected local regions to potentially guide the perception of the original topological structure among the shuffled patches, thereby aiding in the establishment of spatial associations for these discrepancies. Additionally, DHCNet incorporates the online refinement of these holistic cues discovered from local regions into the training process to iteratively improve their quality. As a result, DHCNet uses these holistic cues as supervisory signals to fine-tune the parameters of the recognition model, thus improving its sensitivity to holistic cues across the entire objects. Extensive evaluations demonstrate that DHCNet achieves remarkable performance on five widely-used Ultra-FGVC datasets.

Computer Vision Data Curation & Synthetic Data Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

Related Papers