Search papers, labs, and topics across Lattice.
The paper introduces HMAR, a hierarchical modality-aware medical image retrieval framework based on a Mixture-of-Experts (MoE) architecture, to address limitations in existing MIR systems related to feature encoding, similarity metrics, and global image similarity. HMAR uses dual experts for global and local feature extraction, trained with a two-stage contrastive learning strategy, and employs Kolmogorov-Arnold Network (KAN) layers for efficient hash code generation. Experiments on the RadioImageNet-CT dataset demonstrate that HMAR achieves state-of-the-art retrieval performance, improving mAP by up to 1.1% compared to existing methods.
By adaptively routing medical image queries to global and local feature experts, HMAR achieves state-of-the-art retrieval accuracy without relying on expensive bounding box annotations.
Medical image retrieval (MIR) is a critical component of computer-aided diagnosis, yet existing systems suffer from three persistent limitations: uniform feature encoding that fails to account for the varying clinical importance of anatomical structures, ambiguous similarity metrics based on coarse classification labels, and an exclusive focus on global image similarity that cannot meet the clinical demand for fine-grained region-specific retrieval. We propose HMAR (Hierarchical Modality-Aware Expert and Dynamic Routing), an adaptive retrieval framework built on a Mixture-of-Experts (MoE) architecture. HMAR employs a dual-expert mechanism: Expert0 extracts global features for holistic similarity matching, while Expert1 learns position-invariant local representations for precise lesion-region retrieval. A two-stage contrastive learning strategy eliminates the need for expensive bounding-box annotations, and a sliding-window matching algorithm enables dense local comparison at inference time. Hash codes are generated via Kolmogorov-Arnold Network (KAN) layers for efficient Hamming-distance search. Experiments on the RadioImageNet-CT dataset (16 clinical patterns, 29,903 images) show that HMAR achieves mean Average Precision (mAP) of 0.711 and 0.724 for 64-bit and 128-bit hash codes, improving over the state-of-the-art ACIR method by 0.7% and 1.1%, respectively.