Mar 4, 2026arXiv:2603.04341

Hold-One-Shot-Out (HOSO) for Validation-Free Few-Shot CLIP Adapters

Chris Vorster, Noel Murphy, Derek Molloy

AI Summary

This paper introduces Hold-One-Shot-Out (HOSO), a validation-free method for learning the blending ratio in CLIP adaptation, addressing the common issue of requiring test set ablation or validation sets for hyperparameter tuning in few-shot CLIP adaptation. HOSO trains a CLIP-Adapter by holding out a single example and using it to learn the blending ratio, while the adapter itself trains on the remaining support examples. Experiments demonstrate that HOSO-Adapter outperforms the CLIP-Adapter baseline by over 4% on average across 11 few-shot datasets, even surpassing the performance of CLIP-Adapter with optimal blending ratios selected on the test set in 8- and 16-shot settings.

Key Contribution

Forget validation sets: HOSO learns the optimal blending ratio for CLIP adapters in a truly few-shot manner, outperforming even test-set-optimized baselines.

Abstract

In many CLIP adaptation methods, a blending ratio hyperparameter controls the trade-off between general pretrained CLIP knowledge and the limited, dataset-specific supervision from the few-shot cases. Most few-shot CLIP adaptation techniques report results by ablation of the blending ratio on the test set or require additional validation sets to select the blending ratio per dataset, and thus are not strictly few-shot. We present a simple, validation-free method for learning the blending ratio in CLIP adaptation. Hold-One-Shot-Out (HOSO) presents a novel approach for CLIP-Adapter-style methods to compete in the newly established validation-free setting. CLIP-Adapter with HOSO (HOSO-Adapter) learns the blending ratio using a one-shot, hold-out set, while the adapter trains on the remaining few-shot support examples. Under the validation-free few-shot protocol, HOSO-Adapter outperforms the CLIP-Adapter baseline by more than 4 percentage points on average across 11 standard few-shot datasets. Interestingly, in the 8- and 16-shot settings, HOSO-Adapter outperforms CLIP-Adapter even with the optimal blending ratio selected on the test set. Ablation studies validate the use of a one-shot hold-out mechanism, decoupled training, and improvements over the naively learnt blending ratio baseline. Code is released here: https://github.com/chris-vorster/HOSO-Adapter

Eval Frameworks & Benchmarks Multimodal Models Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Hold-One-Shot-Out (HOSO) for Validation-Free Few-Shot CLIP Adapters

Related Papers