Search papers, labs, and topics across Lattice.
This paper introduces Hold-One-Shot-Out (HOSO), a validation-free method for learning the blending ratio in CLIP adaptation, addressing the common issue of requiring test set ablation or validation sets for hyperparameter tuning in few-shot CLIP adaptation. HOSO trains a CLIP-Adapter by holding out a single example and using it to learn the blending ratio, while the adapter itself trains on the remaining support examples. Experiments demonstrate that HOSO-Adapter outperforms the CLIP-Adapter baseline by over 4% on average across 11 few-shot datasets, even surpassing the performance of CLIP-Adapter with optimal blending ratios selected on the test set in 8- and 16-shot settings.
Forget validation sets: HOSO learns the optimal blending ratio for CLIP adapters in a truly few-shot manner, outperforming even test-set-optimized baselines.
In many CLIP adaptation methods, a blending ratio hyperparameter controls the trade-off between general pretrained CLIP knowledge and the limited, dataset-specific supervision from the few-shot cases. Most few-shot CLIP adaptation techniques report results by ablation of the blending ratio on the test set or require additional validation sets to select the blending ratio per dataset, and thus are not strictly few-shot. We present a simple, validation-free method for learning the blending ratio in CLIP adaptation. Hold-One-Shot-Out (HOSO) presents a novel approach for CLIP-Adapter-style methods to compete in the newly established validation-free setting. CLIP-Adapter with HOSO (HOSO-Adapter) learns the blending ratio using a one-shot, hold-out set, while the adapter trains on the remaining few-shot support examples. Under the validation-free few-shot protocol, HOSO-Adapter outperforms the CLIP-Adapter baseline by more than 4 percentage points on average across 11 standard few-shot datasets. Interestingly, in the 8- and 16-shot settings, HOSO-Adapter outperforms CLIP-Adapter even with the optimal blending ratio selected on the test set. Ablation studies validate the use of a one-shot hold-out mechanism, decoupled training, and improvements over the naively learnt blending ratio baseline. Code is released here: https://github.com/chris-vorster/HOSO-Adapter