Apr 15, 2026arXiv:2604.13552

Training-Free Test-Time Contrastive Learning for Large Language Models

AI Summary

This paper introduces Training-Free Test-Time Contrastive Learning (TF-TTCL), a novel adaptation framework that improves the robustness of frozen LLMs under distribution shift by distilling supervision from the model's own inference experiences. TF-TTCL employs an "Explore-Reflect-Steer" loop, using semantic query augmentation via multi-agent role-playing, contrastive experience distillation to extract textual rules, and contextual rule retrieval to guide inference. Experiments on reasoning tasks demonstrate that TF-TTCL outperforms zero-shot baselines and other TTA methods in online evaluation.

Key Contribution

Frozen LLMs can dynamically improve their reasoning abilities at test time, without any training, by distilling knowledge from their own successes and failures.

Abstract

Large language models (LLMs) demonstrate strong reasoning capabilities, but their performance often degrades under distribution shift. Existing test-time adaptation (TTA) methods rely on gradient-based updates that require white-box access and need substantial overhead, while training-free alternatives are either static or depend on external guidance. In this paper, we propose Training-Free Test-Time Contrastive Learning TF-TTCL, a training-free adaptation framework that enables a frozen LLM to improve online by distilling supervision from its own inference experiences. Specifically, TF-TTCL implements a dynamic"Explore-Reflect-Steer"loop through three core modules: 1) Semantic Query Augmentation first diversifies problem views via multi-agent role-playing to generate different reasoning trajectories; 2) Contrastive Experience Distillation then captures the semantic gap between superior and inferior trajectories, distilling them into explicit textual rules; and 3) Contextual Rule Retrieval finally activates these stored rules during inference to dynamically steer the frozen LLM toward robust reasoning patterns while avoiding observed errors. Extensive experiments on closed-ended reasoning tasks and open-ended evaluation tasks demonstrate that TF-TTCL consistently outperforms strong zero-shot baselines and representative TTA methods under online evaluation. Code is available at https://github.com/KevinSCUTer/TF-TTCL.

Inference & Quantization Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Training-Free Test-Time Contrastive Learning for Large Language Models

Related Papers