May 26, 2026arXiv:2605.27186

MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation

Haoyu Zheng, Yun Zhu, Shu Yuan, Shangming Chen, Qing Wang, Wenqiao Zhang, Jun Xiao

AI Summary

The paper introduces MAIGO, a self-distillation method to mitigate the "lost-in-conversation" (LiC) problem in LLMs by reducing self-contamination, where early deviations in assistant replies propagate through subsequent turns. MAIGO uses "history-cleaned" references, removing prior assistant replies during training while preserving the user-visible dialogue prefix, and distills from full-view references on answer turns. Experiments on Qwen2.5-7B-Instruct demonstrate that MAIGO significantly improves performance on the SHARDED LiC task, increasing accuracy from 52.8% to 66.1% and the SHARDED/FULL ratio from 66.5% to 84.1%.

Key Contribution

LLMs stumble in multi-turn conversations not just because of context length, but because they poison themselves with their own past mistakes – and you can fix it with self-distillation.

Abstract

Large language models often solve tasks from a fully specified prompt but degrade when the same requirements unfold over multiple turns, known as the lost-in-conversation (LiC) gap. We trace part of this degradation to self-contamination: intermediate assistant replies enter later context and carry early deviations forward. Motivated by this mechanism, we propose MAIGO, an on-policy self-distillation method that reduces this contamination using history-cleaned references from the model's own policy. For middle turns, MAIGO removes prior assistant replies while preserving the user-visible sharded prefix; for answer turns, it distills from paired full-view references conditioned on the completed user-side dialogue. A reliability weight downweights middle-turn samples that disagree with the clean reference. MAIGO requires no verifier rewards, state labels, or inference-time scaffolding. Under the LiC paired-view protocol with deterministic verifiers, MAIGO improves Qwen2.5-7B-Instruct SHARDED accuracy from 52.8 to 66.1 and the SHARDED/FULL ratio from 66.5% to 84.1%, while keeping FULL accuracy within 2.3 points. These results show that self-contamination is a trainable component of the LiC gap.

Inference & Quantization Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation

Related Papers