Search papers, labs, and topics across Lattice.
The paper introduces Multi-turn Dialogue Selection (MDS), a dialogue-level data selection framework to improve instruction tuning of language models on multi-turn conversations. MDS uses a global coverage stage based on bin-wise selection in the user-query trajectory space, combined with a local structural stage that evaluates topic grounding, information progress, and query-answer form consistency. Experiments show that MDS outperforms single-turn selectors, LLM scorers, and heuristic baselines on multiple benchmarks, especially on long conversations.
Noisy multi-turn dialogue data hurts instruction tuning, but selecting entire conversations based on topic grounding and information flow yields surprisingly robust models.
Instruction-tuned language models increasingly rely on large multi-turn dialogue corpora, but these datasets are often noisy and structurally inconsistent, with topic drift, repetitive chitchat, and mismatched answer formats across turns. We address this from a data selection perspective and propose \textbf{MDS} (Multi-turn Dialogue Selection), a dialogue-level framework that scores whole conversations rather than isolated turns. MDS combines a global coverage stage that performs bin-wise selection in the user-query trajectory space to retain representative yet non-redundant dialogues, with a local structural stage that evaluates within-dialogue reliability through entity-grounded topic grounding and information progress, together with query-answer form consistency for functional alignment. MDS outperforms strong single-turn selectors, dialogue-level LLM scorers, and heuristic baselines on three multi-turn benchmarks and an in-domain Banking test set, achieving the best overall rank across reference-free and reference-based metrics, and is more robust on long conversations under the same training budget. Code and resources are included in the supplementary materials.