KUNTUJun 8, 2026arXiv:2606.09525

Emergence of Context Characteristics Sensitivity in Large Language Models

Nadya Yuki Wangsajaya, Haeun Yu, Isabelle Augenstein

AI Summary

This study investigates how large language models (LLMs) develop sensitivity to context characteristics during instruction fine-tuning (IFT) across three stages: supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR). The research demonstrates that SFT enhances the models' tendency to prioritize easily interpretable contexts, while subsequent training stages can either reinforce or alter these preferences based on the dataset used. Ultimately, the findings underscore the importance of carefully designing IFT datasets to promote effective context utilization in instruction-tuned models.

Key Contribution

Context sensitivity in LLMs evolves significantly across training stages, revealing that SFT biases models towards simpler contexts that can be both reinforced and reshaped later on.

Abstract

During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the provided context to answer a query. While prior work has studied how context characteristics correlate with context usage by the LLM, this analysis has been limited to inference time, leaving open how these relationships are acquired in the first place. Here, we measure how models' sensitivity to such characteristics shifts across successive IFT stages: supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR). Experiments across four models and three datasets show that SFT makes models more likely to use contexts that are easy to understand, such as containing high length, context-query similarity, and fluency. Post-SFT dynamics may either reinforce or resolve these preferences depending on the training dataset. Our findings reveal that context usage is actively reshaped at each IFT stage, and designing a balanced IFT dataset is important in ensuring robust context utilization of instruction-tuned models.

Natural Language Processing Scaling Laws & Emergent Abilities

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Emergence of Context Characteristics Sensitivity in Large Language Models

Related Papers