Beijing Key Laboratory of IntelligentBUPTJun 1, 2026arXiv:2606.01967

Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning

Wenhang Shi, Yiren Chen, Shuqing Bian, Zhe Zhao, Jinhao Dong, Pengfei Hu, Wei Lu, Xiaoyong Du

AI Summary

This paper investigates the critical yet overlooked role of prompts during the training of Large Language Models (LLMs), revealing that semantically equivalent prompts can lead to vastly different outcomes in terms of catastrophic forgetting and generalization. The authors introduce State-Adaptive Prompt Optimization (SAPO), a novel training strategy that dynamically adjusts prompts based on task loss, leading to significant performance improvements across various benchmarks. Their findings indicate that superior prompts can be identified prior to learning, providing a new framework for robust fine-tuning that enhances model performance while mitigating forgetting.

Key Contribution

Paraphrased prompts may seem equivalent, but they can drastically alter learning dynamics, with some prompts leading to better generalization and less forgetting.

Abstract

While prompt engineering is instrumental in maximizing the capabilities of Large Language Models (LLMs) during inference, the role of prompts during training remains critically underexplored. Prevailing fine-tuning paradigms typically treat training prompts as mere surface forms, assuming that semantically equivalent instructions yield identical learning outcomes. However, we reveal that this equivalence is deceptive: while paraphrased prompts often lead to comparable in-task performance, they induce drastically different cross-task impacts regarding catastrophic forgetting and generalization. Crucially, these impacts are positively correlated across tasks, indicating the existence of superior prompts that consistently yield better performance. Furthermore, we discover that these superior prompts can be robustly identified by task loss prior to learning. Leveraging these insights, we introduce State-Adaptive Prompt Optimization (SAPO), a lightweight yet effective training strategy that shifts task formulation from a static input to a dynamic, state-adaptive variable. Comprehensive experiments on diverse benchmarks confirm its effectiveness, which significantly mitigates forgetting while improving generalization, achieving substantial performance gains over state-of-the-art methods. These results provide insights into how training prompts shape learning dynamics and offer a practical recipe for robust fine-tuning. Our code is available at https://github.com/Eric8932/SAPO.

Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning

Related Papers