Search papers, labs, and topics across Lattice.
This paper investigates the critical yet overlooked role of prompts during the training of Large Language Models (LLMs), revealing that semantically equivalent prompts can lead to vastly different outcomes in terms of catastrophic forgetting and generalization. The authors introduce State-Adaptive Prompt Optimization (SAPO), a novel training strategy that dynamically adjusts prompts based on task loss, leading to significant performance improvements across various benchmarks. Their findings indicate that superior prompts can be identified prior to learning, providing a new framework for robust fine-tuning that enhances model performance while mitigating forgetting.
Paraphrased prompts may seem equivalent, but they can drastically alter learning dynamics, with some prompts leading to better generalization and less forgetting.
While prompt engineering is instrumental in maximizing the capabilities of Large Language Models (LLMs) during inference, the role of prompts during training remains critically underexplored. Prevailing fine-tuning paradigms typically treat training prompts as mere surface forms, assuming that semantically equivalent instructions yield identical learning outcomes. However, we reveal that this equivalence is deceptive: while paraphrased prompts often lead to comparable in-task performance, they induce drastically different cross-task impacts regarding catastrophic forgetting and generalization. Crucially, these impacts are positively correlated across tasks, indicating the existence of superior prompts that consistently yield better performance. Furthermore, we discover that these superior prompts can be robustly identified by task loss prior to learning. Leveraging these insights, we introduce State-Adaptive Prompt Optimization (SAPO), a lightweight yet effective training strategy that shifts task formulation from a static input to a dynamic, state-adaptive variable. Comprehensive experiments on diverse benchmarks confirm its effectiveness, which significantly mitigates forgetting while improving generalization, achieving substantial performance gains over state-of-the-art methods. These results provide insights into how training prompts shape learning dynamics and offer a practical recipe for robust fine-tuning. Our code is available at https://github.com/Eric8932/SAPO.