Search papers, labs, and topics across Lattice.
The paper argues for exposing vector prompt inputs as a public interface for LLM customization, claiming that text-only prompts are insufficient for scalable, stable, and inference-only customization. They provide evidence that vector prompt tuning improves with increasing supervision, unlike text prompts which saturate early, and that vector prompts exhibit distinct attention patterns. The authors contend that exposing vector prompts is crucial for realistic deployment constraints and doesn't necessarily increase model leakage risk.
Forget text prompts: vector prompt interfaces are the key to unlocking scalable and stable LLM customization.
As large language models (LLMs) transition from research prototypes to real-world systems, customization has emerged as a central bottleneck. While text prompts can already customize LLM behavior, we argue that text-only prompting does not constitute a suitable control interface for scalable, stable, and inference-only customization. This position paper argues that model providers should expose \emph{vector prompt inputs} as part of the public interface for customizing LLMs. We support this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early, and that vector prompts exhibit dense, global attention patterns indicative of a distinct control mechanism. We further discuss why inference-only customization is increasingly important under realistic deployment constraints, and why exposing vector prompts need not fundamentally increase model leakage risk under a standard black-box threat model. We conclude with a call to action for the community to rethink prompt interfaces as a core component of LLM customization.