Search papers, labs, and topics across Lattice.
This paper introduces Variational Reward Factorization (VRF), a novel approach to LLM personalization that represents user preferences as variational distributions within a shared preference space. VRF uses a variational encoder to infer these user distributions and Wasserstein distance matching to derive user-specific weights from shared probabilistic bases. By incorporating uncertainty awareness through a variance-attenuated loss, VRF achieves superior personalization performance compared to existing methods, particularly in few-shot settings and across varying uncertainty levels, ultimately improving downstream alignment.
Personalizing LLMs just got a whole lot better: VRF's uncertainty-aware approach crushes existing methods, especially when data is scarce.
Reward factorization personalizes large language models (LLMs) by decomposing rewards into shared basis functions and user-specific weights. Yet, existing methods estimate user weights from scarce data in isolation and as deterministic points, leading to inaccurate and unreliable inference. We introduce Variational Reward Factorization (VRF), an uncertainty-aware framework that represents each user's preferences as a variational distribution in a shared preference space. VRF infers user distributions via a variational encoder, derives weights through Wasserstein distance matching with shared probabilistic bases, and downweights uncertain estimates through a variance-attenuated loss. On three benchmarks, VRF outperforms all baselines across seen and unseen users, few-shot scenarios, and varying uncertainty levels, with gains extending to downstream alignment.