Ryan Bahlous-Boldi

Improbable AI Lab

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Natural Language Processing (1)RLHF & Preference Learning (1)Tool Use & Agents (1)

Frequent co-authors

Isha Puri (1)Idan Shenfeld (1)Akarsh Kumar (1)Mehul Damani (1)

Papers (1)

May 21, 2026

Improbable AI LabMay 21, 2026·also MIT CSAIL

Vector Policy Optimization: Training for Diversity Improves Test-Time Search

LLMs trained with Vector Policy Optimization (VPO) learn to produce diverse solutions that unlock previously unsolvable problems in evolutionary search, outperforming models optimized for single scalar rewards.

Ryan Bahlous-Boldi, Isha Puri, Idan Shenfeld +6

Natural Language Processing RLHF & Preference Learning Tool Use & Agents