Search papers, labs, and topics across Lattice.
2
0
4
22
Stop guessing what feels good: this system learns personalized vibration preferences from just 40 pairwise comparisons.
Grounding reward learning in natural language rationales makes policies 2x more robust to spurious correlations and distribution shifts.