Search papers, labs, and topics across Lattice.
3
0
6
Reward models trained only on success are fundamentally misaligned with human values, leading to dangerous over-rewarding of poor robot behaviors.
Steering imaginations in video world models can reveal critical failure points in robotic actions that traditional methods might overlook.
Current language agents are still far from matching human expert performance when faced with real-world professional tasks requiring complex reasoning, authoritative source retrieval, and domain-specific knowledge, as revealed by the new \$OneMillion-Bench benchmark.