Search papers, labs, and topics across Lattice.
3
0
6
0
Ditch the static image: this method generates realistic talking avatars by learning from *videos* of the subject in completely different scenes.
Injecting optical flow into VLMs lets them spot subtle video transitions that other methods miss, opening the door to more robust video understanding.
Coordinating human input with autonomous orientation control enables stable teleoperation of multiple objects at high accelerations, even in complex scenarios.