Search papers, labs, and topics across Lattice.
3
0
6
25
Autonomous GUI agents can now outperform humans on complex tasks, thanks to a novel framework that rigorously verifies completion, breaks failure loops, and searches for solutions.
Forget slow, end-to-end models: building real-time voice agents hinges on a cascaded streaming pipeline, as demonstrated by a new tutorial achieving sub-second latency.
A unified Vision-Language Model and Diffusion architecture unlocks surprisingly effective optical flow forecasting from noisy web data, enabling language-conditioned robot control and video generation.