Search papers, labs, and topics across Lattice.
2
0
5
MLLMs struggle to juggle proactive tasks and reactive queries in dynamic video streams, but a simple agentic framework can significantly improve their coordination without any training.
Forget static datasets – RL-based co-training unlocks +20% real-world VLA performance by interactively leveraging simulation while preserving real-world capabilities.