Search papers, labs, and topics across Lattice.
1
0
3
7
Current MLLMs are surprisingly bad at understanding human intent in egocentric videos at a step-by-step level, achieving only 33% accuracy on a new benchmark designed to prevent future-frame leakage.