Search papers, labs, and topics across Lattice.
1
0
3
Even GPT-5 struggles with multi-hop retrieval planning in long videos, achieving only 42% accuracy on a new benchmark designed to isolate this skill.