Search papers, labs, and topics across Lattice.
2
0
4
Forget long-context attention: G2F-RAG shows you can boost video reasoning in LMMs by fusing retrieved knowledge directly into the visual space.
Explicitly teaching models to generate and leverage verifiable evidence during both training and inference unlocks state-of-the-art video reasoning performance, even with a small ensemble of candidates.