Search papers, labs, and topics across Lattice.
Key Laboratory of Multimedia Trusted Perception and Efficient Computing
2
0
3
CausalMem achieves over 20x visual token compression while maintaining high accuracy in streaming video understanding, redefining memory efficiency in MLLMs.
MLLMs can significantly improve KB-VQA performance by first identifying entities from a limited candidate set before selecting evidence, leading to a more efficient and effective workflow.