Search papers, labs, and topics across Lattice.
CASIA
2
1
6
4
Current MLLM benchmarks are missing the forest for the trees: Agentic-MME reveals that strong final-answer accuracy masks surprisingly poor tool use and planning in complex multimodal tasks.
Naive application of LLM inference optimizations can *hurt* the performance of smaller reasoning models, highlighting the need for RLLM-specific serving strategies.