Search papers, labs, and topics across Lattice.
3
0
4
7
Even the best MLLMs struggle to meet user requirements, achieving only 66% coverage of essential task functions.
No current MLLM can reliably issue timely safety warnings, with performance sharply varying across domains and a troubling trade-off between recall and false positives.
Scaling up LLMs boosts combinatorial creativity in code generation, but plateaus on exploratory tasks, revealing a "convergence-by-scaling" effect where larger models become less divergent.