Search papers, labs, and topics across Lattice.
Independent researchers *Equally contributed authors
5
0
8
Code agents struggle with evolving user requirements, revealing a 38-point gap in performance across leading LLMs when faced with iterative feedback.
Interactive world models still have a long way to go: a comprehensive benchmark reveals that even state-of-the-art models struggle to consistently perform well across video quality, interaction adherence, and physics compliance.
Surprisingly, general-purpose vision models already contain better action representations for robotic control than specialized embodied models trained explicitly for that purpose.
LongCat-Next shatters the language-centric paradigm by unifying text, vision, and audio into a single autoregressive model with minimal modality-specific design, finally reconciling understanding and generation in discrete vision modeling.
Instruction-based image editing models still struggle to edit small objects, with a new benchmark revealing significant performance gaps despite progress on existing benchmarks.