Search papers, labs, and topics across Lattice.
AI Laboratory
5
0
9
LLM-powered agents can now produce surprisingly strong photographs in complex 3D environments, suggesting a path towards embodied AI with aesthetic awareness.
Visual degradations can cripple the spatial reasoning abilities of even state-of-the-art MLLMs, but targeted finetuning can restore—and even surpass—human-level performance.
By explicitly exposing the model's reasoning process during SVG generation, CTRL-S achieves higher task success rates, superior SVG code quality, and exceptional visual fidelity compared to existing methods.
A 4B-parameter model, InternVL-U, outperforms 14B-parameter models in multimodal generation and editing, proving that size isn't everything.
Forget hand-annotated 3D datasets: a new automated pipeline generates massive, high-quality 3D spatial intelligence from raw video, unlocking better VLM reasoning.