Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Kevin J. Shih | Lattice

Kevin J. Shih

Papers on Lattice

2

Total citations

0

Topics

5

h-index

21

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (2)Robotics & Embodied AI (1)World Models & Planning (1)Eval Frameworks & Benchmarks (1)

Frequent co-authors

Arushi Goel (2)Siddharth Gururani (2)Zhifeng Kong (2)Aditi (1)

Papers (2)

Jun 1, 2026

NVIDIAJun 1, 2026·also BAIR, Galbot, Georgia Tech, HKUST +9

Cosmos 3: Omnimodal World Models for Physical AI

Cosmos 3 sets a new benchmark for omnimodal models, outperforming existing state-of-the-art in both Text-to-Image and Image-to-Video tasks.

Aditi, Niket Agarwal, Arslan Ali +285

Multimodal Models Robotics & Embodied AI World Models & Planning

May 28, 2026

Tingle Li +8May 28, 2026

Benchmarking Single-Factor Physical Video-to-Audio Generation

V2A models prioritize text captions over visual cues when generating audio, resulting in physically plausible but often temporally misaligned sounds.

Tingle Li, Siddharth Gururani, Kevin J. Shih +6

Eval Frameworks & Benchmarks Multimodal Models Speech & Audio

Speech & Audio (1)

Niket Agarwal (1)

Martin Antolini (1)