Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Shusheng Yang | Lattice

Shusheng Yang

New York University

Papers on Lattice

2

Total citations

0

Topics

4

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (2)Multimodal Models (2)Eval Frameworks & Benchmarks (1)Robotics & Embodied AI (1)

Frequent co-authors

Saining Xie (2)Sihyun Yu (1)Nanye Ma (1)Pinzhi Huang (1)

Papers (2)

Jun 2, 2026

1w ago·also KAIST

Benchmarking Visual State Tracking in Multimodal Video Understanding

MLLMs are failing to visually track events in videos, performing only modestly above baseline despite strong results on other benchmarks.

Sihyun Yu, Nanye Ma, Pinzhi Huang +8

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

May 21, 2026

Meta AI3w ago·also BAIR, NYU

Cambrian-P: Pose-Grounded Video Understanding

Camera pose, largely ignored in video LLMs, unlocks significant gains in spatial reasoning and even improves general video QA when used as a lightweight supervisory signal.

Jihan Yang, Zifan Zhao, Xichen Pan +5

Computer Vision Multimodal Models Robotics & Embodied AI

Hyunseok Lee (1)

June Suk Choi (1)

Ellis Brown (1)

Oscar Michel (1)