Sangho Lee

Papers on Lattice

Total citations

Topics

h-index

Research focus

Multimodal Models (2)Architecture Design (Transformers, SSMs, MoE) (1)Computer Vision (1)Inference & Quantization (1)Training Efficiency & Optimization (1)

Frequent co-authors

Ranjay Krishna (2)Winson Han (2)Ranjay Krishna (2)Christopher Clark (1)

Papers (3)

Mar 30, 2026

Christopher Clark +11Mar 30, 2026·also Paul G. Allen School of Computer Science

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

Ditch the coordinate system: VLMs can point *way* better by directly selecting visual tokens, leading to SOTA results and improved sample efficiency.

Christopher Clark, Yue Yang, Jae Sung Park +9

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Mar 18, 2026

Mar 18, 2026·also AI2, Korea U, Paul G. Allen School of Computer Science

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Pruning vision tokens across both the ViT and LLM can yield a 62% efficiency boost in video VLMs with minimal performance loss, and without complex text conditioning.

Jianrui Zhang, Winson Han, Ranjay Krishna +2

Inference & Quantization Multimodal Models Training Efficiency & Optimization

Aug 11, 2025

AI2Aug 11, 2025·also Microsoft Research, NVIDIA, UW

MolmoAct: Action Reasoning Models that can Reason in Space

Robot foundation models can achieve state-of-the-art performance by explicitly reasoning about spatial plans as editable trajectory traces, rather than directly mapping perception to control.

Jason Lee, Jiafei Duan, Haoquan Fang +1666

Reasoning & Chain-of-Thought Robotics & Embodied AI World Models & Planning

Search

Sangho Lee

Research focus

Frequent co-authors

Papers (3)