Ranjay Krishna

Paul G. Allen School of Computer Science

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (6)Computer Vision (5)Training Efficiency & Optimization (3)RLHF & Preference Learning (2)

Frequent co-authors

Zixian Ma (4)Jieyu Zhang (4)Taira Anderson (3)Weikai Huang (2)

Papers (9)

Apr 13, 2026

Yinuo Yang +41w ago·also Paul G. Allen School of Computer Science

You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass

Forget multiple forward passes: this reward model scores all candidate responses at once, unlocking massive speedups and better comparative reasoning.

Yinuo Yang, Zixian Ma, Manasi Ganti +2

Multimodal Models RLHF & Preference Learning Training Efficiency & Optimization

Apr 9, 2026

AI22w ago·also UW, Cornell, JHU, Paul G. Allen School of Computer Science

WildDet3D: Scaling Promptable 3D Detection in the Wild

Forget training on closed sets: WildDet3D leverages geometric cues and diverse prompts to achieve SOTA 3D object detection across 13.5K categories in the wild.

Weikai Huang, Jieyu Zhang, Sijun Li +12

Computer Vision Multimodal Models Robotics & Embodied AI

AI22w ago·also Paul G. Allen School of Computer Science

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Open-source web agents can now outperform GPT-4o on key web navigation tasks, thanks to a new dataset and model family that levels the playing field.

Tanmay Gupta, Piper Wolters, Zixian Ma +17

Data Curation & Synthetic Data Open-Source Models & Weights Tool Use & Agents

Mar 30, 2026

Christopher Clark +113w ago·also Paul G. Allen School of Computer Science

MolmoPoint: Better Pointing for VLMs with Grounding Tokens

Ditch the coordinate system: VLMs can point *way* better by directly selecting visual tokens, leading to SOTA results and improved sample efficiency.

Christopher Clark, Yue Yang, Jae Sung Park +9

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Mar 25, 2026

AI2Mar 25, 2026·also Paul G. Allen School of Computer Science

VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models

Forget redrawing diagrams by hand: VFIG, a new vision-language model, can automatically convert rasterized figures into editable SVGs with near GPT-5.2 quality.

Qi He, Xunmei Liu, Hammaad Memon +6

Computer Vision Multimodal Models

Mar 18, 2026

AI2Mar 18, 2026·also Paul G. Allen School of Computer Science

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Pruning vision tokens across both the ViT and LLM can yield a 62% efficiency boost in video VLMs with minimal performance loss, and without complex text conditioning.

Jianrui Zhang, Winson Han, Ranjay Krishna +3

Inference & Quantization Multimodal Models Training Efficiency & Optimization

Mar 10, 2026

Mar 10, 2026·also Tsinghua AI, Paul G. Allen School of Computer Science

Video-Based Reward Modeling for Computer-Use Agents

A new video-based reward model beats GPT-5.2 and Gemini-3 Pro at evaluating computer-using agents, offering a scalable, model-agnostic alternative to traditional methods.

Linxin Song, Jieyu Zhang, Huanxin Sheng +6

Computer Vision RLHF & Preference Learning Tool Use & Agents

Feb 26, 2026

OpenAIFeb 26, 2026·also AI2, Paul G. Allen School of Computer Science, UCLA

Scale Can't Overcome Pragmatics: The Impact of Reporting Bias on Vision-Language Reasoning

Scaling VLMs won't magically unlock reasoning skills; you need to address the reporting bias in training data that suppresses tacit information.

Amita Kamath, Amita Kamath, Jack Hessel +8

Data Curation & Synthetic Data Multimodal Models Reasoning & Chain-of-Thought

Apple MLFeb 26, 2026·also IEEE, Paul G. Allen School of Computer Science

TrajTok: Learning Trajectory Tokens enables better Video Understanding

Ditch slow, external segmentation pipelines: TrajTok learns trajectory tokens end-to-end, boosting video understanding while staying lean and adaptable.

Chenhao Zheng, Jieyu Zhang, Jianing Zhang +9

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Search

Ranjay Krishna

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)