Lattice AI Research

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Computer Vision (2)Multimodal Models (2)Scaling Laws & Emergent Abilities (1)

Frequent co-authors

Bingjun Luo (2)Xinpeng Ding (2)NVIDIA (1)Aaron Blakeman (1)

Papers (3)

Jun 12, 2026

AI21w ago·also NVIDIA, HKUST, Institute of Medical Technology, Motional +3

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Achieving six times the inference throughput of current LLMs while maintaining accuracy, Nemotron 3 Ultra redefines performance benchmarks for agentic reasoning tasks.

NVIDIA, Aaron Blakeman, Aaron Thomas +570

Architecture Design (Transformers, SSMs, MoE)Scaling Laws & Emergent Abilities Tool Use & Agents

May 21, 2026

Tsinghua AIMay 21, 2026·also Shenzhen University, Xidian

Enhancing Visual Token Representations for Video Large Language Models via Training-Free Spatial-Temporal Pooling and Gridding

Video LLMs can get a free performance boost by using ST-GridPool, a novel technique that enhances visual token representations without any additional training.

Bingjun Luo, Tony Wang, Hanqi Chen +1

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Tsinghua AIMay 21, 2026·also Shenzhen University, Xidian

ST-SimDiff: Balancing Spatiotemporal Similarity and Difference for Efficient Video Understanding with MLLMs

Instead of just pruning redundant tokens, ST-SimDiff dramatically cuts MLLM video processing costs by intelligently preserving tokens representing *changes* in the video.

Bingjun Luo, Tony Wang, Chaoqi Chen +1

Computer Vision Inference & Quantization Multimodal Models

Search

Tony Wang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)