Zhehuai Chen

Papers on Lattice

Total citations

Topics

h-index

Research focus

Speech & Audio (3)Multimodal Models (2)Eval Frameworks & Benchmarks (2)Architecture Design (Transformers, SSMs, MoE) (1)Tool Use & Agents (1)

Frequent co-authors

Nvidia Amala Sanjay Deshmukh (1)K. Chumachenko (1)Tuomas Rintamaki (1)Matthieu Le (1)

Papers (3)

Apr 27, 2026

NVIDIAApr 27, 2026·also Amazon Science, Microsoft Research, UW, Music X Lab +1

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Multimodal models can now achieve state-of-the-art performance in real-world tasks like document understanding and audio-video comprehension with significantly reduced inference latency thanks to novel token-reduction techniques.

Nvidia Amala Sanjay Deshmukh, K. Chumachenko, Tuomas Rintamaki +209

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Speech & Audio

Apr 6, 2026

Full-Duplex-Bench-v3: Benchmarking Tool Use for Full-Duplex Voice Agents Under Real-World Disfluency

Real-world speech disfluencies trip up even the most advanced full-duplex voice agents, exposing critical gaps in self-correction and multi-step reasoning abilities.

Guan-Ting Lin, Chen Chen, Zhehuai Chen +2

Eval Frameworks & Benchmarks Speech & Audio Tool Use & Agents

Mar 19, 2026

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation

Text-only LLMs already contain surprisingly diverse levels of auditory knowledge, and this pre-existing knowledge strongly predicts their performance when adapted for audio-language tasks.

Ke-Han Lu, Szu-Wei Fu, Chao-Han Huck Yang +14

Eval Frameworks & Benchmarks Multimodal Models Speech & Audio

Search

Zhehuai Chen

Research focus

Frequent co-authors

Papers (3)