Yucheng Wang

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Speech & Audio (2)Architecture Design (Transformers, SSMs, MoE) (2)Interpretability & Mechanistic Interp (1)Tool Use & Agents (1)

Frequent co-authors

Jing Peng (1)Hanqi Li (1)Chenghao Wang (1)Wenming Tu (1)

Papers (5)

May 27, 2026

ETH3w ago·also Hunyuan Team, NJU, Northwestern, NTU +4

Audio-Mind: An Auditable Agentic Framework for Audio Understanding

Over-reliance on agentic decomposition can actually *hurt* audio understanding when a strong audio frontend already provides sufficient information, highlighting the importance of conditional evidence acquisition.

Yucheng Wang, Jing Peng, Hanqi Li +6

Interpretability & Mechanistic Interp Speech & Audio Tool Use & Agents

Apr 6, 2026

Synthesis4AD: Synthetic Anomalies are All You Need for 3D Anomaly Detection

Forget painstakingly collecting real-world defect data: high-fidelity synthetic anomalies, automatically generated from product designs using an MLLM, can dramatically improve 3D anomaly detection.

Yihan Sun, Yuqi Cheng, Junjie Zu +5

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Mar 16, 2026

Mar 16, 2026·also Cohere, Moonshot, UCSD

Attention Residuals

Forget fixed residual connections: Attention Residuals let each layer selectively attend to previous layers, boosting performance and gradient flow in deep LLMs.

Kimi Team, Jianlin Su, Weixin Xu +28

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Mar 12, 2026

Yuetian Du +6Mar 12, 2026·also Pengcheng Laboratory

Linking Perception, Confidence and Accuracy in MLLMs

MLLMs are often overconfident, but a new confidence-driven training and test-time scaling approach can boost accuracy by 8.8% across benchmarks.

Yuetian Du, Yucheng Wang, Zhijie Xu +4

Eval Frameworks & Benchmarks Multimodal Models RLHF & Preference Learning

Mar 11, 2026

Jing Peng +9Mar 11, 2026·also SJTU

G-STAR: End-to-End Global Speaker-Tracking Attributed Recognition

G-STAR tackles long-form, multi-speaker ASR by giving Speech-LLMs time-aware speaker tracking, enabling robust identity linking across chunks.