Xue Pu

Papers on Lattice

Total citations

Topics

h-index

Research focus

Speech & Audio (3)Natural Language Processing (2)Architecture Design (Transformers, SSMs, MoE) (1)Multimodal Models (1)Tool Use & Agents (1)

Frequent co-authors

Tianle Liang (2)Yifu Chen (2)Shengpeng Ji (2)Jingyu Lu (2)

Papers (3)

May 29, 2026

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

SwanSphere achieves real-time, high-fidelity spatial audio generation from panoramic video and text, overcoming the latency and spatial accuracy limitations of existing methods.

Ke Lei, Yu Zhang, Changhao Pan +4

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Speech & Audio

Apr 17, 2026

Apr 17, 2026·also Zhiyang Jia2

VoxMind: An End-to-End Agentic Spoken Dialogue System

VoxMind drastically improves task completion rates in spoken dialogue agents, jumping from 34.88% to 74.57%, even surpassing Gemini-2.5-Pro, by integrating "Think-before-Speak" reasoning and asynchronous tool management.

Tianle Liang, Yifu Chen, Shengpeng Ji +7

Natural Language Processing Speech & Audio Tool Use & Agents

Apr 16, 2026

Yifu Chen +12Apr 16, 2026·also ZJU

WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training

Reinforcement learning can now be practically applied to spoken dialogue models thanks to a new post-training recipe that disentangles semantic and acoustic improvements.

Yifu Chen, Shengpeng Ji, Qian Chen +10

Natural Language Processing RLHF & Preference Learning Speech & Audio

Search

Xue Pu

Research focus

Frequent co-authors

Papers (3)