Hung-yi Lee

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Speech & Audio (7)Architecture Design (Transformers, SSMs, MoE) (2)Natural Language Processing (2)Eval Frameworks & Benchmarks (2)

Frequent co-authors

Yi-Cheng Lin (4)Ho-Lam Chung (2)Sung-Feng Huang (2)Hung-yi Lee (2)

Papers (7)

Apr 20, 2026

Ho-Lam Chung +21w ago

LLM-Codec: Neural Audio Codec Meets Language Model Objectives

Bridging the gap between audio reconstruction and language modeling objectives yields neural audio codecs that are both more acoustically faithful and linguistically predictable.

Ho-Lam Chung, Yiming Chen, Hung-yi Lee

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Speech & Audio

Apr 19, 2026

NVIDIA1w ago·also NTU Taiwan

MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation

Speech-to-speech translation can now convey laughter and tears with human-like fidelity, thanks to a surprisingly data-efficient approach leveraging LoRA experts.

Szu-Chi Chen, I-Ning Tsai, Yi-Cheng Lin +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Speech & Audio

1w ago·also NVIDIA

VIBE: Voice-Induced open-ended Bias Evaluation for Large Audio-Language Models via Real-World Speech

LALMs reveal their hidden biases when you let them generate freely from real human voices, and gender stereotypes are more pronounced than accent biases.

Yi-Cheng Lin, Yusuke Hirota, Hung-yi Lee

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Speech & Audio

Apr 6, 2026

3w ago

Full-Duplex-Bench-v3: Benchmarking Tool Use for Full-Duplex Voice Agents Under Real-World Disfluency

Real-world speech disfluencies trip up even the most advanced full-duplex voice agents, exposing critical gaps in self-correction and multi-step reasoning abilities.

Guan-Ting Lin, Guan-Ting Lin, Chen Chen +4

Eval Frameworks & Benchmarks Speech & Audio Tool Use & Agents

Xuanjun Chen +93w ago

Joint Fullband-Subband Modeling for High-Resolution SingFake Detection

High-frequency details, often discarded, are actually crucial for spotting singing voice deepfakes, enabling significantly better detection.

Xuanjun Chen, Xuan-Bo Chen, Chia-Yu Hu +7

Red-Teaming & Adversarial Robustness Speech & Audio

Mar 5, 2026

TW-Sound580K: A Regional Audio-Text Dataset with Verification-Guided Curation for Localized Audio-Language Modeling

Overcome LALM's struggles with localized dialectal prosody: a new Taiwanese audio-text dataset and fine-tuning strategy boosts accuracy by 6.5% on the TAU Benchmark.

Haotao Xie, Hao-Hui Xie, Ho-Lam Chung +5

Data Curation & Synthetic Data Multimodal Models Speech & Audio

Mar 5, 2026·also IIS Academia Sinica, NYCU

Latent-Mark: An Audio Watermark Robust to Neural Resynthesis

Audio watermarks can now survive neural resynthesis, thanks to a latent space embedding technique that resists semantic compression by modern audio codecs.

Yen-Shan Chen, Shih-Yu Lai, Ying-Jung Tsou +5