Martijn Bartelds

Research focus

Speech & Audio (2)Architecture Design (Transformers, SSMs, MoE) (1)Multimodal Models (1)Eval Frameworks & Benchmarks (1)Natural Language Processing (1)

Frequent co-authors

Potsawee Manakul (1)Potsawee Manakul (1)Woody Haosheng Gan (1)Woody Haosheng Gan (1)

Papers (2)

Feb 18, 2026

Feb 18, 2026·also Stanford HAI, Together

Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens

Forget text-first: SODA models show that scaling native audio foundation models with interleaved semantic, acoustic, and text tokens unlocks powerful audio generation and cross-modal capabilities.

Potsawee Manakul, Potsawee Manakul, Woody Haosheng Gan +6

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Speech & Audio

Feb 12, 2026

OpenAIFeb 12, 2026·also Google Research, Microsoft Research, Deepgram

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most

Speech recognition models stumble badly on real-world street names, especially for non-English speakers, but a simple synthetic data boost can dramatically improve accuracy.

Martijn Bartelds, Federico Bianchi

Eval Frameworks & Benchmarks Natural Language Processing Speech & Audio

Search

Martijn Bartelds

Research focus

Frequent co-authors

Papers (2)