Nicholas J. Bryan

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Speech & Audio (4)Computer Vision (2)Multimodal Models (2)Architecture Design (Transformers, SSMs, MoE) (2)

Frequent co-authors

Jonah Casebeer (3)Zhepei Wang (2)Yan-Bo Lin (1)Long Mai (1)

Papers (5)

Mar 11, 2026

Mar 11, 2026·also Adobe Research

V2M-Zero: Zero-Pair Time-Aligned Video-to-Music Generation

Forget paired video-music training data: V2M-Zero aligns video and music by matching the *timing* of changes within each modality, not the content itself.

Yan-Bo Lin, Jonah Casebeer, Long Mai +3

Computer Vision Multimodal Models Speech & Audio

Feb 17, 2026

Jonah Casebeer +3Feb 17, 2026

A Generative-First Neural Audio Autoencoder

Compressing 60-second audio into just 788 tokens, this new autoencoder makes generative audio modeling far more tractable by slashing encoding time and latent rates.

Jonah Casebeer, Ge Zhu, Zhepei Wang +1

Architecture Design (Transformers, SSMs, MoE)Speech & Audio Training Efficiency & Optimization

Feb 17, 2026·also Google Research, Adobe Research, ByteDance

TAC: Timestamped Audio Captioning

A new model, TAC, uses synthetic training data to achieve state-of-the-art audio and audio-visual reasoning by generating temporally grounded captions that can then be fed into LLMs.

Sonal Kumar, Prem Seetharaman, Oriol Nieto +5

Data Curation & Synthetic Data Multimodal Models Speech & Audio

Feb 10, 2026

Shih-Lun Wu +5Feb 10, 2026·also MIT CSAIL

Stemphonic: All-at-once Flexible Multi-stem Music Generation

Generate entire multi-instrumental tracks in one pass with Stemphonic, a new diffusion/flow model that's 25-50% faster and higher quality than existing stem generation methods.

Shih-Lun Wu, Ge Zhu, Juan-Pablo Caceres +3

Architecture Design (Transformers, SSMs, MoE)Speech & Audio

Apr 21, 2025

Yatong Bai +3Apr 21, 2025

DRAGON: Distributional Rewards Optimize Diffusion Generative Models

Forget RLHF and DPO – DRAGON lets you fine-tune generative models with rewards that compare entire *distributions* of outputs, unlocking better control and quality without human preference data.

Yatong Bai, Jonah Casebeer, S. Sojoudi +1

Computer Vision RLHF & Preference Learning

Search

Nicholas J. Bryan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)