Jihun Yun

By dynamically balancing fast adaptation and stable averaging, AMUSE delivers faster convergence and better final performance than AdamW and Muon, all without any learning rate tuning.

Jueun Kim, Baekrok Shin, Jihun Yun +3

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Jun 2, 2025

KRAFTONJun 2, 2025

Alignment as Distribution Learning: Your Preference Model is Explicitly a Language Model

Forget RLHF's quirks: aligning LLMs is fundamentally a distribution learning problem, and preference distillation offers a theoretically sound and empirically strong alternative.

Jihun Yun, Juno Kim, Jongho Park +4

Natural Language Processing RLHF & Preference Learning Scalable Oversight & Alignment Theory

Search

Jihun Yun

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)