Sunghyeon Woo

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Training Efficiency & Optimization (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Dongsoo Lee (2)Jeongin Bae (1)Baeseong Park (1)Gunho Park (1)

Papers (2)

Feb 26, 2026

Jeongin Bae +152w ago

Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

Ditching the strict unit-sum constraint in softmax attention with a simple affine scaling trick unlocks more stable training and better downstream performance for Transformers.

Jeongin Bae, Baeseong Park, Gunho Park +13

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Feb 12, 2026

Sunghyeon Woo +4Feb 12, 2026

PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated Serving

Squeezing 4.5x lower latency and 3.9x higher throughput from multi-LLM systems, PrefillShare lets you share the KV cache across models, slashing redundancy without sacrificing accuracy.

Sunghyeon Woo, Hoseung Kim, Sunghwan Shim +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Sunghyeon Woo

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)