Dmitrii Ustiugov

Serverless functions can get a 37% density boost and significantly reduced overhead by offloading I/O to a shared backend, without sacrificing ecosystem compatibility.

JooYoung Park, Kevin Nguetchouang, Kevin Nguetchouang +9

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware

Apr 7, 2026

3w ago·also Tsinghua AI, Beihang, Georgia Tech, Institute of High Performance Computing

CodecFlow: Codec-Guided End-to-End Optimization for Streaming Video Analytics

Video codecs, typically seen as just compression tools, can actually unlock 3x faster and 87% more efficient video analytics by guiding vision-language model inference.

Yu-qin Zou, Yulin Zou, Yan Chen +7

Computer Vision Inference & Quantization Multimodal Models

Mar 5, 2026

Peng Sun +3Mar 5, 2026·also NTU

PromptTuner: SLO-Aware Elastic System for LLM Prompt Tuning

PromptTuner slashes SLO violations by up to 7.9x and costs by 4.5x in LLM prompt tuning, outperforming existing resource management systems.

Peng Sun, Dmitrii Ustiugov, D. Ustiugov +1

Distributed Systems & Hardware Inference & Quantization Training Efficiency & Optimization

Search

Dmitrii Ustiugov

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)