Apr 28, 2026arXiv:2604.25191

How Can Reinforcement Learning Achieve Expert-level Placement?

Ruo-Tong Chen, Ke Xue, Chengrui Gao, Yunqi Shi, Tian Xu, Penguin Xie, Siyuan Xu, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou

AI Summary

This paper addresses the challenge of achieving expert-level chip placement using reinforcement learning by focusing on reward design. They circumvent the need for hand-engineered reward functions by learning a reward model directly from expert layouts, inferring step-by-step expert trajectories. Experiments demonstrate that this approach can efficiently learn from a single expert design and generalize to unseen cases, leading to expert-quality layouts.

Key Contribution

Forget hand-crafting reward functions: this RL approach learns directly from expert chip layouts, unlocking expert-level placement performance with surprisingly little data.

Abstract

Chip placement is a critical step in physical design. While reinforcement learning (RL)-based methods have recently emerged, their training primarily focuses on wirelength optimization, and therefore often fail to achieve expert-quality layouts. We identify the reward design as the primary cause for the performance gap with experts, and instead of formalizing intricate processes, we circumvent this by directly learning from expert layouts to derive a reward model. Our approach starts from the final expert layouts to infer step-by-step expert trajectories. Using these trajectories as demonstrations or preferences, we train a model that captures the latent implicit rewards in expert results. Experiments show that our framework can efficiently learn from even a single design and generalize well to unseen cases.

RLHF & Preference Learning Robotics & Embodied AI Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References33

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

How Can Reinforcement Learning Achieve Expert-level Placement?

Related Papers