Search papers, labs, and topics across Lattice.
This paper introduces Warp-STAR, a GPU-accelerated static timing analysis (STA) engine that mitigates intra-warp load imbalance by orchestrating parallel computations at the warp level. By eliminating load imbalance, Warp-STAR achieves a 2.4x speedup over existing GPU-based STA engines and a 1.7x speedup when integrated into a timing-driven global placement framework. The method also supports differentiable gradient analysis with minimal overhead, enabling gradient-based optimization within EDA flows.
Intra-warp load imbalance, a major bottleneck in GPU-accelerated Electronic Design Automation, can be eliminated through warp-level parallel orchestration, leading to significant speedups in static timing analysis.
Static timing analysis (STA) is crucial for Electronic Design Automation (EDA) flows but remains a computational bottleneck. While existing GPU-based STA engines are faster than CPU, they suffer from inefficiencies, particularly intra-warp load imbalance caused by irregular circuit graphs. This paper introduces Warp-STAR, a novel GPU-accelerated STA engine that eliminates this imbalance by orchestrating parallel computations at the warp level. This approach achieves a 2.4X speedup over previous state-of-the-art (SoTA) GPU-based STA. When integrated into a timing-driven global placement framework, Warp-STAR delivers a 1.7X speedup over SoTA frameworks. The method also proves effective for differentiable gradient analysis with minimal overhead.