Search papers, labs, and topics across Lattice.
This paper introduces STEP, a preference-conditioned reinforcement learning approach for robotic 3D bin packing that explicitly balances space utilization and operational time. STEP uses a Transformer-based policy to evaluate candidate actions, weighing packing benefit against estimated time costs, enabling time-aware strategies. Experiments demonstrate a 44% reduction in operational time without sacrificing packing density, showcasing the effectiveness of the approach.
Robots can pack boxes 44% faster by learning to strategically trade off packing density for speed.
Robotic bin packing is widely deployed in warehouse automation, with current systems achieving robust performance through heuristic and learning-based strategies. These systems must balance compact placement with rapid execution, where selecting alternative items or reorienting them can improve space utilization but introduce additional time. We propose a selection-based formulation that explicitly reasons over this trade-off: at each step, the robot evaluates multiple candidate actions, weighing expected packing benefit against estimated operational time. This enables time-aware strategies that selectively accept increased operational time when it yields meaningful spatial improvements. Our method, STEP (Space-Time Efficient Packing), uses a preference-conditioned, Transformer-based reinforcement learning policy, and allows generalization across candidate set sizes and integration with standard placement modules. It achieves a 44% reduction in operational time without compromising packing density. Additional material is available at https://step-packing.github.io.