UMDMar 8, 2026arXiv:2603.07800

Preference-Conditioned Reinforcement Learning for Space-Time Efficient Online 3D Bin Packing

Nikita Sarawgi, Omey M. Manyar, Fan Wang, Thinh H. Nguyen, Satyandra K. Gupta

AI Summary

This paper introduces STEP, a preference-conditioned reinforcement learning approach for robotic 3D bin packing that explicitly balances space utilization and operational time. STEP uses a Transformer-based policy to evaluate candidate actions, weighing packing benefit against estimated time costs, enabling time-aware strategies. Experiments demonstrate a 44% reduction in operational time without sacrificing packing density, showcasing the effectiveness of the approach.

Key Contribution

Robots can pack boxes 44% faster by learning to strategically trade off packing density for speed.

Abstract

Robotic bin packing is widely deployed in warehouse automation, with current systems achieving robust performance through heuristic and learning-based strategies. These systems must balance compact placement with rapid execution, where selecting alternative items or reorienting them can improve space utilization but introduce additional time. We propose a selection-based formulation that explicitly reasons over this trade-off: at each step, the robot evaluates multiple candidate actions, weighing expected packing benefit against estimated operational time. This enables time-aware strategies that selectively accept increased operational time when it yields meaningful spatial improvements. Our method, STEP (Space-Time Efficient Packing), uses a preference-conditioned, Transformer-based reinforcement learning policy, and allows generalization across candidate set sizes and integration with standard placement modules. It achieves a 44% reduction in operational time without compromising packing density. Additional material is available at https://step-packing.github.io.

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Preference-Conditioned Reinforcement Learning for Space-Time Efficient Online 3D Bin Packing

Related Papers