Mar 10, 2026arXiv:2603.09427

Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

Tatjana Krau, Jorge Mandlmaier, Tobias Damm, Frieder Heieck

AI Summary

This paper investigates the impact of Markov Decision Process (MDP) design choices on sim-to-real transfer in reinforcement learning for industrial process control. They systematically evaluate different MDP configurations, including state composition, reward formulation, and dynamics models, in a color mixing task. The key finding is that physics-based dynamics models significantly improve real-world success rates (up to 50%) compared to simplified models under strict precision constraints.

Key Contribution

Physics-based dynamics models can make or break sim-to-real reinforcement learning, boosting real-world success by 50% in industrial control tasks where simplified models fail.

Abstract

Reinforcement Learning (RL) has demonstrated strong potential for industrial process control, yet policies trained in simulation often suffer from a significant sim-to-real gap when deployed on physical hardware. This work systematically analyzes how core Markov Decision Process (MDP) design choices -- state composition, target inclusion, reward formulation, termination criteria, and environment dynamics models -- affect this transfer. Using a color mixing task, we evaluate different MDP configurations and mixing dynamics across simulation and real-world experiments. We validate our findings on physical hardware, demonstrating that physics-based dynamics models achieve up to 50% real-world success under strict precision constraints where simplified models fail entirely. Our results provide practical MDP design guidelines for deploying RL in industrial process control.

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Impact of Markov Decision Process Design on Sim-to-Real Reinforcement Learning

Related Papers