MIT CSAILMar 18, 2026arXiv:2603.17319

Physics-informed offline reinforcement learning eliminates catastrophic fuel waste in maritime routing

Aniruddha Bora, J. Chalfant, C. Chryssostomidis

AI Summary

The paper introduces PIER, an offline reinforcement learning framework for maritime routing that learns fuel-efficient and safety-aware policies from historical vessel tracking data and ocean reanalysis products. PIER significantly reduces catastrophic fuel waste (defined as >1.5x median consumption) by 9-fold compared to great-circle routing, while also decreasing per-voyage fuel variance by 3.5x. Unlike forecast-dependent methods like A*, PIER maintains consistent performance even under realistic forecast uncertainty, demonstrating robustness.

Key Contribution

Heuristic maritime routes lead to extreme fuel waste in nearly 5% of voyages, but this RL approach cuts that risk by almost 10x.

Abstract

International shipping produces approximately 3% of global greenhouse gas emissions, yet voyage routing remains dominated by heuristic methods. We present PIER (Physics-Informed, Energy-efficient, Risk-aware routing), an offline reinforcement learning framework that learns fuel-efficient, safety-aware routing policies from physics-calibrated environments grounded in historical vessel tracking data and ocean reanalysis products, requiring no online simulator. Validated on one full year (2023) of AIS data across seven Gulf of Mexico routes (840 episodes per method), PIER reduces mean CO2 emissions by 10% relative to great-circle routing. However, PIER's primary contribution is eliminating catastrophic fuel waste: great-circle routing incurs extreme fuel consumption (>1.5x median) in 4.8% of voyages; PIER reduces this to 0.5%, a 9-fold reduction. Per-voyage fuel variance is 3.5x lower (p<0.001), with bootstrap 95% CI for mean savings [2.9%, 15.7%]. Partial validation against observed AIS vessel behavior confirms consistency with the fastest real transits while exhibiting 23.1x lower variance. Crucially, PIER is forecast-independent: unlike A* path optimization whose wave protection degrades 4.5x under realistic forecast uncertainty, PIER maintains constant performance using only local observations. The framework combines physics-informed state construction, demonstration-augmented offline data, and a decoupled post-hoc safety shield, an architecture that transfers to wildfire evacuation, aircraft trajectory optimization, and autonomous navigation in unmapped terrain.

Robotics & Embodied AI Scientific Discovery & Drug Design World Models & Planning

Citation Metrics

Citations0

Influential citations0

References40

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Physics-informed offline reinforcement learning eliminates catastrophic fuel waste in maritime routing

Related Papers