Apr 8, 2026arXiv:2604.06837

Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem

Hyukjun Yang, Han-Dong Lim, Donghwan Lee

AI Summary

This paper analyzes a soft Bellman residual minimization (BRM) objective with a weighted Lp-norm for solving Markov Decision Processes (MDPs) under linear function approximation. It demonstrates that increasing *p* in the Lp-norm aligns the BRM objective with the contraction geometry of the Bellman operator, thereby reducing error propagation. The analysis provides performance error bounds that explicitly connect residual minimization with Bellman contraction properties.

Key Contribution

Aligning your Bellman residual minimization objective with the Bellman operator's contraction geometry provably improves performance in MDPs.

Abstract

The problem of solving Markov decision processes under function approximation remains a fundamental challenge, even under linear function approximation settings. A key difficulty arises from a geometric mismatch: while the Bellman optimality operator is contractive in the Linfty-norm, commonly used objectives such as projected value iteration and Bellman residual minimization rely on L2-based formulations. To enable gradient-based optimization, we consider a soft formulation of Bellman residual minimization and extend it to a generalized weighted Lp -norm. We show that this formulation aligns the optimization objective with the contraction geometry of the Bellman operator as p increases, and derive corresponding performance error bounds. Our analysis provides a principled connection between residual minimization and Bellman contraction, leading to improved control of error propagation while remaining compatible with gradient-based optimization.

Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem

Related Papers