Search papers, labs, and topics across Lattice.
This paper analyzes the approximation error of linear value function approximation in reinforcement learning when using state representations based on learned spectral features of the state-graph Laplacian. It provides an upper bound on this error, demonstrating its dependence on the algebraic connectivity of the state-graph, thus linking approximation quality to the MDP's topological structure. The paper also bounds the error from eigenvector estimation, offering an end-to-end error decomposition and validating the theoretical findings with gridworld simulations.
The approximation error of spectral RL representations is fundamentally limited by the algebraic connectivity of the state-graph, revealing a crucial topological bottleneck.
Learning compact state representations in Markov Decision Processes (MDPs) has proven crucial for addressing the curse of dimensionality in large-scale reinforcement learning (RL) problems. Existing principled approaches leverage structural priors on the MDP by constructing state representations as linear combinations of the state-graph Laplacian eigenvectors. When the transition graph is unknown or the state space is prohibitively large, the graph spectral features can be estimated directly via sample trajectories. In this work, we prove an upper bound on the approximation error of linear value function approximation under the learned spectral features. We show how this error scales with the algebraic connectivity of the state-graph, grounding the approximation quality in the topological structure of the MDP. We further bound the error introduced by the eigenvector estimation itself, leading to an end-to-end error decomposition across the representation learning pipeline. Additionally, our expression of the Laplacian operator for the RL setting, although equivalent to existing ones, prevents some common misunderstandings, of which we show some examples from the literature. Our results hold for general (non-uniform) policies without any assumptions on the symmetry of the induced transition kernel. We validate our theoretical findings with numerical simulations on gridworld environments.