Bellman operator convergence enhancements in Reinforcement Learning

20 Feb 2025 (modified: 05 Apr 2025)AIMS 2025 Workshop T2P SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning; Contraction mapping; Fixed Point; Banach Space; Bellman Operators; Markov Decision Process; Q-Learning.
TL;DR: This paper bridges abstract topological foundations and practical reinforcement learning by leveraging Banach fixed‐point theory and novel Bellman operator formulations to enhance algorithm convergence and performance.
Abstract: This paper reviews the topological groundwork for the study of reinforcement learning (RL) by focusing on the structure of state, action, and policy spaces. We begin by recalling key mathematical concepts such as complete metric spaces, which form the foundation for expressing RL problems. By leveraging the Banach contraction principle, we illustrate how the Banach fixed-point theorem explains the convergence of RL algorithms and how Bellman operators, expressed as operators on Banach spaces, ensure this convergence. The work serves as a bridge between theoretical mathematics and practical algorithm design, offering new approaches to enhance the efficiency of RL. In particular, we investigate alternative formulations of Bellman operators and demonstrate their impact on improving convergence rates and performance in standard RL environments such as MountainCar, CartPole, and Acrobot. Our findings highlight how a deeper mathematical understanding of RL can lead to more effective algorithms for decision-making problems.
Submission Number: 3
Loading