Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm
Sylvain Delattre, Nicolas Fournier
Published: 2025/1/15
Abstract
We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in a Markov decision process with finite state space and finite number of possible actions. We show its convergence when the discount factor is smaller than $1/2$.