Markov decision processes: on the convergence of the Monte-Carlo first visit algorithm

Sylvain Delattre, Nicolas Fournier

Published: 2025/1/15

Abstract

We consider the Monte-Carlo first visit algorithm, of which the goal is to find the optimal control in a Markov decision process with finite state space and finite number of possible actions. We show its convergence when the discount factor is smaller than $1/2$.

Read Full Paper (arXiv.org)