Abstract: Policy Improvement Algorithm for Singularly Perturbed Discounted Markov Decision Processes


In this paper, we consider a perturbed Markov decision process with the discounted reward criterion .The transition probabilities and discount factor are perturbed slightly.We assume t hat the underlying process is completely decomposable in finite number of separate irreducible processes .We introduce the limit Markov control problem which is the optimization problem th at should be solved in case of singular perturbations. In order to solve the limit Markov cont rol problem, we propose an aggregation-disaggregation policy improvement algorithm which conve rges in a finite number of iterations to an optimal deterministic strategy.