Markov decision processes
Springer Online Journal Archives 1860-2000
Abstract In this paper we use an approach which uses a superharmonic property of a sequence of functions generated by an algorithm to show that these functions converge in a non-increasing manner to the optimal value function for our problem, and bounds are given for the loss of optimality if the computational process is terminated at any iteration. The basic procedure is to add an additional linear term at each iteration, selected by solving a particular optimisation problem, for which primal and dual linear programming formulations are given.
Type of Medium: