Gastvortrag: On the Complexity of Value Iteration
Dienstag, 26.02.2019, 11.00 Uhr
Ort: RWTH Aachen University, Informatikzentrum - Ahornstr. 55, Erweiterungsgebäude E3, Raum 9U10
Vortragende: Mahsa Shirmohammadi, CNRS & IRIF, France
Abstract: Value iteration is a fundamental algorithm for solving Markov Decision Processes (MDPs). It computes the maximal n-step payoff by iterating n times a recurrence equation which is naturally associated to the MDP. At the same time, value iteration provides a policy for the MDP that is optimal on a given finite horizon n. In this paper, we settle the computational complexity of value iteration. We show that, given a horizon n in binary and an MDP, computing an optimal policy is EXP-complete, thus resolving an open problem that goes back to the seminal 1987 paper on the complexity of MDPs by Papadimitriou and Tsitsiklis. As a stepping stone, we show that it is EXP-complete to compute the n-fold iteration (with n in binary) of a function given by a straight-line program over the integers with max and + as operators. (joint work with Nikhil Balaji, Stefan Kiefer, Petr Novotný and Guillermo A. Pérez)