Bi-Weekly Talk: Alexander Bork: Underapproximations for Indefinite-Horizon POMDPs

Wednesday, June 30, 2021, 10:30am

Location: Online session

Speaker: Alexander Bork



Partially observable Markov decision processes (POMDPs) are a common extensionof the classic MDP framework for systems where only imperfectinformation is available. A policy determining an action plan ina POMDP does not access the full system state, its decisions areinstead based on a history of observations. The indefinite-horizon reachabilityproblem asks if the expected cumulative reward collected along allpossible execution paths exceeds a given threshold. As this problem isgenerally undecidable for POMDPs, methods to allow thecomputation of approximative solutions are necessary.

We consider Belief Cut-Offs, a method for the underapproximation ofthe indefinite-horizon reachability problem in POMDPs. This method isbased on the principle of stopping the exploration of the underlying beliefstructure of the POMDP at some points and approximating missingvalues at the points where we cut off. We furthermore provide an extensioncalled Belief Clipping, where the approximation stems from different, moreexplored parts of the belief structure.