Verified planning under uncertainty (i.e. (PO)MDP planning)

Master’s practical course, Master’s thesis, or IDP

AI planning is the discipline aiming at building agents (i.e. computer programs) that act to achieve a certain goal, given a declarative description of the environment and the actions available to it. In practical applications, like autonomous vehicles driving, it is not possible to model the exact effects of actions. Planning in settings with uncertainty has been explored in the areas of AI, control, and operations research. In those settings the environment is commonly modelled as a factored Markov decision process (MDP) or a partially observable MDP. Tasks in this project include

extending the existing Isabelle formalisation of MDPs [3] to include the notion of agent rewards
formalising the notion of (optimal) strategies, verifying algorithms that solve these problems, like policy/value iteration [1,2]
refining the verified algorithm to executable code using the existing Isabelle refinement framework [4].

Prerequisites:
Experience with Isabelle/HOL is required.

Advisor:
Mohammad Abdulaziz, email {mansour} AT [in.tum.de]

Supervisor:
Prof. Tobias Nipkow, Raum MI 00.09.055, email {nipkow} AT [in.tum.de]

Material to read:

[1]
Chapter 6 in this book offers an overview of value/policy iteration: http://projects.laas.fr/planning/book.pdf

[2]
Detailed description of policy/value iteration convergence: http://www.professeurs.polymtl.ca/jerome.le-ny/teaching/DP_fall09/notes/lec10_VI_PI.pdf

[3]
An overview of the formalisation of MDPs in Isabelle: https://www.cs.vu.nl/~jhl890/pub/hoelzl2017mdp.pdf

[4]
Isabelle Refinement Framework https://www.isa-afp.org/entries/Refine_Monadic.html