Master’s practical course, Master’s thesis, or IDP
AI planning is the discipline aiming at building agents (i.e. computer programs) that act to achieve a certain goal, given a declarative description of the environment and the actions available to it. In practical applications, like autonomous vehicles driving, it is not possible to model the exact effects of actions. Planning in settings with uncertainty has been explored in the areas of AI, control, and operations research. In those settings the environment is commonly modelled as a factored Markov decision process (MDP) or a partially observable MDP. Tasks in this project include
- extending the existing Isabelle formalisation of MDPs  to include the notion of agent rewards
- formalising the notion of (optimal) strategies, verifying algorithms that solve these problems, like policy/value iteration [1,2]
- refining the verified algorithm to executable code using the existing Isabelle refinement framework .
Experience with Isabelle/HOL is required.
Material to read:
Chapter 6 in this book offers an overview of value/policy iteration: http://projects.laas.fr/planning/book.pdf
Detailed description of policy/value iteration convergence: http://www.professeurs.polymtl.ca/jerome.le-ny/teaching/DP_fall09/notes/lec10_VI_PI.pdf
An overview of the formalisation of MDPs in Isabelle: https://www.cs.vu.nl/~jhl890/pub/hoelzl2017mdp.pdf
Isabelle Refinement Framework https://www.isa-afp.org/entries/Refine_Monadic.html