A Short Course on Reinforcement Learning
A Short Course on Reinforcement Learning by Satinder Singh Baveja - Machine Learning Summer School at Purdue, 2011. This short course will be
a three-part tutorial on reinforcement learning (RL) interpreted broadly to include related methods from decision theoretic planning, optimal control and
operations research.
The first part of the course will cover the basics of RL. Topics covered will include Bandit problems and algorithms for solving them, Markov decision problems
(MDPs) and foundational algorithms for solving them in planning and learning settings, as well as Partially observable MDPs (POMDPs) and foundational
algorithms for solving them in the planning and learning settings. The second part of the course will cover advanced methods for solving MDPs and POMDPs,
the use of function approximation in RL, a case study of a couple of applications of RL, and narrower topics such as inverse RL and apprenticeship learning.
Time permitting I might cover some results from RL in multiagent settings. The third part of the course will cover cutting-edge topics including approaches to
state estimation such as predictive state representations (PSRs), the use and learning of structured probabilistic models in controlled dynamical systems,
and the recently defined optimal reward problem. I will conclude with some open challenge problems in RL.
Machine Learning Summer School at Purdue, 2011