DDLC Seminar: Aditya Mahajan (McGill)

to

Location

Description

Data Driven Learning and Control seminar series is organized by the Information and Decision Science Lab at Cornell University and aims to explore the latest advancements and interdisciplinary approaches to data-driven learning and control systems.

Watch on YouTube Live
 

Agent-state based policies in POMDPs: Beyond belief-state MDPs

The traditional approach to POMDPs is to convert them into fully observed MDPs by considering a belief state as an information state. However, a belief-state based approach requires perfect knowledge of the system dynamics and is therefore not applicable in the learning setting where the system model is unknown. Various approaches to circumvent this limitation have been proposed in the literature. We present a unified treatment of some of these approaches by viewing them as models where the agent maintains a local recursively updateable "agent state" and chooses actions based on the agent state. We highlight the different classes of agent-state based policies and the various approaches that have been proposed in the literature to find good policies within each class. These include the designer's approach to find optimal non-stationary agent-state based policies, policy search approaches to find a locally optimal stationary agent-state based policies, and the approximate information state to find approximately optimal stationary agent-state based policies. We then present how ideas from the approximate information state approach have been used to improve Q-learning and actor-critic algorithms for learning in POMDPs.

This is joint work with Jayakumar Subramanian, Amit Sinha, Matthieu Geist, Erfan SayedSalehi, Nima Akbarzadeh, Tianwei Ni, and Pierre-Luc Bacon.
 

Bio:
Aditya Mahajan is a professor of electrical and computer engineering at McGill University. He is a member of the McGill Center of Intelligent Machines, Mila - Québec AI Institute, International Laboratory for Learning Systems, and Groupe d’études et de recherche en analyse des décisions. He received his B.Tech degree in electrical engineering from the Indian Institute of Technology and his M.S. and Ph.D. degrees in electrical engineering and computer science from the University of Michigan. He has held visiting appointments at the University of California, Berkeley and the University of Paris-Saclay.

He is a senior member of the IEEE and member of Professional Engineers Ontario. He currently serves as Associate Editor of IEEE Control Systems Letters, and Springer Mathematics of Control, Signal, and Systems. In the past, he has served as an Associate Editor of IEEE Transactions on Automatic Control, an Associate Editor of the IEEE Control Systems Society Conference Editorial Board.

He is the recipient of the 2015 George Axelby Outstanding Paper Award, the 2016 NSERC Discovery Accelerator Award, the 2014 CDC Best Student Paper Award (as supervisor), and the 2016 NecSys Best Student Paper Award (as supervisor). His principal research interests include decentralized stochastic control, team theory, reinforcement learning, multi-armed bandits and information theory.