Stable and Sample Efficient Reinforcement Learning
Reinforcement Learning (RL) is widely recognized for its capabilities in control and optimization. Despite its potential, industrial adoption necessitates guarantees of stability and sample efficiency. We present a modular approach to designing inherently stable deep reinforcement learning controllers for linear systems. Our approach leverages the Yula-Kucera parametrization of all stable controllers in conjunction with a purely data-based realization of the system model. This approach retains the “model-free” nature of RL but guarantees closed-loop stability through the learning episodes. We also present an extension of RL with meta-learning to improve sample efficiency. We illustrate these algorithms through experiments on a pilot-scale plant and comparison with off-the-shelf industrial controllers.
This work was done in collaboration with Nathan Lawrence, Daniel McClement, Philip Loewen, Michael Forbes and Shuyuan Wang.
Bio: Bhushan Gopaluni is a professor in the Department of Chemical and Biological Engineering and a Vice-Provost & Associate Vice-President at the University of British Columbia. He was previously an associate dean in the Faculty of Applied Science and an associate head in the Department of Chemical & Biological Engineering. He is also an associate faculty in the Institute of Applied Mathematics, the Institute for Computing, Information and Cognitive Systems, Pulp and Paper Center and the Clean Energy Research Center. He was the Elizabeth and Leslie Gould Teaching Professor from 2014 to 2017. He is currently an associate editor for the Journal of Process Control and was previously an associate editor for The Journal of Franklin Institute and Results in Control and Optimization.
Bhushan received a Ph.D. from the University of Alberta in 2003 and a Bachelor of Technology from the Indian Institute of Technology, Madras, in 1997, both in the field of chemical engineering. From 2003 to 2005, he worked as an engineering consultant at Matrikon Inc. (now Honeywell Process Solutions), during which he designed and commissioned multivariable controllers in British Columbia’s forest bio-products industry and implemented numerous controller performance monitoring projects in the oil & gas and other chemical and pharmaceutical industries. He is the recipient of the Killam Teaching Prize and the Dean’s Service Medal from the University of British Columbia and the D. G. Fisher Award in Process Control from the Canadian Society for Chemical Engineers. He is a Fellow of the Canadian Academy of Engineering.