Princeton University

School of Engineering & Applied Science

Safe and Reliable Reinforcement Learning for Continuous Control

Stephen Tu, University of California, Berkeley
105 Computer Science
Thursday, March 7, 2019 - 12:30pm


Many autonomous systems such as self-driving cars, unmanned aerial vehicles, and personalized robotic assistants are inherently complex.  In order to deal with this complexity, practitioners are increasingly turning towards data-driven learning techniques such as reinforcement learning (RL) for designing sophisticated control policies. However, there are currently two fundamental issues that limit the widespread deployment RL: sample inefficiency and the lack of formal safety guarantees. In this talk, I will propose solutions for both these issues in the context of continuous control tasks. In particular, I will show that in the widely applicable setting where the dynamics are linear, model-based algorithms which exploit this structure are substantially more sample efficient than model-free algorithms, such as the widely used policy gradient method. Furthermore, I will describe a new model-based algorithm which comes with provable safety guarantees and is computationally efficient, relying only on convex programming. I will conclude the talk by discussing the next steps towards safe and reliable deployment of reinforcement learning.


Stephen Tu is a PhD student in Electrical Engineering and Computer Sciences at the University of California, Berkeley advised by Benjamin Recht. His research interests are in machine learning, control theory, optimization, and statistics. Recently, he has focused on providing safety and performance guarantees for reinforcement learning algorithms in continuous settings. He is supported by a Google PhD fellowship in machine learning.