Note: This content is accessible to all versions of every browser. However, this browser does not seem to support current Web standards, preventing the display of our site's design details.


Deep Exploration via Randomized Value Functions

An important challenge in reinforcement learning concerns how an agent can simultaneously explore and generalize in a reliably efficient manner. This talk will present a new approach to exploration that induces judicious probing through randomization of value function estimates and operates effectively in tandem with common reinforcement learning algorithms, such as least-squares value iteration and temporal-difference learning that generalize via parameterized representations of the value function. Theoretical results offer assurances with exhaustive representations of the value function, and computational results suggest that the approach remains effective with generalizing representations.

Type of Seminar:
Control Seminar Series
Prof. Ben van Roy
Stanford University
May 22, 2017   5.15pm

Contact Person:

No downloadable files available.
Biographical Sketch:
Benjamin Van Roy is a Professor of Electrical Engineering, Management Science and Engineering, and, by courtesy, Computer Science, at Stanford University. His research focuses on understanding how an agent interacting with a poorly understood environment can learn over time to make effective decisions. He is interested in questions concerning what is possible or impossible as well as how to design efficient learning algorithms. He is an INFORMS Fellow, has served on the editorial boards of Machine Learning, Mathematics of Operations Research, and Operations Research, and has led research programs at and/or founded several technology companies.