Princeton University

School of Engineering & Applied Science

An information-theoretic perspective on the exploration/exploitation tradeoff

Speaker: 
Daniel Russo, Columbia University
Location: 
Engineering Quadrangle B205
Date/Time: 
Tuesday, November 27, 2018 - 4:30pm

Abstract:

Modern online marketplaces feed themselves: they rely on historical data to optimize content and user-interactions, but it’s the data generated from these interactions that is fed back into the system and used to optimize future interactions. As this cycle continues, good performance requires algorithms capable of learning through sequential interactions, systematically experimenting to gather useful information, and balancing exploration with exploitation.

In this talk, Daniel will formulate a broad family of such online decision-making problems. He will then present an information-theoretic analysis of Thompson sampling, an algorithm that has recently been the focus of much attention in academia and industry. Inspired by this information-theoretic perspective, we propose and analyze two new two new algorithms, information-directed sampling and satisficing Thompson sampling, that sometimes vastly outperform Thompson sampling.

*This talk is based on joint work with Benjamin Van Roy

Bio:

Daniel is an Assistant Professor in the Decisions, Risk, and Operations Division of the Columbia Business School. He joined the Decision, Risk, and Operations division of the Columbia Business School as an assistant professor in Summer 2017. Prior to joining Columbia, Daniel spent one great year as an assistant professor in the MEDS department at Northwestern's Kellogg School of Management and one year at Microsoft Research in New England as Postdoctoral Researcher. He received his PhD from Stanford University in 2015, where he was advised by Benjamin Van Roy. In 2011 Daniel received his BS in Mathematics and Economics from the University of Michigan.
His research contributes to the design, analysis, and application of algorithms for statistical machine learning and sequential decision-making. Professor Russo’s specific areas of expertise include multi-armed bandit problems, reinforcement learning, and online-optimization.