Presenter
Tor Lattimore
Research Scientist,
Deepmind, London
Abstract
Tor will give a whirlwind tour of a series of recent papers on the information directed sampling algorithm for sequential decision-making. The results come in three flavours. First, generalising and applying the IDS algorithm to problems with a rich information structure such as convex bandits and partial monitoring. Second, showing a connection between the optimisation problem solved by IDS and the optimisation problem that determines the asymptotic lower bound for stochastic structured bandit problems. Third, showing a deep connection between IDS and the mirror descent framework for convex optimisation.
Reference
https://arxiv.org/abs/2011.05944
https://arxiv.org/abs/2009.12228
https://arxiv.org/abs/1907.05772
https://arxiv.org/abs/2006.00475
https://arxiv.org/abs/2002.11182
https://arxiv.org/abs/1902.00470
https://arxiv.org/abs/1905.11817
Tor Lattimore is a research scientist at DeepMind working on the foundations of machine learning and especially decision-making. Before joining DeepMind he was an assistant professor at Indiana University and a postdoc at the University of Alberta. He obtained his PhD from the Australian National University under the supervision of Marcus Hutter in 2014.