Deepmind
Understanding Information-Directed Sampling, When and How to Use It?
Botao Hao
UW
Towards Instance-Optimal Algorithms for Reinforcement Learning
Kevin Jamieson
Deepmind
Epistemic Neural Networks
Ian Osband
NUS
Optimal Clustering with Bandit Feedback
Vincent Y. F. Tan
Adaptivity and Confounding in Multi-Armed Bandit Experiments
Daniel Russo
1
2
3