Seuraa
Ian Osband
Ian Osband
OpenAI
Vahvistettu sähköpostiosoite verkkotunnuksessa openai.com - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
Deep exploration via bootstrapped DQN
I Osband, C Blundell, A Pritzel, B Van Roy
Advances in neural information processing systems 29, 2016
14692016
Deep q-learning from demonstrations
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ...
Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018
12252018
A tutorial on thompson sampling
DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen
Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018
11292018
Minimax regret bounds for reinforcement learning
MG Azar, I Osband, R Munos
International conference on machine learning, 263-272, 2017
8212017
Randomized prior functions for deep reinforcement learning
I Osband, J Aslanides, A Cassirer
Advances in Neural Information Processing Systems 31, 2018
4142018
Deep Exploration via Randomized Value Functions
I Osband
https://searchworks.stanford.edu/view/11891201, 2016
3312016
Generalization and exploration via randomized value functions
I Osband, B Van Roy, Z Wen
International Conference on Machine Learning, 2377-2386, 2016
3302016
Why is posterior sampling better than optimism for reinforcement learning?
I Osband, B Van Roy
International conference on machine learning, 2701-2710, 2017
2692017
The uncertainty bellman equation and exploration
B O’Donoghue, I Osband, R Munos, V Mnih
International conference on machine learning, 3836-3845, 2018
2202018
Model-based reinforcement learning and the eluder dimension
I Osband, B Van Roy
Advances in Neural Information Processing Systems 27, 2014
1942014
Learning from demonstrations for real world reinforcement learning
T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ...
arXiv preprint arXiv:1704.03732, 2017
1792017
Behaviour suite for reinforcement learning
I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ...
arXiv preprint arXiv:1908.03568, 2019
1772019
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout
I Osband
http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0
166*
Deep learning for time series modeling
E Busseti, I Osband, S Wong
Technical report, Stanford University, 1-5, 2012
1392012
Near-optimal reinforcement learning in factored mdps
I Osband, B Van Roy
Advances in Neural Information Processing Systems 27, 2014
1212014
On lower bounds for regret in reinforcement learning
I Osband, B Van Roy
arXiv preprint arXiv:1608.02732, 2016
1122016
Bootstrapped thompson sampling and deep exploration
I Osband, B Van Roy
arXiv preprint arXiv:1507.00300, 2015
1012015
(More) efficient reinforcement learning via posterior sampling
I Osband, D Russo, B Van Roy
Advances in Neural Information Processing Systems 26, 2013
1012013
Meta-learning of sequential strategies
PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ...
arXiv preprint arXiv:1905.03030, 2019
892019
Epistemic neural networks
I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ...
Advances in Neural Information Processing Systems 36, 2024
882024
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20