Ian Osband

Viittaukset

	Kaikki	2019 lähtien
Sitaatit	7634	6801
h-indeksi	25	25
i10-indeksi	29	28

1600

800

400

1200

201520162017201820192020202120222023202427 74 222 467 751 1154 1361 1471 1543 519

Muut kirjoittajat

Benjamin Van RoyStanford UniversityVahvistettu sähköpostiosoite verkkotunnuksessa stanford.edu
Zheng WenGoogle DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Vikranth DwaracherlaDeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Xiuyuan LuGoogle DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Daniel RussoColumbia UniversityVahvistettu sähköpostiosoite verkkotunnuksessa gsb.columbia.edu
Morteza IbrahimiStanford UniversityVahvistettu sähköpostiosoite verkkotunnuksessa stanford.edu
Brendan O'DonoghueStanford University, Google DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa alumni.stanford.edu
Mohammad Gheshlaghi AzarCohere AIVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Todd HesterWaymoVahvistettu sähköpostiosoite verkkotunnuksessa waymo.com
Bilal PiotGoogle DeepmindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Olivier PietquinCohere | ex Google DeepMind (On leave - Professor at University of Lille)Vahvistettu sähköpostiosoite verkkotunnuksessa univ-lille.fr
Tom SchaulSenior Staff Scientist, DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa nyu.edu
Rémi MunosDeepMindVahvistettu sähköpostiosoite verkkotunnuksessa inria.fr
Alexander PritzelDeepmindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Marc LanctotResearch Scientist, Google DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com

Seuraa

Ian Osband

OpenAI

Vahvistettu sähköpostiosoite verkkotunnuksessa openai.com - Kotisivu

Reinforcement Learning


Nimike Lajittele sitaattien mukaan Lajittele vuoden mukaan Lajittele otsikon mukaan	Viittaukset Viittaukset	Vuosi
Deep exploration via bootstrapped DQN I Osband, C Blundell, A Pritzel, B Van Roy Advances in neural information processing systems 29, 2016	1398	2016
Deep q-learning from demonstrations T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, D Horgan, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	1163	2018
A tutorial on thompson sampling DJ Russo, B Van Roy, A Kazerouni, I Osband, Z Wen Foundations and Trends® in Machine Learning 11 (1), 1-96, 2018	1049	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	778	2017
Randomized prior functions for deep reinforcement learning I Osband, J Aslanides, A Cassirer Advances in Neural Information Processing Systems 31, 2018	394	2018
Deep Exploration via Randomized Value Functions I Osband https://searchworks.stanford.edu/view/11891201, 2016	320	2016
Generalization and exploration via randomized value functions I Osband, B Van Roy, Z Wen International Conference on Machine Learning, 2377-2386, 2016	318	2016
Why is posterior sampling better than optimism for reinforcement learning? I Osband, B Van Roy International conference on machine learning, 2701-2710, 2017	253	2017
The uncertainty bellman equation and exploration B O’Donoghue, I Osband, R Munos, V Mnih International conference on machine learning, 3836-3845, 2018	207	2018
Model-based reinforcement learning and the eluder dimension I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	182	2014
Learning from demonstrations for real world reinforcement learning T Hester, M Vecerik, O Pietquin, M Lanctot, T Schaul, B Piot, A Sendonaris, ... arXiv preprint arXiv:1704.03732, 2017	175	2017
Behaviour suite for reinforcement learning I Osband, Y Doron, M Hessel, J Aslanides, E Sezener, A Saraiva, ... arXiv preprint arXiv:1908.03568, 2019	174	2019
Risk versus Uncertainty in Deep Learning: Bayes, Bootstrap and the Dangers of Dropout I Osband http://bayesiandeeplearning.org/papers/BDL_4.pdf, 0	163*
Deep learning for time series modeling E Busseti, I Osband, S Wong Technical report, Stanford University, 1-5, 2012	136	2012
Near-optimal reinforcement learning in factored mdps I Osband, B Van Roy Advances in Neural Information Processing Systems 27, 2014	122	2014
On lower bounds for regret in reinforcement learning I Osband, B Van Roy arXiv preprint arXiv:1608.02732, 2016	107	2016
Bootstrapped thompson sampling and deep exploration I Osband, B Van Roy arXiv preprint arXiv:1507.00300, 2015	99	2015
(More) efficient reinforcement learning via posterior sampling I Osband, D Russo, B Van Roy Advances in Neural Information Processing Systems 26, 2013	94	2013
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	82	2019
Epistemic neural networks I Osband, Z Wen, SM Asghari, V Dwaracherla, M Ibrahimi, X Lu, ... Advances in Neural Information Processing Systems 36, 2024	79	2024

Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.

Artikkelit 1–20

Sitaatteja vuodessa

Päällekkäiset lähteet

Yhdistetyt sitaatit

Lisää muut kirjoittajatMuut kirjoittajat

Seuraa

Viittaukset

Muut kirjoittajat