Philip Thomas

Viittaukset

	Kaikki	2019 lähtien
Sitaatit	4641	3624
h-indeksi	33	29
i10-indeksi	58	52

780

390

195

585

2011201220132014201520162017201820192020202120222023202416 27 29 41 68 137 179 253 412 581 676 721 770 461

Yleisessä käytössä

Näytä kaikki

27 artikkelia

0 artikkelia

käytettävissä

ei käytettävissä

Perustuu rahoitusehtoihin

Muut kirjoittajat

Georgios TheocharousAdobe ResearchVahvistettu sähköpostiosoite verkkotunnuksessa adobe.com
Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityVahvistettu sähköpostiosoite verkkotunnuksessa cs.stanford.edu
Bruno Castro da SilvaUniversity of MassachusettsVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Scott M. JordanPostdoctoral Fellow, University of AlbertaVahvistettu sähköpostiosoite verkkotunnuksessa ualberta.ca
George KonidarisBrownVahvistettu sähköpostiosoite verkkotunnuksessa cs.brown.edu
Scott NiekumAssociate Professor, University of Massachusetts AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Stephen GiguereUniversity of MassachusettsVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Yuriy BrunManning College of Information and Computer Sciences, University of Massachusetts AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Antonie J. (Ton) van den BogertProfessor of Mechanical Engineering, Cleveland State UniversityVahvistettu sähköpostiosoite verkkotunnuksessa csuohio.edu
Chris NotaUniversity of Massachusetts, AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Michael BranickyProfessor of Electrical Engineering & Computer Science, University of KansasVahvistettu sähköpostiosoite verkkotunnuksessa ku.edu
Erik Learned-MillerProfessor of Computer Science, University of Massachusetts AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Sarah OsentoskiVinci4dVahvistettu sähköpostiosoite verkkotunnuksessa vinci4d.ai
Blossom MetevierUniversity of Massachusetts AmherstVahvistettu sähköpostiosoite verkkotunnuksessa umass.edu
Sridhar MahadevanDirector, Data Science Lab, Adobe Research & Professor, University of Massachusetts, AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Will DabneyDeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Francisco M. GarciaUniversity of Massachusetts - AmherstVahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu
Robert KirschProfessor and Chair of Biomedical Engineering, Case Western Reserve UniversityVahvistettu sähköpostiosoite verkkotunnuksessa case.edu
Arthur GuezGoogle DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa google.com
Rémi MunosGoogle DeepMindVahvistettu sähköpostiosoite verkkotunnuksessa inria.fr

Seuraa

Philip Thomas

University of Massachusetts Amherst

Vahvistettu sähköpostiosoite verkkotunnuksessa cs.umass.edu - Kotisivu

Artificial Intelligence Reinforcement Learning AI Safety


Nimike Lajittele sitaattien mukaan Lajittele vuoden mukaan Lajittele otsikon mukaan	Viittaukset Viittaukset	Vuosi
Data-efficient off-policy policy evaluation for reinforcement learning P Thomas, E Brunskill International Conference on Machine Learning, 2139-2148, 2016	725	2016
Value function approximation in reinforcement learning using the Fourier basis G Konidaris, S Osentoski, P Thomas Proceedings of the AAAI conference on artificial intelligence 25 (1), 380-385, 2011	547	2011
High-confidence off-policy evaluation P Thomas, G Theocharous, M Ghavamzadeh Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	319	2015
High confidence policy improvement P Thomas, G Theocharous, M Ghavamzadeh International Conference on Machine Learning, 2380-2388, 2015	220	2015
Ad recommendation systems for life-time value optimization G Theocharous, PS Thomas, M Ghavamzadeh Proceedings of the 24th international conference on world wide web, 1305-1310, 2015	199	2015
Preventing undesirable behavior of intelligent machines P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill Science 366 (6468), 999-1004, 2019	198	2019
Learning action representations for reinforcement learning Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas International conference on machine learning, 941-950, 2019	188	2019
Increasing the action gap: New operators for reinforcement learning MG Bellemare, G Ostrovski, A Guez, P Thomas, R Munos Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016	170	2016
Bias in natural actor-critic algorithms P Thomas International conference on machine learning, 441-448, 2014	159	2014
Safe reinforcement learning PS Thomas	119	2015
Optimizing for the future in non-stationary mdps Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ... International Conference on Machine Learning, 1414-1425, 2020	71	2020
Is the policy gradient a gradient? C Nota, PS Thomas arXiv preprint arXiv:1906.07073, 2019	71	2019
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ... arXiv preprint arXiv:1405.6757, 2014	69	2014
Evaluating the performance of reinforcement learning algorithms S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas International Conference on Machine Learning, 4962-4973, 2020	67	2020
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017	67	2017
Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing P Thomas, G Theocharous, M Ghavamzadeh, I Durugkar, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 31 (2), 4740-4745, 2017	64	2017
Policy gradient methods for reinforcement learning with function approximation and action-dependent baselines PS Thomas, E Brunskill arXiv preprint arXiv:1706.06643, 2017	62	2017
Importance Sampling for Fair Policy Selection. S Doroudi, PS Thomas, E Brunskill Grantee Submission, 2017	58	2017
Risk Quantification for Policy Deployment PS Thomas, G Theocharous, M Ghavamzadeh US Patent App. 14/552,047, 2016	58	2016
Offline contextual bandits with high probability fairness guarantees B Metevier, S Giguere, S Brockman, A Kobren, Y Brun, E Brunskill, ... Advances in neural information processing systems 32, 2019	55	2019

Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.

Artikkelit 1–20

Sitaatteja vuodessa

Päällekkäiset lähteet

Yhdistetyt sitaatit

Lisää muut kirjoittajatMuut kirjoittajat

Seuraa

Viittaukset

Muut kirjoittajat