Philip Thomas
Title
Cited by
Cited by
Year
Value function approximation in reinforcement learning using the Fourier basis
G Konidaris
Computer Science Department Faculty Publication Series, 101, 2008
2812008
Data-efficient off-policy policy evaluation for reinforcement learning
P Thomas, E Brunskill
International Conference on Machine Learning, 2139-2148, 2016
2092016
High-confidence off-policy evaluation
PS Thomas, G Theocharous, M Ghavamzadeh
Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015
1372015
High confidence policy improvement
P Thomas, G Theocharous, M Ghavamzadeh
International Conference on Machine Learning, 2380-2388, 2015
1022015
Increasing the action gap: New operators for reinforcement learning
MG Bellemare, G Ostrovski, A Guez, PS Thomas, R Munos
arXiv preprint arXiv:1512.04860, 2015
882015
Bias in natural actor-critic algorithms
P Thomas
International conference on machine learning, 441-448, 2014
822014
Personalized ad recommendation systems for life-time value optimization with guarantees
G Theocharous, PS Thomas, M Ghavamzadeh
Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015
742015
Safe reinforcement learning
PS Thomas
University of Massachusetts Libraries, 2015
442015
Proximal reinforcement learning: A new theory of sequential decision making in primal-dual spaces
S Mahadevan, B Liu, P Thomas, W Dabney, S Giguere, N Jacek, I Gemp, ...
arXiv preprint arXiv:1405.6757, 2014
382014
Learning action representations for reinforcement learning
Y Chandak, G Theocharous, J Kostas, S Jordan, PS Thomas
arXiv preprint arXiv:1902.00183, 2019
342019
Application of the actor-critic architecture to functional electrical stimulation control of a human arm
P Thomas, M Branicky, A van den Bogert, K Jagodnik
Proceedings of the... Innovative Applications of Artificial Intelligence …, 2009
332009
Using options and covariance testing for long horizon off-policy policy evaluation
Z Guo, PS Thomas, E Brunskill
Advances in Neural Information Processing Systems, 2492-2501, 2017
242017
Preventing undesirable behavior of intelligent machines
P Thomas, B Castro da Silva, A Barto, S Giguere, Y Brun, E Brunskill
Science 366 (6468), 999-1004, 2019
232019
Training an actor-critic reinforcement learning controller for arm movement using human-generated rewards
KM Jagodnik, PS Thomas, AJ van den Bogert, MS Branicky, RF Kirsch
IEEE Transactions on Neural Systems and Rehabilitation Engineering 25 (10 …, 2017
222017
Projected natural actor-critic
PS Thomas, WC Dabney, S Giguere, S Mahadevan
Advances in neural information processing systems, 2337-2345, 2013
222013
Conjugate Markov Decision Processes
P Thomas, A Barto
International Conference on Machine Learning, 137-144, 2011
212011
Td_gamma: Re-evaluating complex backups in temporal difference learning
G Konidaris, S Niekum, PS Thomas
Advances in Neural Information Processing Systems, 2402-2410, 2011
212011
Importance Sampling for Fair Policy Selection.
S Doroudi, PS Thomas, E Brunskill
Grantee Submission, 2017
202017
Some recent applications of reinforcement learning
AG Barto, PS Thomas, RS Sutton
Proceedings of the Eighteenth Yale Workshop on Adaptive and Learning Systems, 2017
182017
Motor primitive discovery
PS Thomas, AG Barto
2012 IEEE International Conference on Development and Learning and …, 2012
182012
The system can't perform the operation now. Try again later.
Articles 1–20