Hengshuai Yao
Hengshuai Yao
Verified email at ualberta.ca - Homepage
Title
Cited by
Cited by
Year
Universal Option Models
H Yao, C Szepesvari, R Sutton, S Bhatnagar, J Modayil
35*2014
Negative log likelihood ratio loss for deep neural network classification
H Yao, D Zhu, B Jiang, P Yu
Proceedings of the Future Technologies Conference, 276-282, 2019
252019
Distributional reinforcement learning for efficient exploration
B Mavrin, H Yao, L Kong, K Wu, Y Yu
International conference on machine learning, 4424-4434, 2019
242019
Multi-step dyna planning for policy evaluation and control
H Yao, RS Sutton, S Bhatnagar, D Dongcui, C Szepesvári
NIPS, 2009
192009
Provably convergent two-timescale off-policy actor-critic with function approximation
S Zhang, B Liu, H Yao, S Whiteson
International Conference on Machine Learning, 11204-11213, 2020
182020
Discounted reinforcement learning is not an optimization problem
A Naik, R Shariff, N Yasui, H Yao, RS Sutton
arXiv preprint arXiv:1910.02140, 2019
172019
Preconditioned temporal difference learning
H Yao, ZQ Liu
Proceedings of the 25th international conference on Machine learning, 1208-1215, 2008
142008
Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans
J Jin, NM Nguyen, N Sakib, D Graves, H Yao, M Jagersand
2020 IEEE International Conference on Robotics and Automation (ICRA), 6979-6985, 2020
132020
Ace: An actor ensemble algorithm for continuous control with tree search
S Zhang, H Yao
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 5789-5796, 2019
132019
QUOTA: The quantile option architecture for reinforcement learning
S Zhang, H Yao
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 5797-5804, 2019
132019
Pseudo-MDPs and Factored Linear Action Models
H Yao, C Szepesvari, BA Pires, X Zhang
IEEE ADPRL, 2014
122014
Approximate policy iteration with linear action models
H Yao, C Szepesvári
Proceedings of the AAAI Conference on Artificial Intelligence 26 (1), 2012
122012
Deep reinforcement learning with decorrelation
B Mavrin, H Yao, L Kong
arXiv preprint arXiv:1903.07765, 2019
82019
Weakly supervised few-shot object segmentation using co-attention with visual and semantic embeddings
M Siam, N Doraiswamy, BN Oreshkin, H Yao, M Jagersand
arXiv preprint arXiv:2001.09540, 2020
72020
Hill climbing on value estimates for search-control in dyna
Y Pan, H Yao, A Farahmand, M White
arXiv preprint arXiv:1906.07791, 2019
72019
Towards practical hierarchical reinforcement learning for multi-lane autonomous driving
MS Nosrati, EA Abolfathi, M Elmahgiubi, P Yadmellat, J Luo, Y Zhang, ...
72018
Provably convergent off-policy actor-critic with function approximation
S Zhang, B Liu, H Yao, S Whiteson
52019
Towards comprehensive maneuver decisions for lane change using reinforcement learning
C Chen, J Qian, H Yao, J Luo, H Zhang, W Liu
52018
Variance-reduced off-policy memory-efficient policy search
D Lyu, Q Qi, M Ghavamzadeh, H Yao, T Yang, B Liu
arXiv preprint arXiv:2009.06548, 2020
42020
Reinforcement ranking
H Yao, D Schuurmans
arXiv preprint arXiv:1303.5988, 2013
42013
The system can't perform the operation now. Try again later.
Articles 1–20