Seuraa
Adam Gleave
Adam Gleave
CEO at FAR AI
Vahvistettu sähköpostiosoite verkkotunnuksessa far.ai - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
Stable-baselines3: Reliable reinforcement learning implementations
A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann
The Journal of Machine Learning Research 22 (1), 12348-12355, 2021
1327*2021
Stable baselines
A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...
8392018
Adversarial policies: Attacking deep reinforcement learning
A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell
International Conference on Learning Representations, 2020
3532020
Firmament: Fast, centralized cluster scheduling at scale
I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand
12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016
2662016
Inverse reinforcement learning for video games
A Tucker, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2018
482018
Quantifying differences in reward functions
A Gleave, M Dennis, S Legg, S Russell, J Leike
International Conference on Learning Representations, 2021
452021
imitation: Clean imitation learning implementations
A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ...
arXiv preprint arXiv:2211.11972, 2022
40*2022
Multi-task maximum entropy inverse reinforcement learning
A Gleave, O Habryka
GoalsRL Workshop at ICML, 2018
352018
Active inverse reward design
S Mindermann, R Shah, A Gleave, D Hadfield-Menell
GoalsRL Workshop at ICML, 2018
272018
Adversarial Policies Beat Superhuman Go AIs
TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ...
arXiv preprint arXiv:2211.00241, 2022
25*2022
Understanding learned reward functions
EJ Michaud, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
202020
Making compression algorithms for Unicode text
A Gleave, C Steinruecken
Data Compression Conference, 2017
172017
A primer on maximum causal entropy inverse reinforcement learning
A Gleave, S Toyer
arXiv preprint arXiv:2203.11409, 2022
152022
Invariance in policy optimisation and partial identifiability in reward learning
JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave
International Conference on Machine Learning, 32033-32058, 2023
132023
Uncertainty estimation for language reward models
A Gleave, G Irving
arXiv preprint arXiv:2203.07472, 2022
122022
On The Fragility of Learned Reward Functions
L McKinney, Y Duan, D Krueger, A Gleave
arXiv preprint arXiv:2301.03652, 2023
62023
Preprocessing reward functions for interpretability
E Jenner, A Gleave
arXiv preprint arXiv:2203.13553, 2022
62022
DERAIL: Diagnostic Environments for Reward And Imitation Learning
P Freire, A Gleave, S Toyer, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
62020
Reducing exploitability with population based training
P Czempin, A Gleave
arXiv preprint arXiv:2208.05083, 2022
52022
seals: Suite of environments for algorithms that learn specifications
A Gleave, P Freire, S Wang, S Toyer
42020
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20