Seuraa
Adam Gleave
Adam Gleave
CEO at FAR AI
Vahvistettu sähköpostiosoite verkkotunnuksessa far.ai - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
Stable-baselines3: Reliable reinforcement learning implementations
A Raffin, A Hill, A Gleave, A Kanervisto, M Ernestus, N Dormann
Journal of Machine Learning Research 22 (268), 1-8, 2021
17942021
Stable baselines
A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...
8912018
Adversarial policies: Attacking deep reinforcement learning
A Gleave, M Dennis, C Wild, N Kant, S Levine, S Russell
International Conference on Learning Representations, 2020
3962020
Firmament: Fast, centralized cluster scheduling at scale
I Gog, M Schwarzkopf, A Gleave, RNM Watson, S Hand
12th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2016
2792016
Inverse reinforcement learning for video games
A Tucker, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2018
552018
imitation: Clean imitation learning implementations
A Gleave, M Taufeeque, J Rocamonde, E Jenner, SH Wang, S Toyer, ...
arXiv preprint arXiv:2211.11972, 2022
53*2022
Quantifying differences in reward functions
A Gleave, M Dennis, S Legg, S Russell, J Leike
International Conference on Learning Representations, 2021
532021
Multi-task maximum entropy inverse reinforcement learning
A Gleave, O Habryka
GoalsRL Workshop at ICML, 2018
442018
Adversarial Policies Beat Superhuman Go AIs
TT Wang, A Gleave, T Tseng, N Belrose, J Miller, MD Dennis, Y Duan, ...
arXiv preprint arXiv:2211.00241, 2022
39*2022
Active inverse reward design
S Mindermann, R Shah, A Gleave, D Hadfield-Menell
GoalsRL Workshop at ICML, 2018
292018
Understanding learned reward functions
EJ Michaud, A Gleave, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
262020
Invariance in policy optimisation and partial identifiability in reward learning
JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave
International Conference on Machine Learning, 32033-32058, 2023
252023
Uncertainty estimation for language reward models
A Gleave, G Irving
arXiv preprint arXiv:2203.07472, 2022
232022
A primer on maximum causal entropy inverse reinforcement learning
A Gleave, S Toyer
arXiv preprint arXiv:2203.11409, 2022
182022
Making compression algorithms for Unicode text
A Gleave, C Steinruecken
Data Compression Conference, 2017
162017
On the fragility of learned reward functions
L McKinney, Y Duan, D Krueger, A Gleave
arXiv preprint arXiv:2301.03652, 2023
102023
Exploiting novel gpt-4 apis
K Pelrine, M Taufeeque, M Zając, E McLean, A Gleave
arXiv preprint arXiv:2312.14302, 2023
82023
Preprocessing reward functions for interpretability
E Jenner, A Gleave
arXiv preprint arXiv:2203.13553, 2022
82022
DERAIL: Diagnostic Environments for Reward And Imitation Learning
P Freire, A Gleave, S Toyer, S Russell
Deep Reinforcement Learning Workshop at NeurIPS, 2020
82020
Reducing exploitability with population based training
P Czempin, A Gleave
arXiv preprint arXiv:2208.05083, 2022
52022
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20