Paul Christiano

Cited by

	All	Since 2019
Citations	19002	17551
h-index	19	19
i10-index	22	19

7000

3500

1750

5250

201520162017201820192020202120222023202449 137 410 648 741 938 1102 1529 6684 6515

Paul Christiano

National Institute of Standards and Technology

Verified email at nist.gov - Homepage

Artificial Intelligence


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in neural information processing systems 35, 27730-27744, 2022	7973	2022
Concrete problems in AI safety D Amodei, C Olah, J Steinhardt, P Christiano, J Schulman, D Mané arXiv preprint arXiv:1606.06565, 2016	2746	2016
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in neural information processing systems 30, 2017	2510	2017
Learning to summarize with human feedback N Stiennon, L Ouyang, J Wu, D Ziegler, R Lowe, C Voss, A Radford, ... Advances in Neural Information Processing Systems 33, 3008-3021, 2020	1321	2020
Theano: A Python framework for fast computation of mathematical expressions R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, N Ballas, ... arXiv e-prints, arXiv: 1605.02688, 2016	1129*	2016
Fine-tuning language models from human preferences DM Ziegler, N Stiennon, J Wu, TB Brown, A Radford, D Amodei, ... arXiv preprint arXiv:1909.08593, 2019	1065	2019
Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs P Christiano, JA Kelner, A Madry, DA Spielman, SH Teng Proceedings of the forty-third annual ACM symposium on Theory of computing …, 2011	412	2011
A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models C Finn, P Christiano, P Abbeel, S Levine arXiv preprint arXiv:1611.03852, 2016	409	2016
Transfer from simulation to real world through learning deep inverse dynamics model P Christiano, Z Shah, I Mordatch, J Schneider, T Blackwell, J Tobin, ... arXiv preprint arXiv:1610.03518, 2016	263	2016
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	222	2021
Quantum money from hidden subspaces S Aaronson, P Christiano Proceedings of the forty-fourth annual ACM symposium on Theory of computing …, 2012	201	2012
A cryptographic test of quantumness and certifiable randomness from a single quantum device Z Brakerski, P Christiano, U Mahadev, U Vazirani, T Vidick Journal of the ACM (JACM) 68 (5), 1-47, 2021	166	2021
AI safety via debate G Irving, P Christiano, D Amodei arXiv preprint arXiv:1805.00899, 2018	142	2018
Unrestricted adversarial examples TB Brown, N Carlini, C Zhang, C Olsson, P Christiano, I Goodfellow arXiv preprint arXiv:1809.08352, 2018	104	2018
Model evaluation for extreme risks T Shevlane, S Farquhar, B Garfinkel, M Phuong, J Whittlestone, J Leung, ... arXiv preprint arXiv:2305.15324, 2023	94	2023
Supervising strong learners by amplifying weak experts P Christiano, B Shlegeris, D Amodei arXiv preprint arXiv:1810.08575, 2018	80	2018
Robust cooperation in the prisoner's dilemma: Program equilibrium via provability logic M Barasz, P Christiano, B Fallenstein, M Herreshoff, P LaVictoire, ... arXiv preprint arXiv:1401.5577, 2014	48*	2014
Sleeper agents: Training deceptive llms that persist through safety training E Hubinger, C Denison, J Mu, M Lambert, M Tong, M MacDiarmid, ... arXiv preprint arXiv:2401.05566, 2024	28	2024
Evaluating language-model agents on realistic autonomous tasks M Kinniment, LJK Sato, H Du, B Goodrich, M Hasin, L Chan, LH Miles, ... arXiv preprint arXiv:2312.11671, 2023	20	2023
Online local learning via semidefinite programming P Christiano Proceedings of the forty-sixth annual ACM symposium on Theory of computing …, 2014	19	2014

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by