Zeyuan Allen-Zhu

Cited by

	All	Since 2019
Citations	14125	12080
h-index	44	39
i10-index	59	58

3600

1800

900

2700

2011201220132014201520162017201820192020202120222023202437 62 85 138 155 287 474 744 1111 1432 1648 1907 3589 2369

Public access

View all

18 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Yuanzhi LiAssistant Professor at CMUVerified email at andrew.cmu.edu
Weizhu ChenMicrosoftVerified email at microsoft.com
Lorenzo OrecchiaUniversity of Chicago, Computer ScienceVerified email at bu.edu
Edward HuOpenAIVerified email at openai.com
Phillip WallisAmazonVerified email at amazon.com
Elad HazanProfessor at Princeton University and Director Google AI PrincetonVerified email at princeton.edu
Yang YuanTsinghua UniversityVerified email at tsinghua.edu.cn
Zhao SongAdobe ResearchVerified email at ias.edu
Yelong ShenMicrosoftVerified email at microsoft.com
Alessandro ChiesaEPFLVerified email at epfl.ch
Chenguang ZhuHead of Zoom GenAI ScienceVerified email at zoom.us
zheng chenMicrosoftVerified email at microsoft.com
Sebastien BubeckVP GenAI Research, Microsoft AIVerified email at microsoft.com
Naman AgarwalSenior Research Scientist, Google AI PrincetonVerified email at google.com
Tengyu MAStanford UniversityVerified email at stanford.edu
Brian BullinsAssistant Professor, Purdue UniversityVerified email at purdue.edu
Zhenyu LiaoApplied Scientist in Amazon Inc.Verified email at amazon.com
Pinyan LuITCS, Shanghai University of Finance and EconomicsVerified email at mail.shufe.edu.cn
Xiaorui SunUniversity of Illinois at ChicagoVerified email at uic.edu
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC BerkeleyVerified email at cs.berkeley.edu

Zeyuan Allen-Zhu

Meta AI / FAIR Labs

Verified email at csail.mit.edu - Homepage

Language Models Machine Learning Optimization Algorithms Theory


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
LoRA: Low-rank adaptation of large language models EJ Hu, Y Shen, P Wallis, Z Allen-Zhu, Y Li, S Wang, L Wang, W Chen ICLR 2022: International Conference on Learning Representations, 2022	3912	2022
A convergence theory for deep learning via over-parameterization Z Allen-Zhu, Y Li, Z Song ICML 2019: International Conference on Machine Learning, 2019	1483	2019
Is Q-learning Provably Efficient? C Jin, Z Allen-Zhu, S Bubeck, MI Jordan NIPS 2018: Neural Information Processing Systems, 2018	879	2018
Learning and generalization in overparameterized neural networks, going beyond two layers Z Allen-Zhu, Y Li, Y Liang NeurIPS 2019: Neural Information Processing Systems, 2019	812	2019
Katyusha: the first direct acceleration of stochastic gradient methods Z Allen-Zhu STOC 2017: Symposium on Theory of Computing, 19-23, 2017	673	2017
Variance reduction for faster non-convex optimization Z Allen-Zhu, E Hazan ICML 2016: International Conference on Machine Learning, 699-707, 2016	421	2016
Linear coupling: An ultimate unification of gradient and mirror descent Z Allen-Zhu, L Orecchia ITCS 2017: Innovations in Theoretical Computer Science, 2017	375	2017
Finding approximate local minima faster than gradient descent N Agarwal, Z Allen-Zhu, B Bullins, E Hazan, T Ma STOC 2017: Symposium on Theory of Computing, 1195-1199, 2017	335*	2017
Towards understanding ensemble, knowledge distillation and self-distillation in deep learning Z Allen-Zhu, Y Li ICLR 2023: International Conference on Learning Representations, 2023	333	2023
Byzantine Stochastic Gradient Descent D Alistarh, Z Allen-Zhu, J Li NIPS 2018: Neural Information Processing Systems, 2018	302	2018
A simple, combinatorial algorithm for solving SDD systems in nearly-linear time JA Kelner, L Orecchia, A Sidford, ZA Zhu STOC 2013: Symposium on Theory of Computing, 911-920, 2013	289	2013
Natasha 2: Faster Non-Convex Optimization Than SGD Z Allen-Zhu NIPS 2018: Neural Information Processing Systems, 2018	254	2018
Improved SVRG for non-strongly-convex or sum-of-non-convex objectives Z Allen-Zhu, Y Yuan ICML 2016: International Conference on Machine Learning, 1080-1089, 2016	226	2016
What Can ResNet Learn Efficiently, Going Beyond Kernels? Z Allen-Zhu, Y Li NeurIPS 2019: Neural Information Processing Systems, 2019	208	2019
Even faster accelerated coordinate descent using non-uniform sampling Z Allen-Zhu, Z Qu, P Richtárik, Y Yuan ICML 2016: International Conference on Machine Learning, 1110-1119, 2016	207	2016
On the convergence rate of training recurrent neural networks Z Allen-Zhu, Y Li, Z Song NeurIPS 2019: Neural Information Processing Systems, 2019	190	2019
Asymptotically optimal strategy-proof mechanisms for two-facility games P Lu, X Sun, Y Wang, ZA Zhu ACM-EC 2010: Conference on Economics and Computation, 315-324, 2010	189	2010
Neon2: Finding Local Minima via First-Order Oracles Z Allen-Zhu, Y Li NIPS 2018: Neural Information Processing Systems, 2018	149	2018
Feature purification: How adversarial training performs robust deep learning Z Allen-Zhu, Y Li FOCS 2021: Symposium on Foundations of Computer Science, 977-988, 2022	148	2022
LazySVD: Even faster SVD decomposition yet without agonizing pain Z Allen-Zhu, Y Li NIPS 2016: Neural Information Processing Systems, 974-982, 2016	135	2016

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors