Follow
Yanping Huang
Title
Cited by
Cited by
Year
Regularized evolution for image classifier architecture search
E Real, A Aggarwal, Y Huang, QV Le
Proceedings of the aaai conference on artificial intelligence 33 (01), 4780-4789, 2019
31672019
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
arXiv preprint arXiv:2210.11416, 2022
14502022
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Y Huang, Y Cheng, A Bapna, O Firat, MX Chen, D Chen, HJ Lee, J Ngiam, ...
Advances in Neural Information Processing Systems 32, 103--112, 2019
14092019
Lamda: Language models for dialog applications
R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ...
arXiv preprint arXiv:2201.08239, 2022
11242022
Palm 2 technical report
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 2023
6992023
Predictive coding
Y Huang, RPN Rao
Wiley Interdisciplinary Reviews: Cognitive Science 2 (5), 580-593, 2011
6722011
Gshard: Scaling giant models with conditional computation and automatic sharding
D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat, Y Huang, M Krikun, N Shazeer, ...
International Conference on Learning Representations (ICLR), 2020
6712020
Glam: Efficient scaling of language models with mixture-of-experts
N Du, Y Huang, AM Dai, S Tong, D Lepikhin, Y Xu, M Krikun, Y Zhou, ...
International Conference on Machine Learning, 5547-5569, 2022
432*2022
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
1972019
Alpa: Automating Inter-and Intra-Operator Parallelism for Distributed Deep Learning
L Zheng, Z Li, H Zhang, Y Zhuang, Z Chen, Y Huang, Y Wang, Y Xu, ...
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
1722022
H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei. 2022. Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, E Li, X Wang, ...
arXiv preprint arXiv:2210.11416, 2022
155*2022
Just pick a sign: Optimizing deep multitask models with gradient sign dropout
Z Chen, J Ngiam, Y Huang, T Luong, H Kretzschmar, Y Chai, D Anguelov
Advances in Neural Information Processing Systems 33, 2039-2050, 2020
1502020
Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition
Y Zhang, DS Park, W Han, J Qin, A Gulati, J Shor, A Jansen, Y Xu, ...
IEEE Journal of Selected Topics in Signal Processing 16 (6), 1519-1532, 2022
1352022
Mixture-of-experts with expert choice routing
Y Zhou, T Lei, H Liu, N Du, Y Huang, V Zhao, AM Dai, QV Le, J Laudon
Advances in Neural Information Processing Systems 35, 7103-7114, 2022
1102022
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
852021
Beyond distillation: Task-level mixture-of-experts for efficient inference
S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin, MT Luong, O Firat
arXiv preprint arXiv:2110.03742, 2021
712021
Designing effective sparse expert models
B Zoph, I Bello, S Kumar, N Du, Y Huang, J Dean, N Shazeer, W Fedus
arXiv preprint arXiv:2202.08906 2 (3), 17, 2022
682022
Building machine translation systems for the next thousand languages
A Bapna, I Caswell, J Kreutzer, O Firat, D van Esch, A Siddhant, M Niu, ...
arXiv preprint arXiv:2205.03983, 2022
512022
Neurons as Monte Carlo samplers: Bayesian Inference and Learning in Spiking Networks
Y Huang, RPN Rao
Advances in neural information processing systems 27, 1943-1951, 2014
502014
St-moe: Designing stable and transferable sparse expert models
B Zoph, I Bello, S Kumar, N Du, Y Huang, J Dean, N Shazeer, W Fedus
arXiv preprint arXiv:2202.08906, 2022
482022
The system can't perform the operation now. Try again later.
Articles 1–20