Benchmarking and analyzing deep neural network training H Zhu, M Akrout, B Zheng, A Pelegris, A Jayarajan, A Phanishayee, ... 2018 IEEE International Symposium on Workload Characterization (IISWC), 88-100, 2018 | 152 | 2018 |
Tbd: Benchmarking and analyzing deep neural network training H Zhu, M Akrout, B Zheng, A Pelegris, A Phanishayee, B Schroeder, ... arXiv preprint arXiv:1803.06905, 2018 | 88 | 2018 |
Automatic horizontal fusion for GPU kernels A Li, B Zheng, G Pekhimenko, F Long 2022 IEEE/ACM International Symposium on Code Generation and Optimization …, 2022 | 48 | 2022 |
Echo: Compiler-based GPU memory footprint reduction for LSTM RNN training B Zheng, N Vijaykumar, G Pekhimenko 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 39 | 2020 |
DietCode: Automatic optimization for dynamic tensor programs B Zheng, Z Jiang, CH Yu, H Shen, J Fromm, Y Liu, Y Wang, L Ceze, ... Proceedings of Machine Learning and Systems 4, 848-863, 2022 | 33 | 2022 |
Hidet: Task-mapping programming paradigm for deep learning tensor programs Y Ding, CH Yu, B Zheng, Y Liu, Y Wang, G Pekhimenko Proceedings of the 28th ACM International Conference on Architectural …, 2023 | 22 | 2023 |
IDEAL: Image denoising accelerator M Mahmoud, B Zheng, AD Lascorz, F Heide, J Assouline, P Boucher, ... Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 19 | 2017 |
EcoRNN: Efficient computing of LSTM RNN training on gpus B Zheng, A Tiwari, N Vijaykumar, G Pekhimenko arXiv preprint arXiv:1805.08899, 2018 | 8 | 2018 |
Tempo: Accelerating transformer-based model training through memory footprint reduction M Andoorveedu, Z Zhu, B Zheng, G Pekhimenko Advances in Neural Information Processing Systems 35, 12267-12282, 2022 | 6 | 2022 |
DNN-Train: benchmarking and analyzing DNN training H Zhu, B Zheng, B Schroeder, G Pekhimenko, A Phanishayee Training 8, 16GBs, 2018 | 6 | 2018 |
Echo: Compiler-based gpu memory footprint reduction for lstm rnn training. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) B Zheng, N Vijaykumar, G Pekhimenko IEEE, 2020 | 5 | 2020 |
EcoRNN: Fused LSTM RNN Implementation with Data Layout Optimization B Zheng, A Nair, Q Wu, N Vijaykumar, G Pekhimenko arXiv preprint arXiv:1805.08899, 2018 | 2 | 2018 |
Grape: Practical and Efficient Graphed Execution for Dynamic Deep Neural Networks on GPUs B Zheng, CH Yu, J Wang, Y Ding, Y Liu, Y Wang, G Pekhimenko Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023 | 1 | 2023 |
EcoRNN: Efficient Computing of LSTM RNN on GPUs B Zheng, G Pekhimenko Memory 9, 1735-1780, 1997 | 1 | 1997 |
Automatic Compiler-based Optimizations for Deep Neural Networks B Zheng | | 2024 |
Domain-Specific Compilation MG Olabi, JG Luna, O Mutlu, W Hwu, I El Hajj, A Li, B Zheng, ... | | |
MiCRo 50 Author index A Jaleel, AJ Elmore, A Bhattacharjee, A Holmes, AJ McPadden, ... | | |
TBD SUITE: BENCHMARKING AND PROFILING TOOLS FOR DNNS XY Geoffrey, H Zhu, A Jayarajan, B Zheng, A Tiwari, G Pekhimenko | | |