Benchmarking and analyzing deep neural network training H Zhu, M Akrout, B Zheng, A Pelegris, A Jayarajan, A Phanishayee, ... 2018 IEEE International Symposium on Workload Characterization (IISWC), 88-100, 2018 | 138 | 2018 |
Tbd: Benchmarking and analyzing deep neural network training H Zhu, M Akrout, B Zheng, A Pelegris, A Phanishayee, B Schroeder, ... arXiv preprint arXiv:1803.06905, 2018 | 78 | 2018 |
Echo: Compiler-based GPU memory footprint reduction for LSTM RNN training B Zheng, N Vijaykumar, G Pekhimenko 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020 | 35 | 2020 |
Automatic horizontal fusion for GPU kernels A Li, B Zheng, G Pekhimenko, F Long 2022 IEEE/ACM International Symposium on Code Generation and Optimization …, 2022 | 22 | 2022 |
DietCode: Automatic optimization for dynamic tensor programs B Zheng, Z Jiang, CH Yu, H Shen, J Fromm, Y Liu, Y Wang, L Ceze, ... Proceedings of Machine Learning and Systems 4, 848-863, 2022 | 19 | 2022 |
IDEAL: Image denoising accelerator M Mahmoud, B Zheng, AD Lascorz, F Heide, J Assouline, P Boucher, ... Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 18 | 2017 |
EcoRNN: Efficient computing of LSTM RNN training on gpus B Zheng, A Tiwari, N Vijaykumar, G Pekhimenko arXiv preprint arXiv:1805.08899, 2018 | 8 | 2018 |
DNN-Train: benchmarking and analyzing DNN training H Zhu, B Zheng, B Schroeder, G Pekhimenko, A Phanishayee Training 8, 16GBs, 2018 | 6 | 2018 |
Hidet: Task-mapping programming paradigm for deep learning tensor programs Y Ding, CH Yu, B Zheng, Y Liu, Y Wang, G Pekhimenko Proceedings of the 28th ACM International Conference on Architectural …, 2023 | 4 | 2023 |
Ecornn: fused LSTM RNN implementation with data layout optimization B Zheng, A Nair, Q Wu, N Vijaykumar, G Pekhimenko arXiv preprint arXiv:1805.08899, 2018 | 2 | 2018 |
Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction M Andoorveedu, Z Zhu, B Zheng, G Pekhimenko Advances in Neural Information Processing Systems 35, 12267-12282, 2022 | 1 | 2022 |
Ecornn: Efficient computing of lstm rnn on gpus B Zheng, G Pekhimenko Memory 9, 1735-1780, 1997 | 1 | 1997 |
Grape: Practical and efficient graph-based executions for dynamic deep neural networks on GPUs B Zheng, C Yu, J Wang, Y Ding, Y Liu, Y Wang, G Pekhimenko | | 2023 |
Domain-Specific Compilation MG Olabi, JG Luna, O Mutlu, W Hwu, I El Hajj, A Li, B Zheng, ... | | |
MiCRo 50 Author index A Jaleel, AJ Elmore, A Bhattacharjee, A Holmes, AJ McPadden, ... | | |
TBD SUITE: BENCHMARKING AND PROFILING TOOLS FOR DNNS XY Geoffrey, H Zhu, A Jayarajan, B Zheng, A Tiwari, G Pekhimenko | | |