SCNN: An accelerator for compressed-sparse convolutional neural networks A Parashar, M Rhu, A Mukkara, A Puglielli, R Venkatesan, B Khailany, ... ACM SIGARCH computer architecture news 45 (2), 27-40, 2017 | 1471 | 2017 |
GPUs and the future of parallel computing SW Keckler, WJ Dally, B Khailany, M Garland, D Glasco IEEE micro 31 (5), 7-17, 2011 | 848 | 2011 |
Imagine: Media processing with streams B Khailany, WJ Dally, UJ Kapasi, P Mattson, J Namkoong, JD Owens, ... IEEE micro 21 (2), 35-46, 2001 | 506 | 2001 |
Timeloop: A systematic approach to dnn accelerator evaluation A Parashar, P Raina, YS Shao, YH Chen, VA Ying, A Mukkara, ... 2019 IEEE international symposium on performance analysis of systems and …, 2019 | 460 | 2019 |
Programmable stream processors UJ Kapasi, S Rixner, WJ Dally, B Khailany, JH Ahn, P Mattson, JD Owens Computer 36 (8), 54-62, 2003 | 449 | 2003 |
Simba: Scaling deep-learning inference with multi-chip-module-based architecture YS Shao, J Clemons, R Venkatesan, B Zimmer, M Fojtik, N Jiang, B Keller, ... Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 446 | 2019 |
Register organization for media processing S Rixner, WJ Dally, B Khailany, P Mattson, UJ Kapasi, JD Owens Proceedings Sixth International Symposium on High-Performance Computer …, 2000 | 414 | 2000 |
The Imagine stream processor UJ Kapasi, WJ Dally, S Rixner, JD Owens, B Khailany Proceedings. IEEE International Conference on Computer Design: VLSI in …, 2002 | 367 | 2002 |
A bandwidth-efficient architecture for media processing S Rixner, WJ Dally, UJ Kapasi, B Khailany, A Lopez-Lagunas, PR Mattson, ... Proceedings. 31st Annual ACM/IEEE International Symposium on …, 1998 | 353 | 1998 |
Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement Y Lin, S Dhar, W Li, H Ren, B Khailany, DZ Pan Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019 | 263 | 2019 |
CudaDMA: optimizing GPU memory bandwidth via warp specialization M Bauer, H Cook, B Khailany Proceedings of 2011 international conference for high performance computing …, 2011 | 218 | 2011 |
Evaluating the imagine stream architecture JH Ahn, WJ Dally, B Khailany, UJ Kapasi, A Das ACM SIGARCH Computer Architecture News 32 (2), 14, 2004 | 196 | 2004 |
Unifying primary cache, scratch, and register file memories in a throughput processor M Gebhart, SW Keckler, B Khailany, R Krashinsky, WJ Dally 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 96-106, 2012 | 163 | 2012 |
A programmable 512 GOPS stream processor for signal, image, and video processing BK Khailany, T Williams, J Lin, EP Long, M Rygh, DFW Tovey, WJ Dally IEEE Journal of solid-state circuits 43 (1), 202-213, 2008 | 146 | 2008 |
Efficient conditional operations for data-parallel architectures UJ Kapasi, WJ Dally, S Rixner, PR Mattson, JD Owens, B Khailany Proceedings of the 33rd annual ACM/IEEE International Symposium on …, 2000 | 142 | 2000 |
Magnet: A modular accelerator generator for neural networks R Venkatesan, YS Shao, M Wang, J Clemons, S Dai, M Fojtik, B Keller, ... 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2019 | 133 | 2019 |
High performance graph convolutional networks with applications in testability analysis Y Ma, H Ren, B Khailany, H Sikka, L Luo, K Natarajan, B Yu Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019 | 133 | 2019 |
Chipnemo: Domain-adapted llms for chip design M Liu, TD Ene, R Kirby, C Cheng, N Pinckney, R Liang, J Alben, H Anand, ... arXiv preprint arXiv:2311.00176, 2023 | 112 | 2023 |
Stream processors: Progammability and efficiency: Will this new kid on the block muscle out ASIC and DSP? WJ Dally, UJ Kapasi, B Khailany, JH Ahn, A Das Queue 2 (1), 52-62, 2004 | 107 | 2004 |
GRANNITE: Graph neural network inference for transferable power estimation Y Zhang, H Ren, B Khailany 2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020 | 106 | 2020 |