Xuechao Wei

Cited by

	All	Since 2019
Citations	849	766
h-index	11	10
i10-index	12	11

180

135

2016201720182019202020212022202320247 13 57 81 140 173 143 165 64

Public access

View all

9 articles

1 article

available

not available

Based on funding mandates

Co-authors

Yun (Eric) LiangProfessor of EECS, Peking University, ACM Distinguished ScientistVerified email at pku.edu.cn
Jingsheng Jason CongVolgenau Chair for Engineering Excellence, Computer Science and Electrical Engineering, UniversityVerified email at cs.ucla.edu
Cody (Hao) YuSoftware Engineer @ Anyscale | ex-Amazonian | UCLA PhD ‘19Verified email at anyscale.com
Peng ZhangComputer Science, University of Califronia, Los AngelesVerified email at cs.ucla.edu
Yuxin WangPhD student of Computer Science, Peking UniversityVerified email at pku.edu.cn
Guangyu SunSchool of Integrated Circuits, Peking UniversityVerified email at pku.edu.cn
Yuan XieChair Professor of Hong Kong University of Science and Technology (HKUST)Verified email at ust.hk
Xiuhong LiPeking UniversityVerified email at pku.edu.cn
Wentai ZhangPeking UniversityVerified email at pku.edu.cn
Tao WangPeking UniversityVerified email at pku.edu.cn
Songwu LuProfessor of Computer Science, UCLAVerified email at cs.ucla.edu
Mengjie MaoUniversity of PittsburghVerified email at pitt.edu
Zhe ZhouPhD. Candidate of Computer Architecture, Peking UniversityVerified email at pku.edu.cn

Xuechao Wei

Peking University

Verified email at pku.edu.cn

Computer Architecture


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs X Wei, CH Yu, P Zhang, Y Chen, Y Wang, H Hu, Y Liang, J Cong Proceedings of the 54th Annual Design Automation Conference 2017, 1-6, 2017	454	2017
Overcoming data transfer bottlenecks in FPGA-based DNN accelerators via layer conscious memory management X Wei, Y Liang, J Cong Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019	75	2019
TGPA: tile-grained pipeline architecture for low latency CNN inference X Wei, Y Liang, X Li, CH Yu, P Zhang, J Cong 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2018	74	2018
Systems and methods for systolic array design from a high-level program P Zhang, CH Yu, X Wei, P Pan US Patent 10,838,910, 2020	57	2020
Frequency improvement of systolic array-based CNNs on FPGAs J Zhang, W Zhang, G Luo, X Wei, Y Liang, J Cong 2019 IEEE International Symposium on Circuits and Systems (ISCAS), 1-4, 2019	41	2019
Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems X Wei, Y Liang, T Wang, S Lu, J Cong 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), 488-493, 2017	31	2017
{PetS}: A unified framework for {Parameter-Efficient} transformers serving Z Zhou, X Wei, J Zhang, G Sun 2022 USENIX Annual Technical Conference (USENIX ATC 22), 489-504, 2022	23	2022
Generating systolic array accelerators with reusable blocks L Jia, L Lu, X Wei, Y Liang IEEE Micro 40 (4), 85-92, 2020	20	2020
FlexBFS: a parallelism-aware implementation of breadth-first search on GPU G Liu, H An, W Han, X Li, T Sun, W Zhou, X Wei, X Tang Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012	18	2012
Gnnear: Accelerating full-batch training of graph neural networks with near-memory processing Z Zhou, C Li, X Wei, X Wang, G Sun Proceedings of the International Conference on Parallel Architectures and …, 2022	14	2022
FTDL: a tailored FPGA-overlay for deep learning with high scalability R Shi, Y Ding, X Wei, H Li, H Liu, HKH So, C Ding 2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020	11	2020
Gcnear: A hybrid architecture for efficient gcn training with near-memory processing Z Zhou, C Li, X Wei, G Sun arXiv preprint arXiv:2111.00680, 1-15, 2021	10	2021
ArchExplorer: Microarchitecture exploration via bottleneck analysis C Bai, J Huang, X Wei, Y Ma, S Li, H Zheng, B Yu, Y Xie Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023	5	2023
FTDL: An FPGA-tailored Architecture for Deep Learning Systems. R Shi, Y Ding, X Wei, H Liu, HKH So, C Ding FPGA, 320, 2020	5	2020
Efficient super-resolution system with block-wise hybridization and quantized winograd on fpga B Shi, J Zhang, Z He, X Wei, S Li, G Luo, H Zheng, Y Xie IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023	3	2023
2022 ICCAD CAD contest problem C: Microarchitecture design space exploration S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided …, 2022	3	2022
Distributed Control Independence for Composable Multi-processors M Mao, H An, T Sun, Q Li, B Deng, X Wei, J Zhou 2012 IEEE/ACIS 11th International Conference on Computer and Information …, 2012	3	2012
An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA Z Ma, T Dai, X Wei, G Luo ACM Transactions on Embedded Computing Systems 22 (6), 1-22, 2023	1	2023
Iccad cad contest 2022 S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie	1	2022
POSTER: RadiK: Scalable Radix Top-K Selection on GPUs Y Li, B Zhou, J Zhang, X Wei, Y Li, Y Chen Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors