Jianbin Fang
Cited by
Cited by
A comprehensive performance comparison of CUDA and OpenCL
J Fang, AL Varbanescu, H Sips
2011 International Conference on Parallel Processing, 216-225, 2011
Test-Driving Intel Xeon Phi
J Fang, H Sips, L Zhang, C Xu, C Yonggang, AL Varbanescu
The 5th ACM/SPEC International Conference on Performance Engineering, 2014
Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer
C Xu, X Deng, L Zhang, J Fang, G Wang, Y Jiang, W Cao, Y Che, Y Wang, ...
Journal of Computational Physics 278, 275-297, 2014
Performance gaps between OpenMP and OpenCL for multi-core CPUs
J Shen, J Fang, H Sips, AL Varbanescu
2012 41st International Conference on Parallel Processing Workshops, 116-125, 2012
An empirical study of intel xeon phi
J Fang, AL Varbanescu, H Sips, L Zhang, Y Che, C Xu
arXiv preprint arXiv:1310.5842, 2013
An application-centric evaluation of OpenCL on multi-core CPUs
J Shen, J Fang, H Sips, AL Varbanescu
Parallel Computing 39 (12), 834-850, 2013
Performance traps in OpenCL for CPUs
J Shen, J Fang, H Sips, AL Varbanescu
2013 21st Euromicro International Conference on Parallel, Distributed, and …, 2013
Parallel programming models for heterogeneous many-cores: a comprehensive survey
J Fang, C Huang, T Tang, Z Wang
CCF Transactions on High Performance Computing 2, 382-400, 2020
Moving from exascale to zettascale computing: challenges and techniques
X Liao, K Lu, C Yang, J Li, Y Yuan, M Lai, L Huang, P Lu, J Fang, J Ren, ...
Frontiers of Information Technology & Electronic Engineering 19, 1236-1244, 2018
Adaptive Optimization of Sparse Matrix-Vector Multiplication on Emerging Many-Core Architectures
S Chen, J Fang, D Chen, C Xu, Z Wang
The 20th IEEE International Conference on High Performance Computing and …, 2018
Deep learning research and development platform: Characterizing and scheduling with qos guarantees on gpu clusters
Z Chen, W Quan, M Wen, J Fang, J Yu, C Zhang, L Luo
IEEE Transactions on Parallel and Distributed Systems 31 (1), 34-50, 2019
Auto-tuning Streamed Applications on Intel Xeon Phi
P Zhang, J Fang, T Tang, C Yang, Z Wang
The 31st IEEE International Parallel & Distributed Processing Symposium, 2018
Deep program structure modeling through multi-relational graph-based learning
G Ye, Z Tang, H Wang, D Fang, J Fang, S Huang, Z Wang
Proceedings of the ACM International conference on parallel architectures …, 2020
FlowGAN: A conditional generative adversarial network for flow prediction in various conditions
D Chen, X Gao, C Xu, S Chen, J Fang, Z Wang, Z Wang
2020 IEEE 32nd international conference on tools with artificial …, 2020
Proteus: Network-aware web browsing on heterogeneous mobile systems
J Ren, X Wang, J Fang, Y Feng, D Zhu, Z Luo, J Zheng, Z Wang
Proceedings of the 14th International Conference on emerging Networking …, 2018
Benchmarking intel xeon phi to guide kernel design
J Fang, AL Varbanescu, H Sips, L Zhang, Y Che, C Xu
Delft University of Technology Parallel and Distributed Systems Report …, 2013
Optimizing Sparse Matrix-Vector Multiplications on An ARMv8-based Many-Core Architecture
D Chen, J Fang, S Chen, C Xu, Z Wang
International Journal of Parallel Programming, 2018
To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference
Q Qing, J Ren, J Yu, L Gao, H Wang, J Zheng, Y Feng, J Fang, Z Wang
The 16th IEEE International Symposium on Parallel and Distributed Processing …, 2018
FlowDNN: a physics-informed deep neural network for fast and accurate flow prediction
D Chen, X Gao, C Xu, S Wang, S Chen, J Fang, Z Wang
Frontiers of Information Technology & Electronic Engineering 23 (2), 207-219, 2022
LIBSHALOM: Optimizing small and irregular-shaped matrix multiplications on ARMv8 multi-cores
W Yang, J Fang, D Dong, X Su, Z Wang
Proceedings of the International Conference for High Performance Computing …, 2021
The system can't perform the operation now. Try again later.
Articles 1–20