Richard Vuduc
Title
Cited by
Cited by
Year
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
S Williams, L Oliker, R Vuduc, J Shalf, K Yelick, J Demmel
SC'07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 1-12, 2007
8732007
OSKI: A library of automatically tuned sparse matrix kernels
R Vuduc, JW Demmel, KA Yelick
Journal of Physics: Conference Series 16 (1), 521, 2005
6012005
Model-driven autotuning of sparse matrix-vector multiply on GPUs
JW Choi, A Singh, RW Vuduc
ACM sigplan notices 45 (5), 115-126, 2010
4452010
Sparsity: Optimization framework for sparse matrix kernels
EJ Im, K Yelick, R Vuduc
The International Journal of High Performance Computing Applications 18 (1 …, 2004
3612004
Automatic performance tuning of sparse matrix kernels
RW Vuduc, JW Demmel
University of California, Berkeley, 2003
2922003
Self-adapting linear algebra algorithms and software
J Demmel, J Dongarra, V Eijkhout, E Fuentes, A Petitet, R Vuduc, ...
Proceedings of the IEEE 93 (2), 293-312, 2005
2482005
A performance analysis framework for identifying potential benefits in GPGPU applications
J Sim, A Dasgupta, H Kim, R Vuduc
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012
2132012
A massively parallel adaptive fast-multipole method on heterogeneous architectures
I Lashuk, A Chandramowlishwaran, H Langston, TA Nguyen, R Sampath, ...
Proceedings of the Conference on High Performance Computing Networking …, 2009
2012009
Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures
A Rahimian, I Lashuk, S Veerapaneni, A Chandramowlishwaran, ...
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
1792010
Fast sparse matrix-vector multiplication by exploiting variable block structure
R Vuduc, HJ Moon
High Performance Computing and Communications, 807-816, 2005
1662005
Performance optimizations and bounds for sparse matrix-vector multiply
R Vuduc, JW Demmel, KA Yelick, S Kamil, R Nishtala, B Lee
SC'02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, 26-26, 2002
1612002
On the limits of GPU acceleration
R Vuduc, A Chandramowlishwaran, J Choi, M Guney, A Shringarpure
Proceedings of the 2nd USENIX conference on Hot topics in parallelism 13, 2010
1592010
Many-thread aware prefetching mechanisms for GPGPU applications
J Lee, NB Lakshminarayana, H Kim, R Vuduc
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 213-224, 2010
1522010
A roofline model of energy
JW Choi, D Bedard, R Fowler, R Vuduc
2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013
1492013
Falcon: fault localization in concurrent programs
S Park, RW Vuduc, MJ Harrold
Proceedings of the 32nd ACM/IEEE International Conference on Software …, 2010
1482010
POET: Parameterized optimizations for empirical tuning
Q Yi, K Seymour, H You, R Vuduc, D Quinlan
2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007
1292007
Statistical models for empirical search-based performance tuning
R Vuduc, JW Demmel, JA Bilmes
International Journal of High Performance Computing Applications 18 (1), 65-94, 2004
1292004
When prefetching works, when it doesn’t, and why
J Lee, H Kim, R Vuduc
ACM Transactions on Architecture and Code Optimization (TACO) 9 (1), 1-29, 2012
1272012
When cache blocking of sparse matrix vector multiply works and why
R Nishtala, RW Vuduc, JW Demmel, KA Yelick
Applicable Algebra in Engineering, Communication and Computing 18 (3), 297-311, 2007
1212007
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
S Venkatasubramanian, RW Vuduc, none none
Proceedings of the 23rd international conference on Supercomputing, 244-255, 2009
1112009
The system can't perform the operation now. Try again later.
Articles 1–20