Paper: Scaling Communication Intensive Applications on BlueGene/P Using One-Sided Communication and Overlap (International Parallel and Distributed Processing Symposium (IPDPS) 2009, Rome, Italy, May 2009) Rajesh Nishtala, Paul Hargrove, Dan Bonachea, and Katherine Yelick
Paper: PDF (289KB)
Paper: Optimizing Collective Communication on Multicores (HotPar 2009, Berkeley, CA, USA, March 2009) Rajesh Nishtala and Katherine Yelick
Paper: PDF (400kB)
Paper: Performance without Pain = Productivity, Data layouts and Collectives in UPC (Principles and Practices of Parallel Programming (PPoPP) 2008 , Salt Lake City, USA, February 2008) Rajesh Nishtala, George Almasi, Calin Cascaval
Paper (External Website): PDF (320k)
Talk Slides: PDF (5.8MB)
Poster: Optimized Collectives for PGAS Languages with One-Sided Communication (Supercomputing, Tampa Bay, USA, November 2006)
Dan O. Bonachea, Rajesh Nishtala, Paul Hargrove, Mike Welcome, Katherine Yelick PDF (386kB)
Talk: Efficient Point-to-point Synchronization in UPC (Partitioned Global Address Space Programming Models, Washington DC, USA, October 2006)
Dan Bonachea, Rajesh Nishtala, Paul Hargrove, Katherine Yelick Abstract (PDF 37kB)
Talk Slides: PPT (2.9MB)PDF (945kb)
Masters Report: Architectural Probes for Measuring Communication Overlap Potential (submitted May 19th, 2006 for Master of Science Degree) Rajesh Nishtala PDF (0.5MB)
Paper: Optimizing Bandwidth Limited Problems Using One-Sided Communication and Overlap ( International Parallel and Distributed Processing Symposium (IPDPS) 2006 , Rhodes, Greece, April 2006)
Christian Bell, Dan Bonachaea, Rajesh Nishtala, Katherine Yelick PDF (300kB) Talk Slides (1MB)
Poster: The Performance and Productivity Benefits of Global Address Space Languages (Supercomputing, Seattle, USA, November 2005)
Dan O. Bonachea, Christian Bell, Rajesh Nishtala, Kaushik Datta, Parry Husbands, Paul Hargrove, Katherine Yelick PDF (2.9MB)
Poster: Automatic Tuning of Collective Communications in MPI (SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, USA, February 2004) Rajesh Nishtala, Kushal Chakrabarti, Neil Patel, Kaushal Sanghavi, James Demmel, Katherine Yelick, and Eric Brewer PowerPoint (6MB)
Journal Paper: When Cache Blocking Sparse Matrix Vector Multiply Works and Why (Applicable Algebra in Engineering, Communication and Computing, March 2007) Rajesh Nishtala, Richard W. Vuduc, James W. Demmel, Katherine Yelick Journal Website
Tech Report: Performance Modeling and Analysis of Cache Blocking in Sparse Matrix Vector Multiply
(UCB/CSD-04-1335, June, 2004.)
Rajesh Nishtala, Richard W. Vuduc, James W. Demmel, Katherine A. Yelick
PDF (~8MB)
Talk: When Cache Blocking Sparse Matrix Multiply Works
and Why
(PARA'04 Workshop on State-of-the-art in Scientific Computing, Copenhagen, Denmark, June 2004) Rajesh Nishtala, Richard Vuduc, James Demmel, Katherine Yelick.
1 Page Abstract: PDF (61K) 7 Page Abstract: PDF (113K) Talk Slides: PPT (3MB)
Paper: Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply
(Proceedings of the IEEE/ACM Conference on Supercomputing, 2002, Baltimore, MD, USA, November 2002.)
Richard Vuduc, James W. Demmel, Katherine A. Yelick, Shoaib Kamil, Rajesh Nishtala, Benjamin Lee. PDF (630k)