Shoaib Kamil

(skamil AT cs dot berkeley dot edu)

[Curriculum Vitae] [Research Statement] [Teaching Statement]


Research Interests

Scientific computing, programming systems for parallel productive programming, software engineering, auto-tuning, embedded DSLs, power-efficient parallel computing, software as a service (SaaS)

I am co-advised by Prof. Armando Fox and Prof. Kathy Yelick, and I work with the BeBOP Group in the Parallel Computing Laboratory. I am also affiliated with the Future Technologies Group at LBNL.

Projects

Asp (Asp is SEJITS for Python) - an implementation of Selective Embedded Just-in-Time Specialization for Python, which bridges the gap between productivity and performance using domain-specific embedded compilers. Asp's goal is to simplify the creation of DSLs in Python, and enable expert programmers in a domain (who are not language experts) to write DSLs or auto-tuned libraries appropriate for their domain. Current results show non-expert programmers can utilize these DSLs and auto-tuned libraries to meet or beat state-of-the-art hand-tuned low-level code, while still writing in a high-level productive language.

Stanza Triad - a modified version of STREAM Triad that tests the effectiveness of prefetch engines. Download v. 0.4

Stencil Probe - small easily-modifiable probe for simulating behavior of stencil applications. used as a testbed for evaluating optimizations for stencil codes.

Teaching

CS169: Software Engineering, Fall 2010 (Instructor: Armando Fox)

CS267: Applications of Parallel Computers, Fall 2008 (Instructor: Horst Simon)

CS164: Compilers and Programming Languages, Fall 2002 (Instructor: Richard Fateman)

CS170: Efficient Algorithms and Intractable Problems, Spring 2001 (Instructors: James Demmel and Jonathan Shewchuk)

Journal Papers

[1] Hardware/Software Co-design of Global Cloud System Resolving Models
M. F. Wehner, L. Oliker, J. Shalf, D. Donofrio, L. A. Drummond, R. Heikes, S. Kamil, C. Kono, N. Miller, H. Miura, M. Mohiyuddin, D. Randall, W.-S. Yang
Journal of Advances in Modeling Earth Systems, 2011.
PDF

[2] Communication Requirements and Interconnect Optimization for High-End Scientific Applications
Shoaib Kamil, Leonid Oliker, Ali Pinar, John Shalf
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2009
PDF

[3] Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors
Kaushik Datta, Shoaib Kamil, Sam Williams, Leonid Oliker, John Shalf, Katherine Yelick
SIAM Review, 2009
PDF

[4] Scientific Computing Kernels on the Cell Processor
Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, Katherine Yelick
International Journal of Parallel Programming (IJPP), 2007

Peer-Reviewed Papers

[26] Auto-tuning the Matrix Powers Kernel with SEJITS
J. Morlan, S.Kamil, A. Fox
Seventh International Workshop on Automatic Performance Tuning (iWAPT) 2012, to appear.

[5] Portable Parallel Performance from Sequential, Productive, Embedded Domain Specific Languages
S. Kamil, D. Coetzee, S. Beamer, H. Cook, E. Gonina, J. Harper, J. Morlan, A. Fox
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2012 (extended abstract).

[6] Bringing Parallel Performance to Python with Domain-Specific Selective Embedded Just-in-Time Specialization
Shoaib Kamil, Derrick Coetzee, Armando Fox
10th Python for Scientific Computing Conference, 2011.
PDF

[7] CUDA-level Performance with Python-level Productivity for Gaussian Mixture Model Applications
H. Cook, E. Gonina, S. Kamil, G. Friedland, D. Patterson, A. Fox
USENIX Workshop on Hot Topics in Parallelism (HotPar), 2011
PDF

[8] Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
G. Hendry, J. Chan, S. Kamil, L. Oliker, J. Shalf, L. P. Carloni, K. Bergman
IEEE Symposium on High Performance Interconnects (HOTI), 2010.
PDF

[9] An Auto-tuning Framework for Parallel Multicore Stencil Computations
Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, Samuel Williams
IEEE International Parallel and Distributed Processing Symposium (IPDPS 2010), April 2010

[10] SEJITS: Getting Productivity and Performance with Selective Embedded JIT Specialization
Bryan Catanzaro, Shoaib Kamil, Yunsup Lee, Krste Asanovic, James Demmel, Kurt Keutzer, John Shalf, Kathy Yelick, Armando Fox
First Workshop on Programming Models for Emerging Architectures (PMEA), September 2009

[11] A Generalized Framework for Auto-tuning Stencil Computations
Shoaib Kamil, Cy Chan, Sam Williams, Leonid Oliker, John Shalf, Mark Howison, E. Wes Bethel, Prabhat
Cray User Group Conference, 2009
Best Paper Award

[12] Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications
Gilbert Hendry, Shoaib Kamil, A. Biberman, J. Chan, B. Lee, M. Mohiyuddin, A. Jain, K. Bergman, L. Carloni, J. Kubiatowicz, L. Oliker, J. Shalf
International Symposium on Networks-on-Chip (NOCS), 2009

[13] Power Efficiency in High Performance Computing
Shoaib Kamil, John Shalf, Erich Strohmaier
International Parallel & Distributed Processing Symposium, 2008
PS/PDF

[14] Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs
Ankit Jain, Shoaib Kamil, Marghoob Mohiyuddin, John Shalf, John Kubiatowicz
High Performance Embedded Computing Conference, 2008

[15] Reconfigurable Hybrid Interconnection for Static and Dynamic Scientific Applications
Shoaib Kamil, Ali Pinar, Daniel Gunter, Michael Lijewski, Leonid Oliker, John Shalf
ACM International Conference on Computing Frontiers, 2007
PDF

[16] Scientific Application Performance on Candidate PetaScale Platforms
Leonid Oliker, Andrew Canning, Jonathan Carter, Costin Iancu, Michael Lijewski, Shoaib Kamil, John Shalf, H. Shan, Erich Strohmaier, Stephane Ethier, Tim Goodale
International Parallel & Distributed Processing Symposium (IPDPS) 2007
Best Paper Award
PDF

[17] Implicit and Explicit Optimizations for Stencil Computations
Shoaib Kamil, Kaushik Datta, Samuel Williams, Leonid Oliker, John Shalf, Katherine Yelick
Memory Systems Performance and Correctness (MSPC) 2006
PDF

[18] The Potential of the Cell Processor for Scientific Computing
Sam Williams, John Shalf, Parry Husbands, Shoaib Kamil, Leonid Oliker, Katherine Yelick
Computing Frontiers, 2006
PDF

[19] Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect
John Shalf, Shoaib Kamil, Leonid Oliker, David Skinner
Proceedings of the IEEE Conference on Supercomputing, 2005
PDF

[20] Understanding Ultra-Scale Application Communication Requirements
Shoaib Kamil, Leonid Oliker, John Shalf, David Skinner
IEEE International Symposium on Workload Characterization (IISWC) 2005
PDF

[21] Impact of Modern Memory Subsystems on Cache Optimizations for Stencil Computations
Shoaib Kamil, Parry Husbands, Leonid Oliker, John Shalf, Katherine Yelick
3rd Annual ACM SIGPLAN Workshop on Memory Systems Performance (MSP) 2005
PDF

[22] Performance Optimizations and Bounds for Sparse Matrix-Vector Multiply
Richard Vuduc, James W. Demmel, Katherine A. Yelick, Shoaib Kamil, Rajesh Nishtala, Benjamin Lee
Proceedings of the IEEE/ACM Conference on Supercomputing, 2002
PDF

[23] Automatic Performance Tuning and Analysis of Sparse Triangular Solve
Richard Vuduc, Shoaib Kamil, Jen Hsu, Rajesh Nishtala, James W. Demmel, Katherine A. Yelick
ICS 2002: Workshop on Performance Optimization via High-Level Languages and Libraries
PDF

Other Papers

[27] Parallel High Performance Statistical Bootstrapping in Python
Aakash Prasad, David Howard, Shoaib Kamil, Armando Fox
11th Annual Scientific Computing with Python Conference, July 2012 (to appear).

[25] Ubiquitous Dynamic Code Generation and Compilation on Future Computing Devices
Shoaib Kamil and Armando Fox
Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012), Provocative Ideas Session, to appear.

[24] Energy-Efficient Computing for Extreme Scale Science
David Donofrio, Leonid Oliker, John Shalf, Michael Wehner, Chris Rowen, Jens Krueger, Shoaib Kamil, Marghoob Mohiyuddin
IEEE Computer Magazine, November 2009 (non-peer-reviewed)
Link