U.C. Berkeley CS267/EngC233 Home Page

Applications of Parallel Computers

Spring 2010

T Th 9:30-11:00, 250 Sutardja Dai Hall

Instructors:

Jim Demmel

Offices:
564 Soda Hall ("Virginia"), (510)643-5386
831 Evans Hall, same phone number

Office Hours: W 1-2 and Th 1:30-2:30, in 564 Soda Hall (note change, dated 7 March 2010)

(send email)

Horst Simon

Lawrence Berkeley National Lab (LBNL), Building 50B, room 4230, (510)486-7377

Office Hours: T 8:30 - 9:30 and Th 11-12, 422 Sutardja Dai Hall

(send email)

Teaching Assistants:

Razvan Carbunescu

Office: Parlab, 5th floor, Soda Hall, cell: (225) 747-0405

Office Hours: W 3:30 - 5pm and Th 1:30 - 3pm, 576 Soda (Euclid, in the ParLab)

(send email)

Andrew Gearhart

Office: Parlab, 5th floor, Soda Hall, cell: (410) 259-1410

Office Hours: W 4 - 5pm and Th 1:30 - 3pm, 576 Soda (Euclid, in the ParLab)

(send email)

Administrative Assistants:

Tammy Johnson

Office: 565 Soda Hall

Phone: (510)643-4816

(send email)

Link to webcasting of lectures (Active during lectures only; see below under "Lecture Notes" for archived video).

(Jan 19) Due to technical difficulties, we will not be webcasting today. We will try to fix it next time.

To ask questions during live lectures, please email them to this address, which the teaching assistants will be monitoring during lecture.

Syllabus and Motivation

CS267 was originally designed to teach students how to program parallel computers to efficiently solve challenging problems in science and engineering, where very fast computers are required either to perform complex simulations or to analyze enormous datasets. CS267 is intended to be useful for students from many departments and with different backgrounds, although we will assume reasonable programming skills in a conventional (non-parallel) language, as well as enough mathematical skills to understand the problems and algorithmic solutions presented. CS267 satisfies part of the course requirements for a new Designated Emphasis ("graduate minor") in Computational Science and Engineering.

While this general outline remains, a large change in the computing world has started in the last few years: not only are the fastest computers parallel, but nearly all computers will soon be parallel, because the physics of semiconductor manufacturing will no longer let conventional sequential processors get faster year after year, as they have for so long (roughly doubling in speed every 18 months for many years). So all programs that need to run faster will have to become parallel programs. (It is considered very unlikely that compilers will be able to automatically find enough parallelism in most sequential programs to solve this problem.) For background on this trend toward parallelism, click here.

This will be a huge change not just for science and engineering but the entire computing industry, which has depended on selling new computers by running their users' programs faster without the users having to reprogram them. Large research activities to address this issue are underway at many computer companies and universities, including Berkeley's ParLab, whose research agenda is outlined here.

While the ultimate solutions to the parallel programming problem are far from determined, students in CS267 will get the skills to use some of the best existing parallel programming tools, and be exposed to a number of open research questions.

Tentative Detailed Syllabus

Grading

There will be several programming assignments to acquaint students with basic issues in memory locality and parallelism needed for high performance. Most of the grade will be based on a final project (in which students are encouraged to work in small interdisciplinary teams), which could involve parallelizing an interesting application, or developing or evaluating a novel parallel computing tool. Students are expected to have identified a likely project by mid semester, so that they can begin working on it. We will provide many suggestions of possible projects as the class proceeds.

Homeworks should be submitted by emailing them to cs267.spring2010.submissions@gmail.com.

Class Projects

You are welcome to suggest your own class project, but you may also look at the ParLab webpage for ideas, the Computational Research Division and NERSC webpages at LBL, or at the class posters and their brief oral presentations from CS267 in Spring 2009.

Announcements

(May 5) The poster session will be in the main hallway of the 5th floor of Soda Hall.

(Apr 17) The poster session will be on Thursday May 6 instead of Tuesday May 4. Final reports will be due Wednesday, May 12, by noon.

(Mar 9) Material at the end of Lecture 14 was updated to discuss some possible class projects

(Mar 7) Starting March 10, Prof. Demmel's Wednesday office hours will be 1-2pm instead of 2-3pm.

(Feb 5) The topics of the lectures scheduled for Feb 11 and 16 have been swapped, see the syllabus for details.

(Jan 28) There are two seminars of interest to CS267 students today: At 11am in the Wozniak Lounge, Soda Hall, Laurent Visconti of Microsoft will talk about "New abstractions for parallel linear algebra libraries." At 4pm in 306 Soda Hall, Phil Colella of LBL will talk about "Models, Algorithms, and Software: Tradeoffs in the Design of High-Performance Computational Simulations in Science and Engineering."

(Jan 28) Andrew Gearhart and Razvan Carbunescu will not have office hours today (Thursday, Jan 28).

(Jan 27) Andrew Gearhart has changed his office hours (see above).

(Jan 26) Prof. Demmel has to cancel his office hours on Thursday, Jan 28, 1:30-2:30pm

(Jan 21) Note the change in Prof. Demmel's office hours.

(Jan 19) Due to technical difficulties, there will no webcasting today. We will try to fix this next time.

(Jan 17) Homework Assignment 0 has been posted here, due Feb 2 by midnight.

(Jan 17) Homeworks should be submitted by emailing them to cs267.spring2010.submissions@gmail.com.

(Jan 17) Please fill out the following class survey.

(Jan 17) This course satisfies part of the course requirements for a new Designated Emphasis ("graduate minor") in Computational Science and Engineering.

(Jan 17) NERSC will host a workshop on programming their new supercomputer, the Cray XT5, from Feb 1-3. Students interested in attending should send email to Richard Gerber and say that they are CS267 students. This workshop is suitable for more experienced students.

(Jan 17) For students who want to try some on-line self-paced courses to improve basic programming skills, click here. You can use this material without having to register. In particular, courses like CS 9C (for programming in C) might be useful.

(Jan 17) This course will have students attending from two CITRIS campuses: UC Berkeley and UC Davis. CITRIS is generously providing the webcasting facilities and other resources to help run the course. Lectures will be webcast here (active during lectures only).

Class Resources and Homework Assignments

This will include, among other things, class handouts, homework assignments, the class roster, information about class accounts, pointers to documentation for machines and software tools we will use, reports and books on supercomputing, pointers to old CS267 class webpages (including old class projects), and pointers to other useful websites.

Lecture Notes

Notes from previous offerings of CS267 are posted on old class webpages available under Class Resources

In particular, the web page from the 1996 offering has detailed, textbook-style notes available on-line that are still largely up-to-date in their presentations of parallel algorithms (the slides to be posted during this semester will contain some more recently invented algorithms as well).

Lectures (power point and archived video) for lectures from Spr 2010 will be posted here.

Lecture 1 - Jan 19 - Introduction (in powerpoint), (not webcast)

Lecture 2 - Jan 21 - Single Processor Machines: Memory Hierarchies and Processor Features; Case Study: Tuning Matrix Multiply (in powerpoint), (video archive)

Lecture 3 - Jan 26 - Introduction to Parallel Machines and Programming Models (in powerpoint), (video archive)

Lecture 4 - Jan 28 - Finish Parallel Machines and Programming Models; Shared Memory Programming with Threads and OpenMP (in powerpoint), (video archive)

Lecture 5 - Feb 2 - Distributed memory machines and programming (in powerpoint), (video archive)

Lecture 6 - Feb 4 - Sources of Parallelism and Locality in Simulation - Part 1 (in powerpoint), (video archive)

Lecture 7 - Feb 9 (video archive)

Sources of Parallelism and Locality in Simulation - Part 2 (in powerpoint),

Tricks with Trees (in powerpoint),

Notes on Homework 1 (in powerpoint),

Lecture 8 - Feb 11 - Graph Partitioning (in powerpoint), (video archive)

Lecture 9 - Feb 16 - (video archive)

Complete Graph Partitioning (same slides as last lecture)

Real-time Knowledge Extraction from Massive Time-Series Datastreams, by Josh Bloom, (in pdf),

Lecture 10 - Feb 18 - An Introduction to CUDA/OpenCL and Manycore Graphics Processors, by Bryan Catanzaro, (in powerpoint-x), (video archive)

Lecture 11 - Feb 23 - Architecting Parallel Software with Patterns, by Kurt Keutzer, (in powerpoint-x), (video archive)

Lecture 12 - Feb 25 - Parallel Programming in UPC (Unified Parallel C) by Kathy Yelick, (in powerpoint), (video archive)

Lecture 13 - Mar 2 - Dense Linear Algebra, Part 1 (in powerpoint), (video archive)

Lecture 14 - Mar 4 - Dense Linear Algebra, Part 2 (in powerpoint) (updated March 9), (video archive)

Lecture 15 - Mar 9 - Automatic Performance Tuning and Sparse-Matrix-Vector-Multiplication (SpMV) (in powerpoint) (We will also discuss class projects using slides at the end of Lecture 14.) (video archive)

Lecture 16 - Mar 11 - Evolution of Processor Architecture, and the Implications for Performance Optimization (in powerpoint-x), by Sam Williams, (video archive)

Lecture 17 - Mar 16 - Sparse Matrix Methods on High Performance Computers (in powerpoint), by Xiaoye Sherry Li, (video archive)

Lecture 18 - Mar 18 - Structured Grids (in powerpoint), (video archive)

Lecture 19 - Mar 30 - Performance Analysis Tools by Karl Fuerlinger, (in powerpoint), (video archive)

Lecture 20 - Apr 1 - Fast Fourier Transform (in powerpoint), (video archive)

Sharks and Fish

"Sharks and Fish" are a collection of simplified simulation programs that illustrate a number of common parallel programming techniques in various programming languages (some current ones, and some old ones no longer in use).

Basic problem description, and (partial) code from 1999 class, written in Matlab, CMMD, CMF, Split-C, Sun Threads, and pSather, available here.

Code (partial) from 2004 class, written in MPI, pthreads, OpenMP, available here.