Reference Material for CS 267

Two sets of paper copies of documentation and papers related to parallel computing will be available. One set will be in the Soda Hall Reading Room (681 Soda Hall), for which cardkey access is required, and a second set will be on reserve in Bechtel Engineering Library.

The material reproduced here consists of CM-5 documentation (all of which is available on line as well), a number of language manuals (also available on line), journal articles, and tech reports.

There is also a printed copy of the CS267 lecture notes from 1995, in the form in which they were used for a NSF-CBMS Short Course in summer 1995. As of the beginning of the 1996 semester, these notes are largely identical to the 1996 lecture notes. The 1996 on-line notes will be updated throughout the semester.

The online documents have a mirror site at ICSI, and can also be accessed on a CM5, e.g., in the directory, or using the online documentation program for the CM5, called cmview.

Volume 1: CM-5 Overview

  • CM-5 Supercomputing and Parallelism -- slides from an overview talk
  • CM-5 technical summary documents
  • Getting started in CM Fortran
  • CMMD reference manual, version 3.0
  • Volume 2: CM Fortran. (See also CMFortran Version 2.1 Detailed Release Notes.)

  • CM Fortran Reference Manual
  • Volume 3: CMSSL. (See also CMSSL CM5 Scientific Software Library (CMSSL).)

  • CMSSL (CM Scientific Subroutine Library) Manuals, v 1,2
  • Volume 4:

  • Matlab Primer
  • Split-C home page, Culler, Yelick, et al
  • The PVM (Parallel Virtual Machine) Concurrent Computing System, Sunderam, Geist, Dongarra, Manchek (overview of most popular, and lowest level, parallel programming software available)
  • PVM Quick Reference Guide
  • PVM Manual
  • The pSather home page
  • Volume 5:

  • MPI (Message Passing Interface) Standard
  • HPF (High Performance Fortran) Language Specification
  • Volume 6:

  • CS 267 Class notes from 1991, Demmel
  • Misleading Performance Reporting in the Supercomputing Field, Bailey (how to lie with statistics)
  • The Architecture of Problems and Portable Parallel Software Systems, Fox (tries to categorize difference problems types arising in engineering and scientific applications according to the style of parallelism suitable for them)
  • Cray XMP - The Birth of a Supecomputer, August et al (brief overview of architecture)
  • Stanford DASH Multiprocessor, Lenoski et al (brief overview of architecture, precursor for SGI Power challenge and other shared memory parallel machines)
  • A Case for NOW, Anderson, Culler, Patterson et al (why networks of workstations are the future, maybe)
  • Exploiting functional parallelism of POWER2 [IBM RS6000] to design high-performance numerical algorithms, Agarwal, Gustavson, Zubair.
  • IEEE Floating Point Arithmetic Standard (how almost all computers supposedly do floating arithmetic, terse and short, without background)
  • What Every Computer Scientist Should Know About Floating Point Arithmetics But Was Afraid To Ask, Goldberg (background on floating point)
  • LogP: Towards a realistic model of parallel computation, Culler, Karp, Patterson et al (a simple formal model to characterize the cost of parallel programs) (See LogP: Towards a Realistic Model of Parallel Computation )
  • A Comparison of Sorting Algorithms for the CM-2, Blelloch et al
  • An Improved Supercomputer Sorting Benchmark, Thearling and Smith
  • Volume 7:

  • Parallel Numerical Linear Algebra, Demmel, Heath, van der Vorst (detailed survey of the field as of 1993).
  • Designing High Performance Linear Algebra Software for Parallel Computer, Demmel (overview of LAPACK and ScaLAPACK projects)
  • Compiler Transformationns for High Performance Computing Bacon, Graham, Sharp (available through anonymous ftp from
  • A hierarchical O(N log N) force calculation algorithm Barnes and Hut (the first paper to show how to compute the gravitational or electrostatic forces on N particles in fewer than N**2 operations, a major breakthrough in scientific computing)
  • A fast algorithm for Particle Simulations Greengard and Rokhlin - second major paper following Barnes/Hut
  • Implications of Hierarchical N-body Methods for Multiprocessor Architectures, Singh, Hennessy and Gupta (how to program Barnes/Hut and Greengard/Rokhlin in parallel)
  • Hierarchical Algorithms and Architectures for Parallel Scientific Computing, Chan (argues that the divide-and-conquer idea underlying Barnes/Hut and Greengard/Rokhlin is ubiquitous in science and engineering, applying to much more than gravity and electrostatics)
  • A Dynamic Scheduling Method for Irregular Parallel Programs Lucco (how to load balance in the presence of irregular amounts of work)
  • The Chaco User's Guide, Version 1.0, Hendrickson and Leland (a different approach to balancing the load with irregular problems)
  • Distributed memory compiler methods for irregular problems - data copy resuse and runtime partitioning - Das, Ponnusamy, Saltz, and Mavriplis (a compiler/runtime system incorporating load balancing ideas from above papers)
  • Programming with LPARX, Baden, Kohn, Fink (a system to help parallelize certain irregular computations on grids, typically arising in PDEs) (See
  • Volume 8:

    LAPACK Manual (edition 1 in Soda, the improved edition 2 in Bechtel).

    Volume 9:

    A printed copy of the 1995 lecture notes for CS267, as they were used for the NSF-CBMS Short Course on Parallel Computing

    Volume 10:

    A printed copy of the 1995 lecture notes for Alan Edelman's MIT course 18.337 on Parallel Scientific Computing