Christian Bell, Rajesh Nishtala, Jeffrey Hammel, Hormozd Gahvari, Jimmy Su, Michael Tung
The NAS FT benchmark solves a simple partial differential equation through a 3-D discrete Fourier transform. The particular equation is given in the full FT specification at the NAS site. The FT benchmark can stress all-to-all communications, depending on how the transform is performed.
You are to develop UPC and Titanium codes that solve the given PDE by applying a FFT, stepping the PDE forward through time, and applying an inverse FFT. The mechanisms for stepping the PDE forward are given in the NAS spec and in the sample codes. You can use any sequential FFT implementation that produces results in agreement with the NAS benchmark. Don't spend all your time trying to interface with external codes; you should focus on the communication needs.
Figure 1: If the data is considered distributed, then each processor performs local 1-D or 2-D FFTs interleaved with all-to-all transposes.
Figure 2: If the data is considered shared, then processors can pull 1-D or 2-D (or 3-D) slices, perform FFTs, and return the transformed slices to the shared data.
Two schemes for performing large FFTs in parallel are illustrated in Figures 1 and 2. An efficient UPC or Titanium code for a given platform will probably combine aspects of both.
These codes are available to get you started. You can no doubt find others on-line.
Back to homework 3's main page.
Main CS267 page, and the TA's CS267 page
E. Jason Riedy