The point of this assignment is to gain exposure to some parallel programming models. It's also to get you started on the next one.
Because I was so late in posting this, I can't ask for it this Friday. However, I will be issuing and discussing the next assignment this Friday, 25 February. Since this assignment will give you some basis for understanding the next one, you should try to meet with some people over coffee or dessert and discuss it before then.
You'll need a new group. This group is for both this assignment and the next. Swapping groups will help expose you to more ideas, and may help you find good partners for the final project.
Examine three parallel programs. Each implements a simplistic simulation of interacting particles, fish with gravitational attraction. None of them are particularly optimized. The programs all contain some implementation choices; you need to explore some of that space. The programs are written for
These come with Makefiles that should work by default on the Cray. The MPI version also has clauses for Millennium and the NoW. All the programs, along with a basic visualization tool (in Python, using Tkinter), are available in assignment3.tar.gz.
Your group should put together a web page describing the programs. Mail me a URL to a tar file containing the page, any modified code, and Makefiles or other tools needed to build your modified code.
The files have some questions to help guide the discussion. Quantifying your conclusions and comparing them with a good model (like LogP) will make your conclusions more convincing. On the Cray T3E, you can use Vampir to analyze the MPI implementation. Comparisons of timing results between the Cray and the NoW could be interesting, too. You can produce speed-up plots (time for a fixed problem v. number of processors), but it's not altogether straight-forward. Use good judgement, and let the next assignment influence you. Explain why these programs are poor.
It may be worth-while to instrument the UPC implementation using the TAU profiling package, especially for justifying choices in the next assignment. I'm not entirely sure it'll work, but it's worth trying. Don't worry about timing the pthreads implementation; the class doesn't have access to a reasonable SMP. Simply discuss it. If you have access to an SMP with more than 6 processors, feel free to time it there.
I'm not expecting earth-shattering epiphanies and mountains of data, just good comprehension and evidence you can support it. As a warning, some of the coding style is not the best. I tend to write C as if I were compiling from more productive languages, so it can get verbose.
To load the necessary tools into your environment on the Cray, use the following commands:
module load GNU module load python module load vampir module load tau # optional
To use the simplistic viz. tool from one of the subdirectories in the tar file, redirect the output of the simulation into a file, say fish.out. Then try running ../aquarium fish.out. If you receive a ``command not found'' type of error, try running python ../aquarium fish.out. The run button starts and stops the playback. If you get a funky Python error that mentions some capitalization of tkinter, you'll need to use a different Python. On Solaris, you can use mine. To put it in your path, do your shell's equivalent to:
setenv LD_LIBRARY_PATH=/home/cs/ejr/cs267-now-support/lib:$LD_LIBRARY_PATH setenv PATH=/home/cs/ejr/cs267-now-support/bin:$PATH
Note that these paths will only live for this semester. The tool is pretty simplistic and easy to break, but it's less of a kludge than previous X hacks. You typically won't need to see much to know if you've completely killed the simulation. Don't worry about numerical inaccuracies here, but do keep the simulation producing reasonable results. (If you feel like replacing the variant of Euler's method with something better, go for it.)
When running Vampir's graphical tool, you may need to specify -install even when you have a 24- or 32-bit color display. The program doesn't seem to deal with colors well.
I haven't gotten to play with TAU much, but it looks potentially useful. It's worth further examination. I might be able to get it to work on the NoW, as well.
The UPC technical report details many things that don't quite exist. See the README in this assignment's UPC directory for details.
If you feel like banging your head against problems that aren't yours, try using the Millennium cluster. The times on it may be interesting, but you'll have to pester the admins to make the cluster usable.
I tend to use the acronym PE (processing element) when mentioning an individual processing node. It's a hold-over from image processing on SIMD machines (essentially data-parallel hardware). And if you see broken things, please tell me. Posting discoveries to the newsgroup (ucb.class.cs267) would be nice, too.
Main CS267 page, and the TA's CS267 page
E. Jason Riedy