My name is Jimmy Su. I am a second year grad student in the computer science department. My research interest is in compiler optimizations for parallel languages. More specifically, I have been looking at optimizations for irregular array accesses in the context of the Titanium project. My goal for taking CS267 is to learn and develop applications that would benefit from my work on compiler optimizations for irregular problems.
Distributed Immersed Boundary Simulation
in Titanium
Problem
The immersed boundary method is a general technique for modeling elastic boundaries immersed within a viscous, incompressible fluid. The method has been applied to several biological and engineering systems, including large scale models of the heart [3] and cochlea [1]. Simulation for these problems has been done before on share memory machines. Porting these algorithms to a distributed memory machine would increase the number of available processors, so a larger problem can be worked on. These simulations have the potential to improve the basic understanding of the biological systems they model and aid in the development of surgical treatments and prosthetic devices.
Challenge
Despite the popularity of the immersed boundary method and the desire to scale the problems to accurately capture the details of the physical systems, parallelization for large scale distributed memory machine has proven challenging. The primary reason is a classic locality and load balance tradeoff that arises in distributing the immersed boundary data structure across processors.
Distributed Immersed Boundary Simulation in Titanium
Givelberg and Yelick developed a parallelized algorithm for the immersed boundary method that is designed for scalability on distributed memory multiprocessors and clusters of SMPs [2]. The software package is implemented using the Titanium language [4], a Java-based high performance scientific computing. The software package is called IB. It takes advantage of the object-oriented features of Titanium to provide a framework for simulating immersed boundaries that separates the generic immersed boundary method code from the specific application features that define the immersed boundary structure and the forces that arise from those structures. Results showed that IB is scalable, and the large scale immersed boundary computations with the IB package is feasible.
Platform
Experiments were carried out on Seaborg, an IBM SP RS/6000 at the National Energy Research Scientific Computing Center (NERSC). This computer ranks 9th on the Top 500 list. This is a distributed memory computer possessing a large number of 16-processor nodes. Currently there are 380 nodes on this computer and each node has between 16 and 64 GBytes of memory.
Performance
All
of the tests were carried out on either 1, 2, 4 or 8 nodes, with the total
number of processors used being 16, 32, 64 or 128. Table 1 summarizes the wall clock per
time step results for a number of test models, as well as the total number of
floating point operations computed (in billions) when the maximal number of
processors is employed.

Each
processor on Seaborg has a peak performance of 1.5 GFlops. The experiment shows that the software
package runs at less than 3% of peak for all the different configurations. This is the biggest weakness of this
application. Although the performance
has much room for improvement, the transition from share memory machine to
distribute memory machine has already pay off.
It is now able to work on problem sizes that it couldn’t do before due
to the increase in the number of processors.
References
[1] R. P. Beyer. A
computational model of the cochlea using the immersed boundary method. J. Comp. Phys., 98:145–162, 1992.
[2] E. Givelberg and K.
Yelick. Distributed immersed boundary simulation in Titanium. Submitted.
[3] D. M. McQueen and C.
S. Peskin. Shared-memory parallel vector implementation of the immersed
boundary method for the computation of blood flow in the beating mammalian
heart. Supercomputing, 11:213–236, 1997.
[4] K. Yelick, L.
Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S.
Graham, D. Gay, P. Colella, and A. Aiken. Titanium: A high-performance java
dialect. Concurrency:
Practice and Experience, 10(11-13),
September-November 1998.