This problem is offered by Prof. Andrew Gelman of the Statistics Dept. (gelman@stat, 642-9959). Please see him for details.

*Background.* An important problem in statistical computation is Monte
Carlo simulation, and a commonly-used method is Markov chain, or
``random walk'' simulation, in which a computer program simulates a
random walk through a probability distribution. The idea is to run
the random walk long enough so that it has time to take a tour through
most of the distribution. A basic idea in implementation is to
simulate several parallel random walks, so that you can (1) go through
several parts of the distribution at once, and (2) compare the paths
of the several random walks to assess convergence. When the random
walks are close to convergence to their ergodic distribution, their
paths will have similar statistical properties.

*Our current set-up.* We currently have a program written in
``serial/parallel'': it starts m random walks, then cycles through them,
performing one step in each random walk, until each walk has gone
through n steps. Typical values are m=10, n=10000. In some
implementations, the program pauses occasionally (perhaps every 1000
steps) to (a) alter the random walk algorithm, and (b) assess
convergence. (Output from the random walks up to time t can be used
to alter the random walk algorithm and improve the efficiency of the
simulations.)

*Our goal.* We want a parallel set-up in which each of m processors is
continually running a random walk. The parallel processors can be in
a single machine or, perhaps more usefully, on a network of
workstations. The random walks should be run asynchronously, if
possible (so that if one of the processors is busy, the others don't
have to wait for it). There should also be a processor used as a
controller. The controller gives the m processors starting points for
their walks; the controller also supplies them with some parameters
that govern the random walk algorithms (an example of a parameter is
delta, the length of a jump in each step of the walk). The m
processors are writing their results into some shared memory, and
occasionally the controller reads the memory and uses the results to
(a) decide how to alter the random walk algorithms, and (b) assess
convergence (i.e., decide whether to stop). The results from (a)
are fed back into the random walks, and so forth.

The technical difficulties seem to be: (1) setting up the asynchronous processes with the shared memory, and (2) having the controller be able to read this information without interfering with the processors that are writing it. I'll supply a serial program with all the algorithms so that the issue for you is to parallelize it.

I think that parallel processing for this family of algorithms will be useful to a lot of applied statisticians. Of course, I expect this will result in a publication in a statistics journal (I don't know what gets published in C.S. journals) and will lead to many other applications.

Sun Jan 14 22:42:23 PST 1996