From daw@CS.Berkeley.EDU Sat Jun 28 23:22:15 PDT 1997
Article: 2821 of isaac.lists.coderpunks
Path: news.isaac.cs.berkeley.edu!not-for-mail
From: daw@CS.Berkeley.EDU (David Wagner)
Newsgroups: isaac.lists.coderpunks
Subject: Re: Fast distributed mod. exponentiation
Date: 2 Jun 1997 16:58:00 -0700
Organization: ISAAC Group, UC Berkeley
Lines: 24
Sender: daemon@abraham.cs.berkeley.edu
Approved: mail2news@news.isaac.cs.berkeley.edu
Distribution: isaac
Message-ID: <5mvlac$7kv@joseph.cs.berkeley.edu>
References: <33922075.446B@cs.umass.edu>
NNTP-Posting-Host: abraham.cs.berkeley.edu
Precedence: bulk

In article <33922075.446B@cs.umass.edu>,
Lewis McCarthy  <lmccarth@cs.umass.edu> wrote:
> I'm searching for bibliographic references to published work on fast
> distributed or parallel exponentiation in large finite groups (e.g.
> those suitable for DH and ElGamal).

Here's a scheme to parallelize exponentiation; the algorithm has near-perfect
scaling.  I'll assume you've got a fixed generator g of order q, and you
have p processors of equal power.  Then divide the exponent space into p
roughly equal parts, each with k ~= log(q)/p bits; the j-th processor
precomputes g^{2^{jk}}, for j = 0,...,p-1.  Write the exponent e as
	e = x_0 + x_1 2^k + x_2 2^{2k} + x_3 2^{3k} + ... + x_{p-1} 2^{(p-1)k}
Now the j-th processor picks out the j-th term above, e.g. x_j 2^{jk}, and
computes the corresponding exponentiation,
	y_j = g^{x_j {2^{jk}} = g^{x_j} g^{2^{jk}}.
Finally, multiply all the y_j terms together with a parallel prefix algorithm.
There is an extra overhead of p + p log(p) multiplies (total across all
processors), but other than that it scales perfectly.  Also, it can be easily
adapted to a distributed environment where computation is expensive by using
a centralized parallel prefix algorithm instead of a tree-based one.

I don't know anything about Montgomery methods, or other sophisticated serial
techniques for exponentiation, so I'll leave it as an exercise to the reader
to investigate parallelizing them.