From daw@CS.Berkeley.EDU Wed Jul 29 11:25:01 PDT 1998
Article: 5551 of isaac.lists.coderpunks
Path: news.isaac.cs.berkeley.edu!not-for-mail
From: David Wagner <daw@CS.Berkeley.EDU>
Newsgroups: isaac.lists.coderpunks
Subject: Re: Entropy loss
Date: 22 Jul 1998 17:07:10 -0700
Organization: ISAAC Group, UC Berkeley
Lines: 17
Sender: daemon@abraham.cs.berkeley.edu
Approved: mail2news@news.isaac.cs.berkeley.edu
Distribution: isaac
Message-ID: <199807222343.QAA14453@joseph.cs.berkeley.edu>
NNTP-Posting-Host: abraham.cs.berkeley.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Mailer: ELM [version 2.4 PL25]
Precedence: bulk
Xref: joseph.cs.berkeley.edu isaac.lists.coderpunks:5551

In article <Pine.LNX.3.96.980716010232.24035B-100000@blackbox> you write:
> This raises an issue I've been wondering about for a while: how much
> entropy is lost by recursive hashing?

Suppose we've got a random function F : {0,1}^n -> {0,1}^n.
Let f_j denote the fraction of n-bit values which can appear as the
result of iterating F j times.  Then f_1 ~ 1 - 1/e, and
	f_{j+1} ~ 1 - e^{-f_j}.
The first few numbers in the sequence are .632, .469, .374, .312,
.268, .235, .210, etc.  Asymptotically, f_j ~ 2/(j + log j) is a
pretty good approximation.

Moreover, I'd guess that the lost entropy is going to be on the
order of lg f_j bits, which is not too large as long as j isn't too
big.  Even asymptotically, this is lg f_j ~ 1 - lg(j + log j), so
if you hash it 2^k times, you'll only lose about k-1 bits of entropy
(just a guess -- I have no proof).


