From daw@CS.Berkeley.EDU Wed Jul 29 11:25:01 PDT 1998 Article: 5551 of isaac.lists.coderpunks Path: news.isaac.cs.berkeley.edu!not-for-mail From: David Wagner Newsgroups: isaac.lists.coderpunks Subject: Re: Entropy loss Date: 22 Jul 1998 17:07:10 -0700 Organization: ISAAC Group, UC Berkeley Lines: 17 Sender: daemon@abraham.cs.berkeley.edu Approved: mail2news@news.isaac.cs.berkeley.edu Distribution: isaac Message-ID: <199807222343.QAA14453@joseph.cs.berkeley.edu> NNTP-Posting-Host: abraham.cs.berkeley.edu Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Mailer: ELM [version 2.4 PL25] Precedence: bulk Xref: joseph.cs.berkeley.edu isaac.lists.coderpunks:5551 In article you write: > This raises an issue I've been wondering about for a while: how much > entropy is lost by recursive hashing? Suppose we've got a random function F : {0,1}^n -> {0,1}^n. Let f_j denote the fraction of n-bit values which can appear as the result of iterating F j times. Then f_1 ~ 1 - 1/e, and f_{j+1} ~ 1 - e^{-f_j}. The first few numbers in the sequence are .632, .469, .374, .312, .268, .235, .210, etc. Asymptotically, f_j ~ 2/(j + log j) is a pretty good approximation. Moreover, I'd guess that the lost entropy is going to be on the order of lg f_j bits, which is not too large as long as j isn't too big. Even asymptotically, this is lg f_j ~ 1 - lg(j + log j), so if you hash it 2^k times, you'll only lose about k-1 bits of entropy (just a guess -- I have no proof).