From daw@cs.berkeley.edu Sat Nov 14 17:39:43 PST 1998
Article: 1213 of sci.crypt.research
From: daw@cs.berkeley.edu (David Wagner)
Newsgroups: sci.crypt.research
Subject: Re: Entropy vs Work (Mistake in Crypto FAQ?)
Date: 14 Nov 1998 14:05:04 GMT
Organization: ISAAC Group, UC Berkeley
Lines: 93
Approved: crypt-submission@cs.auckland.ac.nz
Message-ID: <72k2mg$gm0$1@scream.auckland.ac.nz>
References: <72dcle$us7$1@joseph.cs.berkeley.edu>
Reply-To: daw@cs.berkeley.edu (David Wagner)
NNTP-Posting-Host: gate.student.auckland.ac.nz
X-Newsreader: NN version 6.5.0 #3 (NOV)
Path: agate!newsfeed.berkeley.edu!newsfeed.cwix.com!203.97.37.7!newsfeed.clear.net.nz!news.iprolink.co.nz!auckland.ac.nz!not-for-mail
Xref: agate sci.crypt.research:1213




In article <72dcle$us7$1@joseph.cs.berkeley.edu>,
John Pliam  <john.pliam@worldnet.att.net> wrote:
> I believe I have a very strong family of counterexamples to a
> statement in the "Cryptography FAQ".
>
> >   We can measure how bad a key distribution is by calculating
> >   its entropy. This number E is the number of ``real bits
> >   of information'' of the key: a cryptanalyst will typically
> >   happen across the key within 2^E guesses. E is defined as
> >   the sum of -p_K log_2 p_K, where p_K is the probability
> >   of key K.
>
> As a general rule, this is dangerously wrong.
 
Agreed.  Good catch!
 
As a general rule, there are many measures of entropy (not just the
Shannon entropy defined above), which are useful in different contexts;
confusing them can be dangerous.
 
Here are several examples:
 + Shannon entropy:  H(X) = - \sum_x Pr_X(x) \log_2 Pr_X(x)
   Useful for analyzing one-time pads, secret sharing, compression,
   asymptotics, etc.
 + Renyi entropy (of order 2):  H_2(X) = - \log_2 \sum_x Pr_X(x)^2
   Useful for analyzing the probability of collisions.
 + ``Guessing'' entropy:  H_G(X) = \log_2 \sum_i i p_i
       if the probabilities Pr_X(x) are arranged so that p_1 >= p_2 >= ...
   Useful for analyzing the expected cost of keysearch, inverting hash
   functions, etc.
I'm stealing liberally from Chapter 3 of [1] here.
 
The different measures of entropy don't necessarily agree, of course.
 
I think many people are often very sloppy about their use of the word
``entropy.''  Your paper seems to be a useful reminder of just how
misleading our intuition can be.
 
Perhaps an example would be useful.  Let's take Renyi entropy, say.
Suppose we have a memoryless source which generates outputs according to
the distribution of the random variable X.  Then H_2(X) is a good
measure of the collision properties of the source.  The probability that
any two fixed outputs collide is 2^{-H_2(X)}; also, after n outputs, the
expected number of collisions is n(n-1)/2  *  2^{-H_2(X)}.  See [1] for
other results.
 
As an application of Renyi entropy, suppose we encrypt plaintext under
ECB mode, and the random variable X represents the distribution of
blocks of plaintext.  Then after n outputs the expected number of
collisions in ciphertext blocks which leak information on the plaintext
is n(n-1)/2  *  2^{-H_2(X)}.  So when we are interested in collision
properties, it seems that the Renyi entropy roughly matches our
intuition.
 
As an example application of ``guessing'' entropy, imagine analyzing the
Unix password hashing scheme, where we are given the hash of the user's
password.  If we let the r.v. X represent the distribution of users'
passwords, then the expected amount of work needed (in an optimal
dictionary search attack) to recover a single target's password is
2^{H_G(X)}.
 
You give a generalization, wf_p(X), which measures the expected work
required to gain a probability p of success.  This seems useful in the
case that we have many cryptanalytic targets.  Suppose we have a
password file containing n hashes; if we do wf_{c/n}(X) work, the
expected number of passwords broken is c (assuming independence of
password choices).
 
Note that [1] cites a paper by Massey [2] that apparently proves the
bound 2^{H_G(X)} >= 2^{H(X)-2} + 1.  [1] also cites a paper by McEliece
and Yu [3] that apparently proves a weak bound in the other direction,
H(X) >= (2^{H_G(X)} - 1) 2 \log_2 |X| / (|X| - 1), where |X| represents
the number of different possible values X can take on.  See [1] for more
results in this vein.
 
Note that for the uniform distribution, the measures of entropy roughly
agree [H(X) = H_2(X) = \log_2 |X|, H_G(X) = (\log_2 (|X|+1)) - 1].
Interestlying, [1] mentions that when X is nearly uniform in the sense
that H(X) = \log_2 |X| - c for small c, then the other measures of
entropy roughly agree too.  Therefore, to prove security, it is
apparently enough to show that H(X) is close to \log_2 |X|.
 
[1] Christian Cachin, Entropy Measures and Unconditional Security in
    Cryptography, PhD thesis, ETH Zurich, May 1997.
    ftp://ftp.inf.ethz.ch/pub/publications/dissertations/th12187.ps.gz
[2] James Massey, ``Guessing and entropy,'' Proc. 1994 IEEE Int'l Symp.
    on Information Theory, 1994, p. 204.
[3] Robert McEliece and Zhong Yu, ``An inequality on entropy,'' Proc.
    1995 IEEE Int'l Symp. on Information Theory, 1995, p. 329.



