From daw@cs.berkeley.edu Sat Nov 14 17:39:43 PST 1998 Article: 1213 of sci.crypt.research From: daw@cs.berkeley.edu (David Wagner) Newsgroups: sci.crypt.research Subject: Re: Entropy vs Work (Mistake in Crypto FAQ?) Date: 14 Nov 1998 14:05:04 GMT Organization: ISAAC Group, UC Berkeley Lines: 93 Approved: crypt-submission@cs.auckland.ac.nz Message-ID: <72k2mg$gm0$1@scream.auckland.ac.nz> References: <72dcle$us7$1@joseph.cs.berkeley.edu> Reply-To: daw@cs.berkeley.edu (David Wagner) NNTP-Posting-Host: gate.student.auckland.ac.nz X-Newsreader: NN version 6.5.0 #3 (NOV) Path: agate!newsfeed.berkeley.edu!newsfeed.cwix.com!203.97.37.7!newsfeed.clear.net.nz!news.iprolink.co.nz!auckland.ac.nz!not-for-mail Xref: agate sci.crypt.research:1213 In article <72dcle$us7$1@joseph.cs.berkeley.edu>, John Pliam wrote: > I believe I have a very strong family of counterexamples to a > statement in the "Cryptography FAQ". > > > We can measure how bad a key distribution is by calculating > > its entropy. This number E is the number of ``real bits > > of information'' of the key: a cryptanalyst will typically > > happen across the key within 2^E guesses. E is defined as > > the sum of -p_K log_2 p_K, where p_K is the probability > > of key K. > > As a general rule, this is dangerously wrong. Agreed. Good catch! As a general rule, there are many measures of entropy (not just the Shannon entropy defined above), which are useful in different contexts; confusing them can be dangerous. Here are several examples: + Shannon entropy: H(X) = - \sum_x Pr_X(x) \log_2 Pr_X(x) Useful for analyzing one-time pads, secret sharing, compression, asymptotics, etc. + Renyi entropy (of order 2): H_2(X) = - \log_2 \sum_x Pr_X(x)^2 Useful for analyzing the probability of collisions. + ``Guessing'' entropy: H_G(X) = \log_2 \sum_i i p_i if the probabilities Pr_X(x) are arranged so that p_1 >= p_2 >= ... Useful for analyzing the expected cost of keysearch, inverting hash functions, etc. I'm stealing liberally from Chapter 3 of [1] here. The different measures of entropy don't necessarily agree, of course. I think many people are often very sloppy about their use of the word ``entropy.'' Your paper seems to be a useful reminder of just how misleading our intuition can be. Perhaps an example would be useful. Let's take Renyi entropy, say. Suppose we have a memoryless source which generates outputs according to the distribution of the random variable X. Then H_2(X) is a good measure of the collision properties of the source. The probability that any two fixed outputs collide is 2^{-H_2(X)}; also, after n outputs, the expected number of collisions is n(n-1)/2 * 2^{-H_2(X)}. See [1] for other results. As an application of Renyi entropy, suppose we encrypt plaintext under ECB mode, and the random variable X represents the distribution of blocks of plaintext. Then after n outputs the expected number of collisions in ciphertext blocks which leak information on the plaintext is n(n-1)/2 * 2^{-H_2(X)}. So when we are interested in collision properties, it seems that the Renyi entropy roughly matches our intuition. As an example application of ``guessing'' entropy, imagine analyzing the Unix password hashing scheme, where we are given the hash of the user's password. If we let the r.v. X represent the distribution of users' passwords, then the expected amount of work needed (in an optimal dictionary search attack) to recover a single target's password is 2^{H_G(X)}. You give a generalization, wf_p(X), which measures the expected work required to gain a probability p of success. This seems useful in the case that we have many cryptanalytic targets. Suppose we have a password file containing n hashes; if we do wf_{c/n}(X) work, the expected number of passwords broken is c (assuming independence of password choices). Note that [1] cites a paper by Massey [2] that apparently proves the bound 2^{H_G(X)} >= 2^{H(X)-2} + 1. [1] also cites a paper by McEliece and Yu [3] that apparently proves a weak bound in the other direction, H(X) >= (2^{H_G(X)} - 1) 2 \log_2 |X| / (|X| - 1), where |X| represents the number of different possible values X can take on. See [1] for more results in this vein. Note that for the uniform distribution, the measures of entropy roughly agree [H(X) = H_2(X) = \log_2 |X|, H_G(X) = (\log_2 (|X|+1)) - 1]. Interestlying, [1] mentions that when X is nearly uniform in the sense that H(X) = \log_2 |X| - c for small c, then the other measures of entropy roughly agree too. Therefore, to prove security, it is apparently enough to show that H(X) is close to \log_2 |X|. [1] Christian Cachin, Entropy Measures and Unconditional Security in Cryptography, PhD thesis, ETH Zurich, May 1997. ftp://ftp.inf.ethz.ch/pub/publications/dissertations/th12187.ps.gz [2] James Massey, ``Guessing and entropy,'' Proc. 1994 IEEE Int'l Symp. on Information Theory, 1994, p. 204. [3] Robert McEliece and Zhong Yu, ``An inequality on entropy,'' Proc. 1995 IEEE Int'l Symp. on Information Theory, 1995, p. 329.