```CS 70 - Lecture 15 - Feb 23, 2011 - 10 Evans

Goals for Note 10 (note: we are skipping Note 9): Counting
Preparation for Probability Theory (most of the rest of the course)
analyzing the "average" behavior of algorithms (quicksort)
design faster algorithms that "flip coins" or take random guesses (hashing)
how much of Artificial Intelligence works (deducing "most likely" meaning
of a sentence, CS188)
communicating reliably over a noisy channel (EE126)
building control systems that work well despite noise (Kalman filters)

If I flip a fair coin once, what is the chance of heads?
1 flip giving head / 2 possible equally likely outcomes = .5
If I flip a fair coin twice, what is the chance of 2 heads?
Need to count the number of ways you can get 2 heads: HH
and divide by the number of equally likely outcomes: {HH,HT,TH,TT}
so 1/4 = 25%
If I flip a fair coin 100 times, what is the chance I get exactly 50 heads?
Need to count how many sequences of 100 flips get exactly 50 heads,
divide by number of sequences of 100 flips altogether (each of which is

passwords consisting of 6 upper case letters, how long would it
take a hacker to break in, if s/he could try 1 password / microsecond?

Notation: set S = {list of distinct items}
S union T = union of S and T = set of all items in S or in T or both
(with any copies removed)
S intersect T = intersection of S and T = set of items in both S and in T
|S| = cardinality of S = #items in S

Counting Principles
1) The Sum Rule:
EX: If you have to do one project for a class, and are given one list with
2 projects and another with 3 different projects, how many different
projects do you have to choose from? 2+3 = 5

The Sum Rule (formally): Suppose we have to choose one task to do, either
T1 or T2. Let S1 be the set of n1 ways to do task 1, and S2 the set of
n2 ways to do task 2, where S1 and S2 disjoint. The set of ways to do
either T1 or T2 is S1 U S2.  The number of ways to do either T1 or T2 is
|S1 U S2| = |S1| + |S2| = n1 + n2

2) The Product Rule:
EX: If you have to do two projects for a class, the first one chosen from a
list of 2 projects, and the second one chosen from a list of 3 projects,
how many different pairs of projects could you turn in?
S1={p1,p2}, S2={pa,pb,pc}
pairs={(p1,pa),(p1,pb),(p1,pc),(p2,pa),(p2,pb),(p2,pc)} = S1 x S2
2*3 = 6 different pairs

The Product Rule (formally) Suppose we have two tasks to do, T1 and T2,
with S1, n1, S2, n2 as above. The set of ways to do both T1 and T2 is
S1 x S2. The number of ways to do both T1 and T2 is
|S1 x S2| = |S1| * |S2| = n1 * n2
(remember S1 x S2 is the set of all pairs of entries {(x1,x2), xi in Si}

3) The Extended Product Rule: If S1 is the set of n1 ways to do T1,
S2 the set of n2 ways to do T2, ... , Sm the set of nm ways to do Tm,
then the set of ways to do T1,T2,...,Tm is S1xS2x...xSm, which has
n1*n2*...*nm elements

ASK&WAIT: How many bits strings of length 9 are there?
of three letters following by 3 numbers?
ASK&WAIT: How many different computer passwords are there if they may be
8 characters, upper case letters only?
ASK&WAIT: How many different computer passwords are there if they may be
6-8 characters long, upper or lower case letters, digits?
ASK&WAIT: What if there must be at least one letter and one number?

ASK&WAIT: How many ways can you shuffle a deck of 52 cards?

EX: How many ways can a class of 100 students be divided in 2-student
teams?
2 students {s1,s2} -> 1 way
4 students {s1,s2,s3,s4} -> 3 ways
{(s1,s2),(s3,s4)}, {(s1,s3),(s2,s4)}, {(s1,s4),(s2,s3)}
How do we get a simple formula for any even n?
Suppose there are are P(n-2) pairings of n-2 students,
whose names are 1, 2, ... , n-2; now add students n-1 and n
What pairings are possible?
Take student n, and choose any other student m to make
the pair.  That leaves n-2 students, with P(n-2)
possible pairings.
m can take on n-1 values, so there are (n-1)*P(n-2)
possible pairings.
Result: recurrence P(n) = (n-1)*P(n-2) , with P(2)=1
Are we sure we have counted every possibility exactly once?
Use induction: assume P(n-2) is correct
In construction, get (n-1) groups of P(n-2) pairings each, where
n is paired with a different m in each group. So
no pairing appears in more than one group. And no pairing
can appear twice in one group because all P(n-2) groupings
of n-2 students are different, by induction. And each
pairing has to appear in one group, depending on partner of n.
ASK&WAIT: What is a closed form formula for P(n)?
For n=100: P(100) = 99*97*95*...*3 \approx 2.7e+78
n  P_n
2  1
4  3
6  15
8  105
10  945
20  6.5e+08
40  3.2e+23
60  2.9e+40
100  2.7e+78
150  6.1e+130
(number of atoms in universe once thought to be about 1e80)

4) Inclusion-Exclusion Principle:
EX: How many 8-bit strings either start with 1 or end with 00?
S1 = {1xxxxxxx, x= any bit}, S2 = {xxxxxx00, x=any bit}
We want |S1 U S2|. But S1 and S2 overlap: S1 inter S2 = {1xxxxx00}
So we count |S1| = 2^7, |S2| = 2^6. But |S1|+|S2|>|S1 U S2| because
S1 inter S2 has been counted twice, so we subtract it:
|S1 U S2| = |S1| + |S2| - |S1 inter S2| = 2^7 + 2^6 - 2^5 = 160
The Inclusion-Exclusion Principle (formally) Suppose we have two tasks to do,
T1 and T2, with S1, n1, S2, n2 as above, except S1 and S2 may intersect.
The set of ways to do both T1 and T2 is S1 U S2. The number of ways to do
both S1 and S2
|S1 U S2| = |S1| + |S2| - |S1 inter S2|
EX: How many <= 3 decimal digit numbers are divisible by 3 or by 4?
Suppose you have 3 tasks, in sets S1, S2, S3, which might overlap.
Then | S1 U S2 U S3 | = |S1| + |S2| + |S3|
- |S1 inter S2| - |S2 inter S3| - |S1 inter S3|
+ |S1 inter S2 inter S3 |
EX: How many <= 3 decimal digit numbers are divisible by 3, 4 or 5?

(5) Pigeonhole Principle:
If k+1 or more objects (pigeons) are placed in k boxes (holes),
then at least one box contains 2 or more objects.
there would only be k or fewer objects, a contradiction)

EX: In any group of 27 English words, at least 2 begin with the same
letter, since there are only 26 letters.
ASK&WAIT: How large a group of people do you need to be sure that
two of them have the same first and last initials?
ASK&WAIT: How many times do you have to shuffle a deck of cards, to
be sure that the cards are in exactly the same order at least twice?

(6) Generalized Pigeonhole Principle:
If N or more objects (pigeons) are placed in k boxes (holes),
then at least one box contains ceiling(N/k) or more objects.
there would be at most
k*(ceiling(N/k) -1) < k*((N/k +1) - 1) = N objects, a contradiction)
EX: N=k+1 implies ceiling(N/k)=ceiling((k+1)/k)=2, usual Pigeonhole principle

ASK&WAIT: There are 199 students enrolled in CS70. How many have to

EX: Given any set S of n+1 positive integers less than or equal to 2*n,
then one of them must divide another one:
(ex: n=5, if S={2 3 5 7 9 10}, then 5 | 10)
Proof: Let S = {a(1),a(2),...,a(n+1)}. Write a(i) = 2^(k(i)) * q(i),
where q(i) is odd. So {q(1),...,q(n+1)} is a set of positive
odd integers from 1 to 2n-1, of which there are only n, namely
1,3,5,...,2n-1. So by the pigeonhole principle, q(i)=q(j)=q
for some i and j. Thus a(i) = 2^(k(i)) * q and
a(j) = 2^(k(j)) * q. If k(i)>k(j) then a(j)|a(i), else a(i)|a(j).

ASK&WAIT: Assuming California has 36M people, at least how many of them must
have the same 3 initials and were born on the same day of the same
month (but not necessarily in the same year)?
For example, Arnold B. Casey (ABC), born 29 Feb 1955 and
Abigail B. Chen (ABC), born 29 Feb 1990

(7) Permutations
DEF: a permutation of a set S of n distinct objects is an ordered list
of these objects
DEF: an r-permutation is an ordered list of r elements of S
EX: S={1,2,3},
all permutations={(1,2,3),(2,1,3),(1,3,2),(2,3,1),(3,1,2),(3,2,1)}
all 2-permutations={(1,2),(2,1),(1,3),(3,1),(2,3),(3,2)}
DEF: the number of r-permutations of a set S with n elements is P(n,r)
Theorem: P(n,r) = n*(n-1)*(n-2)*...*(n-r+1) = n!/(n-r)!
Proof: (product rule): there are n ways to choose the first in list,
n-1 ways to choose second, ... , n-r+1 ways to choose rth
EX: P(3,3)=3*2*1=6, P(3,2)=3*2=6
EX: how many different ways can a salesman visit 8 cities? P(8,8)=8!=40320
EX: How many different ways can 10 horses in a race win, place and show
(come in first, second, third)? P(10,3) = 10*9*8 = 720

(8) Combinations
DEF: an r-combination from a set S is simply an unordered subset of
r elements from S
EX: S={1,2,3}, all 2-combinations={{1,2},{1,3},{2,3}}
Comparing to all 2-permutations, we see we ignore order,
DEF: C(n,r) = number of r-combinations from a set with n-elements
Theorem: C(n,r) = n! / [ (n-r)! r! ]
Proof: the set of all r-permutations can be formed from the set of
all r-combinations by taking all r! orderings of each
r-combination, so P(n,r)=r! * C(n,r), and
C(n,r)=P(n,r)/r!= n! / [ (n-r)! r! ]=  n*(n-1)*(n-2)*...*(n-r+1)/r!
EX: C(3,2)=P(3,2)/2!=6/2=3
DEF C(n,r) also called binomial coefficient, written (n \\ r), pronounced
"n choose r"
Note that C(0,0)= 0!/0!*0! = 1; C(n,0)=C(n,n)=1
Corollary: C(n,r)=C(n,n-r)
Proof: C(n,r)=n!/[(n-r)! r!] = n!/[ r! (n-r)!] = n!/[(n-(n-r))! (n-r)!]
= C(n,n-r)
EX C(3,1)=C(3,2)=1

DEF Pascal triangle:
(0)
(0)
(1)     (1)
(0)     (1)
(2)     (2)     (2)
(0)     (1)     (2)
(3)     (3)     (3)     (3)
(0)     (1)     (2)     (3)
(4)     (4)     (4)     (4)     (4)
(0)     (1)     (2)     (3)     (4)
(5)     (5)     (5)     (5)     (5)     (5)
(0)     (1)     (2)     (3)     (4)     (5)
(6)     (6)     (6)     (6)     (6)     (6)     (6)
(0)     (1)     (2)     (3)     (4)     (5)     (6)
...
row sum
1                                  1
1       1                              2
1       2       1                          4
1       3       3       1                      8
1       4       6       4       1                  16
1       5      10      10       5       1              32
1       6      15      20      15       6       1          64

Note that to get any entry, you sum its neighbors to left above, right above
Theorem: C(n,r)= C(n-1,r-1)+C(n-1,r) (Pascal's identity)
Proof: need to show      n!              (n-1)!          (n-1)!
-----------   =   ------------  +  ------------
(n-r)!*r!         (n-r)!*(r-1)!     (n-r-1)!*r!

multiply by (n-r)!*r!/(n-1)! to get
n        =        1               1
-----------       ------------  +  ------------
1                 1/r            1/(n-r)
or n = r + (n-r), which is true

Theorem: sum_{r=0}^n C(n,r) = 2^n (note row sums of Pascals triangle)
proof: 2^n = number of subsets of a set S with n elements
= sum_{r=0}^n number of subsets of size r of S
= sum_{r=0}^n C(n,r)

Binomial Theorem:
(x+y)^n = sum_{r=0}^n C(n,r) * x^r * y^{n-r}
EX:
(x+y)^0=                                    1
(x+y)^1=                          1x        +       1y
(x+y)^2=                 1x^2     +         2xy     +        1y^2
(x+y)^3=        1x^3     +        3x^2y     +       3xy^2    +       1y^2
(x+y)^4=1x^4    +        4x^3y    +         6x^2y^2 +        4xy^3   +     1y^4

Proof: (x+y)^n=(x+y)*(x+y)*...*(x+y) n times
what is coefficient of x^r*y^(n-r)? If we multiply out whole
expression, get one x^r*y^(n-r) term for each subset of
r terms (x+y) out of n from which we choose x, which is C(n,r)

EX: what is coeff of x^12 y^13 in
(2x-3y)^25 = sum_{r=0}^25 C(25,r) (2x)^r (-3y)^(n-r)
= sum_{r=0}^25 C(25,r) 2^r (-3)^(n-r) x^r y^(n-r)
= ... - C(25,12) 2^12 3^13 x^12 y^13 ...
= ... - 25!/(12! 13!) 2^12 3^13 x^12 y^13 ...
= ... - 3.4.. 10^16 x^12 y^13 ...
ASK&WAIT: How many bit strings contain exact 5 zeros and 14 ones, if each
zero is immediately followed by 2 ones?

ASK&WAIT: show that sum_{k=1}^n k*C(n,k) = n*2^(n-1):

EX: How many ways can a class of n students be divided into m person teams?
We assume n = q*m, so there are q teams.

Solution: Let us start to write down all the ways of dividing all the
students into q teams of m students each by writing down all n! ways of
ordering n students, and just saying the first m students are the first
team, the 2nd m students are the second team and so on.
But it is clear that we have counted the same
sets of teams too many times. Let us try to divide out n! by the number
of multiple copies of the same set of teams.

First, it is clear that no matter what the order of the first m students
in the list is, we get the same team. Since there are m! such orders, we
have counted sets of teams which differ only in the order of the first
teams m! times too often, so we should divide by m!.

Similarly, the order of the 2nd group of m students does not matter, so
we should divide by m! again. The same argument applies to each of the
q groups of m students, so we should divide by m! q times.

But we are still not not done, because the team consisting of the first
m students could appear anywhere in the q possible positions, as could
the second group of m students, and so on. In other words, we have still
counted the same set of teams q! times too often, because the teams can
appear in q! possible orders, and still represent the same set of teams.
So we have to divide by q! also.

All in all, we get n!/( (m!)^q q!).

EX:  How many different desserts can you make out of 4 scoops of ice cream,
each of which may be chocolate (C), vanilla (V) or strawberry (S)?
Here are the 15 possibilities:
CCCC  VVVV   SSSS
CCCV  VVVC   SSSC
CCCS  VVVS   SSSV
CCVS  VVCS   SSCV
CCVV  VVSS
CCSS

Here is a more systematic way to get the answers: we will represent each
dessert by a sequence of 4 stars (representing the 4 scoops) and 2 bars
(dividing the starts into 3 groups: C, V and S). Here are some examples:
**|*|*  represents 2 Cs, 1 V   and 1 S
*|**|*  represents 1 C , 2 V's and 1 S
*|***|  represents 1 C , 3 V's and 0 S's
|****|  represents 0 C , 4 V's and 0 S's
||****  represents 0 C , 0 V's and 4 S's  etc
The idea is that every sequence of 4 stars and 2 bars represents exactly
one dessert. How many such sequences are there? The idea is that we take
6 possible possible positions (for 4 stars and 2 bars) and choose 2 of
them for bars. There are C(6,2) = 6!/(2! 4!) = 15 ways to do this.

Here is the general result:
Theorem: Suppose I have n types of objects ("flavors"). How many different
sets ("desserts") consisting of r objects ("scoops") are there?
Proof: The idea is the same as before: each sequence of r stars ("scoops")
and (n-1) bars represents a possible set. There are C(n+r-1,n-1) ways to
pick n-1 places out of r+n-1 locations to put the bars.

Ex: If I have n=3 flavors of ice cream, and make desserts of r=4 scoops,
there are C(n+r-1,n-1)=C(3+4-1,3-1)=C(6,2)=15 different desserts.

EX: How many anagrams are there of the word "mammal"?
Recall that an anagram is a distint ordering of the letters.
Here are some smaller examples:
the word "the": The 6 anagrams are   the, teh, eth, eht, het, hte
the word "see": The 3 anagrams are   see, ese, ees

Here are different ways to try to solve this problem for the word
"mammal", followed by the general result:

Solution 1: Pick 3 locations for the m's
Pick 2 of the remaining locations for the 2 a's
Pick the remaining location for l
By the product rule, the number of ways to pick locations is
C(6,3)   ... for the m's
* C(3,2)   ... for the a's
* C(1,1)   ... for the l
= 20*3*1 = 60

Solution 2: Pick 1 location for the l
Pick 3 of the remaining locations for the m's
Pick the remaining 2 locations for the a's
By the product rule, the number of ways to pick locations is
C(6,1)   ... for the l
* C(5,3)   ... for the m's
* C(2,2)   ... for the a's
= 6*10*1 = 60, the same answer (whew!)

Solution 3: Let us start by labeling the m's as m1,m2 and m3, and
and the a's as a1 and a2, so we can distinguish them.
So now we have 6 distinct symbols, m1,a1,m2,m3,a2,l,
and the number of ways to order them is 6!.
But clearly we have counted some ordering as distinct
that we should not, so let's try to divide out by
the number of multiple copies.

For example, consider all the orderings where the first
3 characters are m's, and the last three are a1,a2,l.
The are clearly 3! = 6 such orderings, since m1,m2,m3
can appear in the first three positions in any order,
but yield the same anagram. This argument that we are
counting each anagram 3! times works no matter where
the 3 m's appear, so we should divide the number
of orderings by 3! to account for the 3 m's.

Similarly, we should divide by 2! to account for the
two a's.

This yields 6!/ (3! 2!) = 60, the same answer (whew!)

Solution 3 is the one that generalizes to arbitrary
anagrams:

Theorem:   Suppose we have
n(1) copies of symbol 1
n(2) copies of symbol 2
...
n(k) copies of symbol k
Let n = n(1) + n(2) + ... + n(k). Then the number of
distinct anagrams constructed from these symbols is

n!
---------------------------
n(1)! n(2)! n(3)! ... n(k)!

Proof: Consider all n! permutations of the n symbols.
Some of these are identical:
Given a permutation, all n(1)! permutations with symbol 1 in
the same positions are identical
Given a permutation, all n(2)! permutations with symbol 2 in
the same positions are identical
...
Given a permutation, all n(k)! permutations with symbol k in
the same positions are identical
Therefore, we need to divide n! by n(1)!*n(2)!*...*n(k)! to get
the correct number of anagrams.
```