CS271: RANDOMNESS & COMPUTATION
INSTRUCTOR: Alistair Sinclair
(sinclair@cs, 677 Soda)
LECTURES: Tuesday, Thursday 9:30-11:00 in 310 Soda
OFFICE HOURS: Monday 1:00-2:00, Tuesday 11:00-12:00 in 677 Soda
TA: Greg Valiant
(gvaliant@eecs, 615 Soda)
OFFICE HOURS: Monday 2:00-3:00 in 611 Soda, Friday 2:00-3:00 in 751 Soda
Recent Announcements
(12/17) Problem Set 3 has been graded. Sample solutions are posted below.
Graded solutions can be picked up from my office after Jan 3rd. Grades have
also been assigned for the class. Everybody did pretty well. Have a good
break and Happy New Year!
(12/3) The timing of office hours this Monday (Dec 5) will change. AS office hour
will be 3-4 (instead of 1-2). GV office hour will be 10-12 (instead of 2-3). Also,
AS office hour on Tuesday (Dec 6) will change from 11-12 to 12-1.
(11/30) Problem Set 3 is posted below; it covers the material in Lectures 21
to 27, and is due by 5pm on Wednesday, December 14. As always, start early!
(11/22) Problem Set 2 has been graded. Sample solutions are posted below.
Those who did not pick up their graded solutions in class today can do so during
office hours or at next Tuesday's lecture. Happy Thanksgiving!
(11/10) Yet another (minor) correction for Q5(b). As somebody pointed out, it is not
in fact "obvious" that connectivity is a monotonically increasing property, since
adding a new point can actually cause a connected graph to become disconnected in
this model. (You might like to check this!) So the translation from the PPP to
the original model as given below isn't quite correct. However, the translation
can still be made via the following observation. The number of points in the PPP
will be exactly n with probability e^{-n}(n^n)/(n!) > c/sqrt{n} for a constant c.
So if we show that in the PPP the graph is connected with probability much less
than 1/sqrt{n} (in fact it will be exponentially small) then we can conclude that
in the original n-point model Pr[G is connected] -> 0. So the bottom line is that
it's still fine for you to follow the outline in the hint below; it's just that the
justification for transferring to the PPP is different.
(11/9) The current hint for Q5(b) on HW2 is misleading.
Here is a modified hint that contains some simplifications and corrections; apologies
for the confusion.
[Note: If you have come up with an alternative argument
for this part that does not use the hint, that is fine.]
Hint for Q5(b)
1. First, you can make your life alot easier by assuming that the points are distributed
in the unit square according to a Poisson Point Process (PPP) of intensity n. This
means that the number of points in any subregion A has a Poisson distribution with
parameter n x area(A), and the numbers of points in disjoint subregions are independent.
(This independence makes things much simpler.) Since the property of being connected
is obviously monotonically increasing with the number of points, it follows by exactly
the same argument as in the proof of Theorem 14.7 in Lecture 14 that
Pr[G is connected] <= 4 x Pr'[G is connected]
where Pr denotes the probability in the original n-point model, and Pr' denotes the
probability in the PPP model. Thus we can work in the PPP model and show that
Pr'[G is connected] -> 0.
2. Since we're working in the PPP model, you will probably need a Chernoff-type bound
for a Poisson r.v. You may assume that a Poisson r.v. X satisfies exactly the same
form of tail bounds as the Angluin bounds for a binomial r.v., as given in Corollary
13.3. (This bound for the upper tail follows immediately by substituting \lambda=\beta\mu
in the bound you derived in Q1(c) of the present HW. The bound for the lower tail
follows by a completely analogous argument.)
3. The strategy outlined in the original hint is still valid, except that condition (iii)
in the definition of a "bad" set of discs should be modified slightly as follows.
Condition (iii) should read: "The intersection of D_5 -D_3 with each disc of radius
1.5r centered at a set of points spaced equally at distance 0.01r around the boundary
of D_3 contains at least (k+1) points." This is the same as the previous condition,
except that the radius of the discs is a bit smaller and (most important) the number
of discs involved is small (actually constant). In addition to verifying the claimed
lower bound on the probability that a given set of three discs is bad, you should
explain clearly why the presence of a bad set of discs ensures that G is not connected.
(11/8) On HW2, there are a couple of problems with Q5(b) as currently stated.
Some additional hints/corrections will be provided shortly. In the meantime, you are
encouraged to work on the other problems first. Also, in Q6 it is possible
to improve on the constant 8 in the denominator of the exponent; so if you
get a better constant - correctly justified - then that is fine. Finally,
in Q1(c) you should ignore the point about the lower tail.
(11/8) In Q2 of HW2, to avoid possible confusion, note that the factor
\omega(n) is not necessary to obtain the
result of the "Deduce" part. (In fact, a constant will do instead.)
(11/8) The venue for Greg's replacement office hour this week (Thursday 2-3pm)
is his office, 615 Soda.
(11/7) There is a typo in Q5(a) of HW2. The radius should be \sqrt{(10 log n)/n},
rather than (10 log n)/n. Apologies for the confusion.
(11/7) It's been pointed out to me that the deadline for HW2 is on 11/11,
a holiday. Therefore, the deadline is extended to 5pm Monday 11/14.
For the same reason, Greg's second office hour this week will move from
Friday 2-3 to Thursday 2-3, venue TBA.
(10/31) Here are the venues for Greg Valiant's office hours: Monday
2-3pm in 611 Soda; Friday 2-3pm in 751 Soda.
(10/29) There will be NO LECTURE next Thursday (November 3rd). Tuesday's
lecture will take place as usual. Please use the time to work on Problem
Set 2.
(10/29) Problem Set 2 is posted below; it covers the material in Lectures 13
to 20, and is due by 5pm on Friday, November 11. As always, start early!
(10/29) Some people have asked how the hw scores translate to final course grades,
and in particular whether low hw scores could lead to somebody failing the class.
Basically the grading scheme will be set so that anybody who has made
a decent attempt at all three problem sets will pass the class. (Thus, for example,
nobody is in any danger of failing based on HW1.)
(10/29) The class now has a TA:
Greg Valiant,
email gvaliant@eecs.
Starting this coming week (Monday October 31), Greg will hold office hours on Mondays
and Fridays from 2 to 3pm. If you need to speak to Greg and are unable to make either
of those times, you can send him email to arrange an alternative time.
(10/24) Problem Set 1 has been graded. Sample solutions are posted below.
Graded solutions will be returned in class tomorrow, or can be picked up during
office hours.
(10/4) Assessment: The class grade will be based on three Problem Sets, the
first of which is posted below. The last Problem Set will be due after the end of classes.
There will be no final exam.
(10/1) The first Problem Set is posted below; it covers the material in Lectures 1
to 11, and is due by 5pm on Friday, October 14. Start early!
(9/7) Following the move to a larger classroom, all waitlisted students
have now been admitted.
(9/2) THE CLASSROOM HAS BEEN CHANGED TO 310 SODA, STARTING WITH THE NEXT
LECTURE (TUESDAY SEPT 6).
(8/25) Our classroom may be changed as a result of overcrowding. Please
watch this space for announcements.
(8/25) The class is oversubscribed. If you decide to drop the class, please
de-register immediately so that another student can be admitted. If the class
remains full, it may be necessary to limit enrollment to graduate students.
If you plan to audit the class (i.e., come to lectures, but not do assessed
work or receive a letter grade), you should enroll in the class with the S/U
option.
Lecture Notes
Problem Sets
Description
One of the most remarkable developments in Computer Science over the
past 30 years has been the realization that allowing computers to toss
coins can lead to algorithms that are more efficient, conceptually
simpler and more elegant that their best known deterministic counterparts.
Randomization has since become such a ubiquitous tool in algorithm design
that any kind of encyclopedic treatment in one course is impossible.
Instead, I will attempt to survey several of the most widely used
techniques, illustrating them with examples taken from both algorithms
and random structures. A tentative and very rough course outline,
just to give you a flavor of the course, is the following:
- Elementary examples: e.g., checking identities, fingerprinting and
pattern matching, primality testing.
- Moments and deviations: e.g., linearity of expectation, universal hash
functions, second moment method, unbiased estimators, approximate counting.
- The probabilistic method: e.g., threshold phenomena in random graphs
and random k-SAT formulas; Lovász Local Lemma.
- Chernoff/Hoeffding tail bounds: e.g., Hamilton cycles in a random
graph, randomized routing, occupancy problems and load balancing,
the Poisson approximation.
- Martingales and bounded differences: e.g., Azuma's inequality,
chromatic number of a random graph, sharp concentration of Quicksort,
optional stopping theorem and hitting times.
- Random spatial data: e.g, subadditivity, Talagrand's inequality,
the TSP and longest increasing subsequences.
- Random walks and Markov chains: e.g., hitting and cover times,
probability amplification by random walks on expanders, Markov chain
Monte Carlo algorithms.
- Miscellaneous additional topics as time permits: e.g., statistical
physics, reconstruction problems, rigorous analysis of black-box optimization
heuristics,...
Prerequisites
Mathematical maturity, and a solid grasp of undergraduate material
on Algorithms and Data Structures, Discrete Probability and
Combinatorics. If you are unsure about the suitability of your
background, please talk to me before committing to the class.
Registration
Following department policy, all students - including auditors -
are requested to register for the
class. Auditors should register S/U; an S grade will be awarded for
class participation and satisfactory scribe notes. If there is excessive
demand for the class, it may be necessary to limit enrollment
to full-time graduate students. Those who decide to drop the class
are requested to do so promptly so that others may take their place.
Suggested References
There is no required text for the class, and no text that covers
more than about one third of the topics. However, the following
books cover significant portions of the material, and
are useful references.
- Noga Alon and Joel Spencer, The Probabilistic Method (3rd ed.),
Wiley, 2008.
- Svante Janson, Tomasz Łuczak and Andrzej Ruciński, Random Graphs,
Wiley, 2000.
- Geoffrey Grimmett and David Stirzaker, Probability and Random
Processes (3rd ed.), Oxford Univ Press, 2001.
- Michael Mitzenmacher and Eli Upfal, Probability and Computing:
Randomized Algorithms and Probabilistic Analysis, Cambridge Univ Press, 2005.
- Rajeev Motwani and Prabhakar Raghavan, Randomized Algorithms,
Cambridge Univ Press, 1995.
Scribe Notes
Scribe notes for all lectures will be posted on this web page shortly
after each lecture. These will be based on (edited versions of) scribe
notes from previous renditions of the class. In some cases, where there
is substantial new material, I may request volunteers to write scribe
notes for occasional classes.
Assessment etc.
The assessment mechanism will depend on the final composition of the class
and will be announced later. A major (and possibly the only) component will
be a small number of sets of homework exercises distributed through the
semester. You are encouraged to do the Exercises sprinkled through the scribe
notes as we go along, as these will ensure that you absorb the material in real
time and should make the homeworks more manageable.
If the class is not too large, students may also be asked to present a
paper at the end of the semester.