|
Time: Wednesdays 2:30-4
Place: 373 Soda EXCEPTION: meeting on 4/2 will be in 411 Soda (alcove) | There are usually one or more papers that you should at least look over before coming, so that we can have a more effective discussion. |
Past semesters' reading group topics: Fall 2001; Spring 2002; Summer 2002; Fall 2002
To subscribe or unsubscribe, click here.
This semester's first topic: "Massive data sets"
| Date | Speaker | Topic & paper reference |
| Februrary 5 | -- | Organizational meeting |
| February 12 | Andrej Bogdanov | Property testing: new results from STOC 2003 about testing satisfiability of SAT formulas |
| February 19 | James Lee |
I will talk about space-bounded computation. Specifically, I will
talk about small-space deterministic algorithms for undirected s-t
connectivity (USTCON). This is the best kind of open problem:
natural, elegant, important, and perched on the brink of resolution.
Until 1992, the best deterministic space-bounded algorithm for USTCON was due to Savitch (1970) and used O(log^2 n) space. I will show the existence of short universal traversal sequences, and explain how the pseudorandom generators of Nisan can be used to generate such a sequence in small space. Using these sequences in an intuitive way, we will see a deterministic algorithm for USTCON that uses only O(log^{3/2} n) space. The algorithm is very simple and achieves a space complexity of O(log^2 n/log log n) even _without_ Nisan's generator, thus improving a 20-year-old result using only elementary techniques. If there is time, I will sketch how to improve the 3/2 to 4/3 [Armoni, et al. 1999]. Some relevant papers:
|
| February 26 | Eran Halperin | Streaming algorithms for clustering (k-median). |
| March 5 | Sam Riesenfeld | "Topic: Locality-preserving hashing, with applications to high dimensional proximity problems. References: Indyk, Motwani, et al. "Locality-Preserving Hashing in Multidimensional spaces." STOC '97. Linial, Sasson. "Non-Expansive Hashing." STOC '96. Possibly more recent results by Indyk et al if I get a chance to read them...." |
| March 12 | Robi Krauthgamer; 411 SODA |
I will talk about approximate hierarchical clustering. Suppose for example
that you want to find approximate k-medians (think of it as k-clustering)
for every k, such that your k-medians are a subset of your (k+1)-medians,
for all k. One reason this may be interesting is that people apply
hierarchical clustering in practice.
I will present Greg Plaxton's paper "Approximation algorithms for hierarchical location problems", UT Austin Technical Report TR-02-60 (see http://www.cs.utexas.edu/users/plaxton/html/all.html). A paper with same author and title was accepted to the upcoming STOC. |
| March 19 | Kevin Chen | Cancelled. |
| March 28 | Spring break | ? |
| April 2 | Jittat Fakcharoenphol | I'll talk about counting inversions in a data stream. The result is from a STOC'02 paper by Ajtai, Jayram, Ravi Kumar, and Sivakumar, "Approximate counting of inversions in a data stream." I might also talk about Gupta and Zane's improvement from SODA'03 if I have time to read it. |
| April 9 | Kunal Talwar |
Dimension Reduction
The Johnson-Lindentrauss(1985) lemma says that any $n$ point euclidean space can be projected down to O(log n) dimensions without losing too much information about interpoint distances. More precisely, every interpoint distance in the log n dimensional projected space is within 1+\epsilon of the distance in the original space. This result has several algorithmic applications and some elementary proofs. A natural question that researchers have sought to answer is: Is there a corresponding result for l_1 spaces? How many dimensions are necessary to faithfully represent a n-point l_1 space as another (low dim) l_1 space. Are logarithmically many dimensions enough? A very recent result of Brinkman and Charikar(2003) answers this last question in the negative. In fact they show that there exists an l_1 space such that any (1+\epsilon)-faithful representation requires \sqrt{n} dimensions. This build up on a recent result of Charikar and Sahai(2002) that shows similar results for linear projection based embeddings. I shall try to understand and present the ideas and techniques used in proving these two results. PS: If possible, have a look at Charikar and Sahai, Dimension reduction in the l_1 norm, FOCS 2002. |
| April 16 | Jordan Kerenidis | In the first half Hoeteck will describe the notion of epsilon-biased sets and show how they relate to linear codes and expander graphs. Then he will sketch their applications to communication complexity and generation of "almost" k-wise independent random variables. In the second half I will talk about the paper "Derandomizing Low degree tests via epsilon-biased sets" from STOC 2003. I will give the proof for derandomizing the BLR linearity test and a sketch for the low degree test. |
| April 23 | Brighten |
I'll talk about Konemann and Ravi's STOC 2003 paper, "Primal-dual meets
local search: Approximating MST's with nonuniform degree bounds". From
their abstract:
"We present a bicriteria approximation algorithm for the degree-bounded minimum-cost spanning tree problem: Given an undirected graph with nonnegative edge weights and degree bounds B_v > 1 for all vertices v, find a spanning tree T of minimum total edge-cost such that the maximum degree of each node v in T is at most B_v. Our algorithm finds a tree in which the degree of each node v is O(V_v + log n) and the total edge-cost is at most a constant times the cost of any tree that obeys all degree constraints. "[Their algorithm] uses repeated application of Kruskal's MST algorithm interleaved with a combinatorial update of approximate Lagrangean node-multipliers maintained by the algorithm." No knowledge of the primal-dual method will be assumed. I'll give a primal-dual interpretation of Kruskal's MST algorithm first. The paper is available here. |
| April 30 | Alex Fabrikant | Cancelled. |
| May 7 | Chris Harrelson |
I will speak about two results to appear in STOC 2003.
The first is a paper by Ostlin and Pagh, which shows
that there exists a family of hash functions, stored
in O(n) words, which can be evaluated in constant time
in the RAM model, and are completely uniform on a set
of size n with probability at least 1-1/poly(n).
This implies that, for example, any RAM execution sequence of length n^O(1) which depends on a random hash function oracle can be replaced by one which uses Ostlin-Pagh hash functions, and with probability 1-1/poly(n) the distribution of memory configurations after each step is the same. The second, by Dietzfelbinger and Woelfel, gives an alternative proof of the same result, but reduces the constant and therefore may be more practical. Their result appears to be inspired by the first result. It also uses a different technique, which may be of independent interest. |