CS294-8 Fall 2000 Reading List
There are extra links at the bottom of this page to papers that are relevant
but not (yet) assigned.
Week 1: Motivation
Tuesday, August 29
Thursday, August 31
- Brown, A., D. Oppenheimer, K. Keeton, R. Thomas, J. Kubiatowicz, and D.A.
Patterson.
ISTORE: Introspective
Storage for Data-Intensive Network Services. Proceedings of the
7th Workshop on Hot Topics in Operating Systems (HotOS-VII), Rio Rico,
Arizona, March 1999. (An earlier
version appeared as University of California, Berkeley Technical Report
UCB CSD-98-1030, December 1998.)
Week 2: Applications of Reliable Services in the PostPC World
Tuesday, September 5
- Homework 1 due Tuesday, 9/5.
- Michael D. Schroeder, Andrew D. Birrell and Roger M. Needham,
Experience with Grapevine: The Growth of a Distributed System,
ACM Trans. on Computer Systems 2(1), February 1984, pp.3-23.
(Copies will be available in class.)
- Yasushi Saito, Brian N. Bershad, Henry M. Levy,
Manageability,
availability and performance in Porcupine:
a highly-scalable, cluster-based mail service. SOSP 1999.
Thursday, September 7 (Seminar by Mary Baker)
- D. Tang and M. Baker,
"Analysis of a Metropolitan-Area Wireless Network."
Proceedings of Mobicom'99, August, 1999. (Extended version forwarded to a future edition of
ACM/Baltzer Wireless Networks (WINET). The journal version
also corrects a timezone mistake in the conference version.)
Week 3: Communication
Tuesday, September 12
- Andrew Birrel, Greg Nelson, Susan Owicki, and Edward Wobber,
Network Objects, Software Practice and Experience,
25(S4):87-130, December 1995. Also appeared at DEC SRC Research
Report 115.
Thursday, September 14 (Seminar by Michael Mitzenmacher)
- Michael Mitzenmacher
, Using Multiple Hash Functions
to Improve IP
Lookups.
(Send mail to yelick@cs to request a copy of the paper.)
- Michael Mitzenmacher,
Accessing Multiple Mirror Sites in Parallel: Using Tornado Codes
to Speed Up Downloads, INFOCOM 99. (Note: this paper really
belongs later in the semester, but Michael will be speaking
in the Systems Seminar this week on the previous paper, and
would be happy to discuss coding in the post seminar discussion.)
Week 4: Distributed Synchronization
Tuesday, September 19
- Leslie Lamport. Time, Clocks, and the Ordering of Events in a Distributed
System, Communications of the ACM, Vol. 21, No. 7
(July 1978), pp. 558-565.
(Available outside 777 Soda.)
- Chandy and Lamport, Distributed
Snapshots: Determining the Global States of a Distributed System, ACM TOCS, pp. 63-75, Feb. 1985.
Thursday, September 21 (Seminar by David Lowell)
Week 5: Distributed Agreement
Tuesday, September 26
- Leslie Lamport, Part Time Parliament,
Transactions on Computing Systems (TOCS) vol. 16, no. 2, 133-169.
- Butler Lampson.
How
to Build a Highly Available System Using Consensus
Distributed Algorithms, ed. Babaoglu and Marzullo, Lecture Notes in Computer
Science 1151, Springer, 1996, pp 1-17.
- Note: The Castro and Liskov paper has been postponed (we will
read it), in part because the previous two papers are quite long.
Thursday, September 28 (Seminar by Chandramohan Thekkath)
- Homework 2 due Thursday, 9/28.
- Thomas L. Rodeheffer, Chandramohan A. Thekkath, Darryl C. Anderson,
SmartBridge: A Scalable Bridge Architecture, SIGCOMM 2000.
- Chandramohan Thekkath, Timothy Mann, and Edward K. Lee.
Frangipani: A scalable distributed file system.
In Proceedings of the 16th ACM Symposium on Operating Systems Principles, pages 224-237. ACM Press,
October 1997.
- Edward K. Lee and Chandramohan Thekkath.
Petal: Distributed virtual disks. In Proceedings of the Seventh
International Conference on Architectural Support for Programming Languages and Operating Systems,
ASPLOS-VII, pages 84-92. ACM, October 1996.
Week 6: Reasoning about Distributed Algorithms
Tuesday, October 3
- The SPEC language
from an MIT course by Butler Lampson, Martin Rinard, Bill Weihl, and others.
Thursday, October 5 (Seminar by Brendan Murphy)
Week 7: Replication
Tuesday, October 10
- There will not be a class on October 10th.
Thursday, October 12 (Seminar by John Kubiatowicz)
- John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski,
Patrick Eaton, Dennis Geels, Ramakrishna Gummadi,
Sean Rhea, Hakim Weatherspoon, Westley Weimer,
Chris Wells, and Ben Zhao.
OceanStore: An Architecture for
Global-Scale Persistent Storage,
Proceeedings of the Ninth international Conference on Architectural
Support for Programming Languages and Operating Systems (ASPLOS 2000),
November 2000.
Week 8: Reasoning about Distributed Algorithms
Tuesday, October 17
- Homework 3 due Tuesday, 10/17.
- Sections 8-9 of the
SPEC language
from an MIT course by Butler Lampson, Martin Rinard, Bill Weihl, and others.
Thursday, October 19 (Seminar by Dawson Engler)
Week 9: Language Support for Reliability
Tuesday, October 24
- David L. Detlefs, K. Rustan M. Leino, Greg Nelson, Jim Saxe,
Extended Static Checking, Research Report #159, Compaq
Systems Research Center, December 1998.
Thursday, October 26 (Seminar by Jim Larus)
- Staged Server: Using Cohort Scheduling and Staged Computation
to Enhance Server Performance.
Week 10: Distributed Data Structures and Benchmarking
Tuesday, October 31
Thursday, November 2 (Seminar by Joe L. Hellerstein and Gautam Kar)
Week 11: Benchmarking and Replication
Tuesday, November 7 (Guest lecture by Aaron Brown)
Thursday, November 9 (Seminar by Jim Gray)
Week 12: Load Balancing and Paxos Revisited
Tuesday, November 14
Thursday, November 16 (Seminar by Leslie Lamport)
- Eli Gafni and Leslie Lamport, Disk Paxos, Research
Report #163, Compaq Systems Research Center, July 4 2000.
Week 13: Resource Allocation
Tuesday, November 21
- No lecture: Individual project meetings this week for everyone doing
a final project.
Thursday, November 23
Week 14: Byzantine Agreement
Tuesday, November 28
- Michael Fischer, Nancy Lynch, and Michael Patterson,
Impossibility of Distributed Consensus with One Faulty Processor,
Journal of the ACM, vol 32, no 2, 1985. (Note: You need ACM digital
library access to obtain the pdf file. If you don't have that, stop
by my office for a copy. 2) Time permitting, we will also discuss the
Castro-Liskov work.)
Thursday, November 30
- M. Castro and B. Liskov
Practical Byzantine Fault Tolerance
Proceedings of the Third Symposium on Operating Systems Design and
Implementation (OSDI '99), New Orleans, USA, February 1999.
(Barbara Liskov will be speaking in the Wednesday EECS Colloquium
on Nov. 1.)
Week 15: Self-Stabilizing Algorithms
Tuesday, December 5
- Marco Schneider, Self-Stabilization, UT Austin and ACM Computing
Surveys.
Thursday, December 7
Extra Papers:
- Eric Anderson and David Patterson,
A
Retrospective on Twelve Years of LISA Proceedings, Proceedings
of the 13th Systems Administration Conference (LISA) '99, November
1999, Seattle, Washington.
- Subhachandra Chandra, Peter M. Chen,
Whither Generic Recovery From Application Faults? A Fault Study
using Open-Source Software, Proceedings of the 2000 International Conference on Dependable Systems and
Networks / Symposium on Fault-Tolerant Computing (FTCS) , June 2000.
- Anthony D. Joseph, Alan F. deLespinasse,
Joshua A. Tauber, David K. Gifford, and M. Frans Kaashoek.
Rover:
A Toolkit for Mobile Information Access,
Proceedings of the Fifteenth Symposium on Operating Systems Principles,
December 1995.
- National Research Council Report, Reducing
Disaster Losses Through Better Information, 1999.
- Design and implementation of the Lucent Personalized Web Assistant (LPWA)
D. M. Kristol, E. Gabber, P. Gibbons, Y. Matias and A. Mayer,
Bell Labs TR 1999.
- J. Ousterhout. The Role of Distributed State.
CMU Computer Science: A 25th Anniversary Commemorative.
ACM Press Anthology Series, R. Rashid (Ed.), July 1991.
- James Kistler and M. Satyanarayanan.
Disconnected
Operation in the Coda File System,
ACM Trans. on Computer Systems 10(1), February 1992, pp. 3-25.
- Douglas B. Terry, Marvin M. Theimer, Karin Peterson, Alan J. Demers,
Mike J. Spreitzer, and Carl H. Hauser,
Managing
Update Conflicts in Bayou, a Weakly Connected Replicated
Storage System,
Proceedings of the 15th ACM Symposium on Operating Systems Principles,
December, 1995, p. 172-183.
- Stefan Savage, Neal Cardwell, David Wetherall and Tom Anderson.
TCP Congestion Control with a Misbehaving Receiver.
ACM Computer Communications Review, v 29, no 5, October, 1999.
- Savage et al., Robust Protocol Design in
Uncooperative Environments
- David E. Lowell, Peter M. Chen
The
Theory and Practice of Failure Transparency,
CSE-TR-409-99, October 1999.
- Kenneth P. Birman,
The Process Group Approach to Reliable Distributed Computing,
Communications of the ACM 36(12), December 1993, pp. 37-53.
- David Cheriton and Dale Skeen,
Understanding the Limitations of Causally and Totally Ordered
Communication, Proc. of the Symposium on Operating System Principles (SOSP), December 1993.
- Kenneth P. Birman, A Response
to Cheriton and Skeen's Criticism.
- Bruce Lindsay, Laura Haas, C. Mohan, Paul Wilms and Robert Yost.
Computation and Communication in R*: A Distributed Database Manager.
ACM Trans. on Computer Systems 2(1), February 1984, pp. 24-38.
- Ladin, B. Liskov, L. Shrira and S. Ghemawat.
Providing High Availability Using Lazy Replication.
ACM Transactions on Computer Systems, vol. 10 (4), pp. 360--391,
November 18 1992.
http://www.acm.org/pubs/citations/journals/tocs/1992-10-4/p360-ladin/
- M. R. Korupolu, C. G. Plaxton, and R. Rajaraman.
Placement algorithms for hierarchical cooperative caching.
In Proceedings of the 10th Annual ACM-SIAM Symposium on
Discrete Algorithms, Baltimore, Maryland, pages 586-595, January 1999.
- C. G. Plaxton and R. Rajaraman.
Fast fault-tolerant concurrent access
to shared objects. In Proceedings of the 37th Annual
IEEE Symposium on Foundations of Computer Science,
Burlington, Vermont, pages 570-579, October 1996.
- Ambuj Singh and Gregory Johnson, Stable
and Fault-Tolerant Resource Allocation, Principles of Distributed Computing, Portland, Oregon,
July 2000.
- P. Gibbons, J. Bruno and S. Phillips,
Post-mortem black-box correctness tests for basic parallel data structures,
SPAA'99.