CS262B Reading Summary
Scalable, Distributed Data Structures for
Internet Service Construction
Steven D. Gribble et al.
Summary by Feng Zhou
2/1/2004
Strong points of the paper are:
- Even today, a flexible global data management facility for
clusters is a much needed service for cluster-based Internet
service. Neither databases nor (distributed) file systems are a
good fit for this. Databases don't scale, and a centralized one
offers poor availability. File systems provide poor atomicity and
consistency support. Therefore distributed data structures are a
good candidate for this use.
- The consistency strategy (optimistic 2-phase commit) is a good
choice for the cluster environment. For example, replica talk to
each other to agree on committing or aborting, instead of waiting for
the manager to recover, when the manager crashes. This exploits the
fact that all replica are on the same LAN and connected. Another
useful decision is to remove a replica from the replica group when it
crashes, instead of waiting for it to recover. This increases
availability without losing data consistency.
-
The technique use to maintain consistency of metadata maps is
useful. The DDS library piggybacks hashes of maps to the bricks
with every command, which verify whether they are up-to-date,
because carrying out any operation. This means the DDS library
instances do not need to maintain an up-to-date version of the maps
and do not need to be notified with every update to these maps.
One major flaw.
DDS does not provide transaction support. This limits its usage to
non-mission-critical applications, or the applications must provide
ACID support by themselves, which is a daunting job for
app. developers.