CS262B Reading Summary
Flexible Update Propagation for Weakly Consistent
Replication
Karin Petersen et al.
Summary by Feng Zhou
3/3/2004
Strong points of the paper are:
- The version vector is the key data structure in tracks updates.
At each server, a vector is used to keep the newest version of updates
from all other servers in the system. The prefix property states that
all updates prior to the one in the vector is sure to be already
incorporated locally. Therefore, only one version number needs to
be recorded for each server. When propagating changes from S to
R, a comparison between the two version vectors is done and delta
updates are sent from S to R.
- Write stabalizing is important because it guarantees that an update
will not need to be reapplied again *locally*. So it is safe for the local
site to apply the update to the DB and discard it. However, it does not
mean other sites will never need the update info again. So a judicious
decision need to be made to when to discard stable writes. The later you
discard it, the less like other people will need to do full database transfers.
One major flaw.
The correctness of the Bayou protocol certainly depends on a couple
of important assumptions, which the authors didn't make clear in the
paper. For example, one crucial assumption is that reordering of
concurrent updates, either conflicting or non-conflicting, will result
in the same updates to the database. This mandates "perfect"
conflict-resolving methods, which seems hard to find for a lot of
applications. Without this, the system will need some way of preventing
inconsistent data resulting from differnt conflict-resolving outcomes.
A summary/table/diagram explaing the multitude of stamps/variables in
the system will help a lot in helping the reader.
The paper also didn't talk about when to truncate committed logs.
This is an important question. A systematic method will be necessary if
predicatable behavior is needed.