CS262B Reading Summary

Transactional Client-Server Cache Consistency: Alternatives and Performance

Michael J. Franklin et al

Summary by Feng Zhou
3/1/2004

Strong points of the paper are:

  1. Transactional cache consistency algorithms are closely related to but quite different from distributed shared memory cache cosistency algorithms. Although the ACID semantics place certain constraints on the algorithms, the transactions also provide opportunities for better efficiency than normal memory consistency algorithms.  The opportunies are mainly due to the fact that operations in transactions can be aborted if anything goes wrong, which is not possible in memory systems.  Therefore transactional consistency algorithms can be more optimistic, possibly delay or combine communciations, and thus be more efficient.
  2. The choice of Invalid Access Prevention as either avoidance or detection is a key aspect of the taxonomy presented in the paper. Avoidance algorithms ensures that all cached data is valid. So no data validation is required in order to let a transaction read cache data. Detection-based algorithms allow stale data in local caches and need to validate them when committing transactions. However, this aspect of the algorithms is orthogonal to whether the algorithm is pessimistic or optimistic, i.e. either of them can be either pessimistic or optimistic.
  3. Three families of algorithms are evaluated.  They are Server-based 2-phase locking (S2PL), Callback Locking (CBL), and Optimistic 2PL (O2PL). S2PL verifies the validity of pages with the server before they are read by the transaction. In CBL, however, the server maintains callbacks to all clients having copies of a page and notifies them of updates. Therefore this is very much like AFS and is an avoidance scheme. But different write intention declaration times give different variants of the algorithm. O2PL algorithms are avoidance onces with write intention declaration delayed until transaction commit time.
  4. The performance evaluation is done with simulation. First a performance model is presented and built with a simulation language. Then a couple of characteristic workloads are built to match different database applications. This in general is a good strategy for complex systems and algorithms.