Advanced Topics in Computer Systems  
Joe Hellerstein & Eric Brewer

Concurrency Control: Alternate Realities

Optimistic Concurrency Control (Kung & Robinson)

Attractive, simple idea: optimize case where conflict is rare.

Basic idea: all transactions consist of three phases:

  1. Read. Here, all writes are to private storage (shadow copies).
  2. Validation. Make sure no conflicts have occurred.
  3. Write. If Validation was successful, make writes public. (If not, abort!)
When might this make sense? Three examples:
  1. All transactions are readers.
  2. Lots of transactions, each accessing/modifying only a small amount of data, large total amount of data.
  3. Fraction of transaction execution in which conflicts "really take place" is small compared to total pathlength.
The Validation Phase
  1. Assign each transaction a TN during execution.
  2. Ensure that if you run transactions in order induced by "<" on TNs, you get an equivalent serial schedule.
Suppose TN(Ti) < TN(Tj). Then if one of the following three conditions holds, it’s serializable:
    1. Ti completes its write phase before Tj starts its read phase.
    2. WS(Ti) intersect RS(Tj) = emptyset and Ti completes its write phase before Tj starts its write phase.
    3. WS(Ti) intersect RS(Tj) = WS(Ti) intersect WS(Tj) = emptyset and Ti completes its read phase before Tj completes its read phase.
Is this correct? Each condition guarantees that the 3 possible classes of conflicts (W-R, R-W, W-W) on the 2 orderings (i before j, j before i) go in one order only: i before j.  There are 3x2=6 possible conflict orderings to consider.
  1. For condition 1 all conflicts are ordered i before j (true serial execution!)
  2. For condition 2,
  1. For condition 3,
Assigning TN's: at beginning of transactions is not optimistic, since a transaction could not be able to validate immediately if it's predecessor transactions were still running; this smells like locking.  Instead, assign TNs it at end of read phase. Note: this satisfies second half of condition (3).

Note: a transaction T with a very long read phase must check write sets of all transactions begun and finished while T was active.  This could require unbounded buffer space.
Solution: bound buffer space, toss out when full, abort transactions that could be affected.

Serial Validation

Only checks properties (1) and (2), since writes are not going to be interleaved.

Simple technique: make a critical section around <get xactno; validate (1) or (2) for everybody from your start to finish; write>. Not great if:

Improvement to speed up validation:

repeat as often as you want {

}

<get xactno; validate with new xacts; write>.

Note: read-only xacts don’t need to get xactnos! Just need to validate up to highest xactno at end of read phase (without critical section!)
 
 
 

Parallel Validation

Want to allow interleaved writes.
Need to be able to check condition (3). Small critical section.
Problems:

One More Concurrency Control Technique

Time Stamping: Bernstein TODS ’79
 
 
 
record 
Write TS Read TS

  Problems:
  1. forces time-stamp order (tighter restriction than other schemes)
  2. cascaded aborts (no isolation)
  3. I/O cost even on "clean" pages
Multi-version timestamping techniques: Timestamping is not dead, but it is not popular, either.  Note that it wasn't used in Postgres (which did keep versions).
 

Performance Study: Locking vs. Optimistic

Agrawal/Carey/Livny

Previous work had conflicting results: Goal of this paper: Methodology:


Measurements


Experiment 1: Infinite Resources


Experiment 2: Limited Resources (1 CPU, 2 disks)

Experiment 3: Multiple Resources (5, 10, 25, 50 CPUs, 2 disks each) Experiment 4: Interactive Workloads

Add user think time.

Questioning 2 assumptions: Conclusions