CS262B Reading Summary

Mariposa: a Wide-area Distributed Database System

Michael Stonebraker et al.

Summary by Feng Zhou
4/14/04

Strong points of the paper are:

  1. This paper introduced de-centralized control into distributed database systems, in particular into the query processing subsystem. Using the economic model, Mariposa distributes administrative control over the query processor and participates sites. The site issueing the query (query processor) uses bids from other sites to optimize the distributed query plan, while the participating sites have local bidding policies that reflects the way the local administrator want the node to behave in the whole system.
  2. The bidding process is the central idea in the economic model. One input to the process is the budge of the query issuer, used to specify the QoS the client needs. Then bids from various sites are collected, basically represents the costs of executing fragments of the query at these sites. Although traditional distributed query optimizers also consider local query processing costs, the key difference is that bids are actually computed solely by local sites, and thus controlled locally. Therefore the sites have the right to say no to bid requests. The bids reflects local processing power, current load and any extra cost (e.g. fetching missing data from other nodes). Then based on bidding from various sites, the query processor generate a plan that's least expensive while satisfying the QoS requirements (total processing time).
One major flaw,
No mention of guards against malicious nodes (clients and database sites) are mentioned in the paper, although the micro-economic model should be a good paradigm to implement such guarding schemes.