CS262A Reading Summary 10

The HP AutoRAID Hierarchical Storage System

John Wilkes et al
Summary by Feng Zhou
9/24/2002

4 key features,
  1. A two-level storage hierarchy is used to address the hard-to-use problem of RAID systems. Just like memory hierarchies, it combines the performance of the upper-layer (mirroring) with the cost-effectiveness of the under-layer (RAID-5). This is valid and important because raw disk I/O performance is one of the few areas that is improving very slowly, compared to other areas like CPU speed and disk size. So building a hierarchy is generally a good idea.
  2. The mirroring layer and RAID-5 layer are complementary, rather than inclusive. The mirroring layer is not a cache of the RAID-5 layer. This improves the cost-effectiveness of the system. It can be implemented efficiently partly due to the fact that bulk(sequential) reading and writing of disks are very fast compared to random ones. Thus moving large block of data between two layers does not incur much cost.
  3. The introduction of automatic background operations makes the system more adaptive to different workloads. By doing Compaction, Migration and Balancing in the background, the system can continue to operate in a near-optimal situation without the need for off-line maintainance. This greatly improves overall system availability, performance and reduces administration overhead.
  4. Using prototyping and simulation at the same time during system designing is a great idea. Prototyping validates the design as early as possible while simulation helps exploring the design space during the whole process much more easily and faster.

One major flaw:

The paper does not evaluate the timeliness and effectiveness of the background operations, i.e., how long will it take to do compaction, migration and balancing. And under various workloads, how much of these kinds of book-keeping work can be done in the background, which may affect overall performance a lot.