CS262B Reading Summary

A Case for NOW (Networks of Workstations)

Thomas E. Anderson, David E. Culler, David A. Patternson, and the NOW team

Feng Zhou
1/28/2004

Strong points of the paper are:

  1. It gives a good argument for why we need NOW, and also why MPP must fail.  The most important reason is the volume effect of commodity hardware components, commodity operating system and applications.  Given this, a connected network of cheap workstations will have the potential of providing much better performance-price ratio while satisfying raw-performance needs of most applications.  How to make NOW actually work for real applications, however, is where the most research problems lie.
  2. The paper states good reasons why NOW would become a viable solution for large problems at the time when the paper is written.  Essentially it's both processing power of commodity PCs and speed of switching networks have reach their critical mass that makes NOW work.  For processing power, it's the development of fast RISC processors and the fast updating and cheap Intel processors.  For the networks, it's about the first time when networking latency on fast networks like ATM and Myrinet, has become smaller than disk latencies.  This makes using storage of other machines on the same network profitable, compared to using local disks.
  3. The paper identifies several key research topics in NOW.  First, low-overhead communication is a key in achieving good scalability.  At that time, although network bandwidth kept increasing, OS network stack implementation and traditional I/O interfaces were posing unacceptable latencies to communications.   This topic was important and will remain important as long as networking hardware keeps improving and applications need more scalability.  However, the other issue of providing a single system image over the whole cluster and a scalable serverless file system appears to be harder to achieve.
One major flaw.

It is rather unconvincing to conclude from one user study at Berkeley that most part of clusters are idle even in work hours.  This seems to be highly dependent on the usage patterns of the users and how many users are on a certain cluster.  On the other hand, the trend that PC's getting cheaper and cheaper simply results in everyone having a separate computer and enough computers being used solely for computation.  Therefore, migrating work to people's PCs become less and less interesting.