CS262B Reading Summary
A Case for NOW (Networks of Workstations)
Thomas E. Anderson, David E. Culler, David A. Patternson, and the
NOW team
Feng Zhou
1/28/2004
Strong points of the paper are:
- It gives a good argument for why we need NOW, and also why MPP
must fail. The most important reason is the volume effect of
commodity hardware components, commodity operating system and
applications. Given this, a connected network of cheap
workstations will have the potential of providing much better
performance-price ratio while satisfying raw-performance needs of most
applications. How to make NOW actually work for real
applications, however, is where the most research problems lie.
- The paper states good reasons why NOW would become a viable
solution for large problems at the time when the paper is
written. Essentially it's both processing power of commodity PCs
and speed of switching networks have reach their critical mass that
makes NOW work. For processing power, it's the development of
fast RISC processors and the fast updating and cheap Intel
processors. For the networks, it's about the first time when
networking latency on fast networks like ATM and Myrinet, has become
smaller than disk latencies. This makes using storage of other
machines on the same network profitable, compared to using local disks.
- The paper identifies several key research topics in NOW.
First, low-overhead communication is a key in achieving good
scalability. At that time, although network bandwidth kept
increasing, OS network stack implementation and traditional I/O
interfaces were posing unacceptable latencies to
communications. This topic was important and will remain
important as long as networking hardware keeps improving and
applications need more scalability. However, the other issue of
providing a single system image over the whole cluster and a scalable
serverless file system appears to be harder to achieve.
One major flaw.
It is rather unconvincing to conclude from one user study at
Berkeley that most part of clusters are idle even in work hours.
This seems to be highly dependent on the usage patterns of the users
and how many users are on a certain cluster. On the other hand,
the trend that PC's getting cheaper and cheaper simply results in
everyone having a separate computer and enough computers being used
solely for computation. Therefore, migrating work to people's PCs
become less and less interesting.