CS298-1 System Seminar

Fall 1998

Thursdays, 3:30-5, 306 Soda Hall

Course Control #25169 (directions to Soda)

The UC Berkeley Systems Seminar highlights exciting developments in Computer Architecture, Operating Systems, Networking, and related areas. It returns to its traditional Thursday afternoon timeslot, with talks beginning at 3:30 and refreshments served at 4:30 pm. Graduate students may enroll for one unit of credit.

Current Schedule

Date	Person/Place/Bio	Title/Abstract	Related URLs	UCB Host
9/10/98	Bill Joy,Jon Bostrom, Sun Microsystems	Distributed Computing with Jini	Jini \| White Papers	Patterson
9/17/98	Eugene N. Miya, NASA Ames	A Methodology for Calibrating the Cray Research Hardware Performance Monitor		Remzi Arpaci-Desseau
9/24/98	Shmuel Shottan, Meridian Data	The Snap!Server: A leading Plug & Play Network Attached Storage Server	Snap!Server info	Patterson
10/1/98	Anurag Acharya, Department of Computer Science , UCSB	Active Disks: Programming Model, Algorithms and Evaluation	the paper titled Active Disks: Programming Model, Algorithms and Evaluation.	Kim Keeton
10/8/98	Erik Riedel, School of Computer Science, CMU	Active Disks - Remote Execution for Network-Attached Storage	the paper titled Active Disks - Remote Execution for Network-Attached Storage	Patterson
10/15/98	Robert D. Blumofe, The University of Texas at Austin	Thread Scheduling for Multiprogrammed Multiprocessors	See Multiprogramming Multiprocessors Papers	Culler
10/22/98	Dave Anderson, Seagate Disc Home	Future Storage Architecture: Storage Networking		Patterson
10/29/98	Ed Grochowski, Almaden Storage Systems and Technology, IBM	Magnetic Hard Disk Drives- Advances Through The Year 2000 And Beyond	IBM leadership in disk storage technology	Patterson
11/5/98	David Zager, Avesta Technologies, Inc	A model-based approach to enterprise service management	Avesta Technologies	Eric Anderson
11/12/98	Steve Kleiman, Network Appliance	Filer Technology and Appliance Philosophy	Network Appliance - Architecture	Patterson
11/19/98	Donald A Norman, The Nielsen-Norman group	The Invisible Computer: Why the computer industry doesn't work the way you think it should.	The book entitled "The Invisible Computer"	Canny; joint with the Human-Centered Computing Seminar .

Last Updated on 11/10/98
By Dave

David A. Patterson patterson@cs.Berkeley.edu

Upcoming Talks

Filer Technology and Appliance Philosophy (11/12)

Steve Kleiman Chief Architect Network Appliances srk@netapp.com

Network Appliance's Filers (Filer Server Appliances) combine an innovative file system design with an "appliance" oriented software base to produce a line of fast, simple and reliable file servers. This talk
will review the details of the WAFL (Write Anywhere File Layout) file system and show how the properties of the file system are use to not only make easy to use file servers, but to simplify high-availability
architecture and wide area data management. The appliance philosophy also is used in the design of the filer hardware which leverages commercial off-the-shelf components, including a NUMA interconnect.

The Invisible Computer: Why the computer industry doesn't work the way you think it should. (11/19)

Donald A. Norman The Nielsen-Norman group

Thomas Edison was a great inventor, but a crappy businessman. Consider the phonograph. Edison was first (he invented it), he had the best technology, and he did a brilliant, logical analysis of the business. As a result, he built a technology-centered phonograph that failed to take into account his customer's needs. In the end, his several companies proved irrelevant and bankrupt.

In the early part of a technology's life cycle, the customer segment is comprised of enthusiasts who nurture the fledgling early products and help them gain power and acceptability. Technology rules the day, guided by feature-driven marketing. Everything changes when products mature. The customers change, and they want different things from the product. Convenience and user experience dominate over technological superiority. This is a difficult transition for a technology-driven industry. This is where the personal computer industry stands today. The customers want change, yet the industry falters, either unwilling or unable to alter its ways. Edison didn't understand this.

If the information technology is to serve the average consumer, the technology companies need to become market driven, task-driven, driven by the real activities of users. Alas, this is a change so drastic that many companies may not be able to make the transition. The very skills that made them so successful in the early stages of the technology are just the opposite of what is needed in the consumer phases.

This talk addresses the changes we might expect to see in the information technology world. And the process by which they might come
about.

Past Talks

Distributed Computing with Jini (9/10)

Bill Joy, Founder and VP, Research
Jon Bostrom, Jini Evangelist
Sun Microsystems, Inc.

Since I left Berkeley in 1982, I have wanted to build a better basis for computing. With Java, a better programming language, and now Jini, we have a real candidate for a next generation approach. Programs can now be written in a reliable programming language, and simple distributed systems constructed based on distributed objects and mobile code.

This talk describes the large and long-term forces at work which lead us to believe that a distributed object-oriented future is at hand, and the how and why of the new technologies.

We will demonstrate the prototype Jini software and devices, which are about to be retired, publicly for the first and nearly the last time, because these early demo machines, which we have been using since the spring of this year, are about to be retired, in favor of a more sophisticated, and possibly less whimsical, demonstration for the official Jini product launch later this fall.

A Methodology for Calibrating the Cray Research Hardware Performance Monitor: Initial Observations (9/17)

E. N. Miya

Applied Information Systems Division
NASA Ames Research Center
Moffett Field, CA 94035-1000 USA
eugene@ames.arc.nasa.gov

The Cray Research (CRI) X-MP/Y-MP Hardware Performance Monitor (HPM) is a hardware feature unlike anything found on a workstation or educational platform. The HPM provides low-overhead instruction counting, but it is sensitive to hardware and software configuration. Unfortunately, the HPM and its software has lacked adequate calibration. Calibration involves "zeroing" and scaling the instrument. Zeroing is unquestionably the most critical function.

HPM "zeros" were counted for the smallest possible null but complete program and one well-known program ("Hello world") using four languages. The program start up overhead is measureable but unseen. Various system load conditions (dedicated versus loaded systems) were measured using different hardware configurations. The resulting measures are neither intuitively obvious nor consistent at first glance.

We conclude that extensive HPM zeroing and scaling are needed for all languages and compilers, but a documented "zero" measurement is essential for future measurement interpretions. The "zero" point deserves special mention, because it is the starting point and easily obtained. Finally, simple output improvements are suggested.

The Snap!Server: A leading Plug & Play Network Attached Storage Server (9/24)

Shmuel Shottan

Meridian Data
5615 Scotts Valley Drive
Scotts Valley, CA 95066

Strong growth in requirements for PC Server storage is continuing to increase in today's market. While server centric storage subsystems are still the prevalent way of answering the need for additional storage capacity, a new category of network attached storage servers has emerged.

Enterprise centric network attached storage solutions have been introduced and were validated as tried and true designs. Meridian Data's Snap!Server establishes a new category of low-cost, true plug & play appliance, aimed at reducing the time, expense and complexity of adding storage to a network.

Snap!Server has been designed with the goal of appealing to the large base of the market pyramid - the departmental networks and the SOHO market. In order to fit into the targeted environment, Snap!Server was designed to operate in a heterogeneous environment, supporting simultaneously NFS, CIFS, HTTP and NCP protocols, while maintaining data integrity across all platforms and locking (oplock) management across protocols.

The Snap!Server a pure embedded appliance - performing it's single task well.

In this presentation the design goals, architecture, features and design tradeoffs of the Snap!Server will be described. An overview of the tradeoffs required to position the Snap!Server into an existing infrastructure and integrate seamlessly into the current computing environment will be given.

This presentation emphasizes the product centric aspects of the Snap!Server, yet a short assessment of industry and academia initiatives and infrastructure enhancements will be presented, which should enable the Snap!Server to grow into an Intelligent Disk, in future incarnations.

Active Disks: Programming Model, Algorithms and Evaluation (10/1)

Anurag Acharya,
Department of Computer Science, UCSB

Abstract: In this talk, I will focus on Active Disk architectures which integrate significant processing power and memory into a disk drive and allow application-specific code to be downloaded and executed on the data that is being read from (written to) disk. The key idea is to offload bulk of the processing to the disk-resident processors and to use the host processor primarily for coordination, scheduling and combination of results from individual disks. I will describe a stream-based programming model for Active Disks, which allows disklets to be executed efficiently and safely. I will also present active-disk versions of several efficient data-intensive algorithms. Finally, I will present simulation results comparing the performance of six such algorithms (select, group-by, external sort, datacube, image convolution and satellite data processing) running on active-disk architectures and on conventional-disk architectures.

Active Disks - Remote Execution for Network-Attached Storage (10/8)

Erik Riedel, CMU

Today's commodity disk drives are actually small computers, with general-purpose processors
(high-end drives have control processors from the Motorola 68000 family), memory (one to four
megabytes today, and moving higher), a network connection (SCSI over short cables today, but
moving to FibreChannel), and some spinning magnetic material to actually store/retrieve the data.
The increasing performance and decreasing cost of processors and memory are going to continue
to cause more and more intelligence to move into peripherals from CPUs. Storage system
designers are already using this trend toward "excess" compute power to perform more complex
processing and optimizations inside storage devices. To date, such optimizations have been at
relatively low levels of the storage protocol. At the same time, trends in storage density,
mechanics, and electronics are eliminating the bottleneck in moving data off the storage media and
putting pressure on interconnects and host processors to move data more efficiently. We propose
a system called Active Disks that takes advantage of processing power on individual disk drives to
run application-level code. Moving portions of an application's processing to execute directly at
disk drives can dramatically reduce data traffic and take advantage of the storage parallelism
already present in large systems. The focus of this work is to identify the characteristics of
applications that make them suitable for execution at storage devices and quantify the benefits to
individual application performance and overall system efficiency and scalability from the use of
Active Disks. In this talk, I will focus on the opportunities opened up by current trends in storage
devices, discuss a model of expected speedup for applications on Active Disks, and present
results from a prototype system on several large-scale applications.

Thread Scheduling for Multiprogrammed Multiprocessors (10/15)

Robert D. Blumofe The University of Texas at Austin

I will present an efficient user-level thread scheduler for shared-memory multiprocessors and an
analysis of its performance under multiprogramming. This scheduler is a non-blocking
implementation of the work-stealing algorithm. Idle processes (kernel threads) steal (user-level)
threads from randomly chosen victims, and all concurrent data structures are implemented with
non-blocking synchronization. Without any need for special kernel-level resource management,
such as coscheduling or process control, this non-blocking work stealer efficiently utilizes whatever
processor resources are provided by the kernel.

We demonstrate this efficiency with an algorithmic analysis and an empirical analysis. For our
algorithmic analysis, we assume that the kernel is an adversary, and we prove that the execution
time is optimal to within a constant factor. We have implemented the non-blocking work stealer in
Hood: a C++ threads library built on top of Solaris pthreads, and we have studied its
performance. This study shows that application performance does conform to the theoretical
bound with a very small constant factor, roughly 1. Applications efficiently utilize processor
resources even when the number of processes exceeds the number of processors and even when
the number of processors grows and shrinks arbitrarily.

This work has been done in collaboration with Nimar Arora, Dionisios Papadopoulos, and Greg
Plaxton of The University of Texas at Austin.

Magnetic Hard Disk Drives- Advances

Through The Year 2000 And Beyond (10/29)

Dr. Edward Grochowski IBM Almaden Research Center San Jose CA 95120-6099

Magnetic hard disk drives are the dominant storage technology and have been applied in all
information processing applications as servers for large enterprise computers to desktop and
mobile personal comouters. This product has been the recipient of significant technology
innovations as magnetoresistive and giant magnetoresistive sensors, advanced thin film disks and
high performance PRML data channels. The results have been areal density increases at 60% per
year and commensurate price per megabyte reductions yielding an 1998 industry average capacity
per drove of over 4 gigabytes at a price of about $200. To maintain this rapid progress throughout
the next decade, new sensor and disk technologies will be employed, continuing this product
through the superparamagnetic effect. These innovations and their impacts on future drive
capacities. performances and overall designs will be presented.

A model-based approach to enterprise service management (11/5)

David Zager, Ph.D.

Vice President and Chief Technology Officer Avesta Technologies, Inc.

2 Rector Street - New York NY 10006 - USA

This presentation will discuss an approach to enterprise service management we have been working on at Avesta
Technologies (a small New York-based software house). The project is named Trinity.

By "enterprise" we mean the set of resources and their interrelationships that make up an organization's distributed
computing environment, and that deliver automation services to that organization. Resources include power supplies and
network gear, databases and file systems, applications and business groups, and end-users as consumers of automation
services.

By "service management" we mean an approach to organizing operational information during the run-time of the
computing environment that enables people with operational responsibilities to see the availability of complex IT
services. Trinity additionally promotes effective collaboration among various groups with different interests, yet who are
all focused on problem resolution and service restoration. Trinity does not perform the role of an automated operator,
nor is it a generalized configuration tool.

Trinity's architecture consists of an active memory-database, a set of data acquisition agents that run distributed and in
parallel, and a set of presentation applications. The "database" embodies a data model of the resources of a computing
environment and their interrelationships. Trinity builds its model by auto-discovering the resources and their interrelations
(as far as the environment is instrumented; the rest must be manual). It then animates the model by listening for changes
in the real world and allowing the model to change sympathetically. The model both echoes and predicts changes within
the overall system. It presents its results both in real time and as historical reports.

Trinity's presentation of the model of the enterprise delivers information about the enterprise appropriately to multiple
audiences. Most of an organization sees automation services as end-user IT consumables, rather than as the set of
various infrastructural components needed to deliver those services. Unlike the end-user or the help-desk assistant, IT
staff need to see the computing environment as a set of resources. Trinity can display the IT consumables as whole
things, or in terms of the nested layers of components that make them up, depending on who is interested in seeing the
display and in what context.

Trinity helps reconcile the differences in perspective brought about by different worldviews. The complexity of the
environment is so great that it is beyond the grasp of any individual to understand. Rather, there is a community of
knowledge about its workings; knowledge is distributed among a number of individuals through a number of subgroups.
Each subgroup within that community focuses appropriately on a specialized area of expertise, and interprets the world
in those terms. So, for example, the administrator's broken NIC is the financial analyst's inability to perform a
calculation. Trinity provides the common ground where the broken NIC and the inability to calculate are seen as
consequences of one and the same issue.