NSF Workshop on New Challenges and Directions for Systems Research
Architecture Subgroup

Bobby Blumofe
Millind Buddhikot
David Culler (group leader)
Andrew Chien
Mark Hill
Jim Smith
J.P. Singh

The general feeling underlying the discussion on future research challenges in architecture was one of excitement about the many new research opportunities being presented by advances in technology and application demand. Many avenues that were impractical in the past have become very reasonable to investigate, such as pervasive use of virtual machines, highly adaptive execution mechanisms, a changing division of effort between compilation and instruction interpretation, extremely large scale systems, integration of basic resources through flexible, high speed interconnects, and data stream-based processing. We are approaching critical technological thresholds that are likely to change dramatically the organization and abstractions employed in machines and systems. The traditional definitions of architecture and system design need to be reexamined as technology allows us to design much higher level systems. The process we used was one of brainstorming on ideas we loved, ideas we hated, and things we wished other researchers would do. After a very active discussion we collated the most completely developed issues into key topic areas.

"Adaptive" Architecture

Many forms of dynamic optimization are emerging at different levels of design. In the past, these efforts have been focused to a large extent on instruction processing problems, such as branch prediction. More generally, there is a wide range of opportunities for using run-time mechanisms to observe execution behavior, make predictions of future behavior, and use these predictions to enhance performance, typically through speculation, or to control operation of the system, for example to avoid congestion. Generalizations of these techniques can be pushed into other areas of system design, such as memory and I/O systems. Caches, for example, use only limited aspects of memory behavior and perform very limited speculation. More aggressive approaches may examine values being accessed from memory and take speculative action, such as assuming them to be addresses and prefetching the location. Techniques for dynamic optimization can be extended considerably. They can be extended 'down' into lower levels of the design by allowing the hardware itself to be reconfigured based on run-time observations. They can be pushed 'up' across the hardware/software boundary by providing feedback to the application so that it might change its utilization the machine. Both of these directions raise questions concerning the APIs that will be used and raise a host of research questions in the areas of compilers and operating systems. One of the exciting developments is that questions about dynamic optimization techniques are being addressed in an increasingly systematic fashion, covering a large portion of the design space rather than investigating alternative point solutions.

Performance Modeling and Characterization

Many of the group's discussions of novel directions in architecture came back to the question of performance modeling and characterizing the behavior of complex systems. Sophisticated mechanisms that enhance performance within a given technology tend to make the performance less predictable. There are many examples where an apparently insignificant change in a program yields a large, unexpected change in its performance. This makes many other aspects of system design, as well as work in compilers and operating systems, difficult. We believe that it is important to investigate design techniques and strategies that will make a qualitative improvement in the performance predictability of systems for a modest performance penalty. The ability to predict performance underlies work in quality-of-service, compiler optimization, and many other areas. There seems to be an important tension between making systems more adaptive and making them more predictable, although these goals do not necessarily need to be at odds.

New Application Drivers

Many new developments in architecture are likely to result from dramatic changes in requirements presented by emerging applications. In particular, a number of important application areas involve movement of large amounts of data. Rather that crunching on a localized data set, the computer system needs to stream data from one resource to another, perhaps processing it along the way. Required resources vary depending on application, but include disk, high-speed network, display, and audio.

These applications demand that designers develop a new class of evaluation workloads and methods. Newer applications tend to be operating system intensive and tend to incorporate interactivity as a central element of behavior. Thus, new benchmarks and tools need to be developed to evaluate the alternative architectures.

Non-processor-centric Design

Architecture has traditionally been processor-centric, with memory systems, I/O systems, and networks, relagated to secondary status. With changing application demands and technological opportunities, it is increasingly important to investigate design points that are not processor-centric. Examples include storage and networking subsystems, internet appliances, and ubiquitous low cost/power/size designs. These areas of investigation will also require the development new performance evaluation methods.

Design Principles for systems across a wide range of scale

Computer systems will continue to broaden in both applications and scale. In the future, computer systems will range from mega-scale systems with millions of processors to very small scale systems integrated on a single chip. It will be technologically reasonable for million processor systems to be built in fewer than ten years. Historically, large computer systems have tended to include about 10,000 distinct components. This has been fairly constant from vacuum tubes to transistors to chips. We are approaching a point where a complete system including processor and memory will fit on a single chip. About eight years from now we can expect 100 processors on a chip, and million processor systems become reasonable. Most systems of large scale in the past have been designed to operate in a highly synchronous, 'crystalline' fashion. However, most mega-scale systems, such as organic systems and the internet, operate in a looser coordination. They often are better described with a fluid model rather than a mechanical one. One of the challenges is to develop credible modeling tools at for mega-scale computer systems.

It is also important to investigate the other extreme of very small scale systems. There is a spectrum of options from current desktop devices and portables, down to appliances and finally to disposable systems. These are likely to be the systems with highest volume and widest use, and they place a unique set of constraints that are likely to yield novel approaches.

We believe it is important to investigate support for high availability and security. This is often treated in very absolute terms. It will be valuable to understand the space of availability/performance trade-offs. How much performance must be sacrificed (or costs incurred) to obtain various levels of availability? In many contexts the need arises from simple APIs that offer composability.

Critical Thresholds

The architecture group examined the issue of approaching critical thresholds that will change the nature of research in architecture. We identified thresholds in five areas: size, cost, complexity, application and infrastructure.

Historically, the density of systems has increased steadily, but the development of novel designs has fallen into very clear epochs delimited by fairly discrete thresholds. Recent examples include the arrival of integrated circuits, the arrival of the microprocessor, and the arrival of a full word-width processor on a chip. The last example marked the architectural renaissance of the mid-80s where we saw the emergence of RISC microprocessors and single board computers. At the system level, bus-based share-memory multiprocessors and massively parallel processors appeared. We have been in a evolutionary phase for the last couple of years as designs have converged, but we are posed to cross the threshold where a complete system and/or multiple processors fit on a single chip. Address spaces could easily be made large enough to name data anywhere in the world. In addition, a number of advances in high speed links could enable a new level of composition in computer systems.

The other opportunity of improved density is reduced costs. As discussed above, this is likely to present a host of new issues in the area of disposable devices, ubiquitous devices, and appliances. We are seeing the opportunity to push more complex processing operations down into the system, for example pushing intelligence into the memory or into the disks system.

On the application side, the architecture group identified three general areas that can be expected to have a profound impact on architecture: telepresence, virtual reality, and massive information processing (terabytes per person). Each of these presents a set of demands that will qualitatively change the nature of computer systems.

Finally, we believe it is possible to put in place a new infrastructure for architectural research that will dramatically change the nature of the research. Three areas stand out. (1) Development of the equivalent of MOSIS for "systems", i.e., processor cores plus memory plus devices plus means for novel additions on a chip. Industry is increasingly operating at this scale, but there is no accessible infrastructure for research. (2) Massive simulation environments. Building simulators capable of modeling very large scale, very complex, or very detailed systems is a costly undertaking. The computation resources are available, but the research community will be more effective in its investigations if it shares a simulation infrastructure. (3) Rapid evaluation. There has been a good deal of attention paid to rapid prototyping of systems, but it takes a great deal of work to adequately evaluate the ideas, regardless of how quickly the prototype is assembled. We need to examine ways of assembling new workloads and tools rapidly to evaluate novel ideas.