Writings


X-Trace: A Pervasive Network Tracing Framework

Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica
4th USENIX Symposium on Networked Systems Design and Implementation (NSDI'07)
Link to USENIX site
April 10-12, 2007
Cambridge, Mass.

ABSTRACT:

Modern Internet systems often combine different applications (e.g., DNS, web, and database), span different administrative domains, and function in the context of network mechanisms like tunnels, VPNs, NATs, and overlays. Diagnosing these complex systems is a daunting challenge. Although many diagnostic tools exist, they are typically designed for a specific layer (e.g., traceroute) or application, and there is currently no tool for reconstructing a comprehensive view of service behavior. In this paper we propose X-Trace, a tracing framework that provides such a comprehensive view for systems that adopt it. We have implemented X-Trace in several protocols and software systems, and we discuss how it works in three deployed scenarios: DNS resolution, a three-tiered photo-hosting website, and a service accessed through an overlay network.

Self-managed systems and services: Effective web service load balancing through statistical monitoring

George Porter and Randy H. Katz
ACM Digital Library link
Communications of the ACM
Volume 49, Issue 3, March 2006

ABSTRACT:

Identifying the correlated effects between components to improve response to overload.

IP Options are not an option

Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica
EECS Technical Report UCB/EECS-2005-24
December 9, 2005

Link to the report

ABSTRACT:

A wide variety of enhancements to the Internet architecture have been proposed over the past several years, many of which require attaching metadata, or state, to packets as they flow through the network. Examples of such extensions are IP traceback and XCP. The IP specification supports an "options" mechanism as an extensible way to couple state with packets. However, as we will show in this paper, IP options are not well supported in the Internet. We make use of the PlanetLab planetary scale network testbed to quantify the fate of IP-option enabled packets in the wide-area. We measured wide-area paths with both standard IP packets and packets with options. We discovered that approximately half of Internet paths drop packets with options, raising serious dependability issues. Surprisingly, our findings indicate that it is feasible to restore support for options in the wide-area. We discovered that the core of the network drops very few options packets, with the vast majority of those drops occurring in edge AS networks. Furthermore, these drops are concentrated in a minority of the ASes.

In-network Video Prioritization via iBox Classification Predicates

George Porter, Randy H. Katz
EECS Technical Report UCB/EECS-2005-1
Sep 29, 2005 Link to the report

ABSTRACT:

We propose a novel datapath mechanism for tracking and acting on headers in a variety of layer-7 protocols called Classification Predicates, or "cPredicates". We apply this mechanism to the emerging field of in-network storage (Storage Area Networks, or SANs), and consider a multimedia streaming service with video stored in a converged SAN that also contains non-video content. We show that cPredicates have a low, amortized overhead because they only have to examine a small subset of the packet stream in depth. In our experimental environment, only 5% or fewer packets are examined in depth, leading to less than a 10% amortized latency increase. We built a content-based prioritization system for an iSCSI-based SAN and show that it can provide better than best-effort service for video files in a converged SAN containing both video and non-video content.

COPS: Quality of Service vs. Any Service at All

Randy Katz, George Porter, Scott Shenker, Ion Stoica, Mel Tsai
Thirteenth International Workshop on Quality of Service (IWQoS 2005)
Available in Lecture Notes in Computer Science, Vol. 3552, 2005, Page 3
University of Passau, Germany

June 21-23, 2005

Link to LNCS article | Link to eScholarship Site

ABSTRACT:

Todays networks are awash in illegitimate traffic: port scans, propagating worms, and illegal peer-to-peer transfers of materials. This "noise" has created such a crescendo that legitimate traffic is starved for network resources. Essential network services, like DNS and remote file systems, are rendered unavailable. The challenge is no longer "quality of service" but rather "any service at all". Techniques must be developed to identify and segregate traffic into good, bad, and suspicious classes. Quality of Service should now protect the good, block the bad, and slow the ugly when the network is under stress of high resource utilization. We discuss the research challenges and outline a possible architectural approach: COPS (Checking, Observing, and Protecting Services). It is founded on "Inspection-and-Action Boxes" (iBoxes) and packet annotations. The former are middlebox network elements able to inspect packets deeply while performing filtering, shaping, and labelling actions upon them. The latter is a new layer between routing and transport that tags packets for control purposes while also providing an in-band control plane for managing iBoxes across a network.

Effective Web Service Load-Balancing through Statistical Monitoring

George Porter, Randy H. Katz, Ph.D.
IFIP/IEEE International Workshop on Self-Managed Systems & Services (SelfMan '05)
Nice Acropolis, Nice, France

May 19, 2005

selfman05.pdf (Adobe Acrobat)

ABSTRACT:

Multi-tier web services enable efficient, scalable, and composable Internet services. But, as they become increasingly complex, so too does the underlying middleware layer. During high demand periods, some components, such as persistent storage, become overloaded. This leads to long response times that make the site unusable. Formulating admission control policies for web services is a daunting and error-prone task, because even modest-sized web sites can consist of dozens to hundreds of request types. There is a lack of system tools and instrumentation for operators to identify connections between requests and their effects. Thus, we need more self-adaptive web services that can expose these effects to the operator and give he or she the tools to appropriately respond to overload.

We propose an approach to designing such self-adaptive web services by relying on 1) simple statistical techniques for uncovering request effects in multi-tier systems, 2) a black-box approach to component monitoring, 3) a visualization tool summarizing statistical results to facilitate human decision making, and 4) efficient techniques for operators to invoke admission control based on those statistical findings. We argue that including humans in the loop compliments, rather than detracts, from self-adaptive design goals. Additionally, we describe ongoing work on an implementation of a web service that embodies these mechanisms. Our approach leads to a web service that can serve approximately 70% more requests, while lowering the maximum request latency by over 78%.

Delta Routing: Improving the Price-Performance of Hybrid Private Networks

George Porter, Minwen Ji, Ph.D. (HP Labs)
2004 IEEE/IFIP Network Operations & Management Symposium (NOMS 2004)
COEX Convention Center, Seoul, Korea

April 19-23, 2004

noms04.pdf (Adobe Acrobat)

ABSTRACT:

Large enterprises connect their locations together by building a corporate network (sometimes called an intranet) out of private communication channels such as leased lines, Frame Relay and ATM links, called physical private networks or PPNs. Although these lines are dedicated to the company's traffic, and thus provide good quality of service, they are expensive and thus not always overprovisioned. On the other hand, Internet-based Virtual Private Networks (VPNs) can provide speedy deployment of multisite corporate networks at a small fraction of the cost of private lines. However, Internet-based VPNs do not offer the same accountability and predictability as a traditional intranet does, since the Internet is not admistrated by a single provider.

In order to build a hybrid network that is both reliable and affordable, we have developed the Delta Routing protocol, which allows nodes on a corporate network to communicate with each other using both PPN and Internet-based VPN. The Delta routing protocol works with the existing routing protocol on the PPN and allows each node to determine the best routes on the hybrid network using local information only.

We have simulated and compared the Delta routing protocol and the alternatives using the ns-2 simulator. The results show that the Delta routing protocol outperforms the alternatives in a variety of scenarios.

Traffic Matrix Estimation for Low-loss Routing in Hybrid Networks

Masters Thesis
George Porter, M.S.
Advisor: Randy H. Katz, Ph.D.
Second Reader: Ion Stoica, Ph.D.
May 22, 2003

gpmasters.pdf (Adobe Acrobat)

ABSTRACT:

Large enterprises connect their locations together by building a corporate network (sometimes called an intranet) out of private communication channels such as leased lines, Frame Relay and ATM links. Although these lines are dedicated to the company's traffic, and thus provide good quality of service, they are expensive and thus not always overprovisioned. On the other hand, Internet-based Virtual Private Networks (VPNs) can provide speedy deployment of multisite corporate networks at a small fraction of the cost of private lines. However, Internet-based VPN does not offer the same accountability and predictability as an intranet does, since the Internet is not admistrated by a single provider. To take advantage of the best of both worlds, we have developed the Delta Routing protocol, which allows nodes on a corporate network to communicate with each other using both private lines and Internet-based VPN tunnels. The Delta routing protocol works with the existing routing protocol on the PPN and allows each node to determine the best routes on the hybrid network based on local information only. It uses a novel congestion prediction mechanism called Flow-based Delta Routing to make the best use of the local information. We have simulated and compared the Delta routing protocol and the alternatives with the ns-2 simulator. The results show that the Delta routing protocol outperforms the alternatives in a variety of scenarios.

The SAHARA Model for Service Composition across Multiple Providers

B. Raman, S. Agarwal, Y. Chen, M. Caesar, P. Johansson, K. Lai, T. Lavian, S. Machiraju, Z. Mao, G. Porter, T. Roscoe, M. Seshadri, J. Shih, K. Sklower, L. Subramanian, T. Suzuki, S. Zhuang, A. Joseph, R. Katz, I. Stoica
2002 International Conference on Pervasive Computing (Pervasive 2002)
Zurich, Switzerland

August 26-28, 2002

Link to Springer Lecture Notes in Computer Science #2414

ABSTRACT:

Services are capabilities that enable applications and are of crucial importance to pervasive computing in next-generation networks. Service Composition is the construction of complex services from primitive ones; thus enabling rapid and flexible creation of new services. The presence of multiple independent service providers poses new and significant challenges. Managing trust across providers and verifying the performance of the components in composition become essential issues. Adapting the composed service to network and user dynamics by choosing service providers and instances is yet another challenge. In Sahara, we are developing a comprehensive architecture for the creation, placement, and management of services for composition across independent providers. In this paper, we present a layered reference model for composition based on a classification of different kinds of composition. We then discuss the different overarching mechanisms necessary for the successful deployment of such an architecture through a variety of case-studies involving composition.

A Commuting Diagram Relating Threaded and Non-threaded JVM Models

Undergraduate Thesis
George Porter
Advisor: J Strother Moore, Ph.D.
Second Reader: Robert S Boyer, Ph.D.
May 2, 2001

commute.ps.gz (compressed PostScript) | commute.pdf (Adobe Acrobat)

ABSTRACT:

We establish a commuting diagram that relates two models of the Java Virtual Machine (JVM). The first model, M3, supports much of Java, including classes, objects, and dynamic method resolution. The second model, M4, builds upon M3 by adding threads, monitors, and synchronized methods. We describe a theorem, Main, that asserts that running certain ``single-threaded'' states on M4 is equivalent to transforming those states to the domain of M3, running the transformed state there, and translating the result back to the domain of M4. We define the criteria we use to determine if the resulting states are equivalent, and we define our notion of ``single-threaded''. We then discuss a few lessons learned during the development of Main

Formal Models of Java at the JVM Level -- A Survey from the ACL2 Perspective

J Strother Moore, Robert Krug, Hanbing Liu, George Porter
Presented at the Workshop on Formal Techniques for Java Programs, in association with ECOOP 2001
University Eötvös Loránd, Budapest, Hungary, June 2001

full.ps.gz (compressed PostScript)

ABSTRACT:

We argue that a practical way to apply formal methods to Java is to apply formal methods to the Java Virtual Machine (JVM) instead. A Java system can be proved correct by analyzing the bytecode produced for it. We believe that this clarifies the semantic issues without introducing inappropriate complexity. We say "inappropriate" because we believe the complexity present in the JVM view of a Java class is inherent in the Java, when accurately modeled. If it is desired to model a subset of Java or to model "Java" with a slightly simpler semantics, that can be done by formalizing a suitable abstraction of the JVM. In this paper we support these contentions by surveying recent applications of the ACL2 theorem proving system to the JVM. In particular, we describe how ACL2 is used to formalize operational semantics, we describe several models of the JVM, and we describe proofs of theorems involving these models. We are using these models to explore a variety of Java issues from a formal perspective, including Java's bounded arithmetic, object manipulation via the heap, class inheritance, method resolution, single- and multi-threaded programming, synchronization via monitors in the heap, and properties of the bytecode verifier.

The Apprentice Challenge

J Strother Moore, Ph.D., George Porter
November 12, 2000
ACM's Transactions on Programming Languages and Systems Journal, Vol 24, No. 3 (March 2002)

paper-acm.ps (PostScript)

ABSTRACT:

We describe a mechanically checked proof of a property of a system of small bytecoded programs involving an unbounded number of threads and synchronization via monitors. The bytecode programming language we study is modelled on that for the Java Virtual Machine. The formal semantics of our language is given by an operational model expressed in ACL2, a Lisp-based logic of recursive functions. Our proofs are checked with the ACL2 theorem prover. The proof involves reasoning about arithmetic, conditionals, infinite loops, the creation and modification of instance objects in the heap, including threads, the inheritance of fields from superclasses, pointer chasing and smashing, the invocation of instance methods (and the concomitant dynamic method resolution), use of the start method on thread objects, the use of monitors to attain synchronization between threads, and consideration of all possible interleavings or scheduling over an unbounded number of threads. This paper develops techniques for reasoning mechanically about multi-threaded Java programs. The problem posed here is also put forth as a benchmark against which to measure other approaches to formally proving properties of multi-threaded Java programs.

An Executable Formal Java Virtual Machine Thread Model

J Strother Moore, Ph.D., George Porter
October 25, 2000
Accepted to the 2001 JVM Usenix Symposium in Monterey, California

model.ps.gz (compressed PostScript)

ABSTRACT:

We discuss an axiomatic description of a simple abstract machine similar to the Java Virtual Machine (JVM). Our model supports classes, with fields and bytecoded methods, and a representative sampling of JVM bytecodes for basic operations for both data and control. The GETFIELD and PUTFIELD instructions accurately model inheritance, as does the INVOKEVIRTUAL instruction. Our model supports multiple threads, synchronized methods, and monitors. Our current model is inadequate or inaccurate in many respects (e.g., we do not formalize the JVM's finite arithmetic nor do we describe class loading and initialization). But the model is a useful tool for studying the application of formal reasoning to the JVM and to Java Programs.

We demonstrate two useful aspects of an operational formal semantics. First, the model is executable: bytecoded methods can be run on the model. Second, the model allows us to prove theorems about those methods or, more generally, about the model. Because the JVM provides a relatively clean semantics for Java, our model can be thought of as a step towards Java software verification. We illustrate these points. We cite some theorems proved about our model, including a theorem involving unbounded multithreading and mutual exclusion with MONITORENTER and MONITOREXIT. Our proofs are carried out with the ACL2 theorem prover.

Using Formal Methods to Ensure Hardware and Software Correctness

Faculty Advisor: J Strother Moore, Ph.D.
Presented March 23-24, 2001 as part of the College of Natural Science Honor's Day

ABSTRACT:

On January 17, 1995, the Intel Corporation announced an outlay of $475 million to cover the recall of its Pentium chip. This chip has been designed by hundreds of engineers and scientists, has passed countless tests, and has been successfully used by millions of people, both at home and professionally. The bug that struck the Pentium chip had to do with its ability to divide two numbers. The Pentium chip relied on certain tables of numbers to carry out that division, and a few of the entries in that table were not correct. Given the sophistication of the chip, it is easy to imagine how such a small error made it into the final product (in fact, it is amazing there were not more errors!). A commonly used method to prevent errors like this is testing. But there are just too many numbers to divide against each other to test every possible case. How can we make sure that our chips and computer programs are reliable? In short, we can prove that they work with logic and mathematics.

When AMD designed their K5 chip (a microprocessor similar to Intel's Pentium chip), they wanted to prevent this type of bug. To do that, they expressed their design using the language of mathematics. Rather than testing the chip's ability to divide numbers, they formed a "theorem" that specified exactly what it means to carry out the desired operation. That theorem was then proved and checked with ACL2, a computer program that proves certain types of mathematical statements.

Work like this can extend to software, as well. Last year, Prof. J Moore, Ph.D. and I designed a model of the Java Virtual Machine that supports a feature called threads. These threads are used to allow for a program to do more than one thing at a time. Recently, we have begun proving an interesting property about that model. Specifically, we are in the process of proving that certain types of programs that use threads can be treated in a simpler manner. This means that rather than trying to consider a (usually very) complicated threads-based program, you can instead direct your attention to a much simpler non-threads based program. When this proof is completed, you will know that anything you prove about the simpler model will apply to the more complicated model. A theorem like this is useful since it is always better to try to think in the simplest terms possible, and this theorem will allow you in some cases to think about a complicated program at a simpler level.

M4: A Formal Model of the Java Virtual Machine

Faculty Advisor: J Strother Moore, Ph.D.
Presented April 14-15, 2000 as part of the College of Natural Science Honor's Day

ABSTRACT:

When scientists normally talk about the world they use the language of mathematics. By writing down formulas and equations, they are able to communicate unambiguously to other scientists. Furthermore, they can apply tools like Calculus to their mathematical symbols. Computer scientists can also use mathematics to describe their computer programs with a type of math called "Computational logic." This type of mathematics allows programmers to write down a detailed description of their program's behavior--similar to a formula like F=ma or E=mc2.

This past semester Dr. Moore and I have been extending a model of the Java Virtual Machine (JVM) to add support for threads, or the ability to run more than one task at a time. The JVM is a computer program that executes the Java language, which is often used on the Internet. We created a model of the JVM using a computational logic called ACL2. In addition to describing the JVM, ACL2 also proves theorems about the model. We proved two theorems about the behavior of our model (and thus the JVM). In the process, we discovered a bug--a bug that would most likely have remained undiscovered had we not subjected the program to the scrutiny of the ACL2 automatic theorem prover. By having to write a detailed, unambiguous specification of the JVM, I had to learn how it works in a very detailed way and communicate that to ACL2. It is much like teaching a subject to someone else--you often learn more about the subject than you would have otherwise. By proving the two theorems, we ensured the reliability of a part of our system, because we guaranteed its behavior under certain conditions. The practice of modeling software, and proving theorems about those models, gives the programmer confidence in knowing how their program is going to work--a confidence that is nearly impossible to guarantee by simply running a few tests on the program. We now know with confidence the behavior of part of the JVM.

NeoWebScript: Web Programming with Tcl

Sponsor: NeoSoft, Inc.
Presented March 27-28, 1999 as part of the College of Natural Science Honor's Day

View the poster content by clicking here

ABSTRACT:

In this poster, I will describe NeoWebScript, a World Wide Web server that allows untrusted web developers to easily create powerful, dynamic web content while maintaining server security. This power and security is provided by the Tool Command Language (Tcl), which is not only easy to use, easily extendible, and embeddable, but also natively supports configurable security policies. These policies protect the server from malicious or careless web scripts, and give the user enough power to create any number of web-based applications. When the server receives a request for a dynamically generated web page, it creates a Tcl environment with all the components needed to access a database, create an image, and so forth. It then processes any commands in the document, sending the formatted HTML codes to the client. This approach completely avoids both the need for, and overhead of, external CGI programs. Furthermore, the Tcl code can be included in any HTML document, so that a collection of static web pages can be made dynamic without any filename changes or awkward CGI constructs. With an optional encryption module, Apache and NeoWebScript can be used to create powerful e-commerce systems

NeoWebScript is provided as a module for Apache, the world's most popular web server. Given the large number of commercial and public-domain extension packages for Tcl, adding database connectivity, dynamic image generation, network services, or other "external hooks" into your web pages is easy, since extensions written by unrelated third parties can be used together at runtime. Furthermore, since the Tcl environment is only created when a web page needs it, the overhead for static pages is very small. NeoWebScript , written by NeoSoft, Inc., is publicly available and free for noncommercial use.


Go back