Two mini talks by Joe Hellerstein and Gautam Kar
IBM Research
Using Control Theory To Achieve Service Level Objectives For An
E-Mail Server
by Joe Hellerstein
A widely used approach to achieving service level objectives for a
target system (e.g., an email server) is to add a controller that
manipulates the target system's tuning parameters. We describe a
methodology for designing such controllers for software systems that
builds on classical control theory. The classical approach proceeds
in two steps: system identification and controller design. In system
identification, we construct mathematical models of the target system.
Traditionally, this has been based on a first-principles approach,
usingdetailed knowledge of the target system. Such models can be
difficultto build, and too complex to validate, use, and maintain. In
our methodology, a statistical (ARMA) model is fit to historical
measurements of the target being controlled. These models are easier
to obtain and use and allow us to apply control-theoretic design
techniquesto a larger class of systems. When applied to a Lotus Notes
groupware server, we obtain model fits with $R^{2}$ no lower than 75%
and as highas 98%. In controller design, an analysis of the models
leads to a controller that will achieve the service level objectives.
We report on an analysis of a closed-loop system using an integral
control law with Lotus Notes as the target. The objective is to
maintain a reference queue length. Using root-locus analysis from
control theory, we are able to predict the occurrence (or absence) of
controller-induced oscillations in the system's response. Such
oscillations are undesirable since they increase variability, thereby
resulting in a failure to meet the service level objective. We
implement this controller for a real Lotus Notes system, and observe a
remarkable correspondence between the behavior of the real system and
the predictions of the analysis. This allows us to select the proper
parameter for the controller from the analysis alone.
Taxonomy, Modeling and Computation of Dependencies for Distributed
Management
Gautam Kar
This talk addresses the role of dependency analysis in the general
area of distributed management. Specifically, we point out the need for
developing a methodology for identifying, classifying,
representing and
computing dependency information in order to do effective
configuration, fault and performance management in a complex IT
environment.The discussion focuses on developing a systematic
methodology for
obtaining dependency information in an IT service environment,
representing this information within the framework of a model that can
facilitate the design of applications that do fault, performance,
configuration and availability management. The main questions
raised in
this talk are: what are the important characteristics of dependencies?
In other words, when a managed entity, such as a service or resource,
depends on another managed entity, what are the properties of such a
dependency that need to be recorded? How can we classify dependencies
such that they can be used more efficiently to do root cause or impact
analysis in fault management? The paper introduces the concept of
dependency lifetime that traces the flow of dependency information
fromthe design to installation to runtime stages of a service. Three
categories of models, functional, structural and operational, are used
to represent this information as it flows from the design to the
runtimestages. We show how the information obtained at each of
these stages can
be manipulated by management applications. In particular, approaches
that we have pursued to discover dynamic dependencies will be
discussed. Some applications of this approach in the context of an
e-commerce environment will be mentioned.