Formal
Methods for Verification of Recovery Code
Archana Ganapathi
Recovery is an imperative feature for system
reliability and availability. Consequently, recovery code is indispensable. As formal methods can find bugs otherwise hard to
discover manually[2], they
carry significant potential to enhance code robustness, including that of
recovery code.
Formal methods are of two
categories:
1)Dynamic analyses include axiomatic approaches, model
checking, and theorem proving[4, 5]. While these methods systematically explore
the system “state-space”, they increase computational complexity, as the
state-graph is exponential in the size of the program. As there are numerous
potential failure states, this task is tedious for recovery code. It is
exhaustive but guaranteed to terminate as the model is finite. Thus, while
thorough, the time and knowledge overhead of this technique renders it less
practical.
2)Static Analysis includes data-flow analysis(DFA) and
inter-procedural data-flow analysis(IPDFA) (used in optimizing compilers).
These methods can statically ascertain data behavior. They need not understand
or execute program code; they analyze all execution paths at compile-time[3]. For recovery code, these techniques consider
failure behavior patterns, and not implementation details of the recovery.
Checks are “local” and usually incomplete yet more practical as they attempt to
predict and detect failure-causing behavior.
IPDFA of recovery code can
potentially expose erroneous system states that cannot be exposed by simple
(global) DFA. Global DFA considers function bodies but terminates at procedure
call-sites. IPDFA spans the entire program (considers caller and callee body
together). This wider domain for DFA can expose additional opportunities for
error-analysis as well as optimization. In practice, recovery code contains
small fragments with a chain of procedure calls. When representing the entire
program as a call-graph, some paths are “kosher”, as per property/model
specification by the user, while others are not. For example, consider the
following call-graph.

The problem is to check if
the state changes by W are allowed inside P's context. W may be ok with X
calling R but not P calling R calling W.
Assume we have a recovery
code block FRecover(); typically, a function FNormal()
calls FRecover() as below:
FNormal()
{
if(error_case_1)
FRecover()
else
FSomeFunc()
}
By simple DFA, we are aware
of state changes inside FRecover() but not inside FNormal(). After inlining FRecover() as well as FSomeFunc() inside FNormal(), the
whole body of FRecover() becomes visible inside FNormal() and thus can
visualize the state changes. Simple DFA of code could not reveal such errors,
stopping at procedure call boundaries (as it is unaware of the
behavior/side-effects of the callee) -- inside FNormal(),
FRecover() is just a black-box. The state changes inside FRecover()
stand-alone are permissible but in the context of FNormal() (i.e., when
FNormal() calls FRecover()) such state changes may not be permissible (they may
violate properties that were input to the system). For example, FRecover()
has permissions to modify file parameters (owner, group etc.) but when
FNormal() (a user) calls FRecover() a system function, such permissions are revoked[1].
It appears that the
reliability of recovery code can benefit from IPDFA; usually, recovery code
does not contain recursive calls and they are short fragments, thus, simple
procedure-inlining may work effectively.
References
[1] Hao Chen, David Wagner, and Drew
Dean.
Setuid Demystified. 11th USENIX Security Symposium,
2002
[2] Judith
Crow and Ben DiVito.
Formalizing
space shuttle software requirements: Four case studies.
ACM Transactions on Software Engineering and
Methodology, 7(3), July 1998.
[3]
Checking
system rules using system-specific, programmer-written compiler extensions.
In Proc. 4th USENIX Symp. on
Operating Systems Design and Implementation,
[4] Stanford
CS444a Lecture notes on Formal Methods
http://cs444a.stanford.edu/slides/Formal.pdf
[5] UC
Berkeley CS263 Lecture notes
http://www.cs.berkeley.edu/~necula/cs263/lectures.html