Recovery Law Archana Ganapathi Recovery Law: Every system must have a recovery mechanism. Every constituent component of the system must support the system recovery mechanism. Components' recovery mechanism(s), upon existence, must comply with peer component and system recovery mechanisms. This law ensures that one component's recovery mechanism does not conflict with another's recovery mechanism. Sample recovery mechanisms are undo, micro-reboot and system-restart. Each component must notify a system's global recovery manager of their individual recovery mechanisms. The recovery manager checks that component recovery methods are consistent, conflict free (isolated) and dependable and identifies inter-component recovery interference. With this knowledge, upon failure and during recovery, the recovery manager can estimate total time to recover and inform interested parties of the downtime. For example, a system X employs "undo" as its primary recovery mechanism. Component A must be undo- compatible i.e. its functionality is not sacrificed upon system undo. Similarly, if component A requires "undo" as its recovery mechanism while component B is not undo-compatible, this incompatibility must be realized by the system and a useful error message is to be displayed upon initial configuration. Also, if component B is using an updated version of a file and component A rolls back this file, B must be notified. Recovery Law applies to internet and operating systems, networking, databases, application programming and almost all software categories that use recovery techniques. Uses for Recovery Law: 1. Web browser failure to display page- When a web page is unavailable due to web server failure, the error message is often not descriptive enough due to the lack of information regarding cause of outage. By applying the Recovery Law, the failing component's recovery mechanism must comply with the web server's and thus the web server would have sufficient information to suggest an alternate access path to the user or merely a descriptive error message to specify time to resume normal system operation so that the user can choose to relinquish or retry their action. 2. Database server querying handling- When a database query arises and the query handler is on a non- functional server, instead of dropping the query altogether, the incoming request handler should queue potential requests. Upon resumption of service the query handler must service pending queries first. Such a query handler recovery mechanism compliments that of the incoming request handler. 3. Network packet loss/drop due to congestion- In case of congestion, for example at a router, the congested component should inform the sender of the current state and notify the sender upon decongestion rather than dropping packets without notification. The sender should then consider alternate paths rather than flooding the router and contributing to the congestion. 4. Incompatible library versions: Suppose that a new version of a shared library is dispatched (perhaps fixing a major bug that affects all the applications that use it). If the interface/functions change, applications using the library may experience compilation errors. The upgraded library version should include a header with interface/function change information to be compulsorily reviewed by the applications before they update their dlls.