DAISY
Dependable Adaptive Interceptors &
Serialization-based sYstem
Taha Bennani,
Ludovic Courtès,
Jean-Charles Fabre,
Marc-Olivier Killijian,
Eric Marsden,
Francois Taiani
What is DAISY ?
DAISY is a demonstration platform developed within the DSoS European
Project (Dependable Systems of Systems)
that focuses on the use of COTS (Commercials Off The
Shelf) technologies and reflection to build highly
dependable systems. It uses CORBA portable interceptors,
kernel-level reflection techniques, and system library interception
in order to harden a distributed application built out standard software
components (CORBA, Java Virtual Machine (JVM), Linux).
Motivations
COTS Software is increasingly used in large and complex systems that
have high dependability requirements (Telecom, Automotive, Space,
Railways). Very frequently, COTS serve as foundation for the
considered systems and build the executive layers on which
value-adding applications can be built.
This central role of COTS raises many problems regarding the
robustness and reliability of the resulting platforms. Indeed, high
dependability is usually not a primary target of COTS providers, and
the behaviors of COTS executive layer in presence of faults is
questionable. Based on our long experience in COTS characterization
and fault containment (for instance in the FRIENDS and
MAFALDA
projects) we exemplify in DAISY how wrapping technologies and
reflective features can be used to build flexible and robust
distributed systems using market components.
Architecture and Configuration
![[architecture2.png]](./images/architecture2.png)
Fig. 1: Primary Backup Replication
To exemplify our approach, we implemented a mini client-server
banking application. This application is tolerant to crash faults
thanks to a classical Primary-Backup Replication scheme (also known
as "passive replication") that we transparently integrated with the
application using standard CORBA interceptors. (Thus achieving a
high degree of separation of concerns.)
Figure 1 shows the resulting architecture we
obtained. PIS and PIC denote the portable interceptors
of the servers and the client respectively. The three interceptors
(one for the client, and two for the servers) monitor the passive
replication by synchronizing the different checkpoints between the
backup and primary with client requests, and providing error
detection.
One major weakness of such an architecture is that is does not
support any failures but plain crashes. Because the failure modes of
COTS components must be assumed unrestrained if one has no precise
knowledge regarding their internal dependability, a plain
primary-backup scheme alone cannot be trusted. Fortunately using
wrapping and reflective features, this limitation can be upheld to a
large extent, as we show in the following.
![[config2.png]](./images/config2.png)
Fig. 2: Configuration
Figure 2 shows the actual configuration of the system
we used during the DSoS dissemination day. The 2 server processes
run on a rack in Toulouse, France, while the client is installed on a
laptop in the conference room in Vienna, Austria. This figure illustrates
the diversity of the different software and hardware components at
stake. This diversity is of high interest for fault-tolerance
because it tends to eliminate correlated failure modes (i.e. the primary and
the backup fail because of the same single cause). This can be seen
as a cheap and "degenerated" version of N-version programming. This
diversity, however, comes at the price of increased complexity.
Wrapping the Linux Kernel
Among the various wrappers that could be implemented using the
proposed framework, we selected to wrap thread synchronization and
communication. CORBA and Java implementations usually make intensive
use of multi-threading facilities. A fault affecting the behavior of
synchronization facilities can have severe consequences on
multi-threaded entities, thus impacting various layers of
the system. In particular, mutex locks must behave correctly.
Figure 3 illustrates the functioning of the mutex
wrapper. This wrappers continuously checks an invariant property
that must hold for each mutex used by the above layers. This
invariant is derived from the definition of mutex semaphores and
related operations, and expressed using Dijkstra
terminology. #P(s) denotes the number of invocations on P(s)
(lock operation on mutex s), #V(s) the number of release
invocations on s, #Q(s) the number of threads blocked on a P
operation for s, and #C(s) the number of threads that possess
s. Mutual exclusion implies that at most one thread can possess s,
i.e. #C(s) <= 1.
![[wrapper2.png]](./images/wrapper2.png)
Fig. 3: Mutex Wrapper
The formula is a simple balance equation on threads interacting with
mutex s (similar to those found in fluid mechanics for
instance). #P(s) - #V(s) is the number
of threads interacting with the mutex at a given time. These thread
are either in the queue (#Q(s)) or possess the semaphore
(#C(s)), hence the resulting formula: #P(s) -
#V(s) = #Q(s) + #C(s). To perform the evaluation of
this expression, the platform must provide #P, #V,
#C and #Q. The former, namely #P and #V,
can easily be obtained using library interposition techniques, a
conventional approach to intercept operating system calls. The
latter, namely #Q and #C impose to introspect the
operating system kernel as this information is normally not
available. We implemented this introspection by inserting a
reflective kernel module providing a Get_#Q() and a Get_#C()
functions into Linux.
Many situations may render this formula false, all breaking the
mutual exclusion semantics. One of these situations is when the
mutex gets "lost"; i.e. a V (release) operation does not work
properly, and does not actually release the mutex. If this happens,
all subsequent threads that ask for the mutex are blocked
indefinitely, resulting in a partial hang of the
application. Depending of the blocked threads, this may result in
the server blocking on client invocations indefinitely (server
hang). As explained before, this kind of failure is not tolerated by
a plain wrapper-less primary backup replication scheme. In the
following part, we show step by step how thanks to this mutex
wrapper this particular fault can now be tolerated by the DAISY platform.
Demonstration Step by Step
We show here step by step the functioning of the mutex wrappers. We
first launch the distributed application (backup, primary, client,
plus a small demo monitoring facility); we run a series of actions
without faults and observe the exchange of state information between
the replicated servers. In a second phase, we activate the mutex
wrapper, and inject a general mutex fault during the same series of
actions. Without wrapping, this fault would freeze both client and
primary, and go undetected by the primary/backup mechanisms. Thanks
to the wrapper, the freeze is detected and the failure mode is
converted into crash-fail behavior, leading to the transparent
recovery of the whole application by the backup.
Step 1: Initialization:
At initialization, both the primary and the backup are launched. We
developed a small monitoring facility to toggle dynamically the
wrapping facility of the platform and to inject fault
artificially. On Figure 4, the backup is running
on a machine called perth.laas.fr, while the primary runs on
canberra.laas.fr. The demo monitoring facility in shown in
front the two windows.
![[begin-server-small.png]](./images/begin-server-small.png)
Fig. 4: Servers right after
Init
Figure 5 gives a closer look at this monitoring
facility. This facility incorporates both injecting features and
reflective capabilities to control and adapt the fault-tolerant
behavior of the application. As such it can be seen as a kind of
meta-interface for the servers. This monitoring facility does not
run on the same machine as the replicated servers, and communicates
with them using a TCP/IP connection on port 7700. (See Figure
5.)
![[demo-management-begin.png]](./images/demo-management-begin.png)
Fig. 5: Meta-Interface and
Injection Monitor
The GUI of the client is shown on Figure 6. The
client provides basic account management functions: creation,
deletion, deposit, withdrawal, balance. That kind of application
provides both a dynamic and non-trivial state-structure, which is
quite interesting in our case.
![[client-during-transaction-small.png]](./images/client-during-transaction-small.png)
Fig. 6: The
Client
Step 2: Fault-Free Behavior:
We launch a series of operation without injected faults (Figures
7 and 8):
Creation of account 'marco'
Deposit of 50 on 'marco'
Balance of 'marco'
Deposit of 50 on 'marco'
Balance of 'marco'
Withdrawal of 100 from 'marco'
Balance of 'marco'
For each operation, the primary checkpoints its state and
communicates it to its backup. The portable interceptor of the
backup server ("PISb") receives this checkpoint information and
modifies its own internal state accordingly.
![[primary_normal.png]](./images/primary_normal.png)
Fig. 7: A Serie of Actions on the Primary
![[backup_normal.png]](./images/backup_normal.png)
Fig. 8: And their Effect on the
Backup
Step 3: Activating the Mutex Wrapping:
We activate the mutex wrapper using the demo monitoring
facilities. This means that the invariant property #P(s) -
#V(s) = #Q(s) + #C(s) is now checked for each mutex
on all mutex operations. (Figure 9)
![[demo-mngmnt-enable-check.png]](./images/demo-mngmnt-enable-check.png)
Fig. 9: Enabling Mutex
Wrapper
We now launch the same series of actions as during Step 2, and
inject a general mutex fault into the primary using the demo
monitoring facility. This fault is immediately detected by the mutex
wrapper (implemented in our example by the interception library
libuspi), which crashes the primary (Figure
10) , thus triggering the switch primary-backup
(Figure 11).
![[primary_crash.png]](./images/primary_crash.png)
Fig. 10: Mutex Fault Injection in the Server
![[backup_switch.png]](./images/backup_switch.png)
Fig. 11: Switch to the Backup
Future of DAISY
Right now, requests made on the servers are serialized, which greatly
simplifies the checkpoint algorithms and the state capture
mechanisms. Checkpoints are taken before any reply to a client, which
insures that a sent reply can always be recovered without requiring
the client to roll-back as well. (This avoids the classical domino effect,
which is not acceptable in our case since we consider the client state
and actions to be outside the sphere of control of our mechanisms.)
Serializing request processing is straightforward but inhibits any
advantage of multi-threading, notably w.r.t. availability and
throughput. As already acknowledged by many research work,
multi-threading raises two main challenges: non-determinism (of prime
importance for active replication schemes like Triple Modular
Redundancy), and state-restoration (problematic because of the opacity
of the layered executive platform, and the entangling of state
dependencies between the layers).
We've addressed those challenges with a new approach we termed
"Multi-Level Reflection" (or Multi-Layer Reflection) in the following
articles: PRDC-02 and DSN-03. This work allowed the specification of a
multi-layer meta-interface targeting those problems, and we are now
working on a prototype implementation within DAISY of the ideas we
have developed.
Last generated on 24 Feb 2007
francois.taiani@comp.lancs.ac.uk