Table Of Content

Automated Environment Generation for Software Model Checking Oksana Tkachuk, Matthew B. Dwyer Corina S. Pisireanu Department of CIS Kestrel Technologies Kansas State University, USA Moffett Field, USA oksana@c is.ksu.edu pcorina (3 email.nasa.arc.gov Abstract they have not been adapted for software written in modem programming languages. A key problem in model checking open systems is en- In this paper, we describe automated tool support for vironment modeling (i.e., representing the behavior of the adapting existing software model checking frameworks to execution context of the system under analysis). Sojhare provide a restricted form of modular verification. Specif- systems are fundamentally open since their behavior is de- ically, we consider decomposition of a Java program into pendent on patterns of invocation of system components and two parts: a unit under analysis (henceforth called a unit) values defined outside the system but referenced within the and its environment. A unit is any collection of Java classes system. Whether reasoning about the behavior of whole and its environment consists of the classes with which the programs or about program components, an abstract model unit interacts through the unit’s inte~ace.’ The unit’s source of the environment can be essential in enabling suficiently code will be the subject of verification along with an ab- precise yet tractable verijication. stract model of the environments externally observable be- In this papel; we describe an approach to generating en- havior. This environment model is derived from specifica- vironments of Java program fragments. This approach in- tions written by the user or from the results of analyzing tegrates formally specijied assumptions about environment source code that implements environment components. Ex- behavior with sound abstractions of environment implemen- isting abstraction techniques [9] may be applied to local unit tations to form a model of the environment. The approach is data and to the data that flows between the unit and environ- implemented in the Bandera Environment Generator (BEG) ment. The resulting abstracted unit and environment may which we describe along with our experience using BEG then be analyzed by existing Java model checking frame- to reason about properties of severcl non-trivial concurrent works such as JavaPathFinder [30] and Bandera [7]. Java programs. Thorough treatment of the mechanisms by which the environment may influence the behavior of the unit is es- 1 Introduction sential for sound reasoning. The environment may influence the unit’s control (e.g., by invoking methods in the unit’s interface or influencing synchronization relation- Model checking the source code of realistic software sys- ships) and data (e.g., by passing environment data to the tems is a challenge and is currently the topic of a large num- unit or by modifying unit data that may flow to the en- ber of research efforts (eg, [7, 16, 301). The primary chal- vironment). By unit data we mean objects of unit type lenge lies in overcoming the enormous cost of model check- (i.e., the object’s type is included in the unit). Java classes ing which grows as the product of the number of indepen- are broken into two categories depending on whether dent program components, such as, threads of control.-Most they hold a thread of control. In Java, a class contain- researchers agree that abstraction is the key to overcoming the main method or classes that extend(imp1ement) ing this challenge. Research on abstracting the data state j ava. lang. Thread ( j ava .l ang .R unnable) are of programs using techniques such as predicate abstraction labeled active, the rest are termed passive. For consistency, (eg, [3]) are steadily increasing the size and complexity of we reuse terminology from previous work [IO], and call the programs that can be efficiently analyzed. A complemen- active environment classes drivers and passive environment tary approach involves decomposing the program, check- classes stubs. Our approach provides mechanisms by which ing properties of the components, and then composing the a wide-range of driver and stub behaviors may be safely ap- analysis results to draw conclusions about the overall behavior of the program. A variety of forms of compositional ‘We treat interfaces in Java as classes which comprise a unit inrefice or modular verification have been studied (e.g., [18]) but in our terminology. .. proximated , Experience has shown that the developing environment . I BEG Unit models for software model checking that are sufficiently Under Analysis precise to enable effective reasoning yet not so over- 4iw i restrictive that they mask faulty system behaviors is a sig- \ nificant challenge [19, 201. Developing such an environ- pEq- ment may require an understanding of unstated assumptions UnderU Annita lysis t --1- Java I Model about system usage and software interfaces, careful coding Checking 1: to ensure that those assumptions are satisfied in the least restrictive way, and evaluation through model checking of the environment and the unit under analysis. For this rea- Under Analysis Unit son, we believe that multiple sources of infomation should Under Analysis be combined to generate environment models that reflect a broad range of realistic execution contexts for a unit under analysis. BEG is aimed at both minimizing the effort re- Figure 1. BEG Architecture quired to generate environment models and increasing their fidelity with respect to assumptions about environment be- of the external environment. havior. Specifically, BEG currently automates: the discovery of the unit-environment interface based on minimal user Our approach builds on existing work in assume- supplied information, the synthesis of environment drivers guarantee reasoning and in program flow analysis. The ap- from specifications of the sequences of program actions proach is implemented in the Bandera Environment Gen- they may perform, and the synthesis of environment stubs erator (BEG) which supports modular checking of Java from analysis of the possible program actions executed by source code. Our previous work [28] presented the details of the program analysis and synthesis techniques used existing environment code. Program actions in our setting are statements that may directly influence the data or control to model the data effects of environment implementations. This paper focuses on control effects for active environ- state of the unit. ment components and makes several contributions, includ- We envision two ways in which environment generation ing (i)d efining a language of program actions with which to tools can be used effectively: during component develop- specify environment behaviors; (ii)a dapting existing spec- ment as an adjunct to traditional unit testing approaches and ification forms for defining patterns of environment behav- during program validation to enable more efficient reason- ior; (iii)s ynthesing source code models of environment being and to model non-source-code components. havior that- can be processed by existing- model checking During component development individual classes, or frameworks; and (iv) a preliminary evaluation of the effec- groups of dasses, that constitute cohesive functional com- tiveness of BEG in supporting modular source-code model ponents, perhaps structured as Java packages, may become checking. While BEG supports the checking of Java source code complete when the code they interact with (e.g., client code, the fundamental concepts it embodies are much more code) has not been written. In this setting, the class(es) form broadly applicable. a unit and the missing classes they interact with form the The next Section describes our basic approach and an environment. To enable effective checking, we expect that example that is used throughout the paper. Section 3 de- developers will need to encode assumptions about the be- scribes the formalisms for specifying environment behav- havior of the environment at the unit’s interface. They will ior and generating environment models as Java source code. need to account for both control and data effects. These as- Section 4 discusses the soundness of environments relative sumptions can subsequently be checked against implemen- to specified assumptions. An overview of several case stud- tations of the missing environment classes as they become ies using BEG is presented in Section 5. We then compare code complete. and contrast our work to existing research in Section 6 and During program validation when considering a complete conclude in Section 7. application one may break the system into parts to enable more efficient checking of program properties. In this set- 2 Basic Approach and An Example ting, the user selects classes that comprise the unit under analysis and an environment model is automatically ex- tracted. For applications that interact with external entities, The fundamental assumption in BEG is that precise rea- such as embedded control software processing data from soning about the unit is desired, but that some precision in hardware devices, developers may incorporate assumptions modeling the environment may be sacrificed. Our approach about those interactions to generate a representative model is to safely approximate environment data and the environ- 2 ment statements that may influence the unit’s behavior. public boolean unregister(Watcher w) ( Modelling the effect of environment statements is if (super.r emoveElement (w)) ( w.regrstered = false; return true; achieved by a combination of user specifications and analy- I sis of Java source code. Figure 1 shows the architecture of return false; I BEG. BEG accepts multiple information sources for gener- public Watcher removefirsto ( ating an environment model. Users identify the unit under Watcher result = elementAtIndex(0); removeElement(resu1t); analysis by naming the classes, interfaces and packages that return result; comprise the unit. Users provide specifications of their as- 1 sumptions about the patterns of unit method calls and unit Buffer Implementation field definitions that the environment may make. If an im- public boolean unregister (Watcher PO) ( plementation of the environment is available, BEG may be if(chooseBool0) p0.registered = false; used to automatically extract the environment assumptions return chooseBool () ; 1 using static analysis techniques. Thus, environment mod- public Watcher removefirsto { els can be synthesized from a combination of assumption return ( (Watcher)c hooseClass (“Watcher”)) ; I specifications and the results of analyzing implementations. Generated Environment Those models are encoded as Java source code using a collection of modelling primitives to express the atomic execu- Figure 4. Bounded Buffer Stubs (excerpts) tion of environment actions, to encode non-determinism in the environment, and to reflect the approximation in analysis results. that are immediately referenced in the unit. The part . We illustrate our approach on a small publish- of the environment that is invisible to the unit is safely subscribe program implemented using Java’s Observer approximated. and Observable library components. Figure 2 shows class Subj e ct,w hich is an observable; a field obs of type 2.2 Driver Specification and Synthesis Buffer,s hown on the left side of Figure 4, is a container for Watchers that are registered for the Subject. The One may specify assumptions about sequences of Watcher class contains two bookkeeping fields that record method calls and unit field definitions that the environment the total number of registration attempts and the num- may make on the unit. BEG generates a set of driver ber of aborts. Suppose, we are interested in reasoning threads that implement the most liberal model that is con- about whether “Only registered Watchers are notified of sistent with the given assumptions. Figure 3 illustrates Subj ect updates”. This can be specified in several ways, an assumption with one instance of Subject and two but one approach is to test whether registered field of Watchers and a pair of threads whose behavior is given Watchers is true at the point where a Sub je ct calls by regular expressions over method names with parameter update 0. values elided; elided parameters means that their value is selected non-deterministically from the possible values of 2.1 Interface Discovery the parameter type. The first thread repeatedly calls the changestate ( 1 methodon a selected Sub] ect and the The user designates the unit under analysis by naming a second calls any sequence of add ( ) or delete ( ) calls collection of Java classes whose properties need to be veri- on a selected Subject with some Watcher. fied. In general, selection of the classes in the unit is driven Figure 3 also shows the generated drivers that cap- by the properties that one wants to reason about. ture the assumed behavior. Main allocates the spec- These classes are analyzed to determine: the fields of ified instances and starts the execution of the two unit supertype classes that are referenced in the unit and threads. Thread implementations model the assump- the non-unit classes that are directly referenced by the unit. tion specifications by invoking modeling primitives that Any referenced supertypes are included in the unit. Directly capture non-determinism (e.g., chooseBoo1 ( ) chooses referenced non-unit classes define the unit inte$ace. among {true,f alse}, chooseInt (n) chooses among For our example and the mentioned property, (0,. . . ,n},a nd chooseClass (C) chooses among the Subject and Watcher should be in the unit. allocated instances of class C) [8]. Their supertypes ja va .uti l .O bservable and j ava .u ti1 . Observer and referenced Buffer are 2.3 Stub Analysis and Synthesis in the environment. Note, that the actual environment may consist of more classes due to transitive class and A series of static analyses, including points-to and side- method dependencies, however, BEG identifies classes effects analyses, are applied to determine how the environ- 3 public class Watcher implements Observer{ public void notify(0bject arg) { static public int attempts = 0; Watcher cw; static public int aborts = 0; Buffer lb = new BufferO; public boolean registered = false; synchronized (this) ( public void update(Observab1e 0, Object arg) [ ) if ( !changed) return; 1 obs.copy(lb); changed = false; public class Subject extends Observable ( 1 boolean changed = false; if (obs.size0 != lb.size0) cw = null; Buffer obs; while (!lb.isEmptyO) { public Subject0 ( obs = new BufferO; ] cw = lb.removeFirst0; cw.update(this, arg);] public void changestate0 { setchangedo; notifyOOservers0;) ) public synch. . . void add(Watcher 0) ( obs.r egister (0); ] protected synch ... void setchanged0 ( changed = true;] public synch.,. void delete(Watcher 0) { obs.unregister(o); ) ) Figure 2. Customized Observer Implementation environment [ public class TO extends java.lang.Thread [ instantiations ( 1 Subject; 2 Watcher; ] public Subject SO; regular assumptions ( public Watcher w0, wl; (changeStateOl*; public TO(Sub]ect PO, watcher pl, Watcher pZ)[ (add0 I deleteo)*; so = po; wo = p1; Wl = p2;j public void run0 ( I while(Bandera.chooseBool0) sO.changeState0;) 1 public class EnvDriver ( ppuubblliicc c vloaisds rT1u ne0x t( ends java.la.ng.T.hrea d ( ... public static void main(java.lang.String[l paramO)( while(Bandera,chooseBool~)) Subject SO = new Subject0 ; switch(Bandera.choose1nt (2)1 [ Watcher w0 = new WatcherO; case 0: sO.delete(Bandera.chooseClass("Watcher")); Watcher wl = new WatcherO; break; new TO(s0, w0, wl1.starto; case 1 : SO. add (Bandera.c hooseclass ("Watcher") ; new TL(SO, w0, wl) .start(); ) ' break; ) ) 1 Figure 3. Assumptions and Generated Drivers ment methods can influence the unit data [28]. In our exam- they perform any sequence of calls over the methods in the ple, the analysis of the Buffer implementation calculates system by selecting appropriately typed class instances). effects of the environment on the fields of Watcher, for In its current form, the tools make assumptions about the instance method unregister may side-effect only one lack of divergence indefinite-blocking, and lock acquisition field registered. in the environment. Ongoing work is extending the tools Models are generated to reflect all possible side-effects to support the specification of behavior related to these lan- as calculated by the preceding analyses. To safely reflect guage aspects and the extraction of safe approximations of the possibility of a side-effect, code is generated to execute such behavior from implementations. Despite these limi- abstract assignmenrs non-deterministically. Figure 4 shows tations, the BEG toolset has been effective in supporting the generated environment for several methods of Buf f er. modular reasoning about properties of a number of realistic For example, the access of a Watcher instance, via call systems as discussed in Section 5. elementAtIndex (0) , in method unregister () is approximated as the return of a non-deterministically cho- 3 Driver Specification and Synthesis sen instance of Watcher. We focus in this section on the specification of the 2.3.1 Tool Support expected behavior of environment drivers. The building blocks of those specifications are descriptions of program This example illustrated the basic capabilities of BEG. actions that may influence the control or data State of the BEG the 'pecification a wide-range Of assump- unit under analysis. Those program actions are then corn- tions about environment behavior compactly using regular bined to describe the of environment behavior. expressions, temporal logic formula (defined over program actions), and data side-effects summaries. In the absence of 3.1 Environment Instantiation specified assumptions, BEG can be configured to make rea- - sonable assumptions about the intended environment. For example, it is assumed that the calling environment consists We define a name scope within which environment spec- of a number of unit class instances and threads (specified ifications may refer to specific class instances; by default on the command line) that exhibit universal behavior (Le., the name scope is empty. 4 The global name scope is defined by annotating instanti- unit type, and rhs is either a scalar constant or T rypegc)>if ations. In an instantiation, the number of instances allocated fype(n E B, or chooseClass (typeCf)1 , if type@ E U. of a type by the environment is given and those instances Here Ttypefl denotes any possible value of type@; for may be named. For example, we can adapt the assump- scalar types the expansion of values is done implicitly via tion specification in Figure 3 to explicitly name the lone abstraction [9]. The target of the assignment, T, is either an Subje ct instance, s, and reference it regular expression. introduced name, T for an appropriate type, or the name of environment { a class. riengsutlaanrt iaastsiuomnpst i( on1s Su{ bl(esc at dsd;0 I s.d1 eleteO)*, ... ) Method call actions are defined using standard Java syn- I tax, but where partial specification of parameters is allowed. It is important to distinguish between named instances Consider a method in class C with signature public R and the set of all instances. The latter is the set of all envi- m (PI PI, P2 p2 . We can denote the occurrence of a ronment and unit allocated instances. That set forms the call to this method with any receiver object of type C, a universe from which non-deterministic choice primitives specific value, VI,f orpl, and any value forp2 as m(wl,T ), over reference types are evaluated. where w1 is an introduced name or a scalar. Partially speci- A local name scope can also be defined that applies to a fied calls may omit the receiver object or any parameter by portion of an assumption specification. The preceding ex- replacing it with T. The meaning of such a call is the set ample can be rewritten using a local name scope as: of all calls that can be constructed by replacing T with any legal value of the receiver or parameter type. environment ( instantiations ( 1 Sublect; ... ) We note that BEG, through the process of interface dis- regular assumptions { covery, produces the set of program actions (ie., public <Subject s>:(s.addO I s.deleteO)*; ... 1 method calls and fields at the unit-environment interface) that can be used to define assumptions. By default local names are bound to a non-deterministically selected value of the given type that holds throughout the name scope (which is denoted explicitly by a pair of { } 3.3 Specifying Patterns of Actions and which extends to the end of the expression by default). Thus, they serve a function similar to universal quantifiers Regular expressions defined over this alphabet describe a in logics and their primary use is in correlating event oc- language of actions that can be initiated by the environment. currences (e.g., that a sequence of actions are applied to the The simplest regular expression is a single program action. same receiver object). Complex environment assumptions are built up using the In these examples, there is a single instance 'of standard operators for regular expressions: r;s (concatena- Subject, thus the three specifications are semantically tion), TIS (disjunction), r* (closure), and r? (one or more equivalent. In general, this will not be the case. Lo- occurrences of r). Positive closure (r+), bounded iteration cal name introduction is interpreted as non-deterministic (r * n = rl; 7-2; . . . ;r n), and a generalization of bounded choice over the the set of allocated instances of the named iteration (r * {n,m } = r * nlr * (n+ 1)I . . . * m) are type, Subject in this case. Local name scopes may be also supported. These expressions can appear in introduced nested and may refer to additional names. For example, name scopes, where those names are referenced in the pro- <Subject x>:<Subject-xy >:.. . introduces two gram actions used in the expression. The syntax of these names that are guaranteed to refer to distinct instances of assumptions is given in Figure ??. where a is a program Subj e ct. Local names may also be bound to values from action, afuna refunction call actions, n and m are intro- the unit. For example, <Ref x=getRef ( ) > :x .m ( ) in- duced name, and t is a program type name. Legal assump- troduces a name x that is bound to the value returned by tion specifications must also satisfy some type constraints. a call and that is subsequently used to perform a call on Specifically, type expressions, te, may only involve types method m ( ) . and named variables, m, of that type; m here must refer to a name introduced in an enclosing name scope. Similarly, 3.2 An Alphabet of Program Actions name initializations, ni, may only involve function call actions whose type is compatible with the type of the type Let U be the set of classes that comprise the unit under expression for the introduced name. analysis and let B denote the set of Java builtin types. We As an example, java.uti1.I terator presents a define an alphabet of actions consisting two classes of ac- simple standard interface for generating the elements in tions: field assignments and method calls. an instance of a container. Semantically, this interface as- Assignments can be either static field assignments or as- sumes that for each instance of a class implementing the signments through object references of unit type. Assign- Iterator interface (denoted by the introduced name i), ments are of the form r$= rhs where: type(.) E U,f is of all clients will call methods in an order that is consistent 5 8 a switch (chooseInt (1)) { case 0: code(r); break; case 1: code(s) ; break; 1 T; S -+ code(r) ; code(s); T* -+ while (chooseBool0) {code(r); ) T+ --t do { code(r1 ;] while (chooseBoo1 0 T? 4 if (chooseBool0) {code(r); ) S* T * n 4 for (mt i=O;i<n;i++) { S+ code (r); ) s+i,j 1. < t ni.>: s for (int i=O; <ten>:s icnlcchooseInt (m-n); I++) { n code (r); Tu = Ufvn 1 t <t n>:r --+ { t n = chooseClass (t); te - m code (r); } <t n= f >:T 4 (t n = fo; code (r); ) Figure 5. Assumption Syntax <t-m1- mk n>:r -+ I t n; while (true) { n = chooseClass(t); if (n == ml) continue; with the following specification: ... if (n == mk) continue; environment { instantiations [ k Iterator; } break; regular assumptions { 1 [Iterator 11 :i.iteratorO; code (r); } (i. h asNext () ; i . next () ; i .remove( ) ? 1 * 1 This expresses both required sequencing of calls (e.g., a Figure 6. Assumption Semantics call to iterator 0 must precede a call to hasNext) and allowable optional calls (e.g., the occurrence of a single Regular expressions are a familiar formal notation to remove ( ) call after a call to next ( ) ) over each instance many developers and our experience is that many find it of Iterator. easier to use than temporal logics. We also support assumptions specified as Linear Temporal Logic (LTL) and gener- 3.4 From Regular Expressions to Code ate Java models using an approach that is sirni!ar to the one developed for Ada modeling in [22]. Java models of regular expresson assumption specifications can be generated using the templates shown in Fig- 4 Soundness of Synthesized Environments ure 6. These templates use the non-deterministic choice constructs mentioned previously and are defined recur- sively, using code to refer to the code fragment for a given In this section, we justify the soundness of synthesized subexpression. environments with respect to assumption specifications and One can view name scope introduction for a subexpres- the results of side-effects analyses. sion as prefixing a special name binding action to the subexpression. Name scopes are supported by introducing local 4.1 Preliminaries variables in the body of the driver run ( 1 method and assignments that non-deterministically choose an instance to Formally, we model the behavior of a concurrent pro- be bound to the name at the point where the name binding gram written in Java as a labelled transition system. Cor- action is embedded in the regular expression. bett shows how to model the behavior of Java [6] programs We note that much of the generated model code is in- as transition systems as defined below, using standard tech- ternal to the environment. Internal environment states and niques for constructing control flow graphs. actions are hidden in our models by embedding them in A labelled transition system P is a triple atomic statements. This has two consequences: internal en- (S(V),Act,R),w here V is a set of typed program vironment behavior does not contribute to state explosion vuriubles, S(V)i s the set of states representing valuations and internal actions are elided from counter-examples mak- of the variables from V, Act is an alphabet of actions and ing them shorter and easier to read. R C S(V)x Act x S(V)i s a transition relation. We write 6 s s’ for (s,a , s’) E R. For a set of variables W C V, 4.2 Data and Control Effects SIW denotes the valuation of variables from W in state s. Our program model is general enough to capture differ- States of a system can be regarded as tuples giving the ent interactions between the system and the environment: values of all relevant program variables, including the pro- through shared data and control (i.e., communication ac- gram counter. A transition s I& s’ says that the system can evolve from state s to state s’ by executing action a. The lations); the model does not directly capture dynamic alloca- tion of data, so we put a limit to the number of abjects that bels on transitions can represent variable assignments, vari- can flow into the system from the environment. able tests, and actions modelling transfer of control to and Our generated environments are either drivers or stubs. from a procedure; parameter passing can be simulated by Drivers capture the control influences from the environ- communication through common (shared) variables. ment, while the stubs capture the data influences from the We assume that a system is open, i.e. it can interact with environment. its environment through shared variables and actions. If V is the set of variables for an open system, let Vzntd enote 4.2.1 Simulation and Preservation Results the set of internal (local) variables (that only the system itself may modify) and let Vco“ denote the set of com- We proceed to define when a system is a sound abstrac- mon (shared) variables, such that V = VantU Vcoma nd tion of another one. Abstracting means having less de- vznt n vcom = 0. Also if Act is the set of actions of tails while respecting behaviors of the original system. Let the open system, let Acttnt denote the set of internal ac- P = (S(V),Act,R)an d P’ = (S(V’),Act’,R’)b e two tions (a symbol representing an internal action of a system systems. We say that P is a sound abstraction of P’ iff is in the alphabet of only that system) and let ActCond e- there is a simulation from P to P’. note the set of communication (or inte$ace) actions, such A simulation [ 181 from P to P’ is a pair (ps,p a) of re- that Act = Actznt U ActComa nd Act*nt r lA ctCom = 0. lations with ps 2 S(V) x S(V’)an d pa C Act x Act’ A system is closed if it may not interact with the environ- such that if (s,s ’) E ps and s I& t, then there exists some ment; a closed system has no shared variables or actions A state t‘ E S‘ and some action a‘ E Act‘ such that s’ t’, (i.e. Vznt= 0 and Acttn* = 0). (t,t ’) E ps and (a,a ’) E pa. We say that P simulates P’, In order to define the interaction of an open system denoted P 1: P’, if there is a simulation from P to P’. with the envlronment, we first define a parallel composi- When specifying properties of software systems, we use tion of two systems. Let PI = (S(V1),Actl,Rl)a nd universal temporal logics, i.e., we reason about properties P2 = (S(V2)A, ct2, R2) be two open systems. We say that hold along every possible execution path. A standard that PI and P2 are compatible if both their sets of inter- result, see e.g. [21], says that simulations preserve satis- nal variables and sets of internal actions are disjoint (i.e faction of formulas of such logics. I.e., if P 5 P’, then, Vlznt n Vzznt= 0 and fl Actint = 0). for +ev ery universal tempor+al formula 4, P’ 4 implies P 4, However, if P’ does not hold, it does not Let PI and P2 be two compatible systems as above mean that P 4 is necessarily false (i.e., completeness is The composition of PI and P2, denoted PlIIP2, is an- (s(V), sacrificed). other system P = Act, R), where V = VI U Vz, Our synthesized environments are sound abstractions of Act = Act1 UAct2, (s,a , s’) E R iff (u $ Act, A sIVtn=t S’IV,~~) V (U E Act, A (slv,, a, ~’lv,E) &), i = 1,27 rteemal einn vai rsoynnmtheenstisz (esde ee n[2v 1ir,o2n6m]),e natn dis msooudneld c. hEexckteinndgi an gs yosf- The two systems synchronize on the shared actions and the results from [21, 261, we have the following results: (i) asynchronously interleave all other actions. The internal if E 5 EabS,t hen EllP 5 EabsIIP,a nd (ii) if E! 5 Eabs variables of system P, may be modified only by the actions and P 5 Pabs,t hen Ell P 5 Eabs1 1P abs.I n other words, it of system P,, while the common variables may be modified is safe to check universal temporal properties in the pro- by both systems. grams that use the automatically generated environments since these environments are sound abstractions of real en- An environment for system P is another system E that is vironments. compatible with P. Note that after completing the system with a definition of a system representing the environment, the resulting system (i.e. El IP) is still open, admitting arbi- 5 Experience with BEG trary interference from the environment; once we know that all the processeskode modelling the environment have been BEG is implemented using the SOOT framework 1291. included, and no further interaction with the external world BEG uses SOOT’S symbol table, control flow graph, and is expected, we may “close” system EllP by declaring all bytecode representation, Jimple, to perform its analyses; the shared variables and actions to be internal to the system. Jimple is a 3-address SSA-like intermediate form. The tools 7 produce Java code as output that includes calls to the mod- parallel job scheduling framework. Since neither of these eling methods introduced in Section sec:GENERATE. This programs can be model checked efficiently in combination section describes our experience applying BEG to generate with an environment implementation, rather than focus on environments for portions of programs that have appeared measures of time and space required for checking, we de- recently in the literature on Java verification. The actual scribe how the tools supported the user in performing mod- verification was performed using either JavaPathFinder or ular checking. ‘ Bandera. 5.2 Autopilot 5.1 Case Studies in Environment Generation The MD-11 autopilot tutor is a web-based application We have applied BEG to a variety of examples’. A num- that has a graphical user interface (GUI) that simulates the ber of multi-threaded Java programs that have been the sub- Autopilot Mode Control Panel and a Primary Flight Dis- ject of analysis in literature have been re-verified by gen- play of an MD-11 aircraft autopilot. A user may click on erating the previously hand-built environments with BEG; buttons to dial desired altitude and vertical speed, and ad- the resulting checks are in fact slightly more efficient due vance the aircraft towards its goal altitude. Autopilot is im- to the atomicity of environment behavior in generated en- plemented as an applet. The application code consists of vironments. In addition to the Observer/Observable ex- more than 3600 lines of code clustered in two main classes. ample, these examples include: a ProducerdConsurners These measures bely the true complexity of the system as framework for exercising a bounded buffer [15]; RWVSN, there is intensive use of j ava .a wt and ja va . swing Lea’s [ 171 generic readers-writers synchronization frame- GUI frameworks that influences the behavior of the system; work; and dining philosophers with host, a classic synchro- in fact the main thread of control is owned by the frame- nization problem. work and application methods are invoked as application While BEG proved to be quite useful in generating envi- call-backs. ronments for these small systems, the tool support is much The system was checked for mode confusion by encod- more valuable when attempting to reason about proper- ing a model of a pilot’s understanding of the aircraft state. ties of larger software systems. An increasingly important That mental model was integrated with the system to mon- class of object-oriented software systems are frameworks. itor GUI inputs. Assertions were inserted to compare the Frameworks provide for large-scale reuse of functionality state of system data structures with the state of that model; by collecting threads of control, operations and data struc- assertion violations indicated a mismatch between the men- tures that relate to a specific problem domain (e.g., Swing is tal model and the software’s state which implies a potential a Java framework that supports the development of graphic mode confusion. user interfaces (GUI)). Frameworks present rich interfaces To analyze the system BEG was used to generate stubs that allow application specific processing to be co-ordinated for all the GUI framework components and to generate through the framework. Frameworks are quite difficult to drivers that encode regular assumptions about pilot behav- test due to the complexity of their interfaces and the de- ior. We restrict our attention here to the generation of the gree of parameterization that is possible to configure their drivers, for a more complete description see [27]. behavior. Current state-of-the-practice in framework test- The main class of the system is Autopilot which ex- ing relies on the use of groups of use cases to drive test tends j ava.a pplet .Appletw hich in turn extends sev- case generation. BEG enables the synthesize of drivers that eral AWT classes. This applet makes a large number of calls capture multiple framework use cases and mode state ma- to AWT methods in order to create and update the simulated chines. Furthermore, the use of non-determinism in as- cockpit displays. The properties we wished to reason about, sumption specifications allows drivers to span configura- however, were independent of the state of the GUI and we tion settings. This has the great advantage of allowing were chose the Autopilot class itself as the unit. configuration-independent properties to be analyzed with- BEG calculated the data effects of the AWT methods out having to enumerate combinations of configuration set- called from the Autopilot class and generated safe ap- tings. proximation of the data effects on explicitly defined fields To explore BEG’S support for analyzing frameworks, of Autopilot and on fields inherited from AWT classes. we consider two non-trivial Java programs. Autopilot is For this system, we found it useful to name the pilot a swing-based GUI for an MD-11 autopilot simulator used actions to improve the readability of both the assumption for pilot training at NASA [24]; it is a framework client specifications and generated counter-examples. As shown application. ReplicatedWorkers [ 121, is a parameterizable in Figure 7, BEG allows one to define mnemonics for GUI interface actions and to define regular assumptions in terms ’The details of all examples are given at http : //www. cis.k su. edu/bandera. of those mnemonics. Model checking the Autopilot 8 replicated workers framework makes significant use of in- environment( instantiations ( 1 User(new Autopilot 0 ) ; ) terfaces to enable call-backs to the client supplied compu- definitions ( tations that are to be parallelized. An environment for the start=mouseClicked(l); pullA1tKnob=mouseClicked(6); incMCPAlt=mouseClicked(9) ; incMCPVS=mouseClicked(ll); replicated workers, must define and instantiate classes that fly=mouseClicked(14); pilotExp=getExpectationO; implement each of the interfaces given in the framework 1 regular assumptions ( and define appropriate configuration information. start > incMCPAlt^(l,lO) > We checked several properties from [lo] and were able pullAltKnob > (pilotExp fly)-(l,lO) > incMCPVS^(l,lO] > (pilotExp > fly)-5 ; to reproduce those results with one difference. When I I checked a framework instance under the environment defined in Figure 8 for deadlock, we found an actual deadlock. The bug was in the Java implementation of a barrier Figure 7. Autopilot Assumptions synchronization utility. Its discovery was surprising since the framework has been used in implementing more than environment ( ten non-trivial parallel simulation applications and this bug import ca.replicatedworkers.*; instantiations ( was never discovered. We replaced the barrier implemen- 1 ConcreteWorkCollection; 1 ConcreteWorkItem; tation with one from j ava. uti1 . concurrent and the 1 ConcreteResultsCollection; i ConcreteResultItarn; 1 ReplicatedWorkers( deadlock was eliminated. new Configuration(NONE, SYNCH, SOME), TOP, TOP, 2, 1, 1, 0); 1 6 Related Work regular assumptions ( (putWork(T0P) > execute0 ) > destroy0 ; 1 Modular approaches to model checking have been studied for more than a decade. This work has been carried Figure 8. RW Assumptions out mostly at the theoretical level although there have been some implementations of game-theoretic approaches to reasoning about open systems (e.g., [l]). Our focus is on class with the generated environment using JavaPathFinder capturing the complexities of unidenvironment interaction produced the following counter-example: that arise in real programming language and supporting the specification and extraction of precise, yet compact envi- start > incMCPALT-2 > pullAltKnob > fly-2 > incMCPVS > fly ronment models. which indicated a mode confusion anomaly that is possible Environment generation from specifications presented in in the tutor. this paper builds on work of Avrunin et al. [2], who devel- It is interesting to note, that a previous effort to build an oped tool support for analyzing partially implemented real- environment for this application required approximately 6 time systems with some components implemented in Ada months and yielded an environment model that was incon- and others described using graphical interval logic and reg- sistent with the actual environment implementation. From- ular expressions, and our previous work on model checking relatively simple specifications, BEG generated an environ- of partial software systems in Ada [lo, 11, 221. In addition ment in less than 4 minutes that is-guaranteed to be con- to treating Java programs, we support a much richer class of sistent with the implementation, modulo the fidelity of as- environment specifications and extract environment models sumption specifications. from existing code. Another modular approach to checking multithreaded programs is implemented in Calvin [13]. Their approach is 5.2.1 Replicated Workers aimed at procedure checking relying on a user specifications I Replicated Workers (RW) is a configurable frame- of environment assumptions that describe other procedures work designed to support the parallelization of simulations. in the system and constrain interactions among threads. Un- In previous work [lo], we applied largely manual tech- like in our framework, theirs allows for simple invariant niques to model check a collection of properties of an Ada specifications and requires that programs obey a restricted implementation of this framework. Subsequent to that work class of locking disciplines in interacting. the framework was rewritten in Java and has been widely As complementary to our approach, generation of envi- used [12]. ronment assumptions for optimistic environments has been Like most frameworks, replicated workers instances cre- described in [14, 41 Their work is aimed at finding envi- ate threads internally. Clients control the degree and asyn- ronments within which the unit would satisfy its required chrony of parallelism in the configuration by passing pa- properties. This is an important direction to pursue for mod- , rameters to the constructor of the framework instance. The ular program checking, but we also believe that extraction 9 of environments plays an important role when using model and the U.S. Army Research Office under agreement DAAD190110564, checking as a kind of unit testing approach on existing code by DARPAAXO’s PCES program through AFRL Contract F336 15-00-C- 3044, and by Intel Corporation under grant 11462. bases. There is a number of examples of applying static analysis References techniques used in modular analysis or verification. Verisoft incorporates a static analysis to closing of open systems by calculating the influence of externally defined data [5]. Un- [I] R. Alur, L. de Alfaro, R. Grosu, T. A. Henzinger, M. Kang, C. M. Kirsch, R. Majumdar, F. Mang, and B.-Y. Wang. jmocha: A model- like in our approach, they use a simple notion of data depen- checking tool that exploits design structure. In Proceedings of the dence to drive their analysis and do not have the ability to 23rd Annual IEEWACM International Conference on Software En- control the precision of the generated system. Stoller [25] gineering, pages pp. 835-836. IEEE Computer Society Press, 2001. describes an approach that computes a partition of a sys- [2] G. S. Avrunin, J. C. Corbett, and L. Dillon. Analyzing partially- tem’s inputs based on the data-flow analysis of the system. implemented real-time systems. In Proceedings of the 19th Interna- The idea is to use a single representative input value from tional Conference on Software Engineering, pages 228-238, 1997. each partition to exercise all behaviors of the system and [3] T. Ball,. R. Majumdar, T. Millstein, and S. Rajamani. Automatic to avoid exercising the same behavior twice. BEG gener- predicate abstraction of C programs. In Proceedings of the ACM SIGPLAN ‘01 Conference on Programming Language Design and ates environment values based on the user specification or Implementation (PLDI-OI),v olume 36.5 of ACM SIGPLAN Notices, the assumption that the environment data is to be abstracted pages 203-213. ACMPress, June 20-22 2001. before model checking phase. Rountev et al. [23] explore [4] J. M. Cobleigh, D. Giannakopoulou, and C. S. PWreanu. Learning how points-to and side-effects analyses may be used to pro- assumptions for compositional verification. In Tools and Algorithms duce summaries for library modules that later may be used for the Construction and Analysis of Systems, 9th International Con- ference (WCS 2619): Apr. 2003. for separate analysis of client modules. Unlike in our work, [5] C. Colby, P. Godefroid, and L. J. Jagadeesan. Automatically closing their summries are produced using whole program analysis open reactive programs. In SIGPLAN Conference on Programming under the worst-case assumptions about a client and are tar- Language Design and Implementation, pages 345-357, 1998. geted at the optimizations of the client. Our analyses are [6] J. C. Corbett. Constructing compact models of concurrent Java pro- modular and explore the information about the unit, if there grams. In M. Young, editor, Proceedings ofthe 1998 Internarional are call backs from the environment. Symposium on Software Testing and Analysis (ISSTA). ACM Press, March 1998. 7 Conclusions [7] J. C. Corbett, M. B. Dwyer, J. Hatcliff, S. Laubach, C. S. PWireanu, Robby, and H. Zheng. Bandera : Extracting finite-state models from Java source code. In Proceedings of the 22nd International Confer- Despite the significant computational complexity of ence on Software Engineering, June 2000. model checking, it has proven effective as an analysis tech- [8] J. C. Corbett, M. B. Dwyer, J. Hatcliff, and Robby. Expressing checkable propelties of dynamic systems: The bandera specification nique that is capable of finding errors in real concurrent language. Inrernarional Journal on Software Tools for Technology Java programs (e.g., the Replicated Workers framework). Transfer, 4(1):34-56,2002. Modular approaches promise to further scale the application [9] M. B. Dwyer, J. Hatcliff, R. Joehanes, S. Laubach, C. S. PssHreanu, of model checking to software. The Bandera Environment Robby, W. Visser, and H. Zheng. Tool-supported program abstraction Generator (BEG) provides automated tool support that has for finite-state verifica,tion. In Proceedings of the 23rd International proven effective in enabling useful €orms of modular analy- Conference on Software Engineering, May 2001. sis. [lo] M. B. Dwyer and C. S. PSsiIreanu. Filter-based model checking of We are continuing development of the foundations for partial systems. In Proceedings of the Sixth ACM SIGSOFT Sympo- sium on Foundations of Software Engineering, Nov. 1998. BEG as well as the tool support. Specifically, we are work- [ll] M. B. Dwyer and C. S. PMreanu. Model checking generic container ing on analysis of program lock acquisition to safely ap- implementations. In Proceedings of the 1st Symposium on Generic proximate the synchronization interaction between the envi- Programming, May 1998. ronment and unit. In addition, we are adapting thread mod- [I21 M. B. Dwyer and V. Wallentine. A framework for parallel adap- ular approaches [ 131 to enable model checking for arbitrary tive grid simulations. Concurrency - Pracrice and Experience, numbers of environment threads. BEG is being released as 9(11):1293-1310,1997. part of the Bandera toolset at http : //www. cis . ksu . [I31 C. Flanagan and S. Qadeer. Thread modular model checking. In edu/bander a. Model Checking Software (WCS 2648), May 2003. [14] D. Giannakopoulou, C. S. Pasareanu, and H. Barringer. Assumption generation for software component verification. In Proceedings of 8 Acknowledgments the 17th IEEE Conference on Automated Software Engineering, May 2002. The authors would like to thank the members of the Bandera and JPF [15] K. Havelund and T. Pressburger. Model checking Java programs us- projects for many helpful discussions and comments related to this work. ing Java PathFinder. International Journal on Software Tools for This work was supported in part by the U.S. Army Research Laboratory Technology Transfer, 1999. 10

NASA Technical Reports Server (NTRS) 20030107450: Automated Environment Generation for Software Model Checking PDF

11 Pages·1.1 MB·English

by NASA Technical Reports Server (NTRS)

#additional_collections #NASA_NTRS_Archive

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 20030107450: Automated Environment Generation for Software Model Checking

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.