ebook img

Software Unit Test Coverage and Adequacy - UMass PDF

62 Pages·1997·0.46 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Software Unit Test Coverage and Adequacy - UMass

Software Unit Test Coverage and Adequacy HONG ZHU NanjingUniversity PATRICK A. V. HALL AND JOHN H. R. MAY TheOpenUniversity,MiltonKeynes,UK Objectivemeasurementoftestqualityisoneofthekeyissuesinsoftwaretesting. Ithasbeenamajorresearchfocusforthelasttwodecades.Manytestcriteriahave beenproposedandstudiedforthispurpose.Variouskindsofrationaleshavebeen presentedinsupportofonecriterionoranother.Wesurveytheresearchworkin thisarea.Thenotionofadequacycriteriaisexaminedtogetherwithitsrolein softwaredynamictesting.Areviewofcriteriaclassificationisfollowedbya summaryofthemethodsforcomparisonandassessmentofcriteria. CategoriesandSubjectDescriptors:D.2.5[SoftwareEngineering]:Testingand Debugging GeneralTerms:Measurement,Performance,Reliability,Verification AdditionalKeyWordsandPhrases:Comparingtestingeffectiveness,fault- detection,softwareunittest,testadequacycriteria,testcoverage,testingmethods 1. INTRODUCTION Goodenough and Gerhart [1975, 1977] made an early breakthrough in research In 1972, Dijkstra claimed that “program on software testing by pointing out that testing can be used to show the presence the central question of software testing of bugs, but never their absence” to per- is “what is a test criterion?”, that is, the suade us that a testing approach is not criterion that defines what constitutes acceptable [Dijkstra 1972]. However, the an adequate test. Since then, test crite- last two decades have seen rapid growth ria have been a major research focus. A of research in software testing as well as intensive practice and experiments. It great number of such criteria have been has been developed into a validation and proposed and investigated. Consider- verification technique indispensable to able research effort has attempted to software engineering discipline. Then, provide support for the use of one crite- where are we today? What can we claim rion or another. How should we under- about software testing? stand these different criteria? What are In the mid-’70s, in an examination of the future directions for the subject? the capability of testing for demonstrat- In contrast to the constant attention ing the absence of errors in a program, given to test adequacy criteria by aca- Authors’addresses:H.Zhu,InstituteofComputerSoftware,NanjingUniversity,Nanjing,210093,P.R. ofChina;email:^[email protected]&;P.A.V.HallandJ.H.R.May,DepartmentofComputing,The OpenUniversity,WaltonHall,MiltonKeynes,MK76AA,UK. Permissiontomakedigital/hardcopyofpartorallofthisworkforpersonalorclassroomuseisgranted withoutfeeprovidedthatthecopiesarenotmadeordistributedforprofitorcommercialadvantage,the copyrightnotice,thetitleofthepublication,anditsdateappear,andnoticeisgiventhatcopyingisby permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists,requirespriorspecificpermissionand/orafee. ©1997ACM0360-0300/97/1200–0366$03.50 ACMComputingSurveys,Vol.29,No.4,December1997 Test Coverage and Adequacy • 367 demics, the software industry has been software. A way to measure how well slow to accept test adequacy measure- this objective has been achieved is to ment. Few software development stan- plant some artificial faults into the dards require or even recommend the program and check if they are de- use of test adequacy criteria [Wichmann tected by the test. A program with a 1993; Wichmann and Cox 1992]. Are planted fault is called a mutant of the test adequacy criteria worth the cost for original program. If a mutant and the practical use? original program produce different Addressing these questions, we sur- outputs on at least one test case, the vey research on software test criteria in fault is detected. In this case, we say the past two decades and attempt to put that the mutant is dead or killed by it into a uniform framework. the test set. Otherwise, the mutant is still alive. The percentage of dead mu- 1.1 The Notion of Test Adequacy tants compared to the mutants that are not equivalent to the original pro- Let us start with some examples. Here gram is an adequacy measurement, we seek to illustrate the basic notions called the mutation score or mutation underlying adequacy criteria. Precise adequacy [Budd et al. 1978; DeMillo definitions will be given later. et al. 1978; Hamlet 1977]. —Statement coverage. In software test- From Goodenough and Gerhart’s ing practice, testers are often re- [1975, 1977] point of view, a software quired to generate test cases to exe- test adequacy criterion is a predicate cute every statement in the program that defines “what properties of a pro- at least once. A test case is an input gram must be exercised to constitute a on which the program under test is ‘thorough’ test, i.e., one whose success- executed during testing. A test set is a ful execution implies no errors in a set of test cases for testing a program. tested program.” To guarantee the cor- The requirement of executing all the rectness of adequately tested programs, statements in the program under test they proposed reliability and validity is an adequacy criterion. A test set requirements of test criteria. Reliability that satisfies this requirement is con- requires that a test criterion always sidered to be adequate according to produce consistent test results; that is, the statement coverage criterion. if the program tested successfully on Sometimes the percentage of executed one test set that satisfies the criterion, statements is calculated to indicate then the program also tested success- how adequately the testing has been fully on all test sets that satisfies the performed. The percentage of the criterion. Validity requires that the test statements exercised by testing is a always produce a meaningful result; measurement of the adequacy. that is, for every error in a program, —Branch coverage. Similarly, the branch there exists a test set that satisfies the coverage criterion requires that all criterion and is capable of revealing the control transfers in the program un- error. But it was soon recognized that der test are exercised during testing. there is no computable criterion that The percentage of the control trans- satisfies the two requirements, and fers executed during testing is a mea- hence they are not practically applica- surement of test adequacy. ble [Howden 1976]. Moreover, these two —Path coverage. The path coverage cri- requirements are not independent since terion requires that all the execution a criterion is either reliable or valid for paths from the program’s entry to its any given software [Weyuker and Os- exit are executed during testing. trand 1980]. Since then, the focus of —Mutation adequacy. Software testing research seems to have shifted from is often aimed at detecting faults in seeking theoretically ideal criteria to ACMComputingSurveys,Vol.29,No.4,December1997 368 • Zhu et al. the search for practically applicable ap- that the adequacy of testing the pro- proximations. gram p by the test set t with respect to Currently, the software testing litera- the specification s is of degree r accord- ture contains two different, but closely ing to the criterion C. The greater the related, notions associated with the real number r, the more adequate the term test data adequacy criteria. First, testing. an adequacy criterion is considered to These two notions of test data ade- be a stopping rule that determines quacy criteria are closely related to one whether sufficient testing has been another. A stopping rule is a special done that it can be stopped. For in- case of measurement on the continuum stance, when using the statement cover- since the actual range of measurement age criterion, we can stop testing if all results is the set {0,1}, where 0 means the statements of the program have false and 1 means true. On the other been executed. Generally speaking, hand, given an adequacy measurement since software testing involves the pro- M and a degree r of adequacy, one can gram under test, the set of test cases, always construct a stopping rule M and the specification of the software, an r such that a test set is adequate if and adequacy criterion can be formalized as only if the adequacy degree is greater a function C that takes a program p, a than or equal to r; that is, M (p, s, t) 5 specification s, and a test set t and gives r true N M(p, s, t) $ r. Since a stopping a truth value true or false. Formally, let rule asserts a test set to be either ade- P be a set of programs, S be a set of quate or inadequate, it is also called a specifications, D be the set of inputs of predicate rule in the literature. the programs in P, T be the class of test sets, that is, T 5 2D, where 2X denotes An adequacy criterion is an essential part of any testing method. It plays two the set of subsets of X. fundamental roles. First, an adequacy Definition 1.1 (Test Data Adequacy criterion specifies a particular software Criteria as Stopping Rules). A test testing requirement, and hence deter- data adequacy criterion C is a function mines test cases to satisfy the require- C: P 3 S 3 T 3 {true, false}. C(p, s, t) 5 ment. It can be defined in one of the true means that t is adequate for testing following forms. program p against specification s accord- (1) It can be an explicit specification for ing to the criterion C, otherwise t is inad- test case selection, such as a set of equate. guidelines for the selection of test Second, test data adequacy criteria cases. Following such rules one can provide measurements of test quality produce a set of test cases, although when a degree of adequacy is associated there may be some form of random with each test set so that it is not sim- selections. Such a rule is usually plyclassifiedasgoodorbad.Inpractice, referred to as a test case selection the percentage of code coverage is often criterion. Using a test case selection used as an adequacy measurement. criterion, a testing method may be Thus, an adequacy criterion C can be defined constructively in the form of formally defined to be a function C from an algorithm which generates a test aprogramp,aspecifications,andatest set from the software under test and set t to a real number r 5 C(p, s, t), its own specification. This test set is the degree of adequacy [Zhu and Hall then considered adequate. It should 1992]. Formally: be noticed that for a given test case selection criterion, there may exist a Definition 1.2 (Test Data Adequacy number of test case generation algo- Criteria as Measurements). A test data rithms. Such an algorithm may also adequacy criterion is a function C, C: involve random sampling among P 3 S 3 T 3 [0,1]. C(p, s, t) 5 r means many adequate test sets. ACMComputingSurveys,Vol.29,No.4,December1997 Test Coverage and Adequacy • 369 (2) It can also be in the form of specify- cess of software testing. If path cover- ing how to decide whether a given age is used, then the observation of test set is adequate or specifying whether statements have been executed how to measure the adequacy of a is insufficient; execution paths should test set. A rule that determines be observed and recorded. However, if whether a test set is adequate (or mutation score is used, it is unneces- more generally, how adequate) is sary to observe whether a statement is usually referred to as a test data executed during testing. Instead, the adequacy criterion. output of the original program and the output of the mutants need to be re- However, the fundamental concept corded and compared. underlying both test case selection cri- Although, given an adequacy crite- teria and test data adequacy criteria is rion, different methods could be devel- the same, that is, the notion of test oped to generate test sets automatically adequacy. In many cases they can be or to select test cases systematically easily transformed from one form to an- and efficiently, the main features of a other. Mathematically speaking, test testing method are largely determined case selection criteria are generators, by the adequacy criterion. For example, that is, functions that produce a class of as we show later, the adequacy criterion test sets from the program under test is related to fault-detecting ability, the and the specification (see Definition dependability of the program that 1.3). Any test set in this class is ade- passes a successful test and the number quate, so that we can use any of them of test cases required. Unfortunately, equally.1 Test data adequacy criteria the exact relationship between a partic- are acceptors that are functions from ular adequacy criterion and the correct- the program under test, the specifica- ness or reliability of the software that tion of the software and the test set to a passes the test remains unclear. characteristic number as defined in Def- Due to the central role that adequacy inition 1.1. Generators and acceptors criteria play in software testing, soft- are mathematically equivalent in the ware testing methods are often com- sense of one-one correspondence. Hence, pared in terms of the underlying ade- we use “test adequacy criteria” to de- quacy criteria. Therefore, subsequently, note both of them. we use the name of an adequacy crite- Definition 1.3 (Test Data Adequacy rion as a synonym of the corresponding Criteria as Generators [Budd and An- testing method when there is no possi- gluin 1982]). A test data adequacy cri- bility of confusion. terion C is a function C: P 3 S 3 2T. A test set t [ C(p, s) means that t satis- 1.2 The Uses of Test Adequacy Criteria fies C with respect to p and s, and it is said that t is adequate for (p, s) accord- An important issue in the management ing to C. of software testing is to “ensure that before any testing the objectives of that The second role that an adequacy cri- testing are known and agreed and that terion playsistodeterminetheobserva- the objectives are set in terms that can tions that should be made during the be measured.” Such objectives “should testing process. For example, statement be quantified, reasonable and achiev- coverage requires that the tester, or the able” [Ould and Unwin 1986]. Almost testing system, observe whether each all test adequacy criteria proposed in statement is executed during the pro- the literature explicitly specify particu- lar requirements on software testing. They are objective rules applicable by 1Testdataselectioncriteriaasgeneratorsshould project managers for this purpose. notbeconfusedwithtestcasegenerationsoftware tools,whichmayonlygenerateonetestset. For example, branch coverage is a ACMComputingSurveys,Vol.29,No.4,December1997 370 • Zhu et al. test requirement that all branches of Generally speaking, there are two basic the program should be exercised. The aspects of software dependability as- objective of testing is to satisfy this sessment. One is the dependability esti- requirement. The degree to which this mation itself, such as a reliability fig- objective is achieved can be measured ure. The other is the confidence in quantitatively by the percentage of estimation, such as the confidence or branches exercised. The mutation ade- the accuracy of the reliability estimate. quacy criterion specifies the testing re- The role of test adequacy here is a con- quirement that a test set should be able tributory factor in building confidence to rule out a particular set of software in the integrity estimate. Recent re- faults, that is, those represented by mu- search has shown some positive results tants. Mutation score is another kind of with respect to this role [Tsoukalas quantitative measurement of test qual- 1993]. ity. Although it is common in current soft- Test data adequacy criteria are also ware testing practice that the test pro- very helpful tools for software testers. cesses at both the higher and lower There are two levels of software testing levels stop when money or time runs processes. At the lower level, testing is out, there is a tendency towards the use a process where a program is tested by of systematic testing methods with the feeding more and more test cases to it. application of test adequacy criteria. Here, a test adequacy criterion can be used as a stopping rule to decide when 1.3 Categories of Test Data Adequacy this process can stop. Once the mea- Criteria surement of test adequacy indicates There are various ways to classify ade- that the test objectives have been quacy criteria. One of the most common achieved, then no further test case is is by the source of information used to needed. Otherwise, when the measure- specify testing requirements and in the ment of test adequacy shows that a test measurement of test adequacy. Hence, has not achieved the objectives, more an adequacy criterion can be: tests must be made. In this case, the adequacy criterion also provides a —specification-based, which specifies guideline for the selection of the addi- the required testing in terms of iden- tional test cases. In this way, adequacy tified features of the specification or criteria help testers to manage the soft- the requirements of the software, so ware testing process so that software that a test set is adequate if all the quality is ensured by performing suffi- identified features have been fully ex- cient tests. At the same time, the cost of ercised. In software testing literature testing is controlled by avoiding redun- it is fairly common that no distinction dant and unnecessary tests. This role of is made between specification and re- adequacy criteria has been considered quirements. This tradition is followed by some computer scientists [Weyuker in this article also; 1986] to be one of the most important. —program-based, which specifies test- At a higher level, the testing proce- ing requirements in terms of the pro- dure can be considered as repeated cy- gram under test and decides if a test cles of testing, debugging, modifying set is adequate according to whether program code, and then testing again. the program has been thoroughly ex- Ideally, this process should stop only ercised. when the software has met the required reliability requirements. Although test It should not be forgotten that for both data adequacy criteria do not play the specification-based and program-based role of stopping rules at this level, they testing, the correctness of program out- make an important contribution to the puts must be checked against the speci- assessment of software dependability. fication or the requirements. However, ACMComputingSurveys,Vol.29,No.4,December1997 Test Coverage and Adequacy • 371 in both cases, the measurement of test In the software testing literature, peo- adequacy does not depend on the results ple often talk about white-box testing of this checking. Also, the definition of and black-box testing. Black-box testing specification-based criteria given previ- treats the program under test as a ously does not presume the existence of “black box.” No knowledge about the a formal specification. implementation is assumed. In white- It has been widely acknowledged that box testing, the tester has access to the software testing should use information details of the program under test and from both specification and program. performs the testing according to such Combining these two approaches, we details. Therefore, specification-based have: criteria and interface-based criteria be- long to black-box testing. Program- —combined specification- and program- based criteria and combined specifica- based criteria, which use the ideas of tion and program-based criteria belong both program-based and specification- to white-box testing. based criteria. Another classification of test ade- quacy criteria is by the underlying test- There are also test adequacy criteria ing approach. There are three basic ap- that specify testing requirements with- proaches to software testing: out employing any internal information (1) structural testing: specifies testing from the specification or the program. requirements in terms of the cover- For example, test adequacy can be mea- age of a particular set of elements in sured according to the prospective us- the structure of the program or the age of the software by considering specification; whether the test cases cover the data that are most likely to be frequently (2) fault-based testing: focuses on de- used as input in the operation of the tecting faults (i.e., defects) in the software. Although few criteria are ex- software. An adequacy criterion of plicitly proposed in such a way, select- this approach is some measurement ing test cases according to the usage of of the fault detecting ability of test the software is the idea underlying ran- sets.2 dom testing, or statistical testing. In (3) error-based testing: requires test random testing, test cases are sampled cases to check the program on cer- at random according to a probability tain error-prone points according to distribution over the input space. Such our knowledge about how programs a distribution can be the one represent- typically depart from their specifica- ing the operation of the software, and tions. the random testing is called representa- The source of information used in the tive. It can also be any probability dis- adequacy measurement and the under- tribution, such as a uniform distribu- lying approach to testing can be consid- tion, and the random testing is called ered as two dimensions of the space of nonrepresentative. Generally speaking, software test adequacy criteria. A soft- if a criterion employs only the “inter- ware test adequacy criterion can be face” information—the type and valid classified by these two aspects. The re- range for the software input—it can be view of adequacy criteria is organized called an interface-based criterion: according to the structure of this space. —interface-based criteria, which specify testing requirements only in terms of the type and range of software input 2We use the word fault to denote defects in soft- without reference to any internal fea- ware and the word error to denote defects in the tures of the specification or the pro- outputsproducedbyaprogram.Anexecutionthat gram. producesanerroriscalledafailure. ACMComputingSurveys,Vol.29,No.4,December1997 372 • Zhu et al. 1.4 Organization of the Article been used as a model of program struc- ture. It is widely used in static analysis The remainder of the article consists of of software [Fenton et al. 1985; Ko- two main parts. The first part surveys saraju 1974; McCabe 1976; Paige 1975]. various types of test data adequacy cri- It has also been used to define and teria proposed in the literature. It in- study program-based structural test ad- cludes three sections devoted to struc- equacy criteria [White 1981]. In this tural testing, fault-based testing, and section we give a brief introduction to error-based testing. Each section con- the flow-graph model of program struc- sists of several subsections covering the ture. Although we use graph-theory ter- principles of the testing method and minology in the following discussion, their application to program-based and readers are required to have only a pre- specification-based test criteria. The liminary knowledge of graph theory. To second part is devoted to the rationale help understand the terminology and to presented in the literature in support of avoid confusion, a glossary is provided the various criteria. It has two sections. in the Appendix. Section 5 discusses the methods of com- A flow graph is a directed graph that paring adequacy criteria and surveys consists of a set N of nodes and a set the research results in the literature. E # N 3 N of directed edges between Section 6 discusses the axiomatic study nodes. Each node represents a linear and assessment of adequacy criteria. Fi- sequence of computations. Each edge nally, Section 7 concludes the paper. representing transfer of control is an ordered pair ^n , n & of nodes, and is 2. STRUCTURAL TESTING 1 2 associated with a predicate that repre- This section is devoted to adequacy cri- sents the condition of control transfer teria for structural testing. It consists of from node n1 to node n2. In a flow two subsections, one for program-based graph, there is a begin node and an end criteria and the other for specification- node where the computation starts and based criteria. finishes, respectively. The begin node has no inward edges and the end node has no outward edges. Every node in a 2.1 Program-Based Structural Testing flow graph must be on a path from the There are two main groups of program- begin node to the end node. Figure 1 is based structural test adequacy criteria: an example of flow graph. control-flow criteria and data-flow crite- ria. These two types of adequacy crite- Example 2.1 The following program ria are combined and extended to give computes the greatest common divisor dependence coverage criteria. Most ade- of two natural numbers by Euclid’s al- quacy criteria of these two groups are gorithm. Figure 1 is the corresponding based on the flow-graph model of pro- flow graph. gram structure. However, a few control- Begin flow criteria define test requirements in input (x, y); terms of program text rather than using while (x . 0 and y . 0) do an abstract model of software structure. if (x . y) then x:5 x 2 y 2.1.1 Control Flow Adequacy Crite- else y:5 y 2 x ria. Before we formally define various endif control-flow-based adequacy criteria, we endwhile; first give an introduction to the flow output (x 1 y); graph model of program structure. end A. The flow graph model of program It should be noted that in the litera- structure. The control flow graph ture there are a number of conventions stems from compiler work and has long of flow-graph models with subtle differ- ACMComputingSurveys,Vol.29,No.4,December1997 Test Coverage and Adequacy • 373 Figure1. FlowgraphforprograminExample2.1. ences, such as whether a node is al- to another is represented by a directed lowed to be associated with an empty edge between the nodes such that the sequence of statements, the number of condition of the control transfer is asso- outward edges allowed for a node, and ciated with it. the number of end nodes allowed in a B. Control-flow adequacy criteria. flow graph, and the like. Although most Now, given a flow-graph model of a pro- adequacy criteria can be defined inde- gram and a set of test cases, how do we pendently of such conventions, using measure the adequacy of testing for the different ones may result in different program on the test set? First of all, measures of test adequacy. Moreover, recall that the execution of the program testing tools may be sensitive to such on an input datum is modeled as a conventions. In this article no restric- traverse in the flow graph. Every execu- tions on the conventions are made. tion corresponds to a path in the flow For programs written in a procedural graph from the begin node to the end programming language, flow-graph node. Such a path is called a complete models can be generated automatically. computation path, or simply a computa- Figure 2 gives the correspondences be- tion path or an execution path in soft- tween some structured statements and ware testing literature. their flow-graph structures. Using these A very basic requirement of adequate rules, a flow graph, shown in Figure 3, testing is that all the statements in the can be derived from the program given program are covered by test executions. in Example 2.1. Generally, to construct This is usually called statement cover- a flow graph for a given program, the age [Hetzel 1984]. But full statement program code is decomposed into a set coverage cannot always be achieved be- of disjoint blocks of linear sequences of cause of the possible existence of infea- statements. A block has the property sible statements, that is, dead code. that whenever the first statement of the Whether a piece of code is dead code is block is executed, the other statements undecidable [Weyuker 1979a; Weyuker are executed in the given order. Fur- 1979b; White 1981]. Because state- thermore, the first statement of the ments correspond to nodes in flow- block is the only statement that may be graph models, this criterion can be de- executed directly after the execution of fined in terms of flow graphs, as follows. a statement in another block. Each block corresponds to a node in the flow Definition 2.1 (Statement Coverage graph. A control transfer from one block Criterion). A set P of execution paths ACMComputingSurveys,Vol.29,No.4,December1997 374 • Zhu et al. Figure2. Exampleflowgraphsforstructuredstatements. may be missed from an adequate test. Hence, we have a slightly stronger re- quirement of adequate test, called branch coverage [Hetzel 1984], that all control transfers must be checked. Since control transfers correspond to edges in flow graphs, the branch coverage crite- rion can be defined as the coverage of all edges in the flow graph. Definition2.2(BranchCoverageCrite- rion). A set P of execution paths satis- fies the branch coverage criterion if and only if for all edges e in the flow graph, there is at least one path p in P such that p contains the edge e. Figure3. FlowgraphforExample2.1. Branch coverage is stronger than statement coverage because if all edges in a flow graph are covered, all nodes are necessarily covered. Therefore, a satisfies the statement coverage crite- test set that satisfies the branch cover- rion if and only if for all nodes n in the age criterion must also satisfy state- flow graph, there is at least one path p ment coverage. Such a relationship be- in P such that node n is on the path p. tween adequacy criteria is called the Notice that statement coverage is so subsumes relation. It is of interest in weak that even some control transfers the comparison of software test ade- ACMComputingSurveys,Vol.29,No.4,December1997 Test Coverage and Adequacy • 375 quacy criteria (see details in Section feasible elements. Most program-based 5.1.3). adequacy criteria in the literature are However, even if all branches are ex- not finitely applicable, but finitely ap- ercised, this does not mean that all com- plicable versions can often be obtained binations of control transfers are by redefinition in this way. Subse- checked. The requirement of checking quently, such a version is called the all combinations of branches is usually feasible version of the adequacy crite- called path coverage or path testing, rion. It should be noted, first, that al- which can be defined as follows. though we can often obtain finite appli- cability by using the feasible version, Definition 2.3 (Path Coverage Crite- this may cause the undecidability prob- rion). A set P of execution paths satis- lem; that is, we may not be able to fies the path coverage criterion if and decide whether a test set satisfies a only if P contains all execution paths given adequacy criterion. For example, from the begin node to the end node in whether a statement in a program is the flow graph. feasible is undecidable [Weyuker 1979a; Although the path coverage criterion Weyuker 1979b; White 1991]. There- still cannot guarantee the correctness of fore, when a test set does not cover all a tested program, it is too strong to be the statements in a program, we may practically useful for most programs, not be able to decide whether a state- because there can be an infinite number ment not covered by the test data is of different paths in a program with dead code. Hence, we may not be able to loops. In such a case, an infinite set of decide if the test set satisfies the feasi- test data must be executed for adequate ble version of statement coverage. Sec- testing. This means that the testing ond, for some adequacy criteria, such as cannot finish in a finite period of time. path coverage, we cannot obtain finite But, in practice, software testing must applicability by such a redefinition. be fulfilled within a limited fixed period Recall that the rationale for path cov- of time. Therefore, a test set must be erage is that there is no path that does finite. The requirement that an ade- not need to be checked by testing, while quacy criterion can always be satisfied finite applicability forces us to select a by a finite test set is called finite appli- finite subset of paths. Thus, research cability [Zhu and Hall 1993] (see Sec- into flow-graph-based adequacy criteria tion 6). has focused on the selection of the most The statement coverage criterion and important subsets of paths. Probably branch coverage criterion are not fi- the most straightforward solution to the nitely applicable either, because they conflictistoselectpathsthatcontainno require testing to cover infeasible ele- redundant information. Hence, two no- ments. For instance, statement cover- tions from graph theory can be used. age requires that all the statements in a First,apaththathasnorepeatedoccur- program are executed. However, a pro- rence of any edge is called a simple path gram may have infeasible statements, in graph theory. Second, a path that has that is, dead code, so that no input data no repeated occurrences of any node is can cause their execution. Therefore, in called an elementary path. Thus, it is such cases, there is no adequate test set possible to define simple path coverage that can satisfy statement coverage. and elementary path coverage criteria, Similarly, branch coverage is not fi- which require that adequate test sets nitely applicable because a program should cover all simple paths and ele- may contain infeasible branches. How- mentary paths, respectively. ever, for statement coverage and also These two criteria are typical ones branch coverage, we can define a fi- that select finite subsets of paths by nitely applicable version of the criterion specifying restrictions on the complexity by requiring testing only to cover the of the individual paths. Another exam- ACMComputingSurveys,Vol.29,No.4,December1997

Description:
demics, the software industry has been slow to accept test adequacy measure-ment. Few software development stan-dards require or even recommend the
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.