Searching for Test Data Kamran Ghani Submitted for the degree of Doctor of Philosophy The University of York Department of Computer Science 2009 Abstract In recent years metaheuristic search techniques have been applied successfully to solve many software engineering problems. One area in particular where these techniqueshavegainedmuchattentionissearchbasedtestdatageneration. Many techniques have been applied to generate test data for structural, functional as well as non-functional testing across different levels of abstractions of a software system. Inthisthesisweextendcurrentsearchbasedapproachestocoverstronger criteria,extendcurrentworkonhigherlevelmodelstoincludefurtheroptimisation techniques, enhance current approaches to target “difficult” branches, and show how to apply current approaches to refine specifications generated automatically by tools such as Daikon. 1 Contents 1 Introduction 9 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 Literature Review 14 2.1 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 Adequacy Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Structure Based Coverage Criteria . . . . . . . . . . . . . . . . . . 17 2.3.1 Control Flow Based Testing . . . . . . . . . . . . . . . . . . 17 2.3.2 Data Flow Based Criteria . . . . . . . . . . . . . . . . . . . 21 2.4 Specification Based Coverage Criteria . . . . . . . . . . . . . . . . 23 2.4.1 Formal Specification Based Approaches . . . . . . . . . . . 24 2.4.2 State Based Approaches . . . . . . . . . . . . . . . . . . . . 25 2.4.3 UML Based Coverage Criteria . . . . . . . . . . . . . . . . 27 2.5 Fault Based Adequacy Criteria . . . . . . . . . . . . . . . . . . . . 32 2.5.1 Error Seeding . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.5.2 Mutation Testing . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6 Search Based Optimsation Techniques . . . . . . . . . . . . . . . . 37 2.6.1 Hill-Climbing . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.6.2 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . 38 2 CONTENTS 2.6.3 Tabu Search . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6.4 Population Based Algorithms . . . . . . . . . . . . . . . . . 42 2.6.5 Artificial Immune Systems (AIS) . . . . . . . . . . . . . . . 45 2.6.6 Ant Colony Optimisation . . . . . . . . . . . . . . . . . . . 48 2.7 Automated Test Data Generation . . . . . . . . . . . . . . . . . . . 50 2.7.1 Search Based Test Data Generation . . . . . . . . . . . . . 50 2.7.2 Limitations of Search Based Test Data Generation . . . . . 55 2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3 Structural Testing: Searching for Fine Grained Test Data 58 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2 Framework Implementation for Refined Test Data Generation . . . 59 3.2.1 Instrumentor . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.2 Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2.3 Optimisation Rig . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.4 Required Test Case Sequence Generation . . . . . . . . . . 64 3.3 Tuning Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3.1 Objective (Fitness) Function . . . . . . . . . . . . . . . . . 66 3.3.2 Search Space . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.3.3 Neighbourhood . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.4 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.3.5 Cooling Schedule and rate . . . . . . . . . . . . . . . . . . . 70 3.3.6 Experimentation Results for Parameters Setting . . . . . . 71 3.4 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.4.1 Test Objects . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.4.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . 80 3.4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4 Application of Genetic Algorithms to Simulink Models 85 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3 CONTENTS 4.2.1 Simulink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.2.2 Search Based Test Data Generation for Simulink Models . . 88 4.2.3 Existing Work . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.2.4 GAs for Test Data Generation of Simulink Models . . . . . 89 4.2.5 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3.1 Experimental Objects . . . . . . . . . . . . . . . . . . . . . 92 4.3.2 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . 94 4.3.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5 Program Stretching 107 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.2 Program Transformation . . . . . . . . . . . . . . . . . . . . . . . . 108 5.2.1 Model Transformation . . . . . . . . . . . . . . . . . . . . . 109 5.3 Testability Transformation . . . . . . . . . . . . . . . . . . . . . . . 110 5.3.1 Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.4 Program Stretching . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.4.1 Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.4.2 Transformation Rules . . . . . . . . . . . . . . . . . . . . . 117 5.5 Experiments and Evaluation . . . . . . . . . . . . . . . . . . . . . . 118 5.5.1 Experiment Set 1: Code Examples . . . . . . . . . . . . . . 118 5.5.2 Analysis and Evaluation of Experiment Set 1 . . . . . . . . 119 5.5.3 Experiment Set 2: Simulink Models . . . . . . . . . . . . . 119 5.5.4 Analysis and Evaluation of Experiment Set 2 . . . . . . . . 120 5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.7 Application Outside SBSE . . . . . . . . . . . . . . . . . . . . . . . 130 6 Strengthening Inferred Specifications using Search Based Tes- ting 131 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 4 CONTENTS 6.2.1 Dynamic Invariant Generation . . . . . . . . . . . . . . . . 132 6.2.2 SBTDG for Invariant Falsification . . . . . . . . . . . . . . 133 6.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.4 Model for Invariant Falsification . . . . . . . . . . . . . . . . . . . 136 6.4.1 High Level Model. . . . . . . . . . . . . . . . . . . . . . . . 136 6.4.2 Invariant Inference . . . . . . . . . . . . . . . . . . . . . . . 137 6.4.3 Intermediate Invariants’ Class . . . . . . . . . . . . . . . . . 138 6.4.4 Test Data Generation Framework . . . . . . . . . . . . . . . 139 6.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 6.5.1 Middle Program . . . . . . . . . . . . . . . . . . . . . . . . 140 6.5.2 WrapRoundCounter . . . . . . . . . . . . . . . . . . . . . . 141 6.5.3 BubbleSort . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.5.4 CalDate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.5.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 7 Evaluation, Future Work and Conclusion 145 7.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.1.1 Structural Testing: Searching for Fine Grained Test Data . 147 7.1.2 Application of Genetic Algorithms to Simulink Models . . . 147 7.1.3 Program Stretching . . . . . . . . . . . . . . . . . . . . . . 148 7.1.4 Strengthening Inferred Specifications using Search Based Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 7.2.1 Extension to the Proposed Framework . . . . . . . . . . . . 149 7.2.2 Application of Search Based Techniques to Simulink Models 150 7.2.3 Program Stretching . . . . . . . . . . . . . . . . . . . . . . 150 7.2.4 Likely Invariants’ Falsification . . . . . . . . . . . . . . . . . 151 7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A Triangle Program with Instrumented Version 153 A.1 Instrumented Version of Triangle Program . . . . . . . . . . . . . . 154 5 CONTENTS B Programs used in this thesis work 155 B.0.1 CalDate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.0.2 Quadratic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 B.0.3 Expint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 B.0.4 Complex or Program3 for program stretching, code based . 158 B.0.5 Program 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 B.0.6 Program 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 B.0.7 Program 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 B.1 Models for Test data generation using GAs for Simulink Models . . 166 6 Acknowledgment I would like to thank my supervisor Professor John A. Clark. It would not have been possible for me to finish this work without his kind support, advice and encouragement throughout this period. I would also like to thank Professor Ri- chard Paige for his insightful comments. Many thanks to Yuan Zhan for her kind support and collaboration on part of this work. Iwouldalsoliketothankallmyfriendsandcolleagueswhoalwaysencouraged me to carry on this work. Many thanks to my family who always stand by my side, my sisters, brothers and aunts who always prayed for my success, especially my elder brother Usman Ghani, who brought me up like a father. And thanks to my wife, who is spending some hard time without me. 7 Declaration The work presented in this thesis was carried out in the University of York from October 2005 to September 2009. Much of the work appeared in print as follows: 1. KamranGhaniandJohnA.ClarkStrengthening Inferred Specification using Search Based Testing, in proceedings of 1st International Workshop on Search-Based Software Testing (SBST) in conjunction with ICST 2008, pp. 187-194, Lillehammer, Norway. 2. KamranGhaniandJohnA.ClarkWidening the Goal Posts: Program Stret- ching to Aid Search Based Software Testing, in proceedings of the 1st Inter- national Symposium on Search Based Software Engineering (SSBSE ’09) 3. Kamran Ghani, John A. Clark and Yuan Zhan Comparing Algorithms for Search-Based Test Data Generation of Matlab Simulink Models, in procee- dings of the 10th IEEE Congress on Evolutionary Computation (CEC ’09), Trondheim, Norway. 4. Kamran Ghani and John A. Clark Automatic Test Data Generation for Multiple Condition and MCDC Coverage, in proceedings of the 4th Interna- tional Conference on Software Engineering Advances (ICSEA ’09b), Porto, Portugal. For all publications, the primary author is Kamran Ghani with John A. Clark providing advice as supervisor. Tool support and advice was also given by Yuan Zhan for paper no 3 above. In all cases the actual experiments were designed and carried out by Ghani. 8 Chapter 1 Introduction 1.1 Introduction Dynamic testing — “the dynamic verification of the behaviour of a program on a finite set of test cases, suitably selected from the usually infinite executions do- main, against the expected behaviour” [90] — is used to gain confidence in almost all developed software. Various static approaches, such as reviews, walkthroughs and inspections, can be used to gain further confidence but it is generally felt [22] that only dynamic testing can provide confidence in the correct functioning of the software in its intended environment. Wecannotperformexhaustivetestingbecausethedomainofprograminputsis usuallytoolargeandalsotherearetoomanypossibleexecutionpaths. Therefore, the software is tested using a suitably selected set of test cases. A variety of coverage criteria have been proposed to assess how effective test sets are likely to be. Historically, criteria exercising aspects of control flow, such as statement and branchcoverage[136], havebeenthemostcommon. Furthercriteria, suchasdata flow[154], ormoresophisticatedcriteriasuchasMC/DCcoverage[162]havebeen adopted for specific application domains. Many of these criteria are motivated by general principles (e.g. you cannot have much confidence in the correctness of a statement without exercising it); others target specific commonly occurring fault types (e.g. boundary value coverage). Finding a set of test data to achieve identified coverage criteria is typically a 9
Description: