ebook img

Quality Measures and Assurance for AI Software PDF

142 Pages·2011·0.6 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Quality Measures and Assurance for AI Software

Quality Measures and Assurance for AI Software1 John Rushby Computer Science Laboratory SRI International 333 Ravenswood Avenue Menlo Park, CA 94025 Technical Report CSL-88-7R, September 1988 (Also available as NASA Contractor Report 4187) 1 This work was performed for the National Aeronautics and Space Administra- tion under contract NAS1 17067 (Task 5). Abstract This report is concerned with the application of software quality and assur- ancetechniquestoAIsoftware. ItdescribesworkperformedfortheNational Aeronautics and Space Administration under contract NAS1 17067 (Task 5) and is also available as a NASA Contractor Report. The report is divided into three parts. In Part I we review existing software quality assurance measures and techniques{those that have been developed for, andappliedto, conventionalsoftware. Thispart, whichprovidesafairly comprehensive overview of software reliability and metrics, static and dy- namic testing, and formal speci(cid:12)cation and veri(cid:12)cation, may be of interest to those unconcerned with AI software. In Part II, we consider the char- acteristics of AI-based software, the applicability and potential utility of measures and techniques identi(cid:12)ed in the (cid:12)rst part, and we review those few methods that have been developed speci(cid:12)cally for AI-based software. In Part III of this report, we present our assessment and recommendations for the further exploration of this important area. An extensive bibliography with 194 entries is provided. Contents 1 Introduction 1 1.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.2 Acknowledgments : : : : : : : : : : : : : : : : : : : : : : : : : 2 I Quality Measures for Conventional Software 3 2 Software Engineering and Software Quality Assurance 5 2.1 Software Quality Assurance : : : : : : : : : : : : : : : : : : : 6 3 Software Reliability 8 3.1 The Basic Execution Time Reliability Model : : : : : : : : : 10 3.2 Discussion of Software Reliability : : : : : : : : : : : : : : : : 17 4 Size, Complexity, and E(cid:11)ort Metrics 19 4.1 Size Metrics : : : : : : : : : : : : : : : : : : : : : : : : : : : : 19 4.2 Complexity Metrics: : : : : : : : : : : : : : : : : : : : : : : : 21 4.2.1 Measures of Control Flow Complexity : : : : : : : : : 21 4.2.2 Measures of Data Complexity : : : : : : : : : : : : : : 22 4.3 Cost and E(cid:11)ort Metrics : : : : : : : : : : : : : : : : : : : : : 23 4.4 Discussion of Software Metrics : : : : : : : : : : : : : : : : : 26 5 Testing 29 5.1 Dynamic Testing : : : : : : : : : : : : : : : : : : : : : : : : : 29 5.1.1 Random Test Selection: : : : : : : : : : : : : : : : : : 30 5.1.2 Regression Testing : : : : : : : : : : : : : : : : : : : : 32 5.1.3 Thorough Testing : : : : : : : : : : : : : : : : : : : : 32 5.1.3.1 Structural Testing : : : : : : : : : : : : : : : 34 5.1.3.2 Functional Testing : : : : : : : : : : : : : : : 36 i ii Contents 5.1.4 Symbolic Execution : : : : : : : : : : : : : : : : : : : 37 5.1.5 Automated Support for Systematic Testing Strategies 38 5.2 Static Testing : : : : : : : : : : : : : : : : : : : : : : : : : : : 39 5.2.1 Anomaly Detection : : : : : : : : : : : : : : : : : : : : 40 5.2.2 Structured Walk-Throughs : : : : : : : : : : : : : : : 41 5.2.3 Mathematical Veri(cid:12)cation : : : : : : : : : : : : : : : : 41 5.2.3.1 Executable Assertions : : : : : : : : : : : : : 44 5.2.3.2 Veri(cid:12)cation of Limited Properties : : : : : : 45 5.2.4 Fault-Tree Analysis : : : : : : : : : : : : : : : : : : : 46 5.3 Testing Requirements and Speci(cid:12)cations : : : : : : : : : : : : 48 5.3.1 Requirements Engineering and Evaluation : : : : : : : 49 5.3.1.1 SREM: : : : : : : : : : : : : : : : : : : : : : 51 5.3.2 Completeness and Consistency of Speci(cid:12)cations : : : : 54 5.3.3 Mathematical Veri(cid:12)cation of Speci(cid:12)cations : : : : : : 55 5.3.4 Executable Speci(cid:12)cations : : : : : : : : : : : : : : : : 56 5.3.5 Testing of Speci(cid:12)cations : : : : : : : : : : : : : : : : : 56 5.3.6 Rapid Prototyping : : : : : : : : : : : : : : : : : : : : 57 5.4 Discussion of Testing : : : : : : : : : : : : : : : : : : : : : : : 58 II Application of Software Quality Measures to AI Soft- ware 63 6 Characteristics of AI Software 65 7 Issues in Evaluating the Behavior of AI Software 72 7.1 Requirements and Speci(cid:12)cations : : : : : : : : : : : : : : : : 72 7.1.1 Service and Competency Requirements: : : : : : : : : 73 7.1.2 Desired and Minimum Competency Requirements : : 73 7.2 Evaluating Desired Competency Requirements : : : : : : : : 74 7.2.1 Model-Based Adversaries : : : : : : : : : : : : : : : : 75 7.2.2 Competency Evaluation Against Human Experts : : : 75 7.2.2.1 Choice of Gold Standard : : : : : : : : : : : 76 7.2.2.2 Biasing and Blinding : : : : : : : : : : : : : 76 7.2.2.3 Realistic Standards of Performance : : : : : 77 7.2.2.4 Realistic Time Demands : : : : : : : : : : : 77 7.2.3 Evaluation against Linear Models : : : : : : : : : : : : 78 7.3 Acceptance of AI Systems : : : : : : : : : : : : : : : : : : : : 80 7.3.1 Identifying the Purpose and Audience of Tests : : : : 81 Contents iii 7.3.2 Involving the User : : : : : : : : : : : : : : : : : : : : 82 7.3.2.1 Performance Evaluation of AI Software : : : 83 8 Testing of AI Systems 85 8.1 Dynamic Testing : : : : : : : : : : : : : : : : : : : : : : : : : 85 8.1.1 In(cid:13)uence of Con(cid:13)ict-Resolution Strategies : : : : : : : 87 8.1.2 Sensitivity Analysis : : : : : : : : : : : : : : : : : : : 87 8.1.3 Statistical Analysis and Measures : : : : : : : : : : : : 88 8.1.4 Regression Testing and Automated Testing Support : 89 8.2 Static Testing : : : : : : : : : : : : : : : : : : : : : : : : : : : 89 8.2.1 Anomaly Detection : : : : : : : : : : : : : : : : : : : : 90 8.2.2 Mathematical Veri(cid:12)cation : : : : : : : : : : : : : : : : 97 8.2.3 Structured Walk-Throughs : : : : : : : : : : : : : : : 102 8.2.4 Comprehension Aids : : : : : : : : : : : : : : : : : : : 103 9 Reliability Assessment and Metrics for AI Systems 105 III Conclusions and Recommendations for Research 107 10 Conclusions 109 10.1 Recommendations for Research : : : : : : : : : : : : : : : : : 115 Bibliography 118 iv Chapter 1 Introduction This report is concerned with the application of software quality and eval- uation measures to AI software and, more broadly, with the question of quality assurance for AI software. By AI software we mean software that uses techniques from the (cid:12)eld of Arti(cid:12)cial Intelligence. (Genesereth and Nilsson [73] give an excellent modern introduction to such techniques; Har- mon and King [84] provide a more elementary overview.) We consider not only metrics that attempt to measure some aspect of software quality, but alsomethodologiesandtechniques(suchassystematictesting)thatattempt to improve some dimension of quality, without necessarily quantifying the extent of the improvement. The bulk of the report is divided into three parts. In Part I we review existing software quality measures|those that have been developed for, and applied to, conventional software. In Part II, weconsiderthecharacteristicsofAIsoftware,theapplicabilityandpotential utility of measures and techniques identi(cid:12)ed in the (cid:12)rst part, and we review those few methods that have been developed speci(cid:12)cally for AI software. In Part III of this report, we present our assessment and recommendations for the further exploration of this important area. 1.1 Motivation It is now widely recognized that the cost of software vastly exceeds that of the hardware it runs on|software accounts for 80% of the total computer systems budget of the Department of Defense, for example. Furthermore, as much as 60% of the software budget may be spent on maintenance. Not only does software cost a huge amount to develop and maintain, but vast 1 2 Chapter 1. Introduction economic or social assets may be dependent upon its functioning correctly. It is therefore essential to develop techniques for measuring, predicting, and controlling the costs of software development and the quality of the software produced. The quality-assurance problem is particularly acute in the case of AI software|which for present purposes we may de(cid:12)ne as software that per- forms functions previously thought to require human judgment, knowledge, or intelligence, using heuristic, search-based techniques. As Parnas ob- serves [149]: \The rules that one obtains by studying people turn out to be inconsistent,incomplete,andinaccurate. Heuristicprogramsare developed by a trial and error process in which a new rule is added whenever one (cid:12)nds a case that is not handled by the old rules. This approach usually yields a program whose behavior is poorly understood and hard to predict." Unlesscompellingevidencecanbeadducedthatsuchsoftwarecanbe\trusted" to perform its function, then it will not|and should not|be used in many circumstances where it would otherwise bring great bene(cid:12)t. In the follow- ing sections of this report, we consider measures and techniques that may provide the compelling evidence required. 1.2 Acknowledgments Alan Whitehurst of the Computer Science Laboratory (CSL) and Leonard Wesley of the Arti(cid:12)cial Intelligence Center (AIC) at SRI contributed ma- terial to this report. It is also a pleasure to acknowledge several useful discussions with Tom Garvey of AIC, and the careful reading and criticism of drafts provided by Oscar Firschein of AIC and Teresa Lunt of CSL. The guidance provided by our technical monitors, Kathy Abbott and Wendell Ricks of NASA Langley Research Center, was extremely valuable. Part I Quality Measures for Conventional Software 3

Description:
Chapter 1 Introduction This report is concerned with the application of software quality and eval-uation measures to AI software and, more broadly, with the question of
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.