6 1 0 2 b e F Abstract 4 1 Incyber-physicalsystems,softwaremaycontrolsafety-significantoperations.Thisreportdiscussesamethod ] to structure software testing to measure the statistical confidence that algorithms are true to their intended O design. Thesubjectmatterappearsintwomainparts: theory,whichshowstherelationshipbetweendiscrete L systemstheory,software,andtheactuatedautomaton;andapplication,whichdiscussessafetydemonstration . andindemnification,asafetyassurancemetric. s c [ The recommended form of statistical testing involves sampling algorithmic behavior in a specific area of safetyriskknownasahazard.Whenthissampleisrandom,itisknownasasafetydemonstration.Itprovides 3 evidence for indemnification, a statistic expressing an assured upper bound for accident probability. The v methodobtainsresultsefficientlyfrompracticalsamplesizes. 0 2 Keywords:software,safety,hazard,demonstration,operationalprofile,automata,confidence,statistics 8 0 0 . 1 0 5 1 : v i X r a Software Safety Demonstration and Indemnification Odell Hegna [email protected] Chapter 1 Prologue 1.1 Copyright This document may be freely copied or modified in accordance with the Creative Commons Attribution license1. 1.2 Executive summary Insystemsofintegratedhardwareandsoftware,theintangiblenatureofsoftwareraisesthequestionoffitness in roles bearing safety risk. Such a safety risk in software, known as a hazard, is a regionof code involv- ing safety constraints (requirements)necessitating some degree of verification. Hazards are identified and monitoredbysafetyengineers,andpossesshypothetical(threatened)frequencyandseverityratings. During its development, potentially hazardous software merits not only rigorously controlled general engineering process,butalsoquantitativeassuranceofhazardswithinparticularproducts. 1.2.1 Approach Thetopicofthisessayisassuringtheinterplaybetweensafetyconstraints(requirements)andsoftwarecon- trol. Software is appreciated as a branching process whose permutations are intractably numerous to test exhaustively.Barringexhaustivetesting,statisticalverificationremainsanoption. Thedegreeofstatisticalverificationwillbeexpressedasresidualrisk,a contravariantquantity. Asoftware item’s total risk has many constituents. For instance, any software communicating with an operator runs humanfactorsrisk. Statisticalsafetyrisk,oneconstituentoftotalrisk,focusesonhazardouscode. Codeis potentiallyhazardousifitsstatisticalrisk(numericalproductoffrequencyofexecution,probabilityoferror, andexpectedsafetylosspererror)issufficientlyhigh. Thesubjectmatterresultsfromapplyingstandardmathematicstoawell-known(butcloudy)problem.Itisor- ganizedaccordingtoamathematizedversionoftheJointSoftwareSystemsSafetyEngineeringHandbook[6] ofthe UnitedStatesDepartmentof Defense(2010). Thismathematizationaffordsa deeperstructuralview 1http://creativecommons.org/licenses/by/3.0/ 1 ofsafetyengineering.Thisviewinspiresaunificationofthatdocument’sriskmanagementgoals,andexerts commonalityagainstitsdisparatehardwareandsoftwareriskdisciplines. 1.2.2 Synopsis Ahazardinsoftwareisaregionofcodeinvolvingsafetyrequirements,whoselogicalcorrectnessisessential tosafeoperation(hazardsdonotembraceallformsofsoftwareerror).Thisconditionmotivatessomedegree of formalverification of correctness. Hazardsare measuredaccordingto their statistical risk, which is the numericproductofthreefactorsassociatedwithasoftwarepoint.Firstisthepoint’sfrequencyofexecution. Secondistheprobabilityofencounteringerrorduringexecutionofacodetrajectorythatreachesthepoint. Thirdisthepoint’sseverity,itssafetyconsequence(loss)pererror. Softwaresafetyassurancemaybeaccomplishedviamanagementofstatisticalrisk. Itisorganizedintotwo phases. First isguestimation,whichusesexpertopinionto yielda roughrankingofhazardrisks, basedon thethreeconstituents.Thesubjectofthisessayistheapproximationphasewhichfollows,producingrefined post-developmentriskforeachhazard. Thisrefinement,followinghardwarepractice,isknownasaresidual risk. Data for calculation of each residual risk is drawn from a collection of specially constructed tests called a demonstration. Becauseerrorisassociatedwithsoftwaresequencesratherthanpoints,demonstrationsexer- ciseavarietyofapproachingtrajectories. Eachdemonstrationdoesproduceamaximum-likelihoodestimate oftheprobabilityofwalkingintoerror,butthisfigureisn’tusefulbecauseitisusuallyzero.Definetheindif- ferenceupperboundastheupperboundat50%confidence,sotheoddsofunderestimationbalancethoseof overestimation.Theindifferenceupperboundyieldsunbiasedassurance. Indemnification is the risk level assured by the indifference upper bound on proportion failing some test ofa demonstration. Theindifferenceupperbound,whichisnon-zero,functionallyreplacesthe maximum- likelihood estimate. Owing to its definition as a confidenceupper bound, indemnificationis also a quality assurancemetriconcompletenessofsafetytestingrelativetorisklevel. Thisessayproposesare-unification ofhardwareandsoftwarerisk,prescribingthatstatisticalriskbecomethecommonstandardbearer. 1.2.3 Significance Profounddifferenceexistsbetweenthisessay’sproposalandcurrentstandardssuchasMIL-STD-882Eand itscompanionJointSoftwareSystemsSafetyEngineeringHandbook. PresentadherentsofMIL-STD-882E mustbreaknewproceduralgroundiftheyintendtoevaluatestatisticalrisk. Theprotocolconfusesstatistical assurance with other techniquesfor design vetting. Perhaps in an effortto encompassboth, the standard’s analysisdescribesahierarchyforsoftwarebasedonsafetyimpact:potentialhumanintervention,redundancy, or levelof safety responsibility. This protocol’smeasure is a hierarchyof discrete categoriesrather than a continuumvariable.Itmayenablesometypesofanalysis,butitrendersstatisticalriskassessmentimpossible. Thesestandardsmodifythedefinitionofrisk,preferringtointroduceseparateriskconceptsforhardwareand software.Accordingtothemilitarystandard,statisticalriskexistsonlyforhardware,andisconsequentlylost forsoftware.Thisessayproposesare-introductionofstatisticalrisktosoftware,withtheresultthathardware andsoftwarerisksbecomeinterchangeableinmeaning. 2 1.3 Apologies This essay is not rendered to academic standards of quality; it benefits from no formal literature search and was written in isolation. The experienced reader may find terms in nonstandard context. The author has strived to maintain consistency, but admits deficiency in standardization of terminology. The author apologizesforresultinginconvenience. The author also apologizesthat the conceptsdiscussed here are nascent. Difficultengineeringmust be ac- complishedbeforeamaturetechnologyisavailableforcommercialization. The authorfeaturesmathematicscentrally,2 presumingundergraduatebackgroundandprovidingnecessary computerscience.Thisapproachrisksestrangingmanyworthyengineeringreaders;however,amathematical foundationisnecessary.Thisessayservesthatneed. 1.4 Informal introduction 1.4.1 Hall’sdefinitions The concept of system is intuitively obvious but describing its analytical properties is tricky. A famous exampleappearedin Hall’s 1962treatise on systems engineeringmethodology[2, p. 60ff]. Hall proposes succinctdefinitionsofthetermssystemandenvironment: • Asystemisasetofobjectswithrelationshipsbetweentheobjectsandbetweentheirattributes. • Foragivensystem,theenvironmentisthesetofallobjectsoutsidethesystem: (1)achangeinwhose attributesaffectthesystemand(2)whoseattributesarechangedbythebehaviorofthesystem. Thesedefinitionsallowacomponenttobelongeithertothesystemortheenvironment,becauseHall’sdefini- tionsareambiguous(differentphraseologyisusedthatisactuallyequivalent). OurregimenmodifiesHall’s historical definitions to remove ambiguity; systems will be regarded as all-inclusive. From the standpoint ofrelevantinfluences,theresimplyisno“outside”influence. We clarifythatasystemischaracterizedasa sequenceofstimulusandresponse. Below“component”isasynonymfor“object.” Thesedescriptionsstill suffersomecircularity: • Asystemisthesetofallcomponentshavingattributes,changestowhichaffectthesystem’sresponse. • Theenvironmentisthesetofallcomponentsinsidethesystemwhoseattributesarenotaffectedbythe system’sresponse. In summary, the environment affects the system’s response, but the system response does not affect the environment’sattributes. Factorsoutsidethesystemmayinfluencetheenvironment’sattributes. 1.4.2 Classification Inasystem,thetermsmechanism,construct,andmodelhavespeciallydifferentiatedmeaning. 2Theauthorisaretiredsoftwaresafetyengineer,notamathematician. 3 • Mechanisms are abstractions, not necessarily separable, whose structure emulates all behaviors of a givenphenomenon. • Constructsareisolatablesubstructuresofamechanism,forexaminingparticularbehaviors. • Modelsinterpretabehaviorofaconstructintermsofalternateinfrastructure. Exempligratia,hardwareandsoftwarearemechanismsandoperationalprofilesareconstructs,whilesafety riskisamodel.Adescriptionofmajormechanisms,constructs,andmodelsfollows. Hardwaremechanism Thedynamicsofhardwarecomponentsisportrayedasconstrainedrealtime trajectoriesoverastate space. A trajectory is a mappingfrom time into state space. A constraintrelation is an alternative expressionfor whatisfamiliarasanequationorinequalityofstate;itismerelyasubstituteforanequivalentequation.Itis characteristicofsystemsthatatanytime,intersectingconstraintsdelimitapparentlyindependentchoicesso thatjustoneisvalid. Interactingconstraintsendowhardwarewithcapabilities. Constraintscanbeclassified accordingtotheirengineeringsignificance. Aviolatedsafetyconstraintjeopardizeslife,health,equipment, orsurroundings. Softwaremechanism Theoretical investigations of discrete reactive systems3 (or software) can be accomplished using a simple substitute for programming languages: the automaton. Automata are purely mechanistic structures posi- tionedinthemachine/languagespectrumsomewherebetweentheTuringmachineandtheGurevichabstract state language (ASM). Beside its adequacy for examining theory, automata avoid selection of a preferred programminglanguage,whichwouldunnecessarilyparticularizeconceptsintendedtobegeneral. Automataperformworkindiscreteunitscalledsteps. Asequenceofstepsisfurtherknownasawalk. This essay presents the actuated automaton, a variant form whose work is deterministic conditionalsequencing andapplicationofinstructions.Instructionsarerepresentedbymathematicalmorphisms,collectivelyknown asfunctionalities. Theorderofthesefunctionalitiesisgovernedbytheactuatedautomatonthroughitsstate. Iterationofanactuatedautomatonemulatesanoperatingprogram. Reactivemechanism Reactive systems characteristically need some means to transfer externalstimuli. The reactive mechanism containsstructuresenablingthehardwareandsoftwaremechanismstointer-operatecohesively. Thenature oftimediffersbetweenthetwo;timeisacontinuuminhardwarewhileitisdiscreteinsoftware. Theclock synchronizationpermitsintegrationbyspecifyinganorderedcross-referencebetweendiscreteandrealtime. Remainingisneedforinter-mechanismcommunication.Twoformsexist: • Sensors convey informationabout the hardware environmentto the software mechanism. Using the clocksynchronization,areal-timetrajectoryissampledintoasequenceofevents. • Transducersmapapointofthesoftwarestateintoatrajectoryinhardware. Thistrajectoryiscalleda control. 3Areactivesystemrespondstoitsenvironment,orexternalstimuli. 4 Coneconstruct The actuated automaton has a generalized inverse called the converse. Through reiteration, the converse constructsapartiallyorderedset(poset)ofeffectsandpotentialcauses. Thisposetisnotlinearbecausean effectmayhavemorethanoneprecedingcause. Aconeistheresultofdecomposingtheposetintoconstituentchainscalledreversewalks.Viewedasforward walks (reversingthe reverse walks), these chainsare ordinarysequencesof causes and consequenteffects. Thecollectionofforwardchainsconvergesto apointknownas thecrux, whilethe conedivergesfromthe same point. One subcomponentof a cone is its edge, which is the collection of steps radiallyopposite the crux. Operationalprofileconstruct Aspreviouslymentioned,automataaccomplishworkinunitscalledsteps.Anoperationalprofileisameasure ofastep’sexcitationprobabilityrelativetoareferencesetofsteps. The“referencesetofsteps”itselfhistoricallyrepresentedasoftwareusagepattern,soitsoughttoresemble the naturalmix of functionalitiesin deployedsoftware. Thisidea is abstractedto a potentiallypurposeless reference set, but the software usage pattern remains important. From the usage pattern, along with the automaton’sstaticlogic,arisestheverynotionofprobability. Anoperationalprofilemaybeappliedtotheedgeofacone. Safetyriskmodel Accidentsoccurhaphazardlywithvaryingfrequencyandseverity. Inthecontextofsoftware,riskexpresses the potential impact of algorithmic design errors. Since its true extent is unknown, software safety risk is expressedas a statistical hypothesis. The compoundPoisson processis a modelsimulating discrete event- basedlosses thataccumulatewith passingtime. Itofferstheadvantageofindependentparameterizationof theloss’intensity(frequency)andseverity.Indemnificationisstatisticalassuranceofsoftwaresafety. 1.4.3 Principleofemergence Emergence [9] is a broad principle of physics describing a process whereby larger entities possessing a propertyarisethroughinteractionsbetweensimplerentitiesthatthemselvesdonotexhibittheproperty. Par- ticularized to software testing, the “principle of (weak) emergence” is that erroneous software can do no actualharmuntilcertainofitsvaluesemergefromtherealmofdigitallogicintoaphysicalsubsystem. This principleinquiresbothintomechanicsoftransduction,andhowtransduciblevaluescomeintobeing.Theau- tomatonofthesoftwaremechanismanswersthelatterquestion.Ifsoftwarehazardistobeevaluatedstarting atpointsoftransductionandproceedingbackwardsthroughinternallogic,thentheautomatonmustsupport reverseinference–meaningreversedincomputationalorder,fromfinalconclusiontopossiblepremise(see §1.4.2). 5 1.5 CHOICE FORK Chapter 2 (Discrete Systems Theory) details relationships between systems theory and automata. From a mathematicalstandpointthematerialisnecessary,buttherearereadersforwhomthischapterwouldduplicate existingknowledge. Afterverifyingtheirunderstandingofoperationalprofiles,section2.7,theyareinvited toskipforwardtoChapter3. Chapter2issummarizedheretodecidewhethertoskipit. Softwareisdescribedusingatriadofstructures: the process, the procedure, and the path; not all are independent. Rudiments underlying these structures consist of ensembles and Cartesian products. Walks, the actuated automaton, converse automata, reverse walks,andconesfollow. Those desiring detailed introductionto fundamentalsmay access AppendixA, which reviews groundwork andnotationusedhere. Itshighlightsincludethatanensembleisamappingfromasetofstimuliintoaset ofresponses.EnsemblesaredenotedbyuppercaseGreekletterssuchasΨ. ThegeneralCartesianproductof anensemble,calledachoicespace,isdenoted Ψ. Q 6 Chapter 2 Discrete systems theory Discretesystemstheory(software)isidentifiedwiththeactuatedautomaton. 2.1 Process Chains of stimulus and response characterize reactive discrete systems. In this chain, successive links are not independent: the response effected in one link feeds forward into the stimulus of the following link. For instance, in a system of cog-wheelsand escapements, gear train movementaccomplishedin one stage of operation becomes input to the next. A formalism called a process captures this notion of sequential inheritance.Weassembleprocessesfromasimpleunitcalledtheframe,whichistwo-partstructureconsisting ofstartingandendingconditions. Aprocessisasequenceofframessuchthatthestartingconditionofeach frame subsumes the endingcondition of its predecessor frame. Interpretedin systems language, a frame’s startingconditionisastimulusanditsendingconditionisaresponse. Currentresponsere-appearsaspartof futurestimulus. DefinitionsofensembleandrelatedbasicconceptsappearinGroundwork,AppendixA.1ff. Definition2.1.1. ThepairofensembleshΨ,ΦiisabasisifΦ⊆Ψ. Itisnecessarytorepresentstates(variables)whichareusedbutnotset–so-called“volatile”variables. For example, such variables can hold the transient values of sensors. The remainder Ψ\Φ is the generating ensembleofvolatilevariables(seeterminologyfollowingdefinitionA.3.4). Definition2.1.2. TheframespaceFofbasishΨ,Φiistheset Ψ× Φ. Amemberf ∈Fisaframe. Q Q Terminology. Letf =(ψ,φ)∈ Ψ× Φbeaframe.Thechoiceψ ∈ Ψistheframe’sstartingcondition (abscissa)andφ∈ ΦisthefrQame’seQndingcondition(ordinate). Q Q Twoframesmayberelatedsuchthattheendingconditionofoneframeisembeddedwithinthenextframe’s startingcondition.Thisstipulationisconvenientlyexpressedasamappingrestriction: Definition 2.1.3. Let hΨ,Φi be a basis with frames f = (ψ,φ), f′ = (ψ′,φ′) ∈ Ψ× Φ. Frame f conjoinsframef′ifψ′|domΦ=φ. Q Q Notation. AsequenceinasetS issomemappingσ: N → S –thatis, σ ∈ SN. Theanonymoussequence conventionallowsreferencetoasequenceusingthecompoundsymbol{s },understandings∈S.Formally, n 7 thesymbols denotesthatterm(i,s )∈{s }. Theconventionisclumsyexpressingfunctionalnotation;for i i n instances ={s }(i)meansi{7→sn}s . i n i Definition2.1.4. LethΨ,Φibeabasiswithsequenceofframes{f }: N → Ψ× Φ. Thesequenceis n successivelyconjointiffi conjoinsfi+1 foreachi≥1. Q Q Definition2.1.5. WithhΨ,Φiabasis,aprocessisasuccessivelyconjointsequenceofframesN → Ψ× Φ. Q Q Definition2.1.6. LethΨ,ΦibeabasiswithframespaceF = Ψ× Φ. Definetheabscissaprojection absc :F→ Ψby(ψ,φ)a7→bsc ψ. DefinetheordinateprojectiQonord Q:F→ Φby(ψ,φ)o7→rd φ. Q Q Definition 2.1.7. Let hΨ,Φi be a basis with persistent-volatile partition Ψ = ΦΞ (see appendix §A.4). Suppose f is a frame in Ψ × Φ. The reactive state of frame f is ψ = φξ = abscf. The event or volatile excitation state oQf frameQf is ξ = (abscf)|domΞ. Similarly, the persistent state of frame f is φ=(abscf)|domΦ. Terminology. Processconceptsinterpretintosystemslanguage.Thereactivespace Ψcontainsthesystem stimulus. Sequential conjointness allows circumstantial interpretation of the choiQce space Φ. It is the system’s response in the context of the frame ending condition. To place Φ in contextQof the frame’s reactivestate, the Cartesian product Φ = (Ψ|domΦ) = ( Ψ) | domQΦ [bytheoremA.3.17] is the persistent state space. Using this noQmenclatQure, sequential conjQointness is summarized that each frame’s responsebecomesthenextframe’spersistentstate,symbolicallyordf =abscf |domΦ. i i+1 2.2 Procedure Theprocedureisusefultoportrayaprocessframeasatransformationfromthestimulusspacetotheresponse space. Todistinguishsuchtransformationsfromothermappings,weusethespecialterm“functionality”and stipulatethatthecollectionoffunctionalitiesisafinitesetcalledacatalog. Theterm“catalog”willlaterbe appliedtoresourcesetsidentifiedwithanautomaton. 2.2.1 Functionality Thefunctionalitygeneralizestheframe.Iff =(ψ ,φ )istheithprocessframe,thisconceptpermitswriting i i i φ =f (ψ ),wheref issomefunctionalitybelongingtocatalogF. i i i i Definition 2.2.1. A functionality is a mapping whose domain and codomain are choice spaces (definition A.3.4),withthecodomainasubspace(definitionA.3.16)ofthedomain. Lemma2.2.2. LethΨ,Φibeabasis. Anymappingf : Ψ→ Φisafunctionality. Q Q Proof. As a basis, definition 2.1.1 establishes that Ψ and Φ are ensembles with Φ ⊆ Ψ. Since Ψ and Φ areensembles,definitionA.3.4assertsthat Ψand Φarechoicespaces. TheoremA.3.21providesthat Φisasubspaceof ΨbecauseΦ ⊆ Ψ.QByvirtueQoff ∈ ΦQΨ, thenf : Ψ → Φisamapping fQromonechoicespaceQtoanother,whichisasubspaceofthefiQrst. TheseconditiQonssatisfQythepremisesof definition2.2.1. Remark(functionalityversusfunction). Initsprogrammingsense,theterm“function”willnotbeusedhere. Amathematicalfunctionalitydiffersfromasoftwarefunction;functionalitieslackarguments. By virtueof itscallingprotocol,aprogrammingfunctioniseffectivelyaclassoffunctionalities. 8