SANDIA REPORT SAND2007-2758 UnlimitedRelease PrintedMay2007 Bayesian Methods for Estimating the Reliability in Complex Hierarchical Networks (Interim Report) Paul T. Boggs, Youssef M. Marzouk, Philippe P. Pe´bay, John Red-Horse, Kathleen Diegert, andRenaZurn Preparedby SandiaNationalLaboratories Albuquerque,NewMexico87185andLivermore,California94550 SandiaisamultiprogramlaboratoryoperatedbySandiaCorporation, aLockheedMartinCompany,fortheUnitedStatesDepartmentofEnergy’s NationalNuclearSecurityAdministrationunderContractDE-AC04-94-AL85000. Approvedforpublicrelease;furtherdisseminationunlimited. IssuedbySandiaNationalLaboratories,operatedfortheUnitedStatesDepartmentof EnergybySandiaCorporation. NOTICE:Thisreportwaspreparedasanaccountofworksponsoredbyanagencyof theUnitedStatesGovernment.NeithertheUnitedStatesGovernment,noranyagency thereof,noranyoftheiremployees,noranyoftheircontractors,subcontractors,ortheir employees,makeanywarranty,expressorimplied,orassumeanylegalliabilityorre- sponsibility for the accuracy, completeness, or usefulness of any information, appara- tus,product,orprocessdisclosed,orrepresentthatitsusewouldnotinfringeprivately ownedrights. Referencehereintoanyspecificcommercialproduct, process, orservice bytradename,trademark,manufacturer,orotherwise,doesnotnecessarilyconstitute or imply its endorsement, recommendation, or favoring by the United States Govern- ment,anyagencythereof,oranyoftheircontractorsorsubcontractors. Theviewsand opinionsexpressedhereindonotnecessarilystateorreflectthoseoftheUnitedStates Government,anyagencythereof,oranyoftheircontractors. PrintedintheUnitedStatesofAmerica. Thisreporthasbeenreproduceddirectlyfrom thebestavailablecopy. 2 SAND2007-2758 UnlimitedRelease PrintedMay2007 Bayesian Methods for Estimating the Reliability in Complex Hierarchical Networks (Interim Report) Paul T. Boggs Youssef M. Marzouk Sandia National Laboratories Sandia National Laboratories M.S. 9159, P.O. Box 969 M.S. 9051, P.O. Box 969 Livermore, CA 94551, U.S.A. Livermore, CA 94551, U.S.A. [email protected] [email protected] Philippe P. Pe´bay John Red-Horse Sandia National Laboratories Sandia National Laboratories M.S. 9051, P.O. Box 969 M.S. 0828, P.O. Box 5800 Livermore, CA 94551, U.S.A. Albuquerque, NM 87185, U.S.A. [email protected] [email protected] Kathleen Diegert Rena Zurn Sandia National Laboratories Sandia National Laboratories M.S. 0830, P.O. Box 5800 M.S. 9007, P.O. Box 969 Albuquerque, NM 87185, U.S.A. Livermore, CA 94551, U.S.A. [email protected] [email protected] Abstract Current work on the Integrated Stockpile Evaluation (ISE) project is evidence of Sandia’scommitmenttomaintainingtheintegrityofthenuclearweaponsstockpile. In this report, we undertake a keyelement in that process: developmentof an analytical frameworkfordeterminingthereliabilityofthestockpileinarealisticenvironmentof time-variance,inherentuncertainty,andsparseavailableinformation. Thisframework isprobabilisticinnatureandisfoundedonanovelcombinationofclassicalandcom- putational Bayesian analysis, Bayesian networks, and polynomial chaos expansions. We note that, while the focus of the effort is stockpile-related, it is applicable to any reasonably-structuredhierarchicalsystem,includingsystemswithfeedback. 3 Contents 1 Introduction .............................................................. 5 2 Background .............................................................. 7 2.1 BayesianNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Bayes’Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 FunctionalAnalyticApproachtoProbability . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 TheModel................................................................ 15 3.1 Classversusinstance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 FirstProblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 BasicMethodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4 SpecificationofaSyntheticSystemGenerator................................ 23 4.1 GenericDescriptionofanIndividualSystem . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2 SpecificationofaSampleSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5 Conclusion............................................................... 30 References................................................................... 31 4 Bayesian Methods for Estimating the Reliability in Complex Hierarchical Networks (Interim Report) 1 Introduction Sandiahastremendousresponsibilitiesinmaintainingtheintegrityofthenuclearweapons stockpile and, in carrying out this mission, manages a comprehensive testing program. A major goal of this program is to determine the reliability of the stockpile and, at the same time, to calculate the confidence in that reliability estimate. This is clearly a cost-intensive activity; thus, the question arises as to whether or not the same levels of overall stockpile integrity can be achieved at lower cost, or if higher levels can be achieved within the same budgetconstraints. Weapons systems are tremendously complex, multi-level affairs, and are often rife with uncertainties that can originate in any number of ways. Calculating the reliability in such settingscanbeextremelydifficult;estimatingtheconfidenceinthereliabilityisofteneven moredifficult. Inthispaper,wesetforthourprogramforinvestigatingtheseissues. In this document, we take “reliability” to be the probability that system failure does not occur. In the context of our development, we define this precisely and discuss our strat- egy for developing estimates of reliability in complex hierarchical systems comprised of interconnectedcomponentsthatinherentlypossessuncertaintyateverylevelofthesystem. Wealsodeveloparigorousmeansofcomputingtheconfidenceinthisestimate. Moreover, dueto,amongotherthings,datagatheringlimitations,ourmathematicaldescriptionsofthe underlying uncertainties are often incomplete. This lack of information is itself a source of uncertainty in reliability estimates. Thus, properly accounting for this uncertainty is of paramount importance, particularly in systems where it is impractical, or impossible, to performasufficientnumberofclassicalteststoensureaspecifiedconfidencelevel. In the sections below, we describe the development of a novel mathematical strategy, em- ployingstructuredprobabilisticmodels,fordescribingsystems-relatedtests. Ourgoalwill be to develop a rigorous means to assess the overall reliability of the systems, the associ- ated confidence we have in this reliability, and the impact additional data acquired from subsystem and component-level tests can have. This last ability will allow us to formulate optimization problems that will determine the “best” place in the system to test to give the maximumincreaseinourconfidenceinthereliabilityestimate. The basis of our strategy will be a Bayesian approach, with which we seek to update the probability distribution for the output of any component given new test data on the perfor- 5 mance of the system. This approach is inspired by the work of Martz and Waller [10, 9], who used a Bayesian approach to update reliability estimates based only on pass-fail data and only for special structures of the system. We extend this approach to more general Bayesian networks in which certain components may exist in, or directly affect, multi- ple levels of the systems. The strategy will first be applied to time-independent systems with continuous-valued data, but will be naturally extensible to time-dependent systems with continuous data. “Continuous data” refers to real-valued performance variables, e.g., voltage, that must fall between given bounds for the system to be considered working ac- ceptably. A final foundational element of our strategy is the use of polynomial chaos ex- pansions (PCE), with which we will represent the random variables and processes that characterize the system at all of the levels. This facilitates several important tasks: uncer- tainty propagation through the system; analysis of so-called uncertainties present due to lackofinformation;andgeneralizationstothetime-dependentcase. Exact or approximate inference algorithms will allow probabilistic information character- izing reliability to be updated through and across systems. This is crucial for estimating the uncertainty in the overall system reliability, and for infusing appropriate updates when additionalinformationanddataareacquired. This work is related to and inspired by our current work on the Integrated Stockpile Eval- uation (ISE) project, but is more generally applicable to any hierarchical system. In fact, it should be applicable to any reasonably structured system, including, e.g., systems that havefeedback. This report is organized as follows: For the sake of completeness, we provide in section 2 some technical background on several topics. First, we describe the Bayesian networks that we will consider and why they are appropriate to this study. Next, we review Bayes’ theorem in the form that we exploit . We then describe function analytic approaches to probability and the fundamentals of polynomial chaos expansions (PCE) that are essential to our approach. In this context, we consider the “stochastic dimension” and methods for preventing an explosion of this dimension. In section 3, we develop the basic model that we will use and consider some of its properties. We also formally define reliability in this setting. We then consider a very simple hierarchical system with linear functional relationships among the components. This allows us to work out in detail the effects of obtaining test data at various levels in the system. Some counter-intuitive results emerge from this exercise that have helped us to understand this effort better. In section 4, we develop a synthetic generator that will be used to produce hierarchical test systems with specified properties. These, in turn, will be used to test our final strategy over a variety of systems. Insection5weconcludewithadiscussionofthenextstepsinourresearch. 6 2 Background Inthissectionweprovidesomebackgroundonsomeofthefundamentalconceptsthatwill beneededinourdiscussion. 2.1 Bayesian Networks The systems that we are considering are assumed to possess uncertainty, either inherent or duetolackofinformation,andthisisafundamentalcharacteristicthatweconsideredines- tablishinganappropriatemathematicalcontext. Here,wedescribethisandotherimportant systemconsiderations: 1. Theyareengineeredhierarchicalsystems;anycomponentmaybecomprisedofsub- components, which, in turn, may themselves contain subcomponents. Also, several instances of a single type of component1 may appear in a given system. For exam- ple, ahydraulic systemmay containfive valves, threeof whichhave identical model numbers and specifications. It is also possible that a particular instance of a com- ponent will play a role in more than one subsystem. A simple example of this is an automobilebattery,whichhasfunctionsbothinthestartingcircuitry,andin,say,the instrumentpanel. 2. We identify the i-th component in the system with a variable X. This variable may i represent voltage, impedance, yield stress, or any other quantity of interest. Because of measurement error, limited opportunities for testing, component-to-component variability,aging,andenvironmentalinfluences,ourknowledgeoftheexactvalueof X willbeimprecise. Thus,weadoptaprobabilisticapproachandtreatX asarandom i i variable. Information on X will be expressed in this probabilistic framework, say i using probability distributions, or functional representations. Furthermore, we will have the ability to update these probabilistic descriptions using Bayesian inference asmoredatabecomeavailable. 3. Bayesian networks will be employed to aggregate information into a system-level probabilistic model, and to encode conditional independence relationships among variouscomponentsasimposedbythesystemstructure. Regarding the last item above, consider the joint probability distribution of the random variables {X :i∈V}, where V is the set of component indices. This distribution is high- i dimensional and complex, describing the performance of every component of the system and dependencies among the performance values. We would like to estimate and update thisdistributionbasedontestdata. Wewouldalsoliketofocusourattentionontheperfor- manceofparticularcomponentsandsubsystems—inotherwords,toexaminethemarginal 1Thesewillbereferredtoas“instances”ofa“class;”see§3. 7 distribution of a particular subset {X :i∈U ⊂V}. And we would like to calculate con- i ditional probabilities—i.e., the probability of one subset of the variables given the values of another subset of the variables. These tasks will become computationally intractable unlesswetakeadvantageofthestructureofthesystem. Inparticular,weproposeusingthe engineeredstructureofthesystemtofactorthejointprobabilitydistributionintoanumber ofconditionalprobabilities. The above serves as motivation for using probabilistic graphical modeling capabilities present in Bayesian networks. We provide a brief description of this concept here; for more details see, for example, Jensen [5] or Jordan [6]. Our notation here follows that of Jordan [6]. LetG(V,E)beadirectedacylicgraph(DAG)withnodesV andedgesE. Let X ≡{X :i∈V} be a collection of random variables indexed by the nodes of the graph. V i Eachnodev∈V isassociatedwithasetof“parent”nodes,i.e.,allthenodesfromwhicha directed edge points towards v. This set of parents is denoted by p and may be the empty v set. Using any set of indices as a subscript, we let Xp denote the set of random variables v associatedwiththeparentsofv. InaBayesiannetwork,thejointprobabilitydistributionof X factorsasfollows: V (cid:213) p(xV)= p(xv|xp v) v∈V where p(·) is a probability density function in the case of real-valued X and a probability massfunctioninthecaseofdiscrete-valuedX. In the present application, the structure of the graph G will reflect the structure of the engineeredsystem. Butthiscorrespondenceneednotbeexact;theBayesiannetworkbased on G may include different types of nodes representing the class and specific instances of aparticularcomponent,andmayalsoincludenodesrepresentingenvironmentalconditions orotherexternalfactorsthatarerelevanttosystemperformance. 2.2 Bayes’ Theorem Theabilitytoupdateourestimateofthereliabilityofasystembasedonnewlyacquireddata is critical. And, since we will be using Bayesian updating to effect this, Bayes’ Theorem is fundamental to our work. We now provide a brief summary of this important result. By definition,theconditionalprobabilityofeventAgiveneventBisgivenby P(A∩B) P(A|B)= . P(B) Similarly,theconditionalprobabilityofeventBgiveneventAisgivenby P(A∩B) P(B|A)= . P(A) Thus,bycombiningthesetwoequationsandrearrangingterms,weobtainBayes’Theorem: P(B|A)P(A) P(A|B)= . P(B) 8 Inthissetting,thetermshavestandardnames. • P(A) is called the prior probability. It is our state of knowledge prior to observing eventB. • P(B|A)istheprobabilityofBgivenA. ThisisoftencalledthelikelihoodofA. • P(A | B) is called the posterior probability of A given B. That is, it is the updated probability of Agiven the new eventB. If subsequent data is collected, this becomes thepriorinthenextround. • The term P(B) is called the marginal probability of B and acts as a normalizing constant. AlthoughwehavederivedBayes’Theoremintermsofprobabilities,thesameresultholds for probability densities. In particular, for a probability density, say f(x), x ∈ X, we can write,givendatay∈Y f(y|x)f(x) f(x|y)= . f(y) Again,thetermshavestandardnames: • f(x) is called the prior distribution of X. It is our state of knowledge about the randomvariableX priortoobservingY =y. • f(y|x)isthelikelihoodfunctionofx givenY =y. • f(x | y) is called the posterior distribution of X given y. That is, it is the updated probability density of X given the data Y = y. If subsequent independent data is collected,thisdistributionbecomesthepriorinthenextround. • The term f(y) is the marginal distribution of y, also called the evidence, and acts as anormalizingconstant. Thenotationalabuseofusing f inalloftheseisconventional;eachoneis,infact,different asisdistinguishedbyitsarguments. Our approach is to use Bayes’ Theorem to update distributions; thus we will use the latter form in our development. Finally, we note that it is easy to extend Bayes’ Theorem to two or more variables. Specifically, in the context of the hierarchical system structure that we assume,weobtainthefollowingusefulresult f(x|y,z)(cid:181) f(y|x)f(z|x)f(x), where we have ignored the normalizing constant. This is true if the data y and z are condi- tionallyindependentgivenX. 9 2.3 Functional Analytic Approach to Probability In the field of probability, there are two primary means of analysis. The first of these is the traditional probabilistic approach, in which one is concerned with properties of certain probabilistic entities, such as cumulative distribution functions, probability density func- tions,orastatisticalmoment,andtheirbehaviorundertransformationorlimitoperations. There is, however, an alternative approach, which we will refer to as a function analytic approachtoprobability. Thebasisofthisapproachistherecognitionthatrandomvariables (RVs) and random fields (RFs) are functions with at least a subset of the domain of these functions being a sample space, W , of elementary events that is well-defined in the context ofaprobabilityspaceconsistingofthetriple,(W ,S,P). Inadditiontothesamplespace,the probability space consists of a s -algebra S of subsets of W called events, and a probability measureP. Eachoftheseentitieshaswell-establishedandprecisemathematicalproperties. Using this structure, it is easy to observe that a random function is, in fact, not random at all, and that whatever randomness that exists in the framework is entirely associated with the occurrence of events. Thus, it seems reasonable to cast these random functions in a function analytic setting. Within this setting there are many analysis possibilities: algebraic, semi-group, topological, etc. Our goal is to develop approximations to exact functionsinaHilbertspacesetting. We emphasize that under identical assumptions, probabilistic solutions that result from ei- ther analytical path are identical. They offer competing means to package information; the approach taken should be dictated by the particulars associated with a given applica- tion class. In our case, the primary goal is first to generalize deterministic problems to accommodateinputparametersthataremodeledasRVswithapproximateprobabilisticin- formation. The function analytic approach is particularly well-suited to this. Based on ex- perience,weexpectnoimpedimentstogeneralizingourimplementationtotime-dependent problemsusingRFs. 2.3.1 ScalarPolynomialChaosExpansions The polynomial chaos expansion (PCE) method was first conceived by N. Wiener as a means to integrate operators possessing differential Brownian motion, at the time viewed as chaotic, as an external forcing influence. While the random process he described is quitegeneral,wewilltakeasimplerroutewhilenotingthatthetransitiontothemoregen- eral case, first progressing to vectors containing RVs as components, then to more general randomprocesses,ispossible[13]. Consider two real-valued scalar RVs X andY, defined on (W ,S,P), each with finite vari- ance. Assume that there exists a functional transformation, T, between X and Y; that is, thatX =T(Y)iswell-defined. 10

