Table Of ContentSANDIA REPORT
SAND2007-2758
UnlimitedRelease
PrintedMay2007
Bayesian Methods for Estimating the
Reliability in Complex Hierarchical
Networks
(Interim Report)
Paul T. Boggs, Youssef M. Marzouk, Philippe P. Pe´bay, John Red-Horse, Kathleen Diegert,
andRenaZurn
Preparedby
SandiaNationalLaboratories
Albuquerque,NewMexico87185andLivermore,California94550
SandiaisamultiprogramlaboratoryoperatedbySandiaCorporation,
aLockheedMartinCompany,fortheUnitedStatesDepartmentofEnergy’s
NationalNuclearSecurityAdministrationunderContractDE-AC04-94-AL85000.
Approvedforpublicrelease;furtherdisseminationunlimited.
IssuedbySandiaNationalLaboratories,operatedfortheUnitedStatesDepartmentof
EnergybySandiaCorporation.
NOTICE:Thisreportwaspreparedasanaccountofworksponsoredbyanagencyof
theUnitedStatesGovernment.NeithertheUnitedStatesGovernment,noranyagency
thereof,noranyoftheiremployees,noranyoftheircontractors,subcontractors,ortheir
employees,makeanywarranty,expressorimplied,orassumeanylegalliabilityorre-
sponsibility for the accuracy, completeness, or usefulness of any information, appara-
tus,product,orprocessdisclosed,orrepresentthatitsusewouldnotinfringeprivately
ownedrights. Referencehereintoanyspecificcommercialproduct, process, orservice
bytradename,trademark,manufacturer,orotherwise,doesnotnecessarilyconstitute
or imply its endorsement, recommendation, or favoring by the United States Govern-
ment,anyagencythereof,oranyoftheircontractorsorsubcontractors. Theviewsand
opinionsexpressedhereindonotnecessarilystateorreflectthoseoftheUnitedStates
Government,anyagencythereof,oranyoftheircontractors.
PrintedintheUnitedStatesofAmerica. Thisreporthasbeenreproduceddirectlyfrom
thebestavailablecopy.
AvailabletoDOEandDOEcontractorsfrom
U.S.DepartmentofEnergy
OfficeofScientificandTechnicalInformation
P.O.Box62
OakRidge,TN37831
Telephone: (865)576-8401
Facsimile: (865)576-5728
E-Mail: reports@adonis.osti.gov
Onlineordering: http://www.doe.gov/bridge
Availabletothepublicfrom
U.S.DepartmentofCommerce
NationalTechnicalInformationService
5285PortRoyalRd
Springfield,VA22161
Telephone: (800)553-6847
Facsimile: (703)605-6900
E-Mail: orders@ntis.fedworld.gov
Onlineordering: http://www.ntis.gov/ordering.htm
2
SAND2007-2758
UnlimitedRelease
PrintedMay2007
Bayesian Methods for Estimating the Reliability
in Complex Hierarchical Networks
(Interim Report)
Paul T. Boggs Youssef M. Marzouk
Sandia National Laboratories Sandia National Laboratories
M.S. 9159, P.O. Box 969 M.S. 9051, P.O. Box 969
Livermore, CA 94551, U.S.A. Livermore, CA 94551, U.S.A.
ptboggs@sandia.gov ymarzou@sandia.gov
Philippe P. Pe´bay John Red-Horse
Sandia National Laboratories Sandia National Laboratories
M.S. 9051, P.O. Box 969 M.S. 0828, P.O. Box 5800
Livermore, CA 94551, U.S.A. Albuquerque, NM 87185, U.S.A.
pppebay@sandia.gov jrredho@sandia.gov
Kathleen Diegert Rena Zurn
Sandia National Laboratories Sandia National Laboratories
M.S. 0830, P.O. Box 5800 M.S. 9007, P.O. Box 969
Albuquerque, NM 87185, U.S.A. Livermore, CA 94551, U.S.A.
kvdiege@sandia.gov rmzurn@sandia.gov
Abstract
Current work on the Integrated Stockpile Evaluation (ISE) project is evidence of
Sandia’scommitmenttomaintainingtheintegrityofthenuclearweaponsstockpile. In
this report, we undertake a keyelement in that process: developmentof an analytical
frameworkfordeterminingthereliabilityofthestockpileinarealisticenvironmentof
time-variance,inherentuncertainty,andsparseavailableinformation. Thisframework
isprobabilisticinnatureandisfoundedonanovelcombinationofclassicalandcom-
putational Bayesian analysis, Bayesian networks, and polynomial chaos expansions.
We note that, while the focus of the effort is stockpile-related, it is applicable to any
reasonably-structuredhierarchicalsystem,includingsystemswithfeedback.
3
Contents
1 Introduction .............................................................. 5
2 Background .............................................................. 7
2.1 BayesianNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Bayes’Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 FunctionalAnalyticApproachtoProbability . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 TheModel................................................................ 15
3.1 Classversusinstance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 FirstProblem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 BasicMethodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 SpecificationofaSyntheticSystemGenerator................................ 23
4.1 GenericDescriptionofanIndividualSystem . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 SpecificationofaSampleSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 Conclusion............................................................... 30
References................................................................... 31
4
Bayesian Methods for Estimating
the Reliability in Complex
Hierarchical Networks
(Interim Report)
1 Introduction
Sandiahastremendousresponsibilitiesinmaintainingtheintegrityofthenuclearweapons
stockpile and, in carrying out this mission, manages a comprehensive testing program. A
major goal of this program is to determine the reliability of the stockpile and, at the same
time, to calculate the confidence in that reliability estimate. This is clearly a cost-intensive
activity; thus, the question arises as to whether or not the same levels of overall stockpile
integrity can be achieved at lower cost, or if higher levels can be achieved within the same
budgetconstraints.
Weapons systems are tremendously complex, multi-level affairs, and are often rife with
uncertainties that can originate in any number of ways. Calculating the reliability in such
settingscanbeextremelydifficult;estimatingtheconfidenceinthereliabilityisofteneven
moredifficult. Inthispaper,wesetforthourprogramforinvestigatingtheseissues.
In this document, we take “reliability” to be the probability that system failure does not
occur. In the context of our development, we define this precisely and discuss our strat-
egy for developing estimates of reliability in complex hierarchical systems comprised of
interconnectedcomponentsthatinherentlypossessuncertaintyateverylevelofthesystem.
Wealsodeveloparigorousmeansofcomputingtheconfidenceinthisestimate. Moreover,
dueto,amongotherthings,datagatheringlimitations,ourmathematicaldescriptionsofthe
underlying uncertainties are often incomplete. This lack of information is itself a source
of uncertainty in reliability estimates. Thus, properly accounting for this uncertainty is of
paramount importance, particularly in systems where it is impractical, or impossible, to
performasufficientnumberofclassicalteststoensureaspecifiedconfidencelevel.
In the sections below, we describe the development of a novel mathematical strategy, em-
ployingstructuredprobabilisticmodels,fordescribingsystems-relatedtests. Ourgoalwill
be to develop a rigorous means to assess the overall reliability of the systems, the associ-
ated confidence we have in this reliability, and the impact additional data acquired from
subsystem and component-level tests can have. This last ability will allow us to formulate
optimization problems that will determine the “best” place in the system to test to give the
maximumincreaseinourconfidenceinthereliabilityestimate.
The basis of our strategy will be a Bayesian approach, with which we seek to update the
probability distribution for the output of any component given new test data on the perfor-
5
mance of the system. This approach is inspired by the work of Martz and Waller [10, 9],
who used a Bayesian approach to update reliability estimates based only on pass-fail data
and only for special structures of the system. We extend this approach to more general
Bayesian networks in which certain components may exist in, or directly affect, multi-
ple levels of the systems. The strategy will first be applied to time-independent systems
with continuous-valued data, but will be naturally extensible to time-dependent systems
with continuous data. “Continuous data” refers to real-valued performance variables, e.g.,
voltage, that must fall between given bounds for the system to be considered working ac-
ceptably. A final foundational element of our strategy is the use of polynomial chaos ex-
pansions (PCE), with which we will represent the random variables and processes that
characterize the system at all of the levels. This facilitates several important tasks: uncer-
tainty propagation through the system; analysis of so-called uncertainties present due to
lackofinformation;andgeneralizationstothetime-dependentcase.
Exact or approximate inference algorithms will allow probabilistic information character-
izing reliability to be updated through and across systems. This is crucial for estimating
the uncertainty in the overall system reliability, and for infusing appropriate updates when
additionalinformationanddataareacquired.
This work is related to and inspired by our current work on the Integrated Stockpile Eval-
uation (ISE) project, but is more generally applicable to any hierarchical system. In fact,
it should be applicable to any reasonably structured system, including, e.g., systems that
havefeedback.
This report is organized as follows: For the sake of completeness, we provide in section
2 some technical background on several topics. First, we describe the Bayesian networks
that we will consider and why they are appropriate to this study. Next, we review Bayes’
theorem in the form that we exploit . We then describe function analytic approaches to
probability and the fundamentals of polynomial chaos expansions (PCE) that are essential
to our approach. In this context, we consider the “stochastic dimension” and methods
for preventing an explosion of this dimension. In section 3, we develop the basic model
that we will use and consider some of its properties. We also formally define reliability
in this setting. We then consider a very simple hierarchical system with linear functional
relationships among the components. This allows us to work out in detail the effects of
obtaining test data at various levels in the system. Some counter-intuitive results emerge
from this exercise that have helped us to understand this effort better. In section 4, we
develop a synthetic generator that will be used to produce hierarchical test systems with
specified properties. These, in turn, will be used to test our final strategy over a variety of
systems. Insection5weconcludewithadiscussionofthenextstepsinourresearch.
6
2 Background
Inthissectionweprovidesomebackgroundonsomeofthefundamentalconceptsthatwill
beneededinourdiscussion.
2.1 Bayesian Networks
The systems that we are considering are assumed to possess uncertainty, either inherent or
duetolackofinformation,andthisisafundamentalcharacteristicthatweconsideredines-
tablishinganappropriatemathematicalcontext. Here,wedescribethisandotherimportant
systemconsiderations:
1. Theyareengineeredhierarchicalsystems;anycomponentmaybecomprisedofsub-
components, which, in turn, may themselves contain subcomponents. Also, several
instances of a single type of component1 may appear in a given system. For exam-
ple, ahydraulic systemmay containfive valves, threeof whichhave identical model
numbers and specifications. It is also possible that a particular instance of a com-
ponent will play a role in more than one subsystem. A simple example of this is an
automobilebattery,whichhasfunctionsbothinthestartingcircuitry,andin,say,the
instrumentpanel.
2. We identify the i-th component in the system with a variable X. This variable may
i
represent voltage, impedance, yield stress, or any other quantity of interest. Because
of measurement error, limited opportunities for testing, component-to-component
variability,aging,andenvironmentalinfluences,ourknowledgeoftheexactvalueof
X willbeimprecise. Thus,weadoptaprobabilisticapproachandtreatX asarandom
i i
variable. Information on X will be expressed in this probabilistic framework, say
i
using probability distributions, or functional representations. Furthermore, we will
have the ability to update these probabilistic descriptions using Bayesian inference
asmoredatabecomeavailable.
3. Bayesian networks will be employed to aggregate information into a system-level
probabilistic model, and to encode conditional independence relationships among
variouscomponentsasimposedbythesystemstructure.
Regarding the last item above, consider the joint probability distribution of the random
variables {X :i∈V}, where V is the set of component indices. This distribution is high-
i
dimensional and complex, describing the performance of every component of the system
and dependencies among the performance values. We would like to estimate and update
thisdistributionbasedontestdata. Wewouldalsoliketofocusourattentionontheperfor-
manceofparticularcomponentsandsubsystems—inotherwords,toexaminethemarginal
1Thesewillbereferredtoas“instances”ofa“class;”see§3.
7
distribution of a particular subset {X :i∈U ⊂V}. And we would like to calculate con-
i
ditional probabilities—i.e., the probability of one subset of the variables given the values
of another subset of the variables. These tasks will become computationally intractable
unlesswetakeadvantageofthestructureofthesystem. Inparticular,weproposeusingthe
engineeredstructureofthesystemtofactorthejointprobabilitydistributionintoanumber
ofconditionalprobabilities.
The above serves as motivation for using probabilistic graphical modeling capabilities
present in Bayesian networks. We provide a brief description of this concept here; for
more details see, for example, Jensen [5] or Jordan [6]. Our notation here follows that of
Jordan [6]. LetG(V,E)beadirectedacylicgraph(DAG)withnodesV andedgesE. Let
X ≡{X :i∈V} be a collection of random variables indexed by the nodes of the graph.
V i
Eachnodev∈V isassociatedwithasetof“parent”nodes,i.e.,allthenodesfromwhicha
directed edge points towards v. This set of parents is denoted by p and may be the empty
v
set. Using any set of indices as a subscript, we let Xp denote the set of random variables
v
associatedwiththeparentsofv. InaBayesiannetwork,thejointprobabilitydistributionof
X factorsasfollows:
V
(cid:213)
p(xV)= p(xv|xp v)
v∈V
where p(·) is a probability density function in the case of real-valued X and a probability
massfunctioninthecaseofdiscrete-valuedX.
In the present application, the structure of the graph G will reflect the structure of the
engineeredsystem. Butthiscorrespondenceneednotbeexact;theBayesiannetworkbased
on G may include different types of nodes representing the class and specific instances of
aparticularcomponent,andmayalsoincludenodesrepresentingenvironmentalconditions
orotherexternalfactorsthatarerelevanttosystemperformance.
2.2 Bayes’ Theorem
Theabilitytoupdateourestimateofthereliabilityofasystembasedonnewlyacquireddata
is critical. And, since we will be using Bayesian updating to effect this, Bayes’ Theorem
is fundamental to our work. We now provide a brief summary of this important result. By
definition,theconditionalprobabilityofeventAgiveneventBisgivenby
P(A∩B)
P(A|B)= .
P(B)
Similarly,theconditionalprobabilityofeventBgiveneventAisgivenby
P(A∩B)
P(B|A)= .
P(A)
Thus,bycombiningthesetwoequationsandrearrangingterms,weobtainBayes’Theorem:
P(B|A)P(A)
P(A|B)= .
P(B)
8
Inthissetting,thetermshavestandardnames.
• P(A) is called the prior probability. It is our state of knowledge prior to observing
eventB.
• P(B|A)istheprobabilityofBgivenA. ThisisoftencalledthelikelihoodofA.
• P(A | B) is called the posterior probability of A given B. That is, it is the updated
probability of Agiven the new eventB. If subsequent data is collected, this becomes
thepriorinthenextround.
• The term P(B) is called the marginal probability of B and acts as a normalizing
constant.
AlthoughwehavederivedBayes’Theoremintermsofprobabilities,thesameresultholds
for probability densities. In particular, for a probability density, say f(x), x ∈ X, we can
write,givendatay∈Y
f(y|x)f(x)
f(x|y)= .
f(y)
Again,thetermshavestandardnames:
• f(x) is called the prior distribution of X. It is our state of knowledge about the
randomvariableX priortoobservingY =y.
• f(y|x)isthelikelihoodfunctionofx givenY =y.
• f(x | y) is called the posterior distribution of X given y. That is, it is the updated
probability density of X given the data Y = y. If subsequent independent data is
collected,thisdistributionbecomesthepriorinthenextround.
• The term f(y) is the marginal distribution of y, also called the evidence, and acts as
anormalizingconstant.
Thenotationalabuseofusing f inalloftheseisconventional;eachoneis,infact,different
asisdistinguishedbyitsarguments.
Our approach is to use Bayes’ Theorem to update distributions; thus we will use the latter
form in our development. Finally, we note that it is easy to extend Bayes’ Theorem to two
or more variables. Specifically, in the context of the hierarchical system structure that we
assume,weobtainthefollowingusefulresult
f(x|y,z)(cid:181) f(y|x)f(z|x)f(x),
where we have ignored the normalizing constant. This is true if the data y and z are condi-
tionallyindependentgivenX.
9
2.3 Functional Analytic Approach to Probability
In the field of probability, there are two primary means of analysis. The first of these is
the traditional probabilistic approach, in which one is concerned with properties of certain
probabilistic entities, such as cumulative distribution functions, probability density func-
tions,orastatisticalmoment,andtheirbehaviorundertransformationorlimitoperations.
There is, however, an alternative approach, which we will refer to as a function analytic
approachtoprobability. Thebasisofthisapproachistherecognitionthatrandomvariables
(RVs) and random fields (RFs) are functions with at least a subset of the domain of these
functions being a sample space, W , of elementary events that is well-defined in the context
ofaprobabilityspaceconsistingofthetriple,(W ,S,P). Inadditiontothesamplespace,the
probability space consists of a s -algebra S of subsets of W called events, and a probability
measureP. Eachoftheseentitieshaswell-establishedandprecisemathematicalproperties.
Using this structure, it is easy to observe that a random function is, in fact, not random
at all, and that whatever randomness that exists in the framework is entirely associated
with the occurrence of events. Thus, it seems reasonable to cast these random functions
in a function analytic setting. Within this setting there are many analysis possibilities:
algebraic, semi-group, topological, etc. Our goal is to develop approximations to exact
functionsinaHilbertspacesetting.
We emphasize that under identical assumptions, probabilistic solutions that result from ei-
ther analytical path are identical. They offer competing means to package information;
the approach taken should be dictated by the particulars associated with a given applica-
tion class. In our case, the primary goal is first to generalize deterministic problems to
accommodateinputparametersthataremodeledasRVswithapproximateprobabilisticin-
formation. The function analytic approach is particularly well-suited to this. Based on ex-
perience,weexpectnoimpedimentstogeneralizingourimplementationtotime-dependent
problemsusingRFs.
2.3.1 ScalarPolynomialChaosExpansions
The polynomial chaos expansion (PCE) method was first conceived by N. Wiener as a
means to integrate operators possessing differential Brownian motion, at the time viewed
as chaotic, as an external forcing influence. While the random process he described is
quitegeneral,wewilltakeasimplerroutewhilenotingthatthetransitiontothemoregen-
eral case, first progressing to vectors containing RVs as components, then to more general
randomprocesses,ispossible[13].
Consider two real-valued scalar RVs X andY, defined on (W ,S,P), each with finite vari-
ance. Assume that there exists a functional transformation, T, between X and Y; that is,
thatX =T(Y)iswell-defined.
10