ebook img

Error Bars for Distributions of Numbers of Events PDF

0.32 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Error Bars for Distributions of Numbers of Events

Error bars for distributions of numbers of events R.Aggarwal1,2,A.Caldwell2 1 PanjabUniversity,Chandigarh,India([email protected]) 2 MaxPlanckInstituteforPhysics,Munich,Germany([email protected]) Abstract Thecommonpracticefordisplayingerrorbarsondistributionsofnumbersof eventsisconfusingandcanleadtoincorrectconclusions. Aproposalismade for a different style of presentation that more directly indicates the level of agreementbetweenexpectationsandobservations. 2 1 0 1 Introduction 2 Symmetricerrorbarscenteredontheobservednumberofeventscanbehighlymisleadingandinpractice n oftengeneratesconfusion. Atypicalexample1 ofsuchadatapresentationisgiveninFig.1. a J 0 3 V V Ge50 Ge ] 5 5 2 2 n s/ s/ a nt nt e e - v v a E40 E t a d . s c 30 10 i s y h p 20 [ 3 v 3 10 9 1 5 2 . 2 0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400 1 Mass(GeV) Mass(GeV) 1 1 : v i X Fig.1: Atypicalpresentationofdata-hereeventcountsasafunctionofmassin25GeVintervals. Thesamedata √ √ areplottedtwice: left-linearscale; right-logarithmicscale. Theerrorbarscovertherange(o− o,o+ o), r a whereoisthenumberofobservedevents. Thehistogramgivestheexpectations(meansofPoissondistributions). Therearethreeproblemswiththisstandardstyleofpresentation: – Firstofall,thereisnouncertaintyonthenumberofobservedevents. Wecertainlydonotmeanthat thereisahighprobabilitythatwehad2.3ratherthan2eventsinthe7th binintheplot. Actually, theerrorbarisintendedtorepresenttheuncertaintyonadifferentquantity-theuncertaintyonthe 1Thedataisartificialandinventedforthepurposesofthisnote. mean of an assumed underlying Poisson distribution. The probability distribution for this mean givenanobservednumberofeventso,P(θ|o),canbequiteasymmetric,anddifferentchoicescan bemaderegardingwhattoplotassummary(e.g.,themode,themeanorthemedian)ofP(θ|o). √ – The second problem arises with the length of the error bar. This is routinely taken as ± o , motivatedbythePoissonresultthatthevarianceisequaltothemean,sothattheerrorbarshould cover±1σ,or68%probabilityforpossiblevaluesofθ. However,theprobabilityrangecovered by this definition of the error bar varies dramatically as o → 0 and the probability above and belowthepointishighlyasymmetric. Usuallynoerrorbarisplottedwhen0eventsaremeasured, although this measurement also yields information on possible values of θ. These problems are occasionally avoided by using asymmetric error bars, usually covering the central 68 % of the probability from the cumulative of P(θ|o), but this is still the exception rather than the rule in experimentalparticlephysics. – A third problem occurs when data are compared to expectations, as in Fig. 1, and the error bar is used to determine if the observed number of events represents a significant deviation from the expectation. The error bar on the plot often gives the completely wrong information in this case, sincetherelevantprobabilityistheprobabilitythattheexpectationcouldhaveyieldedthenumber of observed events, not the probability that the observed number of events could have fluctuated to the expectation. For example, in the next-to-last bin in Fig. 1, the expectation is 0.011 and two events are observed. It is VERY wrong to conclude that we have slightly more than a 1 σ discrepancyinthisbin. A proposal for an alternative presentation is given here. We focus on the case where we are comparing observations to predictions and the fluctuations can be modeled with a Poisson distribution. Westartwiththesimplestcase-thatpredictionsareavailablewithnegligibleuncertainty. Wethenshow how to include the uncertainty due to predictions based on finite Monte Carlo event sets and due to systematicuncertainties. 2 Negligibleuncertaintyontheprediction Westartwiththesimplestcase-theexpectations(meansofPoissondistributions)areknownwithvery small uncertainty, and we want to compare the observed data to these expectations. The plot should give us an indication of whether the observations are within reasonable statistical fluctuations of the expectation;i.e.,theplotshouldgivetheuseranindicationofhowrareaparticularobservednumberof eventsowasexpectedtobegivenaPoissondistributionwithmeannumberofeventsν. Theprobability distributionforois e−ννo P(o|ν) = . (1) o! Themostprobablevalueforo(modeoftheprobabilitydistribution)isgivenby o∗ = (cid:98)ν(cid:99) ; (2) i.e.,thelargestintegernotgreaterthanν. Twochoicescanbemadefortheprobabilityintervalstodisplay: 1. The central interval, defined for a probability density P(x) with possible range for the parameter {x ,x } min max (cid:90) x1 (cid:90) xmax α/2 = P(x)dx = P(x)dx . xmin x2 The central interval is {x ,x } and contains probability 1 − α. In our case, we have a discrete 1 2 probabilitydistributionP(o|ν)andtheequalitygenerallycannotbesatisfied. Wetakeinstead o (cid:88) o = sup{ P(i|ν) ≤ α/2}+1 (3) 1 o∈Z i=0 2 ∞ (cid:88) o = inf{ P(i|ν) ≤ α/2}−1 . (4) 2 o∈Z i=o IfP(o = 0|ν) > α/2,thenwetakeo = 0. 1 Wedefinethesetofobservationswhichfallintothecentral1−αprobabilityinterval2 as OC = {o ,o +1,...,o } 1−α 1 1 2 and we display these values of o. Different colors can be used to represent different 1−α prob- ability ranges. For example, if we take ν = 3.¯3, then we find the values given in Table 1. If we choose1−α = 0.68,thenourdefinitionsgive OC = {2,3,4,5} 0.68 whereas OC = {0,1,2,3,4,5,6,7} 0.95 and OC = {0,1,2,3,4,5,6,7,8,9,10,11} . 0.999 Table1:Valuesofo,theprobabilitytoobservesuchavaluegivenν =3.¯3andthecumulativeprobability,rounded tofourdecimalplaces. Thefourthcolumngivestherankintermsofprobability-i.e.,theorderinwhichthisvalue of o is used in calculating the smallest set OS , and the last column gives the cumulative probability summed 1−α accordingtotherank. o P(o|ν) F(o|ν) R F (o|ν) R 0 0.0357 0.0357 7 0.9468 1 0.1189 0.1546 5 0.8431 2 0.1982 0.3528 2 0.4184 3 0.2202 0.5730 1 0.2202 4 0.1835 0.7565 3 0.6019 5 0.1223 0.8788 4 0.7242 6 0.0680 0.9468 6 0.9111 7 0.0324 0.9792 8 0.9792 8 0.0135 0.9927 9 0.9927 9 0.0050 0.9976 10 0.9976 10 0.0017 0.9993 11 0.9993 11 0.0005 0.9998 12 0.9998 12 0.0001 1.0000 13 1.0000 2. The second option for the probability interval is to use the smallest interval containing a given probability. Inthecaseofaunimodalcontinuousdistribution,wecanwritetheconditionas (cid:90) x2 1−α = P(x)dx and P(x ) = P(x ) . 1 2 x1 Forourdiscretecase,thesetmakingupthesmallestintervalcontainingprobabilityatleast1−α, OS ,isdefinedbythefollowingalgorithm 1−α (a) Start with OS = {o∗}. If P(o∗|ν) ≥ 1−α, then we are done. An example where this 1−α requirementisfulfilledfor1−α = 0.68is(ν = 0.001,o∗ = 0,OS = {0}). 0.68 2Notethat1−αistheminimumprobabilitycoveredandthatthesetgenerallycoversalargerprobability. 3 (b) If P(o∗|ν) < 1−α, then we need to add the next most probable number of observations, whichintheunimodalcaseiseithero∗+1oro∗−1. AssumethatP(o∗+1|ν) > P(o∗−1|ν). Then,wewouldextendoursettoOS = {o∗,o∗+1}andcheckagainwhetherP(OS ) ≥ 1−α 1−α 1−α. WewouldcontinuetoaddmemberstothesetOS untilthisconditionismet,always 1−α takingthehighestprobabilityoftheremainingpossibleobservations. Inthespecialcasesthattwoobservationshaveexactlythesameprobability(whenν takeson anintegervalue,P(o = ν) = P(o = ν −1)),thenbothvaluesshouldbetakenintheset. IfweconsidertheexamplegiveninTable1,thenusingthecumulativeaccordingtorank,F ,we R find OS = {2,3,4,5} 0.68 whereas OS = {0,1,2,3,4,5,6,7} 0.95 and OS = {0,1,2,3,4,5,6,7,8,9,10} . 0.999 The same sets are found as for the central interval for 1−α = 0.68 and 0.95, but is smaller for 0.999. Ifthepredicteddistributionhasalargemean(ν > 50,say),thenwecanusetheGaussianapprox- imation. Inthiscase,wetaketheminimalsymmetricrangearoundo∗ suchthat (cid:90) o∗+(n+0.5) √1 e−(x−2oν∗)2dx ≥ 1−α o∗−(n+0.5) 2πν anddefineO = {o∗−n,...,o∗+n}. Thisgivesboththecentralintervalaswellastheminimalinterval. α As an example for the procedures defined here, Fig. 2 shows the same distribution of observed number of events as a function of invariant mass as was shown in Fig. 1. Three different probability intervals are shown, corresponding to 1−α = 0.68, 1−α = 0.95, and 1−α = 0.999 and follow the definitionofthesmallestinterval. Notethatthebandsareextendedbeyondtheintegervaluesintheset by0.5forclarityofpresentation. E.g.,thebandcontainingtheset{2,3,4,5}isdrawnfrom1.5 → 5.5. The smallest 1−α color is chosen if more than one 1−α set contain the same set members (e.g., this occurswhentheset{0}contains> 95%probabilityasinthelastbinsintheplot). Thecolorschemeis meanttobesuggestive. Observedeventcountsoutsidetheshadedbandsshouldindicateunlikelyresults. 3 PredictionsbasedonMonteCarlowithfinitestatistics We now extend the prescription for cases where the prediction is uncertain. The first case we consider is that the prediction is based on a Monte Carlo which has non-negligible statistical uncertainties. We derive the probability distribution for the expected number of events o given that a MC set (consisting possiblyofdifferentcomponents)givesneventsandtheMCnormalizationfactoriss(definedsuchthat theMCpredictionisdividedbythefactorswhencalculatingtheexpectedmeanforthedata). Thefactor sisinitiallytakentobeknownexactly. TheprocessofassigningMonteCarloeventstobinsindicatestheuseofthemultinomialdistribu- tionforcalculations(ifwegenerateafixednumberofeventsratherthanafixedluminosity). Weassume here that the probability for an event to populate any given bin is small so that Poisson statistics can be usedasavalidapproximation. AssumeourMonteCarlosamplehasresultedinnevents(inabinofin- terest). Wenowneedtodeterminetheexpectedmeanforthedatasampleandtheprobabilitydistribution forthismean. WeuseλforthemeanoftheMCdistributioninthebin,andν forthemeanexpectedfor the observations. Applying Bayes’ Theorem [1] and taking the Jeffreys’ prior [2] on the mean for the MC,λ, 1 P (λ) ∝ √ 0 λ 4 VV8800 VV ee ee GG GG 5 5 5 5 22 data 22 s/s/7700 s/s/ ntnt data (0 events) ntnt ee ee vv MC vv EE EE 6600 68 % Prob 95 % Prob 5500 99.9 % Prob 1100 4400 3300 2200 11 1100 00 00 5500 110000 115500 220000 225500 330000 335500 440000 00 5500 110000 115500 220000 225500 330000 335500 440000 MMaassss((GGeeVV)) MMaassss((GGeeVV)) Fig. 2: The proposed style of presentation for the same data as shown in Fig. 1. The shaded bands represent differentprobabilityintervalsfortheobservednumberofevents: greenisfor1−α=0.68;yellowisfor1−α= 0.95 and red is for 1−α = 0.999. The bands use the smallest interval containing at least this probability, and extendtothenext1/2integervalue(seetext). Forthelogarithmicscaleplotontheright,thedataareshownasa solidtriangleifo=0. thepdfforλis 4nn! P(λ|n) = √ λn−1/2e−λ π(2n)! whichleadsto E[λ] = n+1/2 and λ∗ = n−1/2 . Forscalingthepredictiontothemeanexpectedforthedata(ν = λ/s),wehave: P(ν|n,s) = P(λ|n)dλ/dν 4nn! = √ s(sν)n−1/2e−sν π(2n)! Forthedistributionofdataevents,weusetheLawofTotalProbability[3]: (cid:90) P(o|n,s) = P(o|ν)P(ν|n,s)dν sn+1/24nn! (cid:90) P(o|n,s) = √ νo+n−1/2e−(s+1)νdν πo!(2n)! Theintegralgives √ [2(o+n)]! π 4n+o(1+s)o+n+1/2(n+o)! 5 so sn+1/2 n![2(o+n)]! P(o|n,s) = . (5) (1+s)o+n+1/24oo!(2n)!(n+o)! and n+1/2 E[o] = . s Foro = 0,wehave (cid:18) s (cid:19)n+1/2 P(o = 0|n,s) = . (6) 1+s Theprobabilityforthesucceedingvaluesofocanthenbeeasilycalculatedas (2n+2o+2)(2n+2o+1) P(o+1|n,s) = P(o|n,s)· . 4(o+1)(1+s)(n+o+1) We would now use these probabilities of o for the procedure described in the previous section ratherthanthePoissondistributionforP(o|ν). Table 2: Values of o, the probability to observe such a value given n = 10 and s = 3 and the cumulative probability, rounded to four decimal places. The fourth column gives the rank in terms of probability - i.e., the orderinwhichthisvalueofoisusedincalculatingthesmallestsetOS ,andthelastcolumngivesthecumulative 1−α probabilitysummedaccordingtotherank. o P(o|n,s) F(o|n,s) R F (o|n,s) R 0 0.0488 0.0488 7 0.9072 1 0.1280 0.1768 4 0.6654 2 0.1840 0.3608 2 0.3757 3 0.1917 0.5525 1 0.1917 4 0.1617 0.7142 3 0.5374 5 0.1173 0.8315 5 0.7827 6 0.0757 0.9072 6 0.8584 7 0.0446 0.9519 8 0.9519 8 0.0244 0.9763 9 0.9763 9 0.0125 0.9888 10 0.9888 10 0.0061 0.9949 11 0.9949 11 0.0028 0.9978 12 0.9978 12 0.0013 0.9991 13 0.9991 13 0.0006 0.9996 14 0.9996 14 0.0002 0.9998 15 0.9998 15 0.0001 0.9999 16 0.9999 AnexampleoftheeffectoffiniteMonteCarlostatisticsonP(o)isgiveninFig.3. Asisclear,if theMonteCarlousedtoderivethepredictionforthenumberofeventshasaneffectiveluminosityonlya factor3largerthanthedata,significantdifferencescanresultinprobabilitiesforobservedevents. Ifwe considertheexamplegiveninTable2,wheren = 10ands = 3,themeanvalueofν isν¯ = 3.5andthe finiteMCstatisticsgivesslightlydifferentresultsforoursets: OC = {1,2,3,4,5,6} OS = {1,2,3,4,5} 0.68 0.68 6 OC = {0,1,2,...,8} OS = {0,1,2,...,7} 0.95 0.95 OC = {0,1,2,...,13} OS = {0,1,2,...,12} . 0.999 0.999 As expected, the set of possible values of o has increased for a given 1−α probability due to the extra uncertaintyintroducedbythefinitenumberofMonteCarlocounts. 0.25 0.2 0.15 0.1 0.05 0 0 5 10 15 Fig.3: ComparisonofthedistributionsP(o|ν =10/3)andP(o|n=10,s=3.). 4 Predictionwithsystematicuncertainties If the predictions have systematic uncertainties, then this can also be taken into account in defining the probabilitiesfortheobservednumberofevents. Theprobabilitydensityforν willnowdependonextra quantities,andwewrite P(ν|n,s,θ(cid:126)) where θ(cid:126)is a set of nuisance parameters used to describe the systematic uncertainties (e.g., energy scale parameters). ItmaybepossibletodetermineP(ν|n,s,θ(cid:126))directlyfromMonteCarlosimulationswhere alsothesystematicallyuncertainquantitiesarevariedaccordingtotheirbeliefdistributions. Inthiscase, wewouldusethisinformationandhave: (cid:90) P(o|n,s,θ(cid:126)) = P(o|ν)P(ν|n,s,θ(cid:126))dν . In general, this integral will need to be solved numerically and the P(o|n,s,θ(cid:126)) then input into the pre- scriptionsabove. Inmanycases,wehaveafixednumberofMCeventsandthesameeventsareusedrepeatedlywith different assumptions for the uncertain quantities, so that the systematic uncertainty is on the scaling parameters. Inthiscase,wecanwrite (cid:90) P(ν|n,θ(cid:126)) = P(ν|n,s)P(s|θ(cid:126))ds wherethesystematicuncertaintyappearsasapdfforthescalefactor. Theprobabilitydistributionforo isnow (cid:90) (cid:20)(cid:90) (cid:21) P(o|n,θ(cid:126)) = P(o|ν) P(ν|n,s)P(s|θ(cid:126))ds dν . 7 Often,weassumethebeliefinvaluesforthescalefactorscanbemodeledasaGaussian: 1 −(s−s0)2 P(s|s0,σs) = √ e 2σs2 . 2πσ s Inthiscase,wewouldhave (cid:34) (cid:35) P(o|n,s0,σs) = (cid:90) e−ννo (cid:90) √4nn! s(sν)n−1/2e−sν√ 1 e−(s−2σss20)2ds dν . (7) o! π(2n)! 2πσ s This looks rather forbidding but can be solved numerically. Taking our standard example, we now add totheMCstatisticaluncertaintya30%systematicuncertaintyinthescalefactorsandfind3 theresults giveninTable3. AgraphicalpresentationoftheeffectonP(o)ofincludingsystematicuncertaintieson sisshowninFig.4. Table 3: Values of o, the probability to observe such a value given n = 10 and s = 3 and 30 % systematic uncertainty on s, together with the cumulative probability, rounded to four decimal places. The fourth column givestherankintermsofprobability-i.e.,theorderinwhichthisvalueofoisusedincalculatingthesmallestset OS ,andthelastcolumngivesthecumulativeprobabilitysummedaccordingtotherank. 1−α o P(o|n,s) F(o|n,s) R F (o|n,s) R 0 0.0539 0.0539 7 0.8492 1 0.1272 0.1811 4 0. 6111 2 0.1699 0.3511 2 0.3403 3 0.1703 0.5214 1 0.1703 4 0.1436 0.6650 3 0.4839 5 0.1083 0.7733 5 0.7194 6 0.0759 0.8492 6 0.7953 7 0.0509 0.9001 8 0.9001 8 0.0332 0.9333 9 0.9333 9 0.0214 0.9547 10 0.9547 10 0.0138 0.9685 11 0.9685 11 0.0090 0.9776 12 0.9776 12 0.0060 0.9835 13 0.9835 13 0.0040 0.9876 14 0.9876 14 0.0028 0.9904 15 0.9904 Theelementsofourdifferentsetsarenowgivenby OC = {1,2,3,4,5,6} OS = {1,2,3,4,5} 0.68 0.68 OC = {0,1,2,...,11} OS = {0,1,2,...,9} 0.95 0.95 OC = {0,1,2,...,44} OS = {0,1,2,...,33} . 0.999 0.999 The 68 % interval is now different between the two definitions, and very large values of o are allowedatprobability1−α = 0.999,inparticularforthecentralinterval. 3notethatwithsuchalargesystematicvariation,theGaussiandistributioninEq.7istruncatedandisrenormalized,leading toashiftinE[o]. 8 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 0 0 5 10 15 0 5 10 15 Fig.4: ComparisonofthedistributionP(o|n = 10,s = 3,σ = 0.3s )withP(o|ν = 10/3)(top)andP(o|ν = 0 s 0 10/3,σ =0.3s )toP(o|ν =10/3)(bottom). s 0 5 Summary We have described an alternative presentation of data for cases where the aim is to judge whether an observednumberofeventsisconsistentwiththemodelpredictions. Webelievethisstyleofpresentation ismoreappropriatethanthecommonone,whereerrorbarsareplacedontheobservednumberofevents. Examplecodesnippetsfortheexamplesdescribedherecanberequestedfromtheauthors. 6 Acknowledgments TheauthorswouldliketothankFrederikBeaujeanandKevinKröningerforusefuldiscussions. References [1] see, e.g., G. D’Agostini, Bayesian Reasoning in Data Analysis - A Critical Introduction, World Scientific,2003. [2] H.Jeffreys,AnInvariantFormforthePriorProbabilityinEstimationProblems,Proceedingsofthe RoyalSocietyofLondon.SeriesA,MathematicalandPhysicalSciences186(1946)453. [3] see, e.g., D. Zwillinger and S. Kokoska, S. CRC Standard Probability and Statistics Tables and Formulae,CRCPress.Press(2000). 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.