NISSUNAUMANAINVESTIGAZIONE SI PUO DIMANDARE VERASCIENZIA S’ESSA NON PASSA PER LE MATEMATICHE DIMOSTRAZIONI LEONARDO DAVINCI vol. 5 no. 1 2017 Mathematics and Mechanics of Complex Systems CANIO BENEDETTO, STEFANO ISOLA AND LUCIO RUSSO DATING HYPATIA’S BIRTH: A PROBABILISTIC MODEL msp MATHEMATICSANDMECHANICSOFCOMPLEXSYSTEMS Vol.5,No.1,2017 M ∩ M dx.doi.org/10.2140/memocs.2017.5.19 DATING HYPATIA’S BIRTH: A PROBABILISTIC MODEL CANIO BENEDETTO, STEFANO ISOLA AND LUCIO RUSSO Weproposeaprobabilisticapproachasadatingmethodologyforeventslikethe birthofahistoricalfigure. Themethodisthenappliedtothecontroversialbirth dateoftheAlexandrianscientistHypatia,provingtobesurprisinglyeffective. 1. Introduction 19 2. Aprobabilisticmethodforcombiningtestimonies 20 3. ApplicationtoHypatia 25 4. Conclusions 38 References 39 1. Introduction Although in historicalinvestigation itmay appear meaningless to doexperiments onthebasisofapreexistingtheory—andinparticular,itdoesnotmakesenseto provetheoremsofhistory—itcanmakeperfectsensetouseformsofreasoning typicaloftheexactsciencesasanaidtoincreasethedegreeofreliabilityofapar- ticularstatementregardingahistoricalevent. Thispaperdealswiththeproblemof datingthebirthofahistoricalfigurewhentheonlyinformationavailableaboutitis indirect—forexample,asetoftestimonies,orscatteredstatements,aboutvarious aspectsofhis/herlife. Thestrategyisthenbasedontheconstructionofaprobability distribution forthe birthdateout ofeach testimonyand subsequentlycombining thedistributionsso obtainedinasensible way. Onemightraise several objections to this program. According to Charles Sanders Peirce [1901], a probability “is theknownratiooffrequencyofaspecificeventtoagenericevent”,butabirthis neitheraspecificeventnoragenericeventbutan“individualevent”. Nevertheless, probabilistic reasoningis used quite oftenin situations dealingwith events that can beclassifiedas“individual”. Inprobabilisticforecasting,onetriestosummarize whatisknownabout futureeventswiththeassignmentofaprobability toeachof anumber ofdifferentoutcomesthat areoftenevents ofthiskind. Forinstance, in sport betting, a summary of bettors’ opinions about the likely outcome of a race CommunicatedbyFrancescodell’Isola. MSC2010: 01A20,62C05,62P99. Keywords: dating,probabilisticmethod,historicaltestimonies,decisiontheory,Hypatia. 19 20 CANIOBENEDETTO,STEFANOISOLAANDLUCIORUSSO isproducedinordertosetbookmakers’pay-offrates. Bytheway,thistypeofob- servationliesatthebasisofthetheoreticalformulationofthesubjectiveapproach in probability theory [de Finetti 1931]. Although we do not endorse de Finetti’s approachinallitsimplications,weembraceitsseverecriticismoftheexclusiveuse ofthefrequentistinterpretationintheapplicationofprobabilitytheorytoconcrete problems. In particular, we feel entitled to look at an “individual” event of the historicalpastwithaspiritsimilartothatwithwhichonebetsonafutureoutcome (this is a well known issue in the philosophy of probability; see, e.g., [Dubucs 1993]). Plainly, as the information about an event like the birth of an historical figure is first extracted by material drawn from various literary sources and then treatedwithmathematicaltools,bothourapproachandgoalareinterdisciplinary intheiressence. 2. Aprobabilisticmethodforcombiningtestimonies Let X = [x−,x+] ⊂ (cid:90) be the time interval that includes all possible birth dates of a given subject (terminus ad quem). X can be regarded as a set of mutually exclusive statements about a singular phenomenon (the birth of a given subject in a given year), only one of which is true, and can be made a probability space (X,F,P ), with F the σ-algebra made of the 2|X| events of interest and P the 0 0 uniformprobabilitymeasureonF(referencemeasure): P (A)=|A|/|X|(where 0 |A| denotes the number of elements of A). In the context of decision theory, the assignmentofthisprobabilityspace canberegardedastheexpressionofabasic stateofknowledge,intheabsenceofanyinformationthatcanbeusedtodiscrimi- nateamongthepossiblestatementson thegivenphenomenon,namelyasituation inwhichLaplace’sprincipleofindifferencecanbelegitimatelyapplied. Now suppose we have k testimonies T , i = 1,...,k, which in first approx- i imation we may assume independent of each other, each providing some kind of information about the life of the subject, and which can be translated into a probability distribution p on F so that p (x) is the probability that the subject i i is born in the year x ∈ X based on the information given by the testimony T , i assumedtrue,alongwithsupplementaryinformationsuchas,e.g.,lifetablesfor thehistoricalperiodconsidered. Theprecisecriteriafortheconstructionofthese probabilitydistributionsdependsonthekindofinformationcarriedbyeachtesti- mony andwillbe discussedcase bycasein thenext section. Ofcourse, weshall alsotake intoaccountthepossibilitythat sometestimoniesarefalse, therebynot producing any additional information. Wemodel this possibility by assuming that thecorrespondingdistributionsequalthereferencemeasure P . 0 Theproblemthatwewanttodiscussinthissectionisthefollowing: howcanone combinethedistributions p insuchawaytogetasingleprobabilitydistribution Q i DATINGHYPATIA’SBIRTH:APROBABILISTICMODEL 21 thatsomehowoptimizes theavailable information? Toaddress thisquestion, letus observe that from the k testimonies taken together, each one with the possibility to be true or false, one gets N =2k combinations, corresponding to as many bi- narywordsσ =σ (1)···σ (k)∈{0,1}k,whichcanbeorderedlexicographically s s s accordingtos =(cid:80)k σ (i)·2i−1∈{0,1,...,N −1},andgivenby i=1 s P (·)= (cid:81)ik=1 piσs(i)(·) , pσs(i)=(cid:26)pi, σs(i)=1, (2-1) s (cid:80)x∈X (cid:81)ik=1 piσs(i)(x) i P0, σs(i)=0. Inparticular,onereadilyverifiesthat P isbutthereferenceuniformmeasure. 0 Now,if(cid:127)denotestheclassofprobabilitydistributions Q : X →[0,1],welook forapoolingoperator T :(cid:127)N →(cid:127)thatcombinesthedistributions P byweighing s them in a sensible way. The simplest candidate has the general form of a linear combination N−1 N−1 (cid:88) (cid:88) T(P0,...,PN−1)= wsPs, ws ≥0, ws =1, (2-2) s=0 s=0 which, as we shall see, can also be obtained by minimizing some information- theoreticfunction. Remark2.1. Theissuewearediscussingherehasbeentheobjectofavastamount ofliteratureregardingthenormativeaspectsoftheformationofaggregateopinions in several contexts (see, e.g., [Genest and Zidek 1986] and references therein). In particular, it has been shown by McConway [1981] that, if one requires the existenceofafunction F :[0,1]N →[0,1]suchthat T(P0,...,PN−1)(A)= F(P0(A),...,PN−1(A)) forall A∈F (2-3) with P (A)=(cid:80) P (x), then whenever |X|≥3, F must necessarily have the s x∈A s formofalinearcombinationasin(2-2). Theaboveconditionimpliesinparticular that the value of the combined distribution on coordinates depends only on the corresponding values on the coordinates of the distributions P , namely that the s poolingoperatorcommuteswithmarginalization. However,somedrawbacksofthelinearpoolingoperatorhavealsobeenhigh- lighted. Forexample,itdoesnot“preserveindependence”ingeneral: if|X|≥5, itisnottruethat P (A∩B)= P (A)P (B),s =0,...,N −1,entails s s s T(P ,...,P )(A∩B)=T(P ,...,P )(A)T(P ,...,P )(B) 0 N−1 0 N−1 0 N−1 unless w =1 for some s and 0 for all others [Lehrer and Wagner 1983; Genest s andWagner1987]. (Anotherformofthepoolingoperatorconsideredintheliteraturetoovercome thedifficultiesassociatedwiththeuseof(2-2)isthelog-linearcombination 22 CANIOBENEDETTO,STEFANOISOLAANDLUCIORUSSO N−1 N−1 T(P0,...,PN−1)=C (cid:89) Psws, ws ≥0, (cid:88)ws =1, (2-4) s=0 s=0 whereC isanormalizingconstant[GenestandZidek1986;Abbas2009].) On the otherhand, inour context, the independence preservation propertydoes not seem so desirable: the final distribution T(P0,...,PN−1) relies on a set of information much wider than thatassociated with the single distributions P , and s one can easily imagine how the alleged independence between two events can disappearastheinformationaboutthemincreases. 2.1. Optimization. The linear combination (2-2) can also be viewed as the mar- ginal distribution1 of x ∈ X under the hypothesis that one of the distributions P0,...,PN−1 isthe“true”one(withoutknowingwhich)[GenestandMcConway 1990]. Inthisperspective,(2-2)canbeobtainedbyminimizingtheexpectedloss ofinformationduetotheneedtocompromise,namelyafunctionoftheform N−1 (cid:88) I(w,Q)= w D(P (cid:107) Q)≥0, (2-5) s s s=0 where (cid:88) (cid:18)P(x)(cid:19) D(P (cid:107) Q)= P(x)log (2-6) Q(x) x∈X istheKullback–Leiblerdivergence[1951],representingtheinformationlossusing the measure Q instead of P. Note that the concavity of the logarithm and the Jenseninequalityyield (cid:88) P(x) (cid:88) Q(x) − P(x)log ≤log P(x) =0 Q(x) P(x) x x andtherefore D(P (cid:107) Q)≥0 and D(P (cid:107) Q)=0 ⇐⇒ Q ≡ P. (2-7) Wehavethefollowingresult. Lemma2.2. Givenaprobabilityvectorw=(w0,w1,...,wN−1), (cid:88) argminI(w,Q)= Qw ≡ wsPs. (2-8) Q∈(cid:127) s Moreover, (cid:18) (cid:19) (cid:88) (cid:88) I(w,Qw)= H wsPs − wsH(Ps), (2-9) s s where H(Q)=−(cid:80) Q(x)logQ(x)istheentropyof Q ∈(cid:127). x∈X 1Inthesensethatamarginalprobabilitycanbeobtainedbyaveragingconditionalprobabilities. DATINGHYPATIA’SBIRTH:APROBABILISTICMODEL 23 Proof. Equation (2-8) can beobtained using themethod of Lagrange multipliers. Analternativeargumentmakesuseoftheeasilyderived“parallelogramrule”: (cid:80)wsD(Ps (cid:107) Q)=(cid:80)wsD(Ps (cid:107) Qw)+D(Qw (cid:107) Q) forall Q ∈(cid:127). (2-10) s s From (2-7), we thus get I(w,Qw)≤ I(w,Q) for all Q ∈(cid:127). The uniqueness of theminimumfollowsfromthe convexityof D(P (cid:107) Q)withrespectto Q. Finally, checking(2-9)isasimpleexercise. (cid:3) Remark2.3. Itisworthmentioningthat,ifwetook(cid:80) w D(Q (cid:107) P )(insteadof s s s (cid:80) w D(P (cid:107) Q))asthefunctiontobeminimized(stillvarying Q withw fixed), s s s theninsteadofthe “arithmeticmean”(2-2),the“optimal”distributionwould have beenthe“geometricmean”(2-4)(seealso[Abbas2009]). 2.2. Allocatingtheweights. Wehaveseenthatforeachprobabilityvectorwinthe N-dimensionalsimplex{ws ≥0:(cid:80)sN=−01ws =1}thedistribution Qw =(cid:80)swsPs isthe“optimal”one. Wearenowleftwiththeproblemofdeterminingasensible choiceforw. Thiscannotbeachievedbyusingthesamecriterion,inthatby(2-7) infw I(w,Qw)=0andtheminimumisrealizedwheneverws =1forsomes and 0forallothers. Asuitableexpressionfortheweightsw canbeobtainedbyobservingthatthe s term (cid:80) (cid:81)k pσs(i)(x) is proportional to the probability of the event (in the x∈X i=1 i productspace X[1,k])thatthebirthdatesofk differentsubjects,withthei-thbirth datedistributedaccordingto pσs(i),coincide,andthus,itfurnishesameasureofthe i degreeofcompatibilityofthedistributions p involvedintheproductassociated i withthewordσ . s Itthusappearsnaturaltoconsidertheweights (cid:80) (cid:81)k pσs(i)(x) w = x∈X i=1 i , (2-11) s (cid:80)N−1(cid:80) (cid:81)k pσs(i)(x) s=0 x∈X i=1 i which,onceinsertedin(2-2),yieldtheexpression (cid:80)N−1(cid:81)k pσs(i)(·) T(P0,...,PN−1)(·)= (cid:80) s=(cid:80)0N−1i=(cid:81)1k i pσs(i)(x). (2-12) x∈X s=0 i=1 i Remark2.4. Thereareatleastk+1strictlypositivecoefficientsw . Theycorre- s spondtothewordsσ(i) withσ(i)(i)=1forsomei ∈{1,...,k}andσ(i)(j)=0for s s s j(cid:54)=i,plusonetotheword0k,thatis,tothedistributions Ps(i)≡pi,i∈{0,1,...,k}, where p ≡ P . 0 0 2.3. Weightsaslikelihoods. Asomewhatcomplementaryargumenttojustifythe choice (2-11) for the coefficients w can be formulated in the language of prob- s abilisticinference,showingthattheycanbeinterpretedas(normalized)average 24 CANIOBENEDETTO,STEFANOISOLAANDLUCIORUSSO likelihoodsassociatedwiththevariouscombinationscorrespondingtothewordsσ . s Moreprecisely,witheachpairof“hypotheses”oftheform (cid:26){T true}, e=1, De = i i {T false}, e=0, i we associate its likelihood, given the event that the birth date is x ∈ X, with the expression2 P(x | De) (cid:26)p (x)/p (x), e=1, V(De |x)= i = i 0 (2-13) i P(x) 1, e=0, with i ∈{1,...,k} and p ≡ P . In this way, the posterior probability P(De |x) 0 0 i (the probability of De in light of the event that the subject was born in the year i x ∈ X) is given by the product of V(De | x) with the prior probability P(De), i i accordingtoBayes’sformula. Ifwenowconsidertwopairsof“hypotheses” Dei and Dej,whichweassume i j conditionallyindependent(withoutbeingnecessarilyindependent),thatis, P(Dei,Dej |x)= P(Dei |x)P(Dej |x), e ,e ∈{0,1}, i j i j i j thenwefind P(x | Dei,Dej) P(Dei,Dej |x) P(Dei |x)P(Dej |x) P(Dei,Dej |x)= i j = i j = i j i j P(x) P(Dei,Dej) P(Dei,Dej) i j i j P(Dei)P(Dej) = i j ·V(Dei |x)V(Dej |x). P(Dei,Dej) i j i j More generally, given k testimonies T , to each of which there corresponds the i pair of events De, and given a word σ ∈ {0,1}k, if we assume the conditional i s independenceoftheevents(Dσs(1),...,Dσs(k)),weget 1 k k V(Dσs(1),...,Dσs(k)|x)=ρ (cid:89)V(Dσs(i)|x) (2-14) 1 k s i i=1 where (cid:81)k P(Dσs(i)) ρ = i=1 i . (2-15) s P(Dσs(1),...,Dσs(k)) 1 k If,inaddition,thereisgroundstoassumeunconditionalindependence,i.e.,ρ =1, s then (2-14) simply reduces to the product rule. Under this assumption, we can 2Herethesymbol P denoteseitherthereferencemeasure P0 oranyprobabilitymeasureon X compatiblewithit. DATINGHYPATIA’SBIRTH:APROBABILISTICMODEL 25 evaluatetheaveragelikelihood ofthesetofinformation(Dσs(1),...,Dσs(k))with 1 k theexpression k V = 1 (cid:88)V(Dσs(1),...,Dσs(k)|x)=|X|k−1(cid:88)(cid:89) pσs(i)(x). (2-16) s |X| 1 k i x∈X x∈Xi=1 Comparingwith(2-11),weseethat V w = s . (2-17) s N−1 (cid:80) V s s=0 Inotherwords,withinthehypothesesmadesofar,theallocationofthecoefficients (2-11)correspondstoassigningtoeachdistribution P aweightproportionaltothe s averagelikelihoodofthesetofinformationfromwhichitisconstructed. 3. ApplicationtoHypatia Thismethodisnowappliedtoaparticulardatingprocess,theoneofHypatia’sbirth. Thischoicestemsfromthedesiretostudyacasebotheasytohandleandpotentially usefulinitsresults. TheproblemofdatingHypatia’sbirthisindeedopen,inthat therearedifferentpossibleresolutionsoftheconstraintsimposedbytheavailable data. According to the reconstruction given by Deakin [2007, p. 51], “Hypatia’s birth has been placed as early as 350 and as late as 375. Most authors settle for ‘around370’”. Therearenotmanytestimonies(historicalrecords)concerningthe birthofthe Alexandrian scientist(farmoreareabout herinfamousdeath),butthey havethedesirablefeatureofbeingindependentofoneanother,aswillbeapparent inthesequel,sothattheschemediscussedintheprevioussectioncanbedirectly applied. The hope is to obtain something that is qualitatively significant when compared to the preexisting proposals, based on a qualitative discussion of the sources, and quantitatively unambiguous. A probability distribution for the year of Hypatia’s birth is extracted from each testimony, the specific reasoning being brieflydiscussedin eachcase. Eventually alldistributions arecombinedaccording tothecriteriaoutlinedintheprevioussection. 3.1. Hypatia was at her peak between 395 and 408. Under the entry ῾Υπατία, the Suda (a Byzantine lexicon) informs us that she flourished under the emperor Arcadius(ἤκμασεν ἐπί τῆς βασιλείας ᾿Αρκαδίου).3 It is well established that Arcadius, the first ruler of the Byzantine Empire, reigned from 395 to 408. Guessing an age or age interval based on the Greek ἤκμασεν, however, is less straightforward. The word is related to ἀκμή, ‘peak’, 3ϒ166.Seehttp://www.stoa.org/sol-bin/search.pl?field=adlerhw_gr&searchstr=upsilon,166. 26 CANIOBENEDETTO,STEFANOISOLAANDLUCIORUSSO 0.2 uit or fls 0.15 a’ ati p y H 0.1 of y abilit 0.05 b o Pr 0 35 40 45 Age Figure 1. The probability distribution f(x) assumed associated withone’speakyears. 0.08 h birt s 0.06 a’ ati p y H 0.04 of y abilit 0.02 b o Pr 0 350 354 359 364 369 373 Year Figure2. The probability distribution ϒ (ξ) for Hypatia’s birth f basedonherpeakyears. and we follow the rule of thumb, going back to Antiquity, that it refers to the periodofone’slifearound40yearsofage. Specifically,weadopttheprobability distribution f(x) in Figure 1 to model how old Hypatia would have been at her “peak”inArcadius’reign. Figure2showsϒ (ξ),theprobabilitydistributionfortheyearofHypatia’sbirth f deduced fromthis historicaldatum; it isobtained byaveraging fourteen copiesof thetriangular f(x),eachcenteredaroundoneoftheyearsfrom355through368— the beginning and end points of Arcadius’s empire, shifted back by the 40 years correspondingtothepeakof f(x). 3.2. Hypatiawasintellectuallyactivein415. ThesourcesascribeHypatia’smar- tyrdom at the hands of a mob of Christian fanatics to the envy that many felt on account of her extraordinary intelligence, freedom of thought, and political influence,beingawoman. HerentryintheSuda,alreadymentioned,states: DATINGHYPATIA’SBIRTH:APROBABILISTICMODEL 27 Τοῦτο δὲ πέπονθε διὰ φθόνον καὶ τὴν ὑπερβάλλουσαν σοφίαν, καὶ μάλιστα εἰς τὰ περὶ ἀστρονομίαν.4 SocratesScholasticus,inhisΕκκλησιαστική Ιστορία,reports: On accountof the self-possession andease of manner, whichshe had ac- quiredinconsequenceofthecultivationofhermind,shenotinfrequently appeared in public in presence of the magistrates. Neither did she feel abashed in coming to an assembly of men. For all men on account of herextraordinarydignityandvirtueadmiredherthemore. Yetevenshe fellavictim tothepoliticaljealousy whichatthattime prevailed. For as she hadfrequent interviews with Orestes,it was calumniously reported among the Christian populace, that it was she who prevented Orestes frombeingreconciledtothebishop.5 Becauseoftheseandsimilartestimonies,itseemsreasonabletomark415asayear ofintellectualactivityinHypatia’slife. To get from this information a probability distribution for the year of birth, it isnecessarytohavetheprobabilitydistributionofbeingintellectuallyactiveata givenage. Thiscanbecalculatedgiventheprobabilityofbeingaliveatanygiven ageandofbeingactiveatanygivenage(ifalive),bysimplemultiplication. To derive the first of these probability distributions we have used data from a 1974mortalitytableforItalianmales,6 clippingoff agesunder18sincethesubject wasknown tobeintellectuallyactive. The resulting probabilitydistribution,a(x), isshowninFigure3. 1 e v ali g 0.8 n ei b atia 0.6 p y H of 0.4 y bilit a b 0.2 o Pr 0 18 104 Age Figure3. Theprobabilitydistributiona(x)foranadulttoreach agivenage. Thelifeexpectancycomesto71.8years. 4Shesufferedthis[violentdeath]becauseoftheenvyforherextraordinarywisdom,especiallyin thefieldofastronomy. 5BookVII,Chapter15;translationfrom[SocratesScholasticus,p.160]. 6Alldataaretakenfromhttp://www.mortality.org.
Description: