Features modeling with an α-stable distribution: application to pattern recognition based on continuous belief functions AnthonyFichea,∗,Jean-ChristopheCexusa,ArnaudMartinb,AliKhenchafa aLabSticc,UMR6285,ENSTABretagne,2rueFranc¸oisVerny,29806BrestCedex9,France bUMR6074IRISA,IUTLannion/Universite´deRennes1,rueE´douardBranlyBP30219,22302LannionCedex,France Abstract The aim of this paper is to show the interest in fitting features with an α-stable distribution to classify imperfect 5 data. Thesupervisedpatternrecognitionisthusbasedonthetheoryofcontinuousbelieffunctions,whichisawayto 1 considerimprecisionanduncertaintyofdata. Thedistributionsoffeaturesaresupposedtobeunimodalandestimated 0 2 byasingleGaussianandα-stablemodel.Experimentalresultsarefirstobtainedfromsyntheticdatabycombiningtwo featuresofonedimensionandbyconsideringavectoroftwofeatures.Massfunctionsarecalculatedfromplausibility n a functionsbyusingthegeneralizedBayestheorem. Thesamestudyisappliedtotheautomaticclassificationofthree J typesofseafloor(rock,siltandsand)withfeaturesacquiredbyamono-beamecho-sounder. Weevaluatethequality 2 oftheα-stablemodelandtheGaussianmodelbyanalyzingqualitativeresults,usingaKolmogorov-Smirnovtest(K-S 2 test),andquantitativeresultswithclassificationrates. Theperformancesofthebeliefclassifierarecomparedwitha Bayesianapproach. ] I A Keywords: Gaussianandα-stablemodel,unimodalfeatures,continuousbelieffunctions,supervisedpattern recognition,Kolmogorov-Smirnovtest,Bayesianapproach. . s c [ 1 v 1. Introduction 2 1 The choice of a model has an important role in the problem of estimation. For example, the Gaussian model is 6 a very efficient model which fits data in many applications as it is very simple to use and saves computation time. 5 However,asisthecaseforalldistributionmodels,Gaussianlawshavesomeweaknessesandresultscanend-upbeing 0 skewed. Indeed,astheGaussianprobabilitydensityfunction(pdf)issymmetrical,itisnotvalidwhenthepdfisnot 1. symmetrical. It is therefore difficult to choose the right model which fits data for each application. The Gaussian 0 distribution belongs to a family of distributions called stable distributions. This family of distributions allows the 5 representationofheavytailsandskewness. Adistributionissaidtohaveaheavytailifthetaildecaysslowerthanthe 1 tailoftheGaussiandistribution. Therefore,thepropertyofskewnessmeansthatitisimpossibletofindamodewhere : v probabilitydensityfunctionissymmetrical. ThemainpropertyofstablelawsintroducedbyLe´vy[1]isthatthesum Xi oftwoindependentstablerandomvariablesgivesastablerandomvariable. α-stabledistributionsareusedindifferent fieldsofresearchsuchasradar[2,3],imageprocessing[4]orfinance[5,6],... r a The aim of this study is to show the interest in fitting data with an α-stable distribution. In [7], the author pro- poses to characterize the sea floor from a vector of features modeled by Gaussian mixture models (GMMs) using an Autonomous Underwater Vehicle (AUV). However, the problem with the Bayesian approach is the difficulty in consideringtheuncertaintyofdata. Wethereforefavoredtheuseofanapproachbasedonthetheoryofbelieffunc- tions[8,9]. In[10],theauthorscomparedaBayesianandbeliefclassifierwheredatafromsensorsaremodeledusing GMMsestimatedviaanExpectation-Maximization(EM)algorithm[11]. Thisworkhasbeenextendedtodatamod- eledbyα-stablemixturemodels[12]. However,itisdifficulttochoosebetweenthesetwomodelsbecausetheresults ∗Correspondingauthor. Emailaddresses:[email protected](AnthonyFiche),[email protected] (Jean-ChristopheCexus),[email protected](ArnaudMartin),[email protected](AliKhenchaf) PreprintsubmittedtoInformationFusion January23,2015 areroughlythesame. Thispaperraisestwoproblems. Firstly,itisnecessarytoworkonadatasetwherefeaturesare modeledbyα-stablepdfs. Thesecondproblemistoknowhowtoapplythetheoryofbelieffunctionswhendatafrom sensorsaremodeledbyα-stabledistributions. Thispointhasbeendealtwithin[13]. The recent characterization of uncertain environments has taken an important place in several fields of applica- tion such as the SOund Navigation And Ranging (SONAR). SONAR has therefore been used for the detection of underwatermines[14]. In[15],theauthordevelopedtechniquestoperformautomaticclassificationofsedimentson theseabedfromsonarimages. In[10],theauthorsclassifiedsonarimagesbyextractingfeaturesandmodelingthem withGMMs. In[16],theauthorscharacterizedtheseafloorusingdatafromamono-beamecho-sounder. Adataset representsanechosignalamplitudeaccordingtotime.Theycomparethetimeenvelopeofechosignalamplitudewith asetoftheoreticalreferencecurves. Inthispaper,webuildaclassifierbasedonthetheoryofbelieffunctionswhere avectoroffeaturesextractedfromamono-beamecho-sounderismodeledbyaGaussianandanα-stabledistribution. The feature pdfs have only one mode. We finally show that it would be interesting to fit features with an α-stable model. Thispaperisorganizedasfollows: wefirstintroducethedefinitionofstability,themethodtoconstructα-stable probability density functions and the methods of estimation in Section 2. The definitions in the multivariate case arealsopresented. WeintroducethenotionofbelieffunctionsindiscreteandcontinuouscasesinSection3. Belief functions are then calculated in the particular case of α-stable distributions and a belief classifier is constructed in Section 4. We finally classify data by modeling data with Gaussian and α-stable distributions before the generated dataareclassifiedbycomparingthetheoryofbelieffunctionswithGaussianandα-stabledistributionsinSection5. WecomparetheresultsobtainedwiththetheoryofbelieffunctionsusingaBayesianapproach. 2. Theα-stabledistributions Gaussintroducedtheprobabilitydensityfunctioncalled“Gaussiandistribution”(inhonorofGauss)inhisstudy of astronomy. Laplace and Poisson developed the theory of characteristic function by calculating the analytical ex- pressionofFouriertransformofaprobabilitydensityfunction.LaplacefoundthattheFouriertransformofaGaussian lawisalsoaGaussianlaw. CauchytriedtocalculatetheFouriertransformofa“generalizedGaussian”functionwith theexpression f (x)= 1(cid:82)+∞exp(−ctn)cos(tx)dt,buthedidnotsolvetheproblem.Whentheintegernisarealα,we n π 0 definethefamilyofα-stabledistributions. However,Cauchydidnotknowatthattimeifhehaddefinedaprobability density function. With the results of Polya and Bernstein, Janicki and Weron [17] demonstrated that the family of α-stablelawsareprobabilitydensityfunctions. ThemathematicianPaulLe´vystudiedthecentrallimittheoremand showed,withtheconstraintofinfinitevariance,thatthelimitlawisastablelaw[1]. Motivatedbythisproperty,Le´vy calculatedtheFouriertransformforallα-stabledistributions. Thesestablelawswhichsatisfythegeneralizedcentral limittheoremareinterestingastheyallowthemodelingofdataasimpulsivenoisewhentheGaussianmodelisnot valid. Anotherpropertyoftheselawsistheabilitytomodelheavytailsandskewness. Inthefollowing,wedefinethe notionsofstability,characteristicfunction,thewaytobuildapdffromcharacteristicfunction,thedifferentestimators ofα-stabledistributionsandtheextensionofα-stabledistributiontothemultivariatecase. 2.1. Definitionofstability Stability is when the sum of two independent random variables which follow α laws, gives an α law. Math- ematically, this definition means the following: a random variable X is stable, denoted X ∼ S (β,γ,δ), if for all α (a,b)∈R+ × R+,therearec∈R+andd∈Rsuchthat: aX +bX =cX+d, (1) 1 2 with X and X twoindependentα-stablerandomvariableswhichfollowthesamedistributionas X. IfEquation(1) 1 2 definesthenotionofstability,itdoesnotgiveanyindicationastohowtoparameterizeanα-stabledistribution. We thereforeprefertousethedefinitiongivenbycharacteristicfunctiontorefertoanα-stabledistribution. 2 2.2. Characteristicfunction Severalequivalentdefinitionshavebeensuggestedintheliteraturetoparameterizeanα-stabledistributionfrom itscharacteristicfunction[18,19]. Zolotarev[19]proposedthefollowing: πα φSα(β,γ,δ)(t)= eexxpp((jjttδδ−−||γγtt||α[1[1++jβjβ2tsaing(n(2t))losigg|nt|(]t))(|t|1−α−1)]) iiffαα(cid:44)=11,, (2) π whereeachfeaturehasspecificvalues: • α∈]0,2]isthecharacteristicexponent. • β∈[−1,1]istheskewnessparameter. • γ∈R+∗representsthescaleparameter. • δ∈Risthelocationparameter. The advantage of this parameterization compared to [18] is that the values of the characterization and probability density functions are continuous for all parameters. In fact, the parameterization defined by [18] is discontinuous whenα=1andβ=0. 2.3. Theprobabilitydensityfunction The representation of an α-stable pdf, denoted f , is obtained by calculating the Fourier transform of its Sα(β,γ,δ) characteristicfunction: (cid:90) ∞ f (x)= φ (t)exp(−jtx)dt. (3) Sα(β,γ,δ) Sα(β,γ,δ) −∞ However, thisdefinitionisproblematicfortworeasons: whiletheintegrandfunctioniscomplex, itsboundsare infinite. Nolan[20]thereforeproposedawaytorepresentnormalizedα-stabledistributions(i.e. γ = 1andδ = 0). The main idea of Nolan [20] is to use variable modifications so that the integral has finite bounds. Each parameter hasaninfluenceontheshapeofthe f . Thecurvehasalargepeakwhenαisnear0andaGaussianshapewhen Sα(β,γ,δ) α = 2 (Figure 1(a)). The shape of the distribution skews to the left if β = 1, to the right if β = −1 (Figure 1(b)) whilethedistributionissymmetricalwhenβ=0. Finally,thescaleparameterenlargesorcompressestheshapeofthe distribution(Figure1(c))andthelocationparameterleadstothetranslationofthemodeofthe f (Figure1(d)). Sα(β,γ,δ) 2.4. Anoverviewofα-stableestimators Theestimatorsofα-stabledistributionsaredecomposedintothreefamilies: • thesamplequantilemethods[21,22] • thesamplecharacteristicfunctionmethods[23,24,25,26] • theMaximumLikelihoodEstimation[27,28,29,30]. FamaandRoll[21]developedamethodbasedonquantiles. However,thealgorithmproposedbyFamaandRoll suffersfromasmallasymptoticbiasinαandγandrestrictionsonα ∈]0.6,2]andβ = 0. McCulloch[22]extended thequantilemethodtotheasymmetriccase(i.e. β=0). TheMcCullochestimatorisvalidforα∈]0.6,2]. Press [23, 24] proposed a method based on transformations of characteristic function. In [31], the author com- paredtheperformancesofseveralestimators. Forexample,thePressmethodisefficientforspecificvalues. Koutrou- velis[25]extendedthePressmethodandproposedaregressionmethodtoestimatetheparametersofα-stabledistri- butions. Heproposedasecondversionofhisalgorithmwhichisdistinctthatitisiterative[26]. In[32],theauthors provedthatthemethodproposedbyKoutrouvelisisbetterthanboththequantilemethodandthePressmethodbecause itgivesconsistentandasymptoticallyunbiasedestimates. TheMaximumLikelihoodEstimation(MLE)wasfirststudiedinthesymmetriccase[27,28]. Dumouchel[29] developed an approximate maximum likelihood method. The MLE was also developed in the asymmetric case. 3 Nolan[30]extendedtheMLEingeneralcase. TheproblemwiththeMLEisthecalculationoftheα-stableproba- bility density function because there is no closed-form expression. Moreover, the computational algorithm is time- consuming. Consequently,weestimateaunivariateα-stabledistributionusingtheKoutrouvelismethod[26]. 2.5. Multivariatestabledistributions Itispossibletoextendtheα-stabledistributionstothemultivariatecase. ArandomvectorX ∈ Risstableiffor alla,b∈R+,therearec∈R+andD∈Rd suchthat: aX +bX =cX+D, (4) 1 2 whereX andX aretwoindependentandidenticallydistributedrandomvectorswhichfollowthesamedistribution 1 2 asX. Thecharacteristicfunctionofanmultivariateα-stabledistribution,denotedX∼S (σ,δ),hastheform: α,d (cid:32) (cid:90) (cid:33) φSα,d(σ,δ)(t)= eexxpp(cid:32)−−(cid:90)Sd||<<tt,,ss>>||α(1(1+−jπ2jssggnn((<<tt,,ss>>))talnn((<π2αt,)sσ>(d))sσ)(+dsj)<+δj,<t>δ,t>(cid:33) iiffαα=(cid:44)11,, (5) Sd with • Sd ={x∈Rd|||x||=1}thed-dimensionalunitsphere. • σ(.)afiniteBorelmeasureonSd. • δ,t∈Rd. TheexpressionforthecharacteristicfunctioninvolvesanintegrationovertheunitsphereSd. Themeasureσiscalled thespectralmeasureandδiscalledthelocationparameter. Theproblemwithamultivariateα-stabledistributionis that characteristic function forms a non-parametric set. To avoid this problem, it is possible to consider a discrete spectralmeasure[33]whichhastheform: (cid:88)K σ(.)= γδ (.), (6) i si i=1 withγ correspondingtoweightandδ theDiracmeasureins. Foranα-stabledistributioninR2withKmasspoints, thequaintitys ={cos(θ),sin(θ)}withsiθ = 2π(i−1). i i i i i K Therearetwomethodstoestimateanα-stablerandomvector: • thePROJectionmethod[34](PROJ) • theEmpiricalCharacteristicFunctionmethod[35](ECF) Thesetwoalgorithmshavethesameperformancesintermsofestimationandcomputationtime. At this point, it is important to underline that the features extracted from mono-beam echo-sounders have the propertiesofheavy-tailsandskewness. Consequently,wedecidedtoestimatethedatawithanα-stablemodel. These data are however imprecise and uncertain: the imprecision and uncertainty of data can be linked to poor quality estimationofthesedata. Totaketheseconstraintsintoconsideration,weusedanuncertaintheorycalledthetheoryof belieffunctions. Consequently,thegoalofthenextsectionistopresentthetheoryofbelieffunctions. 3. Thebelieffunctions The final objective of this paper is to classify synthetic and real data using the theory of belief functions. Data obtainedfromsensorsaregenerallyimpreciseanduncertainasnoisescandisrupttheiracquisition. Itispossibleto considertheseconstraintsbyusingthetheoryofbelieffunctions. Wewillfirstdevelopthetheoryofbelieffunctions withinadiscreteframeworkbeforecharacterizingitinrealnumbers. 4 3.1. Discretebelieffunctions Thissectionoutlinesbasictoolsinrelationtothetheoryofbelieffunctions. 3.1.1. Definitions Discrete belief functions were introduced by Dempster [8], and formalized by Shafer [9] where he considers a discretesetofnexclusiveeventsC calledtheframeofdiscernment: i Θ={C ,...,C }. (7) 1 n Θcanbeinterpretedasalltheassumptionstoaproblem. Belieffunctionsaredefinedas2Θ onto[0,1]. Theobjective ofdiscretebelieffunctionsistoattributeaweightofbelieftoeachelementA∈2Θ. Belieffunctionsmustfollowthe normalization: (cid:88) mΘ(A)=1, (8) A⊆Θ wheremΘiscalledthebasicbeliefassignment(bba). AfocalelementisasubsetofAwheremΘ(A) > 0andseveralfunctionsinone-to-onecorrespondencearebuiltfrom bba: (cid:88) belΘ(A) = mΘ(B), (9) B⊆A,B(cid:44)∅ (cid:88) plΘ(A) = mΘ(B), (10) A∩B(cid:44)∅ (cid:88) qΘ(A) = mΘ(B). (11) B⊂Θ,B⊇A ThecredibilityfunctionofAcalledbelΘ(A)isalltheelements B ⊆ AwhichbelievepartiallyinA. Thisfunction canbeinterpretedasaminimumofbeliefinA.Onthecontrary,aplausibilityfunctionplΘ(A)illustratesthemaximum beliefinAwhilecommonalityfunctionqΘ(A)representsthesumofbbaallocatedtothesupersetofA. Thisfunction isveryusefulasshownbelow. 3.1.2. Combinationrule We consider M different experts who give mass mΘ(i = 1,...,M) on each element A ⊆ Θ. Alternatively, it is i possibletocombinethemusingcombinationrules. Thereareseveralcombinationrules[36]intheliteraturewhich differentlyaddressconflictsbetweensources. Themostcommonruleistheconjunctivecombination[37]wherethe resultantmassofAisobtainedby: (cid:88) (cid:89)M mΘ(A)= mΘ(B) ,∀A∈2Θ. (12) i i B1∩...Bn=A(cid:44)∅ i=1 Themassoftheemptysetisgivenby: (cid:88) (cid:89)M mΘ(∅)= mΘ(B) ,∀A∈2Θ. (13) i i B1∩...Bn=∅ i=1 Thisruleallowsustostayintheopenworld. However,thisisnotpracticalascalculationsaredifficult. Itispossible tocalculateresultantmasswithcommunalityfunctionswhereeachmassmΘ(i=1,...,M)mustbeconvertedintoits i communalityfunctiontocalculatetheresultantcommunalityfunctionasfollows: (cid:89)M qΘ(A)= qΘ(A). (14) i i=1 ThefinalmassisobtainedbycarryingouttheinverseoperationofEquation(11) [9]. 5 3.1.3. Pignisticprobability To make a decision on Θ, several operators exist such that maximum credibility or maximum plausibility with pignistic probability being the most commonly used operator [38]. This name comes from pignus, a bet, in Latin. Thisoperatorapproachesthepair(bel,pl)byuniformlysharingamassoffocalelementsoneachsingletonC. This i operatorisdefinedby: (cid:88) mΘ(A) betP(C)= , (15) i |A|(1−mΘ(∅)) A⊂Θ,Ci∈A where|A|representsthecardinalityofA. WechoosethedecisionC byevaluating maxbetP(C ). i k 1≤k≤n 3.2. Continuousbelieffunctions The basic description of continuous belief functions was accomplished by Shafer [9], then by Nguyen [39] and Strat[40]. Recently,Smets[41]extendedthedefinitionofbelieffunctionstothesetofrealsR=R∪{−∞,+∞}and massesareonlyattributedtointervalsofR. 3.2.1. Definitions LetusconsiderI = {[x,y],(x,y],[x,y),(x,y);x,y ∈ R}asasetofclosed,half-openedandopenedintervalsofR. FocalelementsareclosedintervalsofR. ThequantitymI(x,y)arebasicbeliefdensitieslinkedtoaspecificpdf. If x>y,thenmI(x,y)=0. Withthesedefinitions,itispossibletodefinethesamefunctionsasinthediscretecase. The interval[a,b]beingasetofRwitha≤b,thepreviousfunctionscanbedefinedasfollows: (cid:90) x=b(cid:90) y=b belR([a,b]) = mI(x,y)dydx, (16) x=a y=x (cid:90) x=b (cid:90) y=+∞ plR([a,b]) = mI(x,y)dydx, (17) x=−∞ y=max(a,x) (cid:90) x=a (cid:90) y=+∞ qR([a,b]) = mI(x,y)dydx. (18) x=−∞ y=b 3.2.2. Pignisticprobability Thedefinitionofpignisticprobabilityfora<bis: (cid:90) x=+∞(cid:90) y=+∞ |[a,b]∩[x,y]| Betf([a,b])= mI(x,y)dxdy. (19) x=−∞ y=x |[x,y]| Itispossibletocalculatepignisticprobabilitiestohavebasicbeliefdensities. However, manybasicbeliefdensities existforthesamepignisticprobability. Toresolvethisissue,wecanusetheconsonantbasicbeliefdensity. Abasic beliefdensityissaidtobe“consonant”whenfocalelementsarenested. FocalelementsI canbelabeledasanindex u u such that I ⊆ I(cid:48) with u(cid:48) > u. This definition is used to apply the least commitment principle, which consists in u u choosingtheleastinformativebelieffunctionwhenabelieffunctionisnottotallydefinedandisonlyknowntobelong toafamilyoffunctions. Theleastcommitmentprinciplereliesonanorderrelationbetweenbelieffunctionsinorder todetermineifabelieffunctionismoreorlesscommittedthananother. Forexample,itispossibletodefineonorder basedonthecommonalityfunction: (∀A⊆Θ,qΘ(A)≤qΘ(A))⇔(mΘ ⊆ mΘ). (20) 1 2 1 q 2 ThemassfunctionmΘislesscommittedthanmΘaccordingtothecommonalityfunction. 2 1 Thefunction Betf canbeinducedbyasetofisopignisticbelieffunctionsBiso(Betf). Manypapers[41,42,43] deal with the particular case of continuous belief functions with nested focal elements. For example, Smets [41] 6 proved that the least committed basic belief assignment mR for the commonality ordering attributed to an interval I =[x,y]withy> xrelatedtoabell-shaped1pignisticprobabilityfunctionisdeterminedby2: mR([x,y])=θ(y)δ (x−ζ(y)), (21) d withx=ζ(y)satisfyingBetf(ζ(y))= Betf(y)andθ(y): dBetf(y) θ(y)=(ζ(y)−y) . (22) dy The resultant basic belief assignment mR is consonant and belongs to the set Biso(Betf). However, it is difficult to buildbelieffunctionsintheparticularcaseofmultimodalpdfsbecausetheframeofdiscernmenthasconnectedsets. In[44,45],theauthorsproposeawaytobuildbelieffunctionswithconnectedsetsbyusingacredalmeasureandan indexfunction. 3.3. Credalmeasureandindexfunction In[44],theauthorsproposeawaytocalculatebelieffunctionsfromanyprobabilitydensityfunction. Theyusean indexfunction f andaspecificindexspaceItoscanthesetoffocalelementsF: fI : I → F, (23) y (cid:55)−→ fI(y). (24) (cid:82) The authors introduce a positive measure µΩ such that dµΩ(y) ≤ 1 describes unconnected sets. The pair (fI,µΩ) I definesabelieffunction. ForallA∈Ω,theydefinesubsetswhichbelongtotheBorelset: F = {y∈ I|fI(y)⊆ A}, (25) ⊆A F = {y∈ I|fI(y)∩A(cid:44)∅}, (26) ∩A F = {y∈ I|fI(y)⊇ A}. (27) ⊇A Fromthesedefinitions,theycomputebelieffunctions: (cid:90) belΩ(A) = dµΩ(y), (28) (cid:90)F⊆A plΩ(A) = dµΩ(y), (29) (cid:90)F∩A qΩ(A) = dµΩ(y). (30) F⊇A Theycontinuetheirstudybyconsideringconsonantbelieffunctions. ThesetoffocalelementsF mustbeordered fromtheoperator⊆. Theydefineanindexfunction f fromR+toF suchthat: y≥ x=⇒ f(y)⊆ f(x). (31) TheygenerateconsonantsetsbyusingacontinuousfunctiongfromRd toI =[0,α [. Theα-cutsaretheset: max fI ={x∈Rd|g(x)≥α}. (32) cs Theyfinallydefinetheindexfunction: fI : I =[0,α ] → {fI(α)|α∈ I}, (33) cs max cs α (cid:55)−→ fI(α). (34) cs 1i.e.thepdfisunimodalwithamodeµ,continuousandstrictlymonotonousincreasing(decreasing)atleft(right)ofthemode. 2δdreferstotheDirac’smeasure. 7 TheinformationavailableistheconditionalpignisticdensityBetf. However,manybasicbeliefdensitiesexistforthe samepignisticprobability Betf. Theleastcommitmentprincipleallowstheleastinformativebasicbeliefdensityto bechosen,wherethefocalelementsaretheα-cutsofBetf suchthat3: dµΩ(y)(α)=λ(fI(α))dλ(α). (35) cs 4. Continuousbelieffunctionsandα-stabledistribution In this section, we model data distributions as a single α-stable distribution. We must however introduce the notionofplausibilityfunctionforanα-stabledistribution. Wefirstdescribehowtocalculatetheplausibilityfunction knowingthepdfinRbeforeweextendtoinRd. Wefinallyexplainhowweconstructourbeliefclassifier. 4.1. LinkbetweenpignisticprobabilityfunctionandplausibilityfunctioninR The information available is the conditional pignistic density Betf[C] with C ∈ θ. The function Betf[C] is i i i supposedtobebell-shapedforallα-stabledistributions(provedbyYamazato[46]). TheplausibilityfunctionfromamassmRisobtainedbyanintegralofEquation(22)between[x,+∞[: (cid:90) +∞ dBetf(t) plR[C](I)= (ζ(t)−t) dt. (36) i dt x ByassumingthatBetf issymmetrical,anintegrationbypartscansimplifyEquation(36): (cid:90) +∞ plR[C](I)=2(x−µ)Betf(x)+2 Betf(t)dt. (37) i x Nowletusconsideraparticularcasewheresymmetrical Betf isanα-stabledistribution(theparameterβ = 0). We alreadyknowthatBetf(x)= f (x). Wecancalculate(cid:82)+∞Betf(t)dtbyusingtheChasles’theorem: Sα(β,γ,δ) x (cid:90) +∞ (cid:90) x (cid:90) +∞ f (t)dt= f (t)dt+ f (t)dt. (38) Sα(β,γ,δ) Sα(β,γ,δ) Sα(β,γ,δ) −∞ −∞ x By definition, a pdf has the quantity (cid:82)+∞ f (t)dt = 1 and (cid:82) x f (t)dt represents the definition of the −∞ Sα(β,γ,δ) −∞ Sα(β,γ,δ) α-stablecumulativedensityfunctionF . Sα(β,γ,δ) Consequently,Equation(37)canbesimplified: plR[C](I)=2(x−µ)f (x)+2(1−F (x)). (39) i Sα(β,γ,δ) Sα(β,γ,δ) Theplausibilityfunctionrelatedtoaninterval I = [x,y]canalsobeseenastheareadefinedundertheα -cutsuch cut thatα = Betf(x). Thenotation pl[C](x)isequivalentto pl[C](I). cut i i Now us let consider an asymmetric α-stable probability density function. We must proceed numerically to cal- culate plausibility function at point x > µ, with µ the mode of the probability density function (Figure 2). The 1 plausibility function related to an interval I = [x ,y ] is defined by the area defined under the α -cut such that 1 1 1 cut α = Betf(x ): cut 1 plR[C](I )=(cid:90) x1 Betf(t)dt+(y −x )Betf(x )+(cid:90) +∞Betf(t)dt. (40) i 1 1 1 1 −∞ y1 By definition, a pdf has the quantity (cid:82)+∞ f (t)dt = 1−F (y ) and (cid:82) x1 f (t)dt = F (x ). In general, we know only one point y . Wy1e estSiαm(βa,γt,eδ)numerically xSα(sβu,γc,δh)th1at f −∞(ySα)(β,=γ,δ)f (xS)α.(βF,γi,δn)all1y, the 1 1 Sα(β,γ,δ) 1 Sα(β,γ,δ) 1 plausibilityfunctionrelatedtotheintervalI is: 1 plR[C](I )=1+F (x )−F(y )+(y −x )f(x ). (41) i 1 Sα(β,γ,δ) 1 1 1 1 1 3λreferstotheLebesgue’smeasure. 8 Inpractice,webasetheclassificationonseveralfeaturesdefinedfordifferentclassesΘ = {C ,..,C }. Forexample, 1 n theHaralickparameters“contrast”and“homogeneity”havedifferentvaluesfortheclassesrock, siltandsand. The featuresaremodeledbyaprobabilitydensityfunctionbecausethefeatureshavecontinuousvalues. Wecancalculate a plausibility function related to its probability density functions by using the least commitment principle. Several plausibilityfunctionsassociatedtothesamefeaturecanbecombinedbyusingthegeneralizedBayestheorem[47,48] tocalculatemassfunctionsallocatedtoAofanintervalI: (cid:89) (cid:89) mR[x](A)= pl (x) (1−pl (x)). (42) j j Cj∈A Cj∈Ac Forseveralfeatures,itispossibletocombinethemassfunctionswithcombinationrules. Tovalidateourapproach,weclassifyplanesusingkinematicdataasin[42,43]andcomparethedecisionwiththe approachofCaronetal.[43]. WeassociateaGaussianprobabilitydensityfunctionforeachspeed’starget(Figure3): • CommercialdefinedbytheprobabilitydensityfunctionofspeedS (0,8,722.5). 2 • BomberdefinedbytheprobabilitydensityfunctionofspeedS (0,7,690). 2 • FighterdefinedbytheprobabilitydensityfunctionofspeedS (0,10,730). 2 WecanobservethatthedecisionisthesamewiththebothapproachesintheparticularcaseofGaussianprobability densityfunctions. 4.2. LinkbetweenpignisticprobabilityfunctionandplausibilityfunctiontoRd It is possible to extend plausibility function in Rd. In [43], the authors calculate plausibility function in the GaussianpdfsituationofmodeδandmatrixofcovarianceΣ. Massfunctionisbuiltinsuchawaythatisoprobability surfaces S with 1 ≤ i ≤ n are focal elements. In R2, isoprobability points of a multivariate Gaussian pdf of mean i δ and covariance matrix Σ are ellipses. In dimension d, the authors defined focal elements as the nested sets HV α enclosedbytheisoprobabilityhyperconicsHC ={x∈Rd|(x−δ)tΣ−1(x−δ)}. Theyobtainbbdbyapplyingtheleast α commitmentprinciple: mRd(HV )= αd+22−1 exp(−1α)withα≥0. (43) α 2d+22Γ(d+22) 2 Equation(43)definesaχ2 distributionwithd+2degreesoffreedom. Theplausibilityfunctionatpointxbelonging to a surface S corresponds to the volume delimited by S (we give an example in Figure 4 for an α-stable pdf). i i Consequently,theplausibilityfunctionatpointxisdefinedby: plRd(x∈Rd)=(cid:90) α=+∞ αd+22−1 exp(−1α)dα. (44) α=(x−δ)tΣ−1(x−δ) 2d+22Γ(d+22) 2 Equation(44)canbesimplifiedby: plRd(x∈Rd)=1−Fd+2((x−δ)(Σ)−1(x−δ)). (45) The function Fd+2 is a cumulative density function of the χ2 distribution with d + 2 degrees of liberty (d is the dimensionofvectorx). TheauthorsalsocalculateplausibilityfunctionsintheparticularcaseofGMMswiththeirpdfsdenotedby p : GMM (cid:88)k=n p (x)= w N(x,δ ,Σ ), (46) GMM k k k k=1 where N(x,δ ,Σ ) is the normal distribution of the k components with mean δ and matrix of covariance Σ and k k k k w is the weight of each mixture. They assign belief to nested sets belonging to a component. Consequently, the k 9 plausibilityfunctionoftheGMMscanbeseenasaweightedsumofplausibilityfunctionsdefinedbyeachcomponent ofthemixture: plRd(x∈Rd)=1−(cid:88)k=nwkFd+2((x−δk)(Σk)−1(x−δk)). (47) k=1 Wewanttoextendthecalculationofplausibilityfunctionforanyα-stablepdf. However,thereisnoclosed-form expressionforα-stablepdf. WeusetheapproachdevelopedbyDore´ etal.[44]tobuildbelieffunctions. 4.3. Beliefclassifier Let us consider a data set with N samples from d sensors. For example, each feature of a vector x ∈ Rd can beseenasapieceofinformationfromasensor. Theclassificationisdividedintotwosteps. N × p(with p ∈]0,1[) samplesarefirstpickedoutrandomlyfromthedatasetforthelearningbase,notedX,suchthat: X=xx...111 ......... xx1NN...××pp d d AllcolumnsofXbelongtoaclassC with1≤i≤n. Probabilitydensityfunctions(Gaussianorα-stablemodels) i are estimated from samples belonging to classes. The rest of the samples is used for the test base (N × (1 − p) vectors). Weuseavalidationtesttodetermineifsamplesbelongtoanestimatedmodel. Ifthetestisnotvalid, we stoptheclassification,otherwisewecontinuetheclassification(Figure5). Theclassificationstepdiagraminoneand ddimensionsisshowninFigure6. Plausibilityfunctionsknowingclassesforxarecalculatedeitherfromtheirprobabilitydensityfunctionsbyusing Equation (41) for an unsymmetric α-stable probability density function or by using Equation (45) for a Gaussian probabilitydensityfunction. Weobtaindmassfunctions(oneforeachfeature)atpoint xwiththegeneralizedBayes theorem[47,48](Equation(42)). Thesedmassfunctionsarecombinedbyacombinationruletoobtainasinglemass function(section3.1.2). Finally,themassfunctionistransformedintopignisticprobability(Equation(19))tomake thedecision. Itisalsopossibletoworkwithavectorxofddimensions.Foranα-stableprobabilitydensityfunction,plausibility functions are calculated for each feature by using the approach of Dore´ et al [44] (section 3.3). For a Gaussian probabilitydensityfunction,weuseEquation(45)tocalculateplausibiltyfunctions. Wecalculateonemassfunction at point x by using the generalized Bayes theorem (Equation (42)). There is no combination step because we are calculatingasinglemassfunction. WeuseEquation(19)totransformthemassfunctionintopignisticprobabilities. Thedecisionischosenbyusingthemaximumnumberofpignisticprobabilities. 5. Applicationtopatternrecognition The aim is to build a belief classifier and to perform a classification of synthetic and real data by estimating features using a Gaussian and α-stable pdf. In [12], synthetic data are classified by modeling features using Gaus- sian and α-stable mixture models. By observing confidence intervals, the hypothesis of α-stable mixture models is significantlybetterthanthehypothesisofGaussianmixturemodels. However,whenthenumberofGaussiandistribu- tionsincreases,classificationaccuraciesaresignificantlythesame. Imagesfromaside-scansonarareautomatically classifiedbyextractingHaralickfeatures[49,50]. However,classificationaccuraciesareroughlythesame. In this section, we limit algorithms with a vector of features in dimension d ≤ 2. Indeed, the generalization of α-stablepdfindimensiond > 2increasesCPUtimebecausethereisnoclosed-formexpression. Consequently, we distinguishtwocasesduringtheclassificationstep: • Theonedimensioncase: eachdimensionisconsideredasafeature. • Thetwocase: weconsideredavectoroftwofeatures. 10