ebook img

Detection of Periodicity Based on Independence Tests - II. Improved Serial Independence Measure PDF

0.48 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Detection of Periodicity Based on Independence Tests - II. Improved Serial Independence Measure

MNRAS000,1–4(0000) Preprint7January2016 CompiledusingMNRASLATEXstylefilev3.0 Detection of Periodicity Based on Independence Tests – II. Improved Serial Independence Measure Shay Zucker⋆ Dept.ofGeosciences,RaymondandBeverlySacklerFacultyofExactSciences,Tel-AvivUniversity,TelAviv6997801,Israel 6 7January2016 1 0 2 ABSTRACT n a We introducean improvementto a periodicitymetric we have introducedin a previous J paper.WeimproveontheHoeffding-testperiodicitymetric,usingtheBlum-Kiefer-Rosenblatt 6 (BKR) test. Besides a consistent improvement over the Hoeffding-test approach, the BKR approachturnsoutto performsuperblywhenapplied to veryshorttime series of sawtooth- ] likeshapes.TheexpectedastronomicalimplicationsaremuchmoredetectionsofRR-Lyrae M starsandCepheidsinsparsephotometricdatabases,andofeccentricKeplerianradial-velocity I (RV)curves,suchasthoseofexoplanetsinRVsurveys. . h Key words: methods:data analysis – methods:statistical – binaries:spectroscopic– stars: p individual:HIP101453–stars:variables:RRLyrae–stars:variables:Cepheids - o r t s a [ 1 INTRODUCTION 2 BLUM-KIEFER-ROSENBLATTTEST 1 Inarecentpaper(Zucker2015,;hereafterPaperI)wehaveintro- ThebestperformingmethodinPaperIwasbasedontheHoeffd- v duced anew non-parametric approach to thedetection of period- ing test. Wassily Hoeffding first proposed it in 1948 (Hoeffding 5 icities in sparse datasets. The new approach follows the logic of 1948)asatestofindependencebetweentworandomvariables.Es- 2 string-lengthtechniques(e.g.Lafler&Kinman1965;Clarke2002) sentially it estimates a measure of deviation of the joint empiri- 2 and quantifies the dependence between consecutive phase-folded caldistributionfunctionfromadistributionthatassumesindepen- 1 samples,foreverytrialperiod.InPaperIwehaveshownthatusu- dence. Themeasureofdeviation that Hoeffdingusedwastheso- 0 ally the classical string-length techniques effectively test solely called Cramér–von Mises criterion for distance between distribu- . 1 forlineardependence betweenconsecutivesamples.Ontheother tions(Cramér1928;vonMises1928). 0 hand, the Hoeffding-test approach we have presented there, tests LetusdenotebyG1andG2thecumulativedistributionfunc- 6 1 for general dependencies, not necessarily linear. PaperI showed tionsofthetworandomvariables,andbyG12 theirjointcumula- thatfortwokindsofperiodicsignals(sawtoothsignalandeccen- tivedistributionfunction.Then,independenceofthetwovariables : v tricspectroscopicbinary(SB)radial-velocity(RV)curve),ourpro- would mean G12=G1G2. Applying Cramér–von Mises criterion i posednewapproach performedbetterthantheconventional tech- fordistancebetweendistributions,Hoeffdingdefinedhisnamesake X niques. statisticDby: r a weemInbsaprikreeddobnyathweidseurccsetusdsfyultoscimomuleatuiopnwsipthrenseewnteadndinimPparpoevreId, D=Z (G12−G1G2)2dG12. (1) periodicitymetrics,similarlybasedondependencemeasures.The InestimatingDusingtheempriricaldata,weusetheempiri- currentLetterrepresentsafirststepinthatdirection,astepwehave caldistributionfunctions,determinedonlybytheobservedvalues. alreadyalludedtointheDiscussionofPaperI. ThisissomewhatreminiscentoftheKolmogorov–Smirnovphilos- Section2describesthedetailsofthemodificationwepropose ophy,whichispopularamongastronomers(e.g.Babu&Feigelson totheHoeffding test,Section3presents thesimulationswehave 2006).Theabovedefinitioneventuallyresultsintheformulaepre- performed in order to test its performance, and in Section 4 we sentedinPaperI. show a test of our new metric on real-life data. In Section 5 we Blum,Kiefer&Rosenblatt (1961) introduced a new version discussourfindings. oftheHoeffdingtest.Theyshowedthatthetwotestswereequiva- lentinlargesamples,butthenewtestwaseasiertocompute,and also more naturally amenable to generalization to more than two variables.Thefundamentaldefinitionoftheirstatisticis: B=Z (G12−G1G2)2dG1dG2. (2) ⋆ E-mail:[email protected] Whilethedifferenceseemstobeminuteandmaybeeveninsignifi- (cid:13)c 0000TheAuthors 2 ShayZucker cant,thechangeintheresultingcomputingformulaisnotnegligi- Sinusoid Almost sinusoidal 1 1 fxboilel(d.ii=nFgo.1ll,Io.n.w.,ioNnrgd)ewPrhatpeoreermtIah,keleeintsduuersxedioerunerofltceeacltctshutelhaetpioohnradsseer-afraoefltdneerodtthadefafpteahcatbesdye detection fraction0.5 detection fraction0.5 0 0 bythearbitraryzero-phasechoice,wealsodefinexN+1≡x1.Let 10 20 30 40 50 10 20 30 40 50 N N us further denote by Ri the phase-folded rank values, such that Sawtooth Pulse ’Rb(bxoiiNtv=h,axrx1Niaj+tm≤e1e)rax=aninsak(nxt’hdNac,xtix,jx1+ai)1sias≤cthctheoxeui+nns1utms.maNfbloloerertsetthotevfhacaplytuacetilhr.isecLw(eexxtrijaus,ptxseajna+rocl1sue)onodfdfoeartfinhnwdeehrpetiahcniher- detection fraction0.15 detection fraction0.15 ders the whole procedure independent of the arbitrary choice of 010 20 30 40 50 010 20 30 40 50 phase. N N Eclipsing binary Spectroscopic binary B=NN−o4wi(cid:229)=Nw1e(Ncacni−deRfiiRnei+o1u)r2d,ependencemeasurebytheformul(a3:) detection fraction0.15 detection fraction0.15 0 0 which can easily be derived from eq. (5.2) in Blumetal. (1961). 10 20 30 40 50 10 20 30 40 50 N N Theaboveexpressionisclearlymuchsimplerthantheparallelex- pressionsineqs.9–12inPaperI. Figure1.Detectionfractionasafunctionofsamplesizefortimeserieswith SNRof3.Thedetectionfractionisestimatedbasedon100simulatedlight curves. Legend: empty circles, dashed line –Generalized Lomb–Scargle 3 PERFORMANCETESTING periodogram; empty squares, dashed line – von-Neumann Ratio; empty upward-pointingtriangles,dashedline–Hoeffdingtest;filledcircles,solid Followingthesamepath asinPaperI, weperformed simulations line–Blum-Kiefer-Rosenblatttest. inwhichwerandomly drew asparsesetofsamplingtimes,from a total baseline spanning 1000 timeunits (’days’). Then we used thosetimestosamplesomeperiodicfunctionwithaperiodoftwo Sinusoid Almost sinusoidal days,andaddedwhiteGaussiannoisewithaprescribedsignal-to- 1 1 wnwoaeivsteeeU,srteanectldilioikipen(sSitPnhNagepR-abe)pri.npIWa:rroseyaintcleuihsgsthwoetidedctafuholr,elvlaesolawmamneodedstseinsecitcnPeouanfsptosreiiirxcdIaSp,le,BwrsieaRowhdVaitcovcoeufturhdvn,eeccp.tiiudolensdes detection fraction0.05 detection fraction0.05 10 20 30 40 50 10 20 30 40 50 this time to use a much simpler and intuitive performance mea- N N Sawtooth Pulse sure:ineachtestedconfigurationofsignalshape,N andSNR,we 1 1 swtahimafersep’dqalyteutteaceniocnctueyinodtngeerdxfidratachtcthetliyaontnuisn’mp.atbWhneenreecodcofartlsrhceiemucltruaaktlneandtgoieowtnh1nse0pi−pne4erw–riio1ohdddi,caichtyhi−ttuyh1se,mowbbeeitttsrahtiicnsssacteiofnporgesr detection fraction0.05 detection fraction0.05 10 20 30 40 50 10 20 30 40 50 of 10−4day−1.Thus, counting thefractionof caseswiththecor- N N Eclipsing binary Spectroscopic binary rectperiodactualymeantafrequencyerrorwhichwassmallerthan 1 1 1Rm0oe−stre4inWcdbaelwya−tet1c-tho.eamsdtpa(inrBetKrdoRdu)thcpeeedripionedrifPcoiartpmyearmnIc,eetarincdoftaolsttohhiestoHBtohleeufmftdw-iKnogie-’fttereasr--t detection fraction0.05 detection fraction0.05 10 20 30 40 50 10 20 30 40 50 ditional’ techniques of Generalized Lomb-Scargle (Lomb 1976; N N Scargle 1982; Zechmeister&Kürster 2009) and von-Neumann ratio (vonNeumannetal. 1941) as a representative of the Figure2.Detection fraction as afunction ofsamplesize fortimeseries string-lengthtechniques(Clarke2002;Lafler&Kinman1965). withSNRof100.Thedetectionfractionisestimatedbasedon100simu- Fig.1examinesthedependenceofthedetectionperformance latedlightcurves.Legend:emptycircles,dashedline–GeneralizedLomb– on the number of samples in the time series, for low-SNR. We Scargle periodogram; emptysquares, dashed line –von-Neumann Ratio; held the SNR fixed at 3 while we varied the sample size N. The emptyupward-pointing triangles, dashedline–Hoeffding test;filledcir- main feature apparent from examining the Figure isthat the new cles,solidline–Blum-Kiefer-Rosenblatttest. BKRapproachconstitutesanimprovementovertheHoeffding-test approach weintroduced inPaperI.Incaseswherethetraditional techniques performed better (e.g., pulse-wave signal shape), they and eccentric SB RV curve). However, specifically in those two usuallyperformedalsobetterthanBKR,andviceversa. cases,anothertrendemerged:theBKRmethodseemedtohavean Fig.2presentsthesametestforthecaseofahighSNR.We exceedinglybetterperformancefordatasetswithveryfewsamples fixedtheSNRatavalueof100andrepeatedthesameexercise.In (N=10). thissituationtheBKRwasagainimprovingovertheHoeffding-test Inordertomakesurethisresultwasnotspuriousorevener- approach, whichalsomeant itperformed significantlybetter than roneous,wehavedecidedtoexaminethisregimemoreclosely.We the traditional approaches in the cases of sawtooth-shaped signal repeated the simulations with SNR=100 for a finer grid of N, MNRAS000,1–4(0000) ImprovedNon-ParametricPeriodicityDetection 3 Sinusoid Almost sinusoidal Original Time Series Phase−Folded Time Series 1 1 4 4 detection fraction0.5 detection fraction0.5 123 123 0 10 15 20 0 10 15 20 00 200 400 600 800 1000 00 0.5 1 1.5 2 N N Sawtooth Pulse Generalized Lomb−Scargle Periodogram von−Neumann Ratio 1 0.8 1 30 detection fraction0.5 detection fraction000...246 0.5 1200 0 0 10 15 20 10 15 20 0 0 N N 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Eclipsing binary Spectroscopic binary Hoeffding Test Blum−Kiefer−Rosenblatt detection fraction0000....2468 detection fraction0.15 000...000123 00..0001..5512 0 0 10 15 20 10 15 20 0 0 N N 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Figure3.Detection fraction asafunction ofsample sizefortimeseries Figure4.Anexampleoftheresultsofapplyingalltheexaminedperiodic- withSNRof100,focusingonsmallN.Thedetectionfractionisestimated itydetectionmethodstoasawtoothsimulatedtime-series,with9samples, basedon100simulatedlightcurves.Legend:emptycircles,dashedline– andaSNRof100.Theuppertwopanelsshowthetimeseriesanditsphase- GeneralizedLomb–Scargleperiodogram;emptysquares,dashedline–von- foldedversion, usingthecorrect period. Theotherpanels showtheperi- NeumannRatio;emptyupward-pointingtriangles,dashedline–Hoeffding odicitymetricscalculatedforthistimeseries,withselfexplanatorytitles. test;filledcircles,solidline–Blum-Kiefer-Rosenblatttest. Notethepoorperformanceofthefirstthreeperiodicitymetrics,compared tothedetectionbytheBKRtechnique. namely,forallNfrom7to20.Fig.3,whichfocusesonthisrange, showstheveryconvincingresultofagradualincreaseoftheper- foldingusingtheresultingperiod,theBKRfunctionandtheGLS formance inmost cases. Specificallyinthecases of thesawtooth periodogramwecalculatedforcomparison.Theprominenceofthe signalandtheeccentricSBRVcurve,Thepaceofthatincreasefor BKR peak at a frequency of 2.5696d−1 is evident, and indeed theBKRmethodismuchfasterthanthatoftheothertechniques, thecorrespondingSDEvalueis17.5.Thephase-foldedlightcurve includingtheHoeffding-testapproach.Itseemedthatalreadywhen demonstratesanobviousperiodicity.Ontheotherhand,TheGLS therewere9(!)samples,theBKRapproachhadverygoodchances periodogramshowsnohintofastatisticallysignificantperiodicity todetecttheperiodicity. detection. Figs.4,5and6showselectedconcreteexamplesofcaseswith To complete the picture, this star isa known RR-Lyrae star, asawtoothsignal shapeandonly 9samples, inwhichtheperfor- withaperiodof0.38918702d(Samusetal.2013),whichisconsis- mance of BKR was definitely superior over the three other tech- tentwithourresult–0.3891656d.TheHipparcoscataloguequotes niqueswetested.Thosecaseswereindeedthemajorityofthesim- the known period as taken from literature, and adds a comment ulatedcases. aboutthescarcityofthedata,whichcastsdoubtaboutthenatureof theperiodicity. Therefore, themaincataloguedoesnot quoteany period.WeshowherethatbyusingournewBKRapproach,wecan assignahighdegreeofcredibilitytotheperiodicity,evenwiththe 4 REAL-LIFETEST veryscarceHipparcosdata. We set out to test our new technique in a real-life situation that couldpotentiallyemphasizeitsadvantagesandallowthemtoma- terialize. To this end we chose to apply it to short Hipparcos 5 DISCUSSION lightcurves.OursampleconsistedofHipparcostargetswithatmost 30samplesintheirHipparcosEpochPhotometryAnnexentry.We This Letter presents an improvement of the serial Hoeffding-test consideredonlysampleswhichwerefullyacceptedbyatleastone periodicitymetricwehavepresentedinPaperI,basedontheBlum- of Hipparcos’ FAST and NDAC consortia (ESA 1997). In total Kiefer-RosenblattmodificationoftheoriginalHoeffdingtest.This therewere51lightcurvesmeetingthosecriteria. new periodicity metric consistently outperforms the Hoeffding- Wescannedtheselectedlightcurvesforperiodicityusingthe test metric. On top of it, it seems to perform superbly better in BKRtest,onafrequencygridrangingfrom10−4to12d−1(follow- sawtooth-like signal shapes, when thenumber of samples isvery ingKoen&Eyer(2002)),withstepsof10−4d−1.Wesearchedfor smallandtheSNRishigh.Thisresultisinlinewiththestatement targetswhose peak intheBKRfunction wassignificantlypromi- ofBlumetal.(1961),thattheirtestisasymptoticallyequivalentto nent. We therefore used the SDE (Signal Detection Efficiency) theHoeffdingtestforlargesamples. statistic originally used by Alcocketal. (2000) and Kovácsetal. This advantage might prove very important for radial- (2002). We chose to single out targets whose SDE statistic was velocity surveys searching for exoplanets. The detection of high- higher than15. Only one targetpassed thishurdle –HIP101453 eccentricity Keplerian RV curves is notoriously difficult when (alsoknownasCHAql). basedonasmallnumberofsamples(Cumming2004).Anothersit- Fig.7showsthelightcurveofHIP101453,aswellasitsphase uationinwhichthisfeaturewillprovevaluableisthedetectionof MNRAS000,1–4(0000) 4 ShayZucker Original Time Series Phase−Folded Time Series Original Time Series Phase−Folded Time Series 4 4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 0 0 0 0 0 200 400 600 800 1000 0 0.5 1 1.5 2 0 200 400 600 800 1000 0 0.5 1 1.5 2 Generalized Lomb−Scargle Periodogram von−Neumann Ratio Generalized Lomb−Scargle Periodogram von−Neumann Ratio 1 30 1 40 30 20 0.5 0.5 20 10 10 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Hoeffding Test Blum−Kiefer−Rosenblatt Hoeffding Test Blum−Kiefer−Rosenblatt 0.03 0.2 0.04 0.2 0.15 0.03 0.15 0.02 0.1 0.02 0.1 0.01 0.05 0.01 0.05 0 0 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Figure5.Anotherexampleoftheresultsofapplyingalltheexaminedperi- Figure6.Anotherexampleoftheresultsofapplyingalltheexaminedperi- odicitydetectionmethodstoasawtoothsimulatedtime-series,with9sam- odicitydetectionmethodstoasawtoothsimulatedtime-series,with9sam- ples,andaSNRof100.Theuppertwopanels showthetimeseries and ples,andaSNRof100.Theuppertwopanels showthetimeseries and itsphase-folded version,usingthecorrectperiod. Theotherpanels show itsphase-folded version,usingthecorrectperiod. Theotherpanels show theperiodicitymetricscalculatedforthistimeseries,withselfexplanatory theperiodicitymetricscalculatedforthistimeseries,withselfexplanatory titles.Notethepoorperformanceofthefirstthreeperiodicitymetrics,com- titles.Notethepoorperformanceofthefirstthreeperiodicitymetrics,com- paredtothedetectionbytheBKRtechnique. paredtothedetectionbytheBKRtechnique. RR-LyraestarsandCepheids(whosesignalshapesarealsoessen- LombN.R.1976,Ap&SS,39,447 tiallysawtooth-likeshapes)insparsedatasetssuchasthoseofHip- SamusN.N.,DurlevichO.V.,Goranskij,V.P.,Kazarovets,E.V.,Kireeva, parcosandGaia(Eyeretal.2012).WehaveprovidedinSection4 N.N.,Pastukhova,E.N.,Zharova,A.V.2013,GeneralCatalogofVari- ademonstrationofthispotentialusingHipparcosdata,specifically ableStars,VizieronlinedatacatalogB/gcvs ScargleJ.D.1982,ApJ,263,835 forthecaseofHIP101453. von Mises R. 1928, Wahrscheinlichkeit, Statistik und Wahrheit, Julius Wecontinueourinvestigationofharnessingthepowerofnon- Springer,Vienna parametricindependencemeasuresforthepurposeofdetectingpe- vonNeumannJ.,KentR.H.,BellinsonH.R.,HartB.I.1941,Ann.Math. riodicityinextremecircumstancesofeitherpoorSNR,smallsam- Stat.,12,153 plesizes,ornon-sinusoidalsignalshapes.Inthemeanwhilewealso Zechmeister,M.,&Kürster,M.2009,A&A,496,577 applyournewlydevelopedapproachestoexistingdatasets,bothfor ZuckerS.2015,MNRAS,449,2723 thesakeoftesting,butalsofordetectingpreviouslymissedperiodic variables.Topromotefurtherresearchandtestingofthisperiodic- ThispaperhasbeentypesetfromaTEX/LATEXfilepreparedbytheauthor. ity metric by the community, we make it available online, in the formofaMATLABfunction1. REFERENCES AlcockC.etal.2000,ApJ,542,257 BabuG.J.,Feigelson,E.D.2006inGabrielC.,Arviset,C.,PonzD.,En- rique S.,eds, ASP Conf. Ser. Vol. 351, Astronomical Data Analysis SoftwareandSystemsXV.Astron.Soc.Pac.,SanFrancisco,p.127 BlumJ.R.,KieferJ.,RosenblattM.1961,Ann.Math.Stat.,32,485 ClarkeD.2002,A&A,386,763 CramérH.1928,Scand.Actuar.J.,11,13 Cumming,A.2004,MNRAS,354,1165 ESA1997,TheHipparcosandTychoCatalogues,ESASP-1200 Eyer,L.etal.2012,Ap&SS,341,207 HoeffdingW.1948,Ann.Math.Stat.,19,293 Koen,C.,&Eyer,L.2002,MNRAS,331,45 KovácsG.,ZuckerS.,MazehT.2002,A&A,391,369 LaflerJ.,&Kinman,T.D.1965,ApJS,11,216 1 TheURLfordownloadingaMATLABcodetocalculatetheBKRperi- odicitymetricishttp://www.tau.ac.il/˜shayz/BKR.m MNRAS000,1–4(0000) ImprovedNon-ParametricPeriodicityDetection 5 12.5 12.5 13 13 ag]13.5 13.5 m p [ 14 14 H 14.5 14.5 15 15 8000 8200 8400 8600 8800 −0.4 −0.2 0 0.2 0.4 0.6 0.8 BJD−2 440 000 phase−folded time [days] 0.5 0.8 0.4 0.6 0.3 B p 0.4 0.2 0.1 0.2 0 0 0 2 4 6 8 10 12 0 2 4 6 8 10 12 Frequency [d−1] Frequency [d−1] Figure 7. Detecting the periodicity in the Hipparcos lightcurve of HIP101453.Upperleft:theoriginallightcurve.Upperright:Thelightcurve phase-folded using aperiod of0.3891656d.Theemptycircles represent copiesoftheoriginaldatasetshiftedbackwardsandforwardsbyoneperiod inordertobettervisualizetheperiodicity.Lowerleft:TheBKRperiodicity metric.Notethesharppeakattheknownfrequency.Lowerright:General- izedLomb-Scargleperiodogram.Notetheabsenceofanysignificantpeak. MNRAS000,1–4(0000)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.