ebook img

Outlier detection for temporal data PDF

131 Pages·2014·9.64 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Outlier detection for temporal data

SSSeeerririeieess s I IISSSSSSNNN:: : 2 22111555111---000000666777 ggg uuu ppp ttt &&& SSSyyynnnttthhheeeSSSiiiSSS LLLeeeccctttuuurrreeeSSS ooonnn aaa MMMooorrrgggaaannn CCClllaaayyypppoooooolll pppUUUBBBlllIIISSSHHHEEErrrSSS • • • g g g DDDaaatttaaa MMMiiinnniiinnnggg aaannnDDD KKKnnnooowwwLLLeeeDDDgggeee DDDiiiSSScccooovvveeerrryyy aaa ooo • • • a a a ggg SSSeeerrriiieeesss e eedddiiitttooorrrsss::: J JJiiiaaawwweeeiii H HHaaannn,, , U UUnnniiivvveeerrrsssiiitttyyy o oofff I IIllllliliinnnoooiiisss a aattt U UUrrrbbbaaannnaaa---CCChhhaaammmpppaaaiiigggnnn,,, ggg aaa LLLiiissseee G GGeeetttoooooorrr,, , U UUnnniiivvveeerrrsssiiitttyyy o oofff M MMaaarrryyyllalaannnddd,, , W WWeeeiii W WWaaannnggg,, , U UUnnniiivvveeerrrsssiiitttyyy o oofff N NNooorrrttthhh C CCaaarrrooolliliinnnaaa,, , C CChhhaaapppeeell l H HHiiilllll,l, , rrr www OOOuuutttllliiieeerrr DDDeeettteeeccctttiiiooonnn JJJooohhhaaannnnnneeesss G GGeeehhhrrrkkkeee,, , C CCooorrrnnneeelllll l U UUnnniiivvveeerrrsssiiitttyyy,, , R RRooobbbeeerrrttt G GGrrrooossssssmmmaaannn,, , U UUnnniiivvveeerrrsssiiitttyyy o oofff I IIllllliliinnnoooiiisss a aattt C CChhhiiicccaaagggooo aaa lll • • • h h h aaa OOOuuutttllliiieeerrr DDDeeettteeeccctttiiiooonnn fffooorrr TTTeeemmmpppooorrraaalll DDDaaatttaaa nnn fffooorrr TTTeeemmmpppooorrraaalll DDDaaatttaaa MMMaaannniisisshhh G GGuuuppptttaaa,, , M MMiiciccrrrooossosoofftftt,, , I IInnndddiiaiaa,, , J JJiininnggg G GGaaaooo,, , S SStttaaattteee U UUnnniivivveeerrrssisitittyyy o ooff f N NNeeewww Y YYooorrrkkk,, , B BBuuuffffffaaalloloo,, , CCChhhaaarrruuu A AAggggggaaarrrwwwaaall,l, , I IIBBBMMM,, , N NNeeewww Y YYooorrrkkk,, , J JJiiaiaawwweeeii i H HHaaannn,, , U UUnnniivivveeerrrssisitittyyy o ooff f I IIlllllilininnoooiisis s a aattt U UUrrrbbbaaannn---CCChhhaaammmpppaaaiigiggnnn ooo uuu OOOuuutttllilieieerrr (((ooorrr aaannnooommmaaallylyy))) dddeeettteeeccctttiioioonnn iisiss aaa vvveeerrryyy bbbrrroooaaaddd fififieeelldldd wwwhhhiicicchhh hhhaaasss bbbeeeeeennn ssstttuuudddiieieeddd iininn ttthhheee cccooonnnttteeexxxttt ooofff aaa ttt lll llalaarrrgggeee nnnuuummmbbbeeerrr ooofff rrreeessseeeaaarrrccchhh aaarrreeeaaasss llilikikkeee ssstttaaatttiisisstttiiciccsss,, , dddaaatttaaa mmmiininniininnggg,, , ssseeennnsssooorrr nnneeetttwwwooorrrkkksss,, , eeennnvvviirirrooonnnmmmeeennntttaaall l iii eee sssccciieieennnccceee,,d,ddiisisstttrrriibibbuuuttteeeddd s ssyyysssttteeemmmsss,, , s sspppaaatttiioioo---ttteeemmmpppooorrraaall l m mmiininniininnggg,, , e eetttccc.. . I IInnniitittiiaiaall l r rreeessseeeaaarrrccchhh i ininn o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn f ffooocccuuussseeeddd rrr D D D ooonnn t ttiimimmeee s sseeerrriieieesss---bbbaaassseeeddd o oouuutttllilieieerrrsss ( ((iininn s sstttaaatttiisisstttiiciccsss))).. . S SSiininnccceee t tthhheeennn,, , o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn h hhaaasss b bbeeeeeennn s sstttuuudddiieieeddd o oonnn a aa l lalaarrrgggeee eee ttt vvvaaarrriieieetttyyy o oofff d ddaaatttaaa t ttyyypppeeesss i ininnccclluluudddiininnggg h hhiigigghhh---dddiimimmeeennnsssiioioonnnaaall l d ddaaatttaaa,, , u uunnnccceeerrrtttaaaiininn d ddaaatttaaa,, , s sstttrrreeeaaammm d ddaaatttaaa,, , n nneeetttwwwooorrrkkk d ddaaatttaaa,,t,ttiimimmeee eee ccc ssseeerrriieieesss d ddaaatttaaa,, , s sspppaaatttiiaiaall l d ddaaatttaaa,, , a aannnddd s sspppaaatttiioioo---ttteeemmmpppooorrraaall l d ddaaatttaaa.. . W WWhhhiililelee t tthhheeerrreee h hhaaavvveee b bbeeeeeennn m mmaaannnyyy t ttuuutttooorrriiaiaallslss a aannnddd s ssuuurrrvvveeeyyysss ttt iii ooo fffooorrr g ggeeennneeerrraaall l o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn,, , w wweee f ffooocccuuusss o oonnn o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn f ffooorrr t tteeemmmpppooorrraaall l d ddaaatttaaa i ininn t tthhhiisiss b bbooooookkk... nnn CCCooommmpppaaarrreeeddd t ttooo g ggeeennneeerrraaall l o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn,, , t tteeeccchhhnnniiqiqquuueeesss f ffooorrr t tteeemmmpppooorrraaall l o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn a aarrreee v vveeerrryyy d ddiiffiffffeeerrr--- f f f ooo eeennnttt.. . I IInnn t tthhhiisiss b bbooooookkk,, , w wweee w wwiilillll l p pprrreeessseeennnttt a aannn o oorrrgggaaannniizizzeeeddd p ppiicicctttuuurrreee o oofff b bbooottthhh r rreeeccceeennnttt a aannnddd p ppaaasssttt r rreeessseeeaaarrrccchhh i ininn t tteeemmmpppooorrraaall l rrr t t t ooouuutttllilieieerrr d ddeeettteeeccctttiioioonnn.. . W WWeee s sstttaaarrrttt w wwiititthhh t tthhheee b bbaaasssiiciccsss a aannnddd t tthhheeennn r rraaammmppp u uuppp t tthhheee r rreeeaaadddeeerrr t ttooo t tthhheee m mmaaaiininn i ididdeeeaaasss i ininn s sstttaaattteee--- eee mmm ooofff---ttthhheee---aaarrrttt ooouuutttllilieieerrr dddeeettteeeccctttiioioonnn ttteeeccchhhnnniiqiqquuueeesss.. . WWWeee mmmoootttiivivvaaattteee ttthhheee iimimmpppooorrrtttaaannnccceee ooofff ttteeemmmpppooorrraaall l ooouuutttllilieieerrr dddeeettteeeccctttiioioonnn ppp ooo aaannnddd b bbrrriieieefff t tthhheee c cchhhaaallllleleennngggeeesss b bbeeeyyyooonnnddd u uusssuuuaaall l o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn.. . Th ThTheeennn,, , w wweee l lilisissttt d ddooowwwnnn a aa t ttaaaxxxooonnnooommmyyy o oofff p pprrrooopppooossseeeddd rrr aaa ttteeeccchhhnnniiqiqquuueeesss fffooorrr ttteeemmmpppooorrraaall l ooouuutttllilieieerrr dddeeettteeeccctttiioioonnn.. . SSSuuuccchhh ttteeeccchhhnnniiqiqquuueeesss bbbrrroooaaadddllylyy iininnccclluluudddeee ssstttaaatttiisisstttiiciccaaall l ttteeeccchhhnnniiqiqquuueeesss lll D D D MMMaaannniiissshhh GGGuuuppptttaaa (((llilikikkeee AAARRR mmmooodddeeellslss,, , MMMaaarrrkkkooovvv mmmooodddeeellslss,, , hhhiisisstttooogggrrraaammmsss,, , nnneeeuuurrraaall l nnneeetttwwwooorrrkkksss))),, , dddiisisstttaaannnccceee---aaannnddd dddeeennnsssiitittyyy---bbbaaassseeeddd aaappp--- aaa ttt ppprrroooaaaccchhheeesss,, , g ggrrrooouuupppiininnggg---bbbaaassseeeddd a aapppppprrroooaaaccchhheeesss ( ((ccclluluusssttteeerrriininnggg,, , c ccooommmmmmuuunnniitittyyy d ddeeettteeeccctttiioioonnn))),, , n nneeetttwwwooorrrkkk---bbbaaassseeeddd a aapppppprrroooaaaccchhheeesss,, , aaa JJJiiinnnggg GGGaaaooo aaannnddd s sspppaaatttiioioo---ttteeemmmpppooorrraaall l o oouuutttllilieieerrr d ddeeettteeeccctttiioioonnn a aapppppprrroooaaaccchhheeesss.. . W WWeee s ssuuummmmmmaaarrriizizzeee b bbyyy p pprrreeessseeennntttiininnggg a aa w wwiididdeee c ccooollllleleeccctttiioioonnn o oofff aaapppppplliliciccaaatttiioioonnnsss wwwhhheeerrreee ttteeemmmpppooorrraaall l ooouuutttllilieieerrr dddeeettteeeccctttiioioonnn ttteeeccchhhnnniiqiqquuueeesss hhhaaavvveee bbbeeeeeennn aaappppppllilieieeddd tttooo dddiisisscccooovvveeerrr iininnttteeerrreeessstttiininnggg CCChhhaaarrruuu AAAggggggaaarrrwwwaaalll ooouuutttllilieieerrrsss.. . JJJiiiaaawwweeeiii HHHaaannn aaaBBBoooUUUTTT SSSyyynnnTTTHHHEEESSSIIISSS ThThThiisiss v vvooolluluummmeee i isiss a aa p pprrriininnttteeeddd v vveeerrrsssiioioonnn o ooff f a aa w wwooorrrkkk t tthhhaaattt a aappppppeeeaaarrrsss i ininn t tthhheee S SSyyynnnttthhheeessisisis s MMM DDDiigiggiititataall l L LLiibibbrrraaarrryyy o ooff f E EEnnngggiininneeeeeerrriininnggg a aannnddd C CCooommmpppuuutteteerrr S SSccicieieennnccecee.. . SSSyyynnntththheeesssiisiss LLLeeecccttutuurrreeesss ooo ppprroroovvviididdeee c ccooonnnccciisisseee,, ,o oorrriigiggiininnaaall l p pprrereessseeennnttataattitioioonnnsss o ooff f i imimmpppooorrrttataannntt t r rereessseeeaaarrcrcchhh a aannnddd d ddeeevvveeelloloopppmmmeeennntt t rrr ggg ttotoopppiiciccsss,, ,p ppuuubbbllilisisshhheeeddd q qquuuiicicckkkllylyy,, ,i ininn d ddiigiggiititataall l a aannnddd p pprrriininntt t f fofoorrrmmmaaattstss.. .F FFooorrr m mmooorrreee i ininnffofoorrrmmmaaattitioioonnn aaa nnn vvviisissiitit t w wwwwwwww..m.mmooorrrgggaaannncccllalaayyypppooooooll.l.c.ccooommm &&& CCC SSSyyynnnttthhheeeSSSiiiSSS LLLeeeccctttuuurrreeeSSS ooonnn IIISSSBBBNNN::: 999777888---111---666222777000555---333777555---444 lll MMMooorrrgggaaannn&&&CCClllaaayyypppoooooolll pppUUUBBBlllIIISSSHHHEEErrrSSS 999000000000000 aaa yyy DDDaaatttaaa MMMiiinnniiinnnggg aaannnDDD KKKnnnooowwwLLLeeeDDDgggeee DDDiiiSSScccooovvveeerrryyy ppp wwwwwwwww...mmmooorrrgggaaannnccclllaaayyypppoooooolll...cccooommm ooo 999777888111666222777000555333777555444 ooo lll JJJiiiaaawwweeeiii H HHaaannn,, , L LLiiissseee G GGeeetttoooooorrr,, , W WWeeeiii W WWaaannnggg,, , J JJooohhhaaannnnnneeesss G GGeeehhhrrrkkkeee,, , R RRooobbbeeerrrttt G GGrrrooossssssmmmaaannn,, , S SSeeerrriiieeesss E EEdddiiitttooorrrsss Outlier Detection for Temporal Data Synthesis Lectures on Data Mining and Knowledge Discovery Editor JiaweiHan,UniversityofIllinoisatUrbana-Champaign LiseGetoor,UniversityofMaryland WeiWang,UniversityofNorthCarolina,ChapelHill JohannesGehrke,CornellUniversity RobertGrossman,UniversityofChicago SynthesisLecturesonDataMiningandKnowledgeDiscoveryiseditedbyJiaweiHan,LiseGetoor, WeiWang,JohannesGehrke,andRobertGrossman.eseriespublishes50-to150-page publicationsontopicspertainingtodatamining,webmining,textmining,andknowledgediscovery, includingtutorialsandcasestudies.escopewilllargelyfollowthepurviewofpremiercomputer scienceconferences,suchasKDD.Potentialtopicsinclude,butnotlimitedto,datamining algorithms,innovativedataminingapplications,dataminingsystems,miningtext,weband semi-structureddata,highperformanceandparallel/distributeddatamining,dataminingstandards, dataminingandknowledgediscoveryframeworkandprocess,dataminingfoundations,miningdata streamsandsensordata,miningmulti-mediadata,miningsocialnetworksandgraphdata,mining spatialandtemporaldata,pre-processingandpost-processingindatamining,robustandscalable statisticalmethods,security,privacy,andadversarialdatamining,visualdatamining,visualanalytics, anddatavisualization. OutlierDetectionforTemporalData ManishGupta,JingGao,CharuAggarwal,andJiaweiHan 2014 ProvenanceDatainSocialMedia GeoffreyBarbier,ZhuoFeng,PritamGundecha,andHuanLiu 2013 GraphMining:Laws,Tools,andCaseStudies D.ChakrabartiandC.Faloutsos 2012 iii MiningHeterogeneousInformationNetworks:PrinciplesandMethodologies YizhouSunandJiaweiHan 2012 PrivacyinSocialNetworks ElenaZheleva,EvimariaTerzi,andLiseGetoor 2012 CommunityDetectionandMininginSocialMedia LeiTangandHuanLiu 2010 EnsembleMethodsinDataMining:ImprovingAccuracyroughCombiningPredictions GiovanniSeniandJohnF.Elder 2010 ModelingandDataMininginBlogosphere NitinAgarwalandHuanLiu 2009 Copyright©2014byMorgan&Claypool Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedin anyformorbyanymeans—electronic,mechanical,photocopy,recording,oranyotherexceptforbriefquotations inprintedreviews,withoutthepriorpermissionofthepublisher. OutlierDetectionforTemporalData ManishGupta,JingGao,CharuAggarwal,andJiaweiHan www.morganclaypool.com ISBN:9781627053754 paperback ISBN:9781627053761 ebook DOI10.2200/S00573ED1V01Y201403DMK008 APublicationintheMorgan&ClaypoolPublishersseries SYNTHESISLECTURESONDATAMININGANDKNOWLEDGEDISCOVERY Lecture#8 SeriesEditors:JiaweiHan,UniversityofIllinoisatUrbana-Champaign LiseGetoor,UniversityofMaryland WeiWang,UniversityofNorthCarolina,ChapelHill JohannesGehrke,CornellUniversity RobertGrossman,UniversityofChicago SeriesISSN Print2151-0067 Electronic2151-0075 Outlier Detection for Temporal Data Manish Gupta Microsoft,IndiaandInternationalInstituteofTechnology–Hyderabad,India Jing Gao StateUniversityofNewYork,Buffalo,NY Charu Aggarwal IBMT.J.WatsonResearchCenter,NY Jiawei Han UniversityofIllinoisatUrbana-Champaign,IL SYNTHESISLECTURESONDATAMININGANDKNOWLEDGE DISCOVERY#8 M &C Morgan &cLaypool publishers ABSTRACT Outlier (or anomaly) detection is a very broad field which has been studied in the context of a largenumberofresearchareaslikestatistics,datamining,sensornetworks,environmentalscience, distributedsystems,spatio-temporalmining,etc.Initialresearchinoutlierdetectionfocusedon timeseries-basedoutliers(instatistics).Sincethen,outlierdetectionhasbeenstudiedonalarge varietyofdatatypesincludinghigh-dimensionaldata,uncertaindata,streamdata,networkdata, timeseriesdata,spatialdata,andspatio-temporaldata.Whiletherehavebeenmanytutorialsand surveysforgeneraloutlierdetection,wefocusonoutlierdetectionfortemporaldatainthisbook. A large number of applications generate temporal datasets. For example, in our everyday life, various kinds of records like credit, personnel, financial, judicial, medical, etc., are all tem- poral.isstressestheneedforanorganizedanddetailedstudyofoutlierswithrespecttosuch temporaldata.Inthepastdecade,therehasbeenalotofresearchonvariousformsoftemporal dataincludingconsecutivedatasnapshots,seriesofdatasnapshotsanddatastreams.Besidesthe initialworkontimeseries,researchershavefocusedonrichformsofdataincludingmultipledata streams,spatio-temporaldata,networkdata,communitydistributiondata,etc. Compared to general outlier detection, techniques for temporal outlier detection are very different. In this book, we will present an organized picture of both recent and past research in temporal outlier detection. We start with the basics and then ramp up the reader to the main ideas in state-of-the-art outlier detection techniques. We motivate the importance of temporal outlier detection and brief the challenges beyond usual outlier detection. en, we list down a taxonomyofproposedtechniquesfortemporaloutlierdetection.Suchtechniquesbroadlyinclude statisticaltechniques(likeARmodels,Markovmodels,histograms,neuralnetworks),distance- and density-based approaches, grouping-based approaches (clustering, community detection), network-basedapproaches,andspatio-temporaloutlierdetectionapproaches.Wesummarizeby presenting a wide collection of applications where temporal outlier detection techniques have beenappliedtodiscoverinterestingoutliers. KEYWORDS temporal outlier detection, time series data, data streams, distributed data streams, temporalnetworks,spatiotemporaloutliers vii Tomydearparents,SatyapalGuptaandMadhubalaGupta, andmycutelovingwifeNidhi –ManishGupta TomyhusbandLu, andmyparents –JingGao TomywifeLata, andmydaughterSayani –CharuAggarwal TomywifeDora, andmysonLawrence –JiaweiHan

Description:
Outlier (or anomaly) detection is a very broad field which has been studied in the context of a large number of research areas like statistics, data mining, sensor networks, environmental science, distributed systems, spatio-temporal mining, etc. Initial research in outlier detection focused on time
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.