ebook img

Compression-Based Methods of Statistical Analysis and Prediction of Time Series PDF

153 Pages·2016·2.027 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Compression-Based Methods of Statistical Analysis and Prediction of Time Series

Boris Ryabko · Jaakko Astola Mikhail Malyutov Compression- Based Methods of Statistical Analysis and Prediction of Time Series Compression-Based Methods of Statistical Analysis and Prediction of Time Series Boris Ryabko • Jaakko Astola (cid:129) Mikhail Malyutov Compression-Based Methods of Statistical Analysis and Prediction of Time Series 123 BorisRyabko JaakkoAstola Inst.ofComputationalTechnologies Dept.ofSignalProcessing SiberianBranchoftheRussianAcademy TampereUniversityofTechnology ofSciences Tampere,Finland Novosibirsk,Russia MikhailMalyutov Dept.ofMathematics NortheasternUniversity Boston,MA,USA ISBN978-3-319-32251-3 ISBN978-3-319-32253-7 (eBook) DOI10.1007/978-3-319-32253-7 LibraryofCongressControlNumber:2016940381 ©SpringerInternationalPublishingSwitzerland2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland Preface Initially,inthe1960s,universalcodesweredevelopedforlosslessdatacompression in their storing and transmission. Those codescan efficiently compresssequences generated by stationary and ergodic sources with unknown statistics. In the last twenty years, it was realized that universal codes can be used for solving many importantproblemsof predictionand statistical analysis of time series. This book describesrecentresultsinthisarea. The first chapter of this book is mainly devoted to the application of universal codes to prediction and statistical analysis of time series. The applications of suggested statistical methods to cryptography are quite numerous, so they are described separately in Chap.2. These two chapters were written by B. Ryabko andJ.Astola. Thethirdchapterpresentsasketchofthetheorybehindmanyapplicationsofa simplifiedhomogeneitytestbetweenliterarytextsbasedonuniversalcompressors. Inparticular,thistestcanbeusedforauthorshipattributioniftrainingtextswritten bydifferentcandidatesareavailable.ThischapterwaswrittenbyM.Malyutov. Boris Ryabkois partlysupportedbyRussian Science Foundation,grant14-14- 00603andbyRussianFoundationforBasicResearch,grant15-29-07932. Novosibirsk,Russia B.Ryabko Tampere,Finland J.Astola Boston,MA,USA M.Malyutov August,2015 v Contents 1 StatisticalMethodsBasedonUniversalCodes ........................... 1 1.1 Introduction ............................................................. 1 1.2 DefinitionsandStatementsoftheProblems........................... 2 1.2.1 EstimationandPredictionforI.I.D.Sources................. 2 1.2.2 ConsistentEstimationsandOn-LinePredictors forMarkovandStationaryErgodicProcesses............... 8 1.2.3 HypothesisTesting............................................ 11 1.2.4 Codes .......................................................... 12 1.3 FiniteAlphabetProcesses .............................................. 14 1.3.1 TheEstimationof(Limiting)Probabilities .................. 14 1.3.2 Prediction...................................................... 15 1.3.3 ProblemswithSideInformation ............................. 16 1.3.4 TheCaseofSeveralIndependentSamples................... 17 1.4 HypothesisTesting...................................................... 20 1.4.1 Goodness-of-FitorIdentityTesting.......................... 20 1.4.2 TestingforSerialIndependence.............................. 20 1.5 ExamplesofHypothesisTesting ....................................... 21 1.6 Real-ValuedTimeSeries ............................................... 22 1.6.1 DensityEstimationandItsApplication...................... 22 1.6.2 ExampleofForecasting ...................................... 25 1.7 TheHypothesisTestingforInfiniteAlphabet ......................... 28 1.8 Conclusion .............................................................. 29 Appendix ...................................................................... 29 References..................................................................... 42 2 ApplicationstoCryptography ............................................. 45 2.1 Introduction ............................................................. 45 2.2 DataCompressionMethodsasaBasisforRandomnessTesting..... 47 2.2.1 RandomnessTestingBasedonDataCompression .......... 47 2.2.2 RandomnessTestingBasedonUniversalCodes ............ 50 vii viii Contents 2.3 TwoNewTestsforRandomnessandTwo-FacedProcesses.......... 50 2.3.1 The“BookStack”Test ....................................... 51 2.3.2 TheOrderTest ................................................ 52 2.3.3 Two-Faced Processes and the Choice oftheBlockLengthforaProcessTesting ................... 53 2.4 TheExperiments........................................................ 55 2.5 Analysis of Stream Ciphers and Random Number GeneratorsUsedinCryptography...................................... 58 2.5.1 TheDistinguishingAttackonZK-CryptCipher............. 59 2.5.2 AnalysisofthePRNGRC4................................... 59 2.6 AStatisticalAttackonBlockCiphers ................................. 60 2.6.1 CryptanalysisofBlockCiphers............................... 60 2.6.2 DescriptionoftheAttack ..................................... 62 2.6.3 VariantsandOptimizations ................................... 64 2.6.4 ExperimentswithRC5........................................ 65 Appendix ...................................................................... 67 References..................................................................... 68 3 SCOT-ModelingandNonparametricTestingofStationaryStrings.... 71 3.1 Introduction ............................................................. 71 3.2 TheoryofUC-BasedDiscrimination................................... 73 3.2.1 HomogeneityTestingwithUC ............................... 73 3.2.2 Preliminaries................................................... 75 3.2.3 MainResults................................................... 76 3.2.4 SketchofANJustificationUndertheNull-Hypothesis ..... 77 3.2.5 PrecedingWork................................................ 78 3.3 SCOT-Modeling......................................................... 79 3.3.1 m-MCReductiontoMConAm............................... 80 3.3.2 Counterexample............................................... 81 3.3.3 1-MCModelInducedbySCOT .............................. 81 3.3.4 StationaryDistributionforModel1.......................... 82 3.3.5 ‘Comb’ModelD ............................................. 83 m 3.3.6 Models2....................................................... 84 3.4 LimitTheoremsforAdditiveFunctionsofSCOTTrajectories....... 89 3.4.1 SCOTModelsAdmittingContinuousTimeLimit........... 90 3.4.2 ContinuousTimeLimit ....................................... 90 3.4.3 ‘Thorny’THa;bSCOTModel................................. 91 3.4.4 AsymptoticNormalityforAdditiveFunctions ofm-MCTrajectories ......................................... 92 3.4.5 AsymptoticExpansionforAdditiveFunctions .............. 92 3.4.6 NonparametricHomogeneityTest............................ 93 3.4.7 ExponentialTailsforLog-LikelihoodFunctions ............ 93 3.4.8 ApplicationtotheSTIAnalysisUnderColoredNoise...... 94 Contents ix 3.5 SCOTHomogeneityTestforRealData................................ 96 3.5.1 NASDAQData ................................................ 97 3.5.2 ResultsforThreePC.......................................... 98 3.5.3 ResultsforTwoPC............................................ 98 3.5.4 ResultsforOnePC............................................ 99 3.5.5 PredictionAccuracy........................................... 99 3.5.6 Follow-UpAnalysis........................................... 100 3.5.7 ComparisonwithGARCH.................................... 101 3.5.8 GARCHandEpicycles........................................ 102 3.5.9 Madisonvs.HamiltonDiscriminationofStyles............. 103 3.5.10 HeliumEmissionsandSeismicEvents....................... 107 3.5.11 DiscussionandConclusions.................................. 109 3.6 UC-BasedAuthorshipAttribution ..................................... 109 3.6.1 CCC-andCC-Statistics....................................... 111 3.6.2 CCC-SampleSizeRequirements ............................ 113 3.6.3 NaiveExplanationofCCC-ConsistencyonToy Example........................................................ 113 3.6.4 Methodology .................................................. 114 3.6.5 Follow-UpAnalysisoftheMostContributingPatterns..... 116 3.6.6 Results ......................................................... 117 3.6.7 BriefSurveyofMicro-StylometryTools..................... 119 3.6.8 AttributionofLiteraryTexts.................................. 122 3.6.9 TwoTranslationsofShakespeareSonnets ................... 122 3.6.10 TwoBooksofIsaiah........................................... 123 3.6.11 TwoNovelsoftheSameAuthor.............................. 125 3.6.12 InhomogeneityofanEarlySholokhov’sNovel.............. 125 3.6.13 AttributionofFederalistPapers............................... 130 3.6.14 TheShakespeareControversy ................................ 133 3.6.15 Amoresetal.VersusRapeofLucrece........................ 136 3.6.16 HeroandLeanderVersusItsContinuation................... 138 3.6.17 ComparisonwithPoemsChapmani,iD1;2;3............. 140 References..................................................................... 141

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.