Table Of ContentSIGNALIDENTIFICATIONANDFORECASTINGINNONSTATIONARY
TIMESERIESDATA
By i ^
DENG-SHANSHIAU
ADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOL
OFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENT
OFTHEREQUIREMENTSFORTHEDEGREEOF
DOCTOROFPHILOSOPHY
UNIVERSITYOFFLORIDA
2001
Copyright2001
by
Deng-ShanShiau
Idedicatethisworktomyfamily.
ACKNOWLEDGMENTS
Aschairmanofmycommittee,Dr. MarkC.K.Yangguidedmethroughout
mydissertation. Withouthisencouragementandsupport,thisworkwouldnever
havebeencompleted. Iextendtohimmysinceregratitudeandwillremember
himalways. IwouldalsoliketothankDr. PejaverV.Rao,Dr. RandyL.Carter,
Dr. SamuelS.Wu,andDr. PanosM.Pardalosforservingonmydissertation
committee.
Dr. J.ChrisSaekellares,directoroftheBrainDynamicsLaboratoryatthe
UniversityofFlorida,andDr. LeonD.lasemidis,associateprofessoratArizona
StateUniversity,supportedandencouragedmethroughoutmyyearsatthe
UniversityofFlorida. Ioffermysincerethankstothem.
Iwishtoexpressmyspecialthankstomyfamily: mywifeShu-Jen,forher
loveandpatience;andtoourdaughter,Marie,forbeingagloriousjoytous. Iam
gratefultomyparentsandmother-in-lawfortheirencouragement.
Finally,Iwouldliketothankallmycolleaguesandfriendsfortheir
assistance.
IV
TABLEOFCONTENTS
page
ACKNOWLEDGMENTS
iv
LISTOFTABLES vii
LISTOFFIGURES viii
ABSTRACT x
CHAPTERS
1 INTRODUCTION 1
1.1 ResearchScopeinTimeSeriesAnalysis 1
1.2 MotivationandPurposeofResearch 2
1.3 Outline 3
2 LITERATUREREVIEW 4
2.1 QuantificationofRegularityforTimeSeries: Approximate
Entropy(ApEn) 4
2.2 AnalysisandForecastinginNonstationaryTimeSeries ... 6
2.2.1 ARJMAModels 6
2.2.2 TrendRemovingProcess 7
2.2.3 ARARMAModels 8
2.2.4 KalmanFilter 9
2.2.5 DynamicalStochasticApproximationMethod 10
2.2.6 AdaptiveForecastingMethod 12
2.2.7 WhittlePseudo-MaximumLikelihoodEstimation ... 13
2.3 AnalysisofWolf’sSunspotTimeSeries 13
3 PATTERNMATCHREGULARITYSCORES:ACOMPARISON
WITHAPPROXIMATEENTROPY 16
3.1 ConceptofPatternMatch 16
3.2 PatternMatchRegularityScores 18
3.3 Example: AComparisonWithApEn 20
3.4 SummaryandDiscussion 30
4 PATTERNMATCHSIGNALIDENTIFICATION(PMSI)ALGO-
RITHM 32
V
4.1 Introduction 32
4.2 PatternMatchSignalIdentification(PMSI)Algorithm.... 33
4.3 AnalyticProofforthePatternIdentifiabilitybyPMSIAlgo-
rithm 38
4.4 ApplicationsofPMSIAlgorithm 67
4.4.1 SimulationStudies 67
4.4.2 ApplicationonEEGTimeSeries 72
4.4.3 ApplicationonSunspotTimeSeries 76
4.5 SummaryandDiscussion 78
5 FORECASTING 81
5.1 Method 81
5.2 ForecastinginEEGTimeSeries 82
5.3 ForecastinginWolf’sSunspotTimeSeries 88
5.4 SummaryandDiscussion 92
6 CONCLUSION 94
REFERENCES 97
BIOGRAPHICALSKETCH 100
vi
, ,
LISTOFTABLES
Table page
4.1 MinimumrequiredPeforidentifyinganypatternofembeddedsignals
fordifferentselectionofliineachgivenIanda 59
4.2 Minimumrequiredvalueofpgforidentifyinganypatternofembed-
dedsignalswithoptimalselectionofliinTable4.1 65
4.3 Resultsofthe20simulations(n=40,000ineachsimulaiton)fortest-
ingthepatternidentifiabilitybythePMSIalgorithm 71
4.4 Patternidentificationanalysisina30-minuteEECtimeseries 73
4.5 Patternidentificationanalysisinsmoothedsunspottimeseries 78
5.1 AR(li)modelsfittedinthelearningperiodofanEEGtimeseriesfor
Zi=5,6,... 10,whereu\=Ui~p,andpisthemeanvalueofU. . 83
5.2 Regressionequationsofthenext6pointsregressonthepreviousZi
pointswhenthesubsequencesarepatternZi-matchedidentifiedby
thePMSIalgorithminthelearningperiodofanEEGtimeseriesU. 84
5.3 Averagepredictionerrorsforthenext6pointsinthepredictionpe-
riodofanEEGtimeseriesbyregressionequations,AR{li)andop-
timalARmodelsfittedinthelearningperiod 87
5.4 Regressionequationsofthenextsixthpointregressonthepreviousli
pointswhenthesubsequencesarepattern/i-matchedidentifiedby
thePMSIalgorithminthelearningperiodofthesmoothedsunspot
timeseriesU 89
5.5 AR{li)modelsfittedinthelearningperiodofsmoothedmonthlysunspot
timeseriesforZi =6,... 9,whereu[ =Ui~pandpisthemean
valueof17 89
5.6 Averagepredictionerrorsforthenextsixthpointintheprediction
periodofsmoothedmonthlysunspottimeseriesbyregressionequa-
tionsandARmodelsfittedinthelearningperiod 92
vii
LISTOFFIGURES
Figure page
3.1 Anexampleofgoodpatternmatchbutnotvaluematch 18
3.2 MIX(p)processfordifferentvaluesofp 22
3.3 NormalizedApEnversuspintheMIX(p)processfordifferentvalues
ofI,givenr=0.18 24
3.4 NormalizedScorelversuspintheMIX(p)processfordifferentvalues
of/,givenr=0.18 25
3.5 NormalizedScore2versuspintheMIX(p)processfordifferentvalues
of/,givenr=0.18 26
3.6 NormalizedApEnversuspintheMIX(p)processfordifferentvalues
ofr,given/=3 27
3.7 NormalizedScorelversuspintheMIX(p)processfordifferentvalues
ofT,given/=3 28
3.8 NormalizedScore2versuspintheMIX(p)processfordifferentvalues
ofT,given/=3 29
4.1 Examples(Z=7,l\=4,Z2=3)forpattern^-matches 36
4.2 Estimatedmatchingprobabilities(Pj’s)withlengthZ = 5anda =
0.0(whitenoise),0.4,0.7,1.0(randomwalk) 57
4.3 MinimumrequiredembeddingprobabilityPefordifferentcombina-
tionsofZxandI2forgivenlengthofsignal1=7anda 60
4.4 MinimumrequiredembeddingprobabilityPefordifferentcombina-
tionsofZiandI2forgivenlengthofsignalZ=8anda 60
4.5 ExamplesofestimatingtherequiredvalueofPe(givenapatternpZj)
byplottingpeversusTSj—maxi{TSi)forZ=5anda=0.4,0.8in
AE(1)timeseries 62
4.6 ExamplesofestimatingtherequiredvalueofPe(givenapatternpZ^)
byplottingpeversusTSj—maxj(TS’j)forZ=7anda=0.4,0.8in
A/2(l)timeseries 63
viii
4.7 Examples(I=6,a=0,0.4,0.7,1.0)oftheminimumrequiredvalue
ofpefordifferentpatternsofembeddedsignals 64
4.8 MinimumrequiredvalueofPeforidentifyinganypatternofembed-
dedsignals,wherelength/=6,7,8,9anda=0,0.1,...,1.0 66
4.9 Fourdifferentpatterns(pt[,ptl,pt\,andpt%)oftheembeddingsig-
nalsfortestingthepatternidentifiabilitybythePMSIalgorithm. . 69
4.10 Fourdifferentpatterns(ptj, pt\)oftheembeddingsig-
nalsinasimulatedunderlyingAR(1)timeseries(a=0.4) 70
4.11 TenminuteEEGtimeseriesdata 72
4.12 Tenidentifiedsignalsinthis30-minuteEEGtimeseriesusingli =
5,6andI2=& 74
4.13 Tenidentifiedsignalsinthis30-minuteEEGtimeseriesusingli =
7,8andI2=G 75
4.14 Thefirst1500points,from1749to1873,ofthesix-monthsmoothed
Wolf’smonthlysunspottimeseries 76
4.15 Fiveidentifiedsignals(/i=6,7,8,9andI2=6)withthesamemost
predictablepatternofthesix-monthsmoothedWolf’smonthlysunspot
timeseries 77
5.1 Patternfi-matchedsubsequencesandtheir6-stepaheadprediction
valuesinthepredictionperiodofanEEGtimeserieswhenli =
5and6 85
5.2 Pattern/i-matchedsubsequencesandtheir6-stepaheadprediction
valuesinthepredictionperiodofanEEGtimeserieswhenli =
7and8 86
5.3 Pattern/i-matchedsubsequencesandtheirpredictionvaluesinthe
predictionperiodofsmoothedmonthlysunspottimeserieswhen
=6and7 90
5.4 Pattern/i-matchedsubsequencesandtheirpredictionvaluesinthe
predictionperiodofsmoothedmonthlysunspottimeserieswhen
/i=8and9 91
IX
AbstractofDissertationPresentedtotheGraduateSchool
oftheUniversityofFloridainPartialFulfillmentofthe
RequirementsfortheDegreeofDoctorofPhilosophy
SIGNALIDENTIFICATIONANDFORECASTINGINNONSTATIONARY
TIMESERIESDATA
By
Deng-ShanShiau
December2001
Chair: MarkC.K.Yang
MajorDepartment: Statistics
Traditionaltimeseriesanalysisfocusesonfindingtheoptimalmodeltofit
thedatainalearningperiodandusingthismodeltomakepredictionsinafuture
period. However,manypracticalapplications,suchasearthquaketimeseriesor
epilepticbrainelectroencephalogram(EEG)timeseries,mayonlycontainafew
meaningful,orpredictablepatterns,whichcanbeusedformeaningfulforecasting
suchastheoccurrencesofsomespecificeventsfollowingsimilarpatterns. Inthese
cases,thetraditionaltimeseriesmodelsuchastheautoregressive{AR)model
usuallygivespoorpredictionssincethemodelisconstructedtofittheentire
learningperiod,whilethepatternusefulforpredictionmayoccurduringonlya
smallportionoftheperiod.
Thepurposeofthisresearchistoprovideastatisticalalgorithmtoidentify
themostpredictablepatterninagiventimeseriesandtoapplythispatternto
makepredictions.
Inthisdissertation,weproposethePatternMatchSignalIdentification
(PMSI)algorithmtoidentifythemostpredictablepatterninagiventimeseries.
Inthisalgorithm,theconceptofthepatternmatchisusedinsteadofthegenerally
X