ebook img

Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong PDF

0.12 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Discussion of "Feature Matching in Time Series Modeling" by Y. Xia and H. Tong

StatisticalScience 2011,Vol.26,No.1,53–56 DOI:10.1214/11-STS345B MainarticleDOI:10.1214/10-STS345 (cid:13)c InstituteofMathematicalStatistics,2011 Discussion of “Feature Matching in Time Series Modeling” by Y. Xia and H. Tong Kung-Sik Chan and Ruey S. Tsay 2 We thank Xia and Tong for their stimulating ar- Consider the simple case that a time series of re- 1 0 ticle on time series modeling. Their emphasis on es- turns,{rt},follows ageneralized autoregressive con- 2 timation rather than model specification is interest- ditionalheteroscedastic modeloforder(1,1)orsim- n ing. It brings new light to statistical applications in ply a GARCH(1,1) model: a generalandtotimeseriesanalysisinparticular.The J r =σ ε , use of maximum likelihood or least squares method t t|t−1 t 6 is so common, especially with the widely available σt2|t−1=ω+αrt2−1+βσt2−1|t−2, ] statistical software packages, thatonetendsto over- E where ω > 0,α ≥ 0,β ≥ 0, 1 > α + β > 0 are pa- look its limitations and shortcomings. M rameters, {ε } are independent and identically dis- There is hardly any statistical method or proce- t tributed (i.i.d.) random variables with zero mean t. dure that is truly “one-size-fits-all” in real applica- a tions. We welcome Xia and Tong’s contributions as andunitvariance,andεt isindependentofpastone- st they argue so convincingly that feature matching step-ahead conditional variances σs2|s−1,s≤t. Esti- [ often fares better in time series modeling. On the mation of the GARCH model can be done by max- 1 imizing the Gaussian likelihood of the data, which other hand, we’d like to point out some issues that v essentially matches the conditional variances with 7 deserve a careful study. the squared returns. 6 3 A natural generalization of the catch-all method 1. HIGHER ORDER PROPERTIES 1 is to estimate a GARCH model that matches the 1. The conditional mean function generally provides k-step-ahead conditional variance to the kth ahead 0 a good description of the cyclical behavior of the data, for k=1,2,...,m with a fixed m, by minimiz- 2 underlying process, and the catch-all approach can ing some weighted measure of dissimilarity of the 1 : be effectively implemented by estimating the model multi-step conditional variances to future squared v that matches the multi-step conditional means to returns.Variousdissimilaritymeasuresmaybeused. i X the data, as eminently illustrated by the authors. Here, we illustrate the usefulness of this idea by r Here, we want to point out the natural extension adopting the negative twice Gaussian log-likelihood a of estimating a model by matching multi-step con- as the dissimilarity measure, that is, estimating ditional higher moments to the data. For example, a GARCH model by minimizing in financial time series analysis, it is pertinent to n−m m model the dynamics of the conditional variances. 2 2 2 S(ω,α,β)= X Xwℓ{rt+ℓ/σt+ℓ|t+log(σt+ℓ|t)}, t=1 ℓ=1 Kung-Sik Chan is Professor, Department of Statistics where {w } is a set of fixed weights and σ2 is the and Actuarial Science, University of Iowa, Iowa City, ℓ t+ℓ|t Iowa 52242, USA e-mail: [email protected]. conditional variance of rt+ℓ given information avail- Ruey S. Tsay is H. G. B. Alexander Professor, Booth ableattimet.Whenm=1,thenewmethodreduces School of Business, University of Chicago, 5807 South to the Gaussian likelihood method. On the other Woodlawn Avenue, Chicago, Illinois 60637, USA hand, under the assumption that the true model is e-mail: ruey.tsay@ chicagobooth.edu. a GARCH model and for a fixed m>1, the esti- mator is expected to be consistent and asymptoti- This is an electronic reprint of the original article cally normal, with details of the investigation to be published by the Institute of Mathematical Statistics in Statistical Science, 2011, Vol. 26, No. 1, 53–56. This reported elsewhere. However, if the GARCH model reprint differs from the original in pagination and does notcontain thetrue model,as likely is thecase typographic detail. in practice, the (generalized) catch-all method with 1 2 K.-S. CHANAND R.S. TSAY Fig. 1. The red line connects the one-step-ahead conditional variance, σˆ2 , from the GARCH(1,1) model fitted by the t|t−1 catch-all method with m=30 and equal weights, whereas their counterparts from the model fitted by the Gaussian likelihood method are connected by the blue line. The light gray bars are the squared daily returns of the CREF series. m>1 may provide new information for estimating nicely with the increasing kurtosis of the data; see a GARCH model that better matches the observed Tsay [(2010), Chapter 3] for a discussion on contri- volatility clustering pattern. butionofαtokurtosisofthereturnr .Thisexample t We tried this approach by fitting a GARCH(1,1) illustrates the usefulness of the generalization of the model to the daily returns of a unit of the CREF catch-all method by matching higher moments, and stock fund over the period from August 26, 2004 to also the potential usefulness of developing a test for August 15, 2006; this series was analyzed by Cryer model misspecification based on the divergence of and Chan [(2008), Chapter 12], and they identified the catch-all estimates with increasing m. the series as a GARCH(1,1) process. Gaussian like- lihood estimation yields ωˆ =0.0164,αˆ=0.0439,βˆ= 2. INFORMATION CONTENT 0.917.Ontheotherhand,thecatch-allmethod,with As statisticians, we love data. However, data have m=30andtheweightswℓ∝1,resultsintheestima- their limitation. Available data may not be infor- tes: ωˆ =0.0261,αˆ =0.102 and βˆ=0.836. Figure 1 mativetoconductanymeaningfulfeaturematching. contrasts the fitted values, σˆt2|t−1, based on the two When the information content pertaining to the se- fitted GARCH models with the squared returns su- lected feature is low, feature matching as an estima- perimposedaslightgraybars.Thefigureshowsthat, tion method is likely to fail. As an example, assess- ascomparedtothemodelestimatedbytheGaussian ment of financial risk focuses on the tail behavior likelihood method, the GARCH model fitted by the of the loss over time. The relevant feature here is catch-all method appears to track the squared re- the upper quantiles of the loss distribution. Since turnsmoreclosely over thevolatile periodand tran- big losses are rare, available data, no matter how sitintotheensuingquietperiodatafasterratecom- big the sample size is, are not informative about the mensurate with the data. It seems that by requiring extreme quantiles. Theuncertainty in any matching the model to track multi-step squared returns, the would be high. method chooses a GARCH model that gives more As another example, consider the monthly time weight to the current squared return to the future series of global temperature anomalies from 1880 to evolution of conditional variances. Indeed, as m in- 2010. The data are available from many sources on creasesfrom1to30,theARCHcoefficientestimateαˆ the web, for example, the websites http://data.giss. increasesfrom0.0439 to0.102, whereastheGARCH nasa.gov/gistemp/oftheGoddardInstituteforSpa- coefficientβˆdecreasesfrom0.917to0.836.Thesesys- ce Studies (GISS), National Aeronautics and Space tematicparametricchangessignifythatthetruemo- Administration(NASA)andhttp://www.ncdc.noaa. delisunlikelyaGARCH(1,1)model.Italsomatches gov/cmb-faq/anomalies.html oftheNationalClima- 3 DISCUSSION tic Data Center, National Oceanic and Atmospheric Administration (NOAA). These time series are of interest because of the concerns about global warm- ing.Thekey featuretomatch thenis theunderlying trend of the global temperature. To handle the time trend, two approaches are commonly used in the timeseriesliterature.Thefirstapproachiscalledthe difference-stationarity inwhichthetimeseriesisdif- ferenced to obtain stationarity. The ARIMA models of Box and Jenkins are examples of this approach. Thesecondapproach iscalled thetrend-stationarity inwhichoneimposesalineartimetrendforthedata. Thedeviation fromthetimetrendbecomesastatio- naryseries.Eventhoughwehave130yearsofdata,it remainshardforthedatatodistinguishatrend-sta- tionary model from a difference-stationary one. On theotherhand,thelong-termforecasts ofadifferen- ce-stationary model differ dramatically from those Fig. 2. The directed scatterplot on the right side shows the of a trend-stationary model. The eventual forecast- evolution of the ARMA coefficients for the linear trend plus ing function of an ARIMA model without constant ARMA(1,1) noise model fitted to the annual global temper- is a horizontal line with standard error approaching ature anomalies, with m in the catch-all method increasing infinity whereas that of a trend-stationary model is from 1 to 30, while the evolution of the ARMA coefficients for the ARMA(1,1,1) model is shown by the directed scatter- a straight line going to positive or negative infin- plot on the left. The two solid circles show the estimates with ity with a finite standard error. See Tsay (2012) for m=1. further analysis of the data and model comparison. Here, we explore whether feature matching may- mean, σ2 is the ℓ-step-ahead prediction variance, cast new light on the preceding problem. For sim- that is, ℓσ2 =σ2 ℓ−1ψ2 where ψ’s are the coeffi- plicity, we consider the annual global temperature ℓ aPj=0 j cients in the linear MA representation of the model anomalies. Model identification suggests two plausi- also known as the impulse response coefficients; σ2 ble models, namely, ARIMA(1,1,1) model and lin- is the harmonic mean of σ2,ℓ=1,2,...,m. When ear trend plus ARMA(1,1) model. We use the plus ℓ m=1, the catch-all method reduces to Gaussian li- convention for the MA coefficients, that is, the kelihood estimation. But for m > 1, the catch-all ARIMA(1,1,1) modelspecifies (1−φB)(1−B)Y = t methodattempts tomatch modelpredictionwithm (1+θB)a , where B is the backshift operator such t “future”values,intermsofmeansandvariances.We that BYt = Yt−1 and at are i.i.d. with zero mean implementedthecatch-allmethodforfittingtheglo- andvariance σ2.For simplicity, thesemodels arefit- a baltemperatureanomalies,withm=1,2,...,30.Fi- ted by conditional least squares with the residuals gure 2 plots the evolution of the catch-all estimates initialized as 0 at the first time point. We also fit- of the ARMA coefficients for the two models. It is ted themodelwiththegeneralized catch-all method interesting to observe that thecatch-all estimates of that matches the predictive distribution to the fu- the ARIMA(1,1,1) model quickly move away from ture m values, in terms of the k-step-ahead predic- the Gaussian likelihood estimates and fluctuate sta- tive means and variances, k =1,2,...,m. The loss blyaroundanessentiallyARIMA(0,1,1)model,that function is similar to that discussed in the previous is, an exponential smoothing model, for a while, be- section. After profiling out the innovation variance, fore they drift to more extreme values. Thus, the the generalized catch-all method fits a model to the catch-all method suggests that for forecasting on time series {Y |t=1,2,...,T} by minimizing t a decadal scale, an exponential smoothing model m may be appropriate for the temperature data. Simi- S(θ)=XX(Yt+ℓ−Yˆt+ℓ|t)2σ2/σℓ2, larly, the catch-all estimates of the linear trend plus t ℓ=1 ARMA(1,1) noise model quickly move away from where θ is the vector of all parameters except the the Gaussian likelihood estimate, and fluctuate sta- innovation variance; Yˆ is the ℓ-step predictive bly for a number of steps around an estimate with t+ℓ|t 4 K.-S. CHANAND R.S. TSAY its MA(1) coefficient comparable to that of the ex- tionseeksparametersthatgivethehighestprobabil- ponential smoothing model. As the corresponding ity of thedataundertheentertained model.Feature AR(1)estimatesarequitecloseto1,thecatch-alles- matching searches for parameters that bestdescribe timates of thetrendplus noisemodelsuggest strong thefeaturesofinterest.Theexamplesusedinthear- similarity between the two models over a forecast ticle all have clearly defined features such as cycles horizon on a decadal scale. When m approaches 30, and, as expected, feature matching works well. On thecatch-allestimatesofthetrendplusARMA(1,1) theother hand,therearesituations underwhichthe noise model drift off on a flight that suddenly ends objective of data analysis does not match well with into a trend plus uncorrelated noise model. any specific feature. For example, with the economy For long-term prediction, the global temperature under pressure, unemployment is of major impor- series is of limited value. The preceding analysis tance to people of all walks. It is well known that highlights the conundrum that using a trend-statio- unemployment rate exhibits strong business cycles, nary model,we simply force thedata to supportthe which in sharp contrast with examples used in the trend; for difference-stationary models, we basically article donot have a fixed (or even an approximate) endupusingtheexponentialsmoothingmodel.This period. It is then not clear which feature to match is a problem facing all estimation methods, not just if one is interested in understandingand forecasting feature matching. the U.S. unemployment rate. We consider these two examples not because we question thevalue of featurematching in time series modeling; rather we like to point out that care must ACKNOWLEDGMENTS be exercised in using feature matching. In other K.-S.ChanthankstheU.S.NationalScienceFoun- words, feature matching may encounter the same dation (NSF-0934617) and R. S. Tsay thanks the problemasthetraditionalestimationmethods.They Booth Schoolof Business,University ofChicago, for are statistical tools. It is the statisticians who use partial financial support. thetools,notthetools thatprocessthedata.Which tool to use in a given application is the choice of astatistician. Whilewewelcome theaddition offea- ture matching to the tool kits, we like to emphasize REFERENCES that there are limitations in feature matching, too. Cryer,J.D.andChan,K.S.(2008).TimeSeriesAnalysis: With Applications in R, 2nd ed.Springer, New York. 3. FEATURE VERSUS OBJECTIVE Tsay, R. S. (2010). Analysis of Financial Time Series, 3rd ed.Wiley, Hoboken, NJ. Modelselection indataanalysisdependscritically Tsay, R. S.(2012). An Introduction to Financial Data Ana- on theobjective of data analysis. Likelihood estima- lysis. Wiley, Hoboken,NJ.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.