ebook img

Spurious Regressions in Financial Economics? PDF

21 Pages·2003·0.184 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Spurious Regressions in Financial Economics?

THEJOURNALOFFINANCE(cid:1)VOL.LVIII,NO.4(cid:1)AUGUST2003 Spurious Regressions in Financial Economics? WAYNEE.FERSON,SERGEISARKISSIAN,andTIMOTHYT.SIMINn ABSTRACT Eventhoughstockreturnsarenothighlyautocorrelated,thereisaspurious regressionbiasinpredictiveregressionsforstockreturnsrelatedtotheclas- sicstudiesofYule(1926)andGrangerandNewbold(1974).Dataminingforpre- dictor variables interacts with spurious regression bias. The two e¡ects reinforceeachother,becausemorehighlypersistentseriesaremorelikelyto befoundsigni¢cantinthesearchforpredictorvariables.Oursimulationssug- gestthatmanyoftheregressionsintheliterature,basedonindividualpredic- torvariables,maybespurious. PREDICTIVEMODELSFORCOMMONSTOCKRETURNShavelongbeenastaple of¢nancial economics.Earlystudies,reviewedbyFama(1970),usedsuchmodelstoexamine markete⁄ciency.Stockreturnsareassumedtobepredictable,basedonlagged instrumentalvariables,inthecurrentconditionalassetpricingliterature.Stan- dard lagged variables include the levels of short-term interest rates, payout-to- priceratios for stock marketindexes,andyieldspreadsbetween low-grade and high-gradebondsorbetweenlong-andshort-termbonds.Manyofthesevariables behaveaspersistent,orhighlyautocorrelated,timeseries. This paper studies the ¢nite sample properties of stock return regressions with persistent lagged regressors.We focus on two issues.The ¢rst is spurious regression,analogoustoYule(1926)andGrangerandNewbold(1974).Thesestu- dieswarnedthatspuriousrelationsmaybefoundbetweenthelevelsoftrending time series that are actually independent. For example, giventwo independent random walks, it is likely that a regression of one on the other will produce a ‘‘signi¢cant’’slopecoe⁄cient,evaluatedbytheusualt-statistics. In this paper, the dependent variables are asset rates of return, which are not highly persistent.Thus, one may think that spurious regression problems nFersonisattheCarrollSchoolofManagement,BostonCollegeandisaResearchAssoci- ate, National Bureau of Economic Research; Sarkissian is on the Faculty of Management, McGillUniversity;SiminisattheSmealCollegeofBusiness,PennsylvaniaStateUniversity. WearegratefultoEugeneFamaforsuggestingthequestionthatmotivatesthisresearchand toJohnCochrane,FrankDiebold,RichardC.Green,GordonHanka,RaymondKan,Donald Keim, Je¡rey Ponti¡, Bill Schwert, RossenValkanov, and an anonymous referee forhelpful comments. Ferson acknowledges ¢nancial support from the Pigott-Paccar professorship at theUniversityofWashingtonandtheCollinsChairinFinanceatBostonCollege.Sarkissian acknowledges ¢nancial support from FCAR and IFM2.Thispaperhasbene¢ted from work- shopsatMcGillUniversity,attheJuly2000NBERAssetPricingGroup,the2000Northern FinanceAssociationMeetings,andthe2001AmericanFinanceAssociationMeetings. 1393 1394 TheJournalofFinance areunlikely.However,thereturnsmaybeconsideredtobethesumofanunob- served expected return, plus unpredictable noise. If the underlying expected returns are persistent time series, there is still a risk of spurious regression. Becausetheunpredictablenoiserepresentsasubstantialportionofthevariance of stock returns, the spurious regression results will di¡er from those in the classicalsetting. Thesecondissueis‘‘datamining’’asstudiedforstockreturnsbyLoandMac- Kinlay (1990), Foster, Smith, andWhaley (1997), and others. If the standard in- struments employed in the literature arise as the result of a collective search through the data, they may have no predictive power in the future. Stylized ‘‘facts’’about the dynamic behavior of stock returns using these instruments (e.g.,Cochrane(1999))couldbeartifactsofthesample.Suchconcernsarenatur- al,giventhewidespreadinterestinpredictingstockreturns. Wefocusonspuriousregressionandtheinteractionbetweendataminingand spurious regression bias. If the underlying expected return is not predictable over time, there is no spurious regression bias, even if the chosen regressor is highlyautocorrelated.Inthiscase,ouranalysisreducestopuredataminingas studiedbyFosteretal.(1997). Whenexpectedreturnsarepersistent,spuriousregressionbiascallssomeof the evidence of previous studies into question.We examine univariate regres- sions for theStandard and Poors500 (S&P 500)excess returnusing13 popular laggedinstrumentsoverthesampleperiodsoftheoriginalstudies.We¢ndthat7 of26t-ratiosorregressionR-squares,signi¢cantbytheusual¢vepercentcriter- ia,areconsistentwiththenullhypothesisofaspuriousregression. The spurious regression and data mining e¡ects reinforce each other. If researchers have mined the data for regressors that produce high‘‘R-squares’’ in predictive regressions, the mining is more likely to uncover the spurious, persistentregressors.Thestandardregressorsintheliteraturetendtobehighly autocorrelated, as expected if the regressors result from a spurious mining process.Forreasonableparametervalues,alltheregressionsthatwereviewfrom theliterature areconsistent with a spurious mining process, evenwhenonlya smallnumberofinstrumentsareconsideredinthemining. This paper contributes to a substantial literature that studies the sampling properties of predictive regressions for stock returns. Goetzmann and Jorion (1993), Nelson and Kim (1993), Bekaert, Hodrick, and Marshall (1997), and Stambaugh (1999) study biases due to dependent stochastic regressors. Kim, Nelson, and Startz (1991) study structural-change-induced misspeci¢ca- tion.CampbellandShiller(1988)considerdependentregressorswithunitroots. Fama and French (1988a), Kandel and Stambaugh (1990), and Hodrick (1992) focus on autocorrelation for long-horizon stock returns. Lanne (2001) and Valkanov (2001) develop statistical inference methods in the presence of near unitroots.PesaranandTimmermann(1995),BossaertsandHillion(1999),Goyal andWelch (2002), and Simin (2002) examine model selection and out-of-sample validity. Boudoukh and Richardson (1994) provide an overview of econometric issues.Schwert(2002)reviewsanomaliesandtradingstrategiesbasedonpredict- ability. SpuriousRegressionsinFinancialEconomics? 1395 Therestofthepaperisorganizedasfollows.SectionIdescribesthedata.Sec- tionIIpresentsthemodelsusedinthesimulationexperiments.SectionIIIpre- sentsthesimulationresults.First,westudythepurespuriousregressionissuein isolation.Thenweconsidertheinteractionbetweenspuriousregressionanddata miningbiases.SectionIVo¡ersconcludingremarks. I. TheData TableIsurveysnineofthemajorstudiesthatproposeinstrumentsforpredict- ingstockreturns.Thetablereportssummarystatisticsformonthlydata,cover- ingvarioussubperiodsof1926through1998.Thesamplesizeandperioddepends onthe studyandthevariable, and thetableprovides the details.We attempt to replicate the data series that were used in these studies as closelyas possible. Thesummarystatisticsarefromourdata.Notethatthe¢rst-orderautocorrela- tionsfrequentlysuggestahighdegreeofpersistence.Forexample,theshort-term Treasurybillyields,monthlybook-to-marketratios,thedividendyieldoftheS&P 500,andsomeoftheyieldspreadshavesample¢rstorderautocorrelationsof0.97 orhigher. Table I also summarizes regressions for the monthly return of the S&P 500 stock index, measured in excess of the one-month Treasury bill return from Ibbotson Associates, on the lagged instruments. These are OLS regressions using one instrument at a time.We report the slope coe⁄cients, their t-ratios, and the adjusted R-squares.The R-squares range from less than one percent to more than seven percent, and 8 of the 13 t-ratios are larger than 2.0. The t-ratios are based on the OLS slopes and Newey^West (1987) standard errors,wherethenumberoflags is chosenbasedonthenumberofstatistically signi¢cantresidualautocorrelations.1 The small R-squares inTable I suggest that predictability represents a tiny fractionof thevariancein stock returns. However, even a small R-squaredcan signal economically signi¢cant predictability. For example, Kandel and Stam- baugh(1996)andFleming,Kirby,andOstdiek(2001)¢ndthatoptimalportfolios respondbyasubstantialamounttosmallR-squaresinstandardmodels.Studies combiningseveralinstrumentsinmultipleregressionsreporthigherR-squares. Forexample,Harvey(1989),using¢veinstruments,reportsadjustedR-squaresas highas17.9percentforsizeportfolios.FersonandHarvey(1991)reportR-squares of 5.8 percent to 13.7 percent for monthly size and industry portfolio returns. These values suggest that the‘‘true’’ R-squared, if we could regress the stock returnonitstime-varyingconditionalmean,mightbesubstantiallyhigherthan weseeinTableI.Toaccommodatethispossibility,weallowthetrueR-squaresin our simulations to varyover the range from 0 to 15 percent. For expositionwe focusonanintermediatevalueof10percent. 1Speci¢cally,wecompute12sampleautocorrelationsandcomparethevalueswithacuto¡ p ffiffiffiffi at two approximate standard errors: 2/ T, whereT is the sample size.The number of lags chosen is the minimum lag length at which no higher order autocorrelation is larger than twostandarderrors.Thenumberoflagschosenisindicatedinthefarrightcolumn. 1 3 9 6 TableI CommonInstrumentalVariables:Sources,SummaryStatistics,andOLSRegressionResults Thistablesummarizesvariablesusedintheliteraturetopredictstockreturns.The¢rstcolumnindicatesthepublishedstudy.Thesecondcolumn denotesthelaggedinstrument.Thenexttwocolumnsgivethesample(Period)andthenumberofobservations(Obs)onthestockreturns.Columns 5and6reporttheautocorrelation(r )andthestandarddeviationoftheinstrument(s ),respectively.Thenextthreecolumnsreportregression Z Z resultsforS&P500excessreturnonalaggedinstrument.Theslopecoe⁄cientisb,thet-statisticist,andthecoe⁄cientofdeterminationisR2.The lastcolumn(HAC)reportsthemethodusedincomputingthestandarderrorsoftheslopes.ThemethodofNewey^West(1987)isusedwiththe numberoflagsgiveninparentheses.MA((cid:2))referstothenumberofmovingaveragetermsusedinthecovariancematrix.Theabbreviationsin thetableareasfollows.TB1yistheyieldontheone-monthTreasurybill.Two-one,Six-one,andLag(two)-onearecomputedasthespreadsonthe returnsofthetwo-andone-monthbills,six-andone-monthbills,andthelaggedvalueofthetwo-monthandcurrentone-monthbill.Theyieldonall T h corporatebondsisdenotedasALLy.TheyieldonAAAratedcorporatebondsisAAAy,andUBAAyistheyieldoncorporatebondswithabelow e BAArating.Thevariable‘‘Cay’’isthelinearfunctionofconsumption,assetwealth,andlaborincome.Thebook-to-marketratiosfortheDowJones J o IndustrialAverageandtheS&P500arerespectivelyDJBMandSPBM. u r n a (1)Reference (2)Predictor (3)Period (4)Obs (5)rz (6)sz (7)b (8)t (9)R2 (10)HAC lo f F Breen,Glosten,&Jagannathan(1989) TB1y 5404^8612 393 0.97 0.0026 (cid:3)2.49 (cid:3)3.58 0.023 NW(5) in Campbell(1987) Two^one 5906^7908 264 0.32 0.0006 11.87 2.38 0.025 NW(0) a n Six^one 5906^7908 264 0.15 0.0020 2.88 2.13 0.025 NW(0) c e Lag(two)(cid:3)one 5906^7908 264 0.08 0.0010 9.88 2.67 0.063 NW(6) Fama(1990) ALLy^AAAy 5301^8712 420 0.97 0.0040 0.88 1.46 0.005 MA(0) Fama&French(1988a) Dividendyield 2701^8612 720 0.97 0.0013 0.40 1.36 0.007 MA(9) Fama&French(1989) AAAy^TB1y 2601^8612 732 0.92 0.0011 0.51 2.16 0.007 MA(9) Keim&Stambaugh(1986) UBAAy 2802^7812 611 0.95 0.0230 1.50 0.75 0.002 MA(9) UBAAy^TB1y 2802^7812 611 0.97 0.0320 1.57 1.48 0.007 MA(9) Kothari&Shanken(1997) DJBM 1927^1992 66 0.66 0.2270 0.28 2.63 0.078 MA(0) Lettau&Ludvigson(2001) ‘‘Cay’’ 52Q4^98Q4 184 0.79 0.0110 1.57 2.58 0.057 MA(7) Ponti¡&Schall(1998) DJBM 2602^9409 824 0.97 0.2300 2.96 2.16 0.012 MA(9) SPBM 5104^9409 552 0.98 0.0230 9.32 1.03 0.001 MA(5) SpuriousRegressionsinFinancialEconomics? 1397 To incorporate data mining, we compile a randomlyselected sample of 500 potentialinstruments,throughwhichoursimulatedanalystsiftstominethedata forpredictorvariables.Weselectthe500seriesrandomlyfromamuchlargersam- ple of 10,866 potential variables. The speci¢cs are described in the Appendix. Essentially,theprocedureistogenerateuniformlydistributedrandomnumbers, order the series from1to10,866 and randomlyextract 500 series.The 500 series are randomly ordered, and permanently assigned numbers between 1 and 500.When a data miner in our simulations searches through, say 50 series, we use the sampling properties of the 50 series to calibrate the parameters in the simulations. We alsouse our sample of potential instruments to calibratetheparameters thatgoverntheamountofpersistenceinthe‘‘true’’expectedreturnsinthemodel. On the one hand, if the instruments we see in the literature, summarized in Table I,arise froma spurious mining process, theyarelikely tobe morehighly autocorrelated than the underlying‘‘true’’expected stock return. On the other hand, if the instruments in the literature are a realistic representation of expected stock returns,theautocorrelationsinTableI maybeagoodproxyfor thepersistence of the true expected returns.2 The mean autocorrelationofour 500seriesis15percentandthemedianis2percent.Elevenofthe13sampleauto- correlationsinTableIarehigherthan15percent,andthemedianvalueis95per- cent.We considera range of values for the true autocorrelationbasedonthese ¢gures,asdescribedbelow. II. TheModels Considerasituationinwhichananalystrunsatime-seriesregressionforthe futurestockreturn,r ,onalaggedpredictorvariable: tþ1 r ¼aþdZ þv : ð1Þ tþ1 t tþ1 Thedataareactuallygeneratedbyanunobservedlatentvariable,Zn,as t r ¼mþZnþu ; ð2Þ tþ1 t tþ1 whereu iswhitenoisewithvariance,s2.Weinterpretthelatentvariable,Znas tþ1 u t the deviations of the conditional mean return fromtheunconditional mean, m, wheretheexpectationsareconditionedonanunobserved‘‘market’’information setattimet.Thepredictorvariablesfollowanautoregressiveprocess: ðZn;ZÞ0 ¼(cid:2)rn 0(cid:3)(cid:4)Zn ;Z (cid:5)0þ(cid:4)en;e(cid:5)0 ð3Þ t t 0 r t(cid:3)1 t(cid:3)1 t t 2Therearegoodreasonstothinkthatexpectedstockreturnsmaybepersistent.Assetpri- cing models like the consumption modelof Lucas (1978) describe expected stock returns as functions of expected economic growth rates. Merton (1973) and Cox, Ingersoll, and Ross (1985) propose real interest rates as candidate state variables, driving expected returns in intertemporalmodels.Suchvariablesarelikelytobehighlypersistent.Empiricalmodelsfor stock return dynamics frequently involve persistent, autoregressive expected returns (e.g., Conrad and Kaul (1988), Fama and French (1988b), Lo and MacKinlay (1988), or Huberman andKandel(1990)). 1398 TheJournalofFinance Theassumptionthatthetrueexpectedreturnisautoregressivefollowsprevious studiessuchasConradandKaul(1988),FamaandFrench(1988b),LoandMac- Kinlay(1988),andHubermanandKandel(1990). Togeneratethearti¢cialdata,theerrorsðen;eÞaredrawnrandomlyasanor- t t malvectorwithmeanzeroandcovariancematrix,S.Webuildupthetime-series oftheZandZnthroughthevectorautoregressionequation(3),wheretheinitial valuesaredrawnfromanormalwithmeanzeroandvariances,Var(Z)andVar(Zn). Theotherparametersthatcalibratethesimulationsare{m,s2,r,rn,andS}. u Wehaveasituationinwhichthe‘‘true’’returnsmaybepredictable,ifZncould t beobserved.ThisiscapturedbythetrueR-squared,Var(Zn)/[Var(Zn)þs2].Weset u Var(Zn)toequalthesamplevariance of theS&P 500return,inexcessofaone- monthTreasurybill return, multiplied by 0.10.Whenthe true R-squared of the simulationis10percent,theunconditionalvarianceofther thatwegenerate tþ1 isequaltothesamplevariance oftheS&P500return,andthe¢rst-orderauto- correlationissimilartothatoftheactualdata.Whenwechooseothervaluesfor thetrue R-squared,these determinethevalues for theparameters2.We set to u equalthesamplemeanexcessreturnoftheS&P500overthe1926through1998 period,or0.71percentpermonth. The extent of the spurious regression bias depends on the parameters r and rn, which control the persistence of the measured and the true regressor. ThesevaluesaredeterminedbyreferencetoTableIandfromoursampleof500 potentialinstruments.Thespeci¢csdi¡eracrossthespecialcases,asdescribed below. WhilethestockreturncouldbepredictedifZncouldbeobserved,theanalyst t usesthemeasuredinstrumentZ.IfthecovariancematrixSisdiagonal,Z and t t Znareindependent,andthetruevalueofdintheregression(1)iszero. t A. PureSpuriousRegression Tofocusonspuriousregressioninisolation,wespecializeequation(3)asfol- lows.ThecovariancematrixSisa2(cid:4)2diagonalmatrixwithvariances(s2,s2). n Foragivenvalue ofrn thevalue ofs2 is determinedas s2¼(1(cid:3)r2)Var(Zn).The n n n measuredregressorhasVar(Z)¼Var(Zn).Theautocorrelationparameters,rn¼r areallowedtovaryoverarangeofvalues.(Wealsoallowrandrntodi¡erfrom oneanother,asdescribedbelow.) Following Granger and Newbold (1974), we interpret a spurious regression asoneinwhichthe‘‘t-ratios’’inregression(1)arelikelytoindicateasigni¢cant relation when the variables are really independent. The problem may come from the numerator or the denominator of the t-ratio: The coe⁄cient or its standard error may be biased. As in Granger and Newbold, the problem lies with the standard errors.3 The reason is simple to understand.When the null 3WhileGrangerandNewbold(1974)donotstudytheslopesandstandarderrorstoidentify the separate e¡ects, our simulations, designed to mimic their setting (not reported in the tables), con¢rm that their slopes are well behaved, while the standard errors are biased. Granger and Newbold use OLS standard errors, while we focus on the heteroskedasticity andautocorrelation-consistentstandarderrorsthataremorecommoninrecentstudies. SpuriousRegressionsinFinancialEconomics? 1399 hypothesisthattheregressionsloped¼0istrue,theerrortermu ofregression tþ1 Equation(1)inheritsautocorrelationfromthedependentvariable.Assumingsta- tionarity,the slope coe⁄cientis consistent,but standarderrors that donotac- countfortheserialdependencecorrectlyarebiased. Because the spurious regression problem is driven by biased estimates of the standard error, the choice of standard error estimator is crucial. In our simulationexercises, it ispossibleto¢nd ane⁄cientunbiasedestimator, since weknowthe‘‘true’’modelthatdescribestheregressionerror.Ofcourse,thiswill notbeknowninpractice.Tomimicthepracticalreality,theanalystinoursimu- lations uses the popular autocorrelation-heteroskedasticity-consistent (HAC) standard errors from Newey and West (1987), with an automatic lag selection procedure. The number of lags is chosen by computing the autocorrelations oftheestimatedresidualsandtruncatingthelaglengthwhenthesampleauto- correlations become ‘‘insigni¢cant’’ at longer lags. (The exact procedure is described in Footnote 1, and modi¢cations to this procedure are discussed below.) ThissettingisrelatedtoPhillips(1986)andStambaugh(1999).Phillipsderives asymptoticdistributionsfortheOLSestimatorsoftheregression(1),inthecase where r¼1, u (cid:5)0, and fen;eg are general independent mean zero processes. tþ1 t t Weallowanonzerovarianceofu toaccommodatethelargenoisecomponent tþ1 of stock returns.We assume ro1 to focus on stationary, but possibly highly autocorrelated,regressors. Stambaugh (1999) studies a case where the errors fen;eg are perfectly cor- t t related,orequivalently,theanalystobservesandusesthecorrectlaggedstochas- ticregressor.Abiasariseswhenthecorrelationbetweenu anden isnotzero, tþ1 tþ1 related to the well-known small sample bias of the autocorrelation coe⁄cient (e.g., Kendall (1954)). In the pure spurious regression case studied here, the observedregressorZ isindependentofthetrueregressorZn,andu isinde- t 7 tþ1 pendent of en .The Stambaughbias is zerointhis case.Thepoint isthatthere tþ1 remainsaprobleminpredictiveregressions,intheabsenceofthebiasstudiedby Stambaugh,becauseofspuriousregression. B. SpuriousRegressionandDataMining We consider the interaction between spurious regression and data mining, where the instruments to be mined are independent as in Foster et al. (1997). ThereareLmeasuredinstrumentsoverwhichtheanalystsearchesforthe‘‘best’’ predictor, based on the R-squares of univariate regressions. In Equation (3) Z t becomes a vector of length L, where L is the number of instruments through which the analyst sifts.The error terms ðen;eÞ become an Lþ1vector with a t t diagonalcovariancematrix;thus,enisindependentofe. t t ThepersistenceparametersinEquation(3)becomean(Lþ1)-square,diagonal matrix,withtheautocorrelationofthetruepredictorequaltorn.Thevalueofrn iseithertheaveragefromoursampleof500potentialinstruments,15percent,or the median value from Table I, 95 percent. The remaining autocorrelations, denoted by the L-vector r, are set equal to the autocorrelations of the ¢rst L 1400 TheJournalofFinance instrumentsinoursample of500potentialinstruments,whenrn¼15percent.4 Whenrn¼95percent,werescaletheautocorrelationstocenterthedistribution at0.95whilepreservingtherangeinthe originaldata.5Thesimulations match theunconditionalvariancesoftheinstruments,Var(Z),tothedata.The¢rstele- mentofthecovariancematrixSisequaltos2.Foratypicalithdiagonalelement n ofS,denotedbys,theelementsofr(Z)andVar(Z)aregivenbythedata,andwe i i i sets2¼[1(cid:3)r(Z)2]Var(Z). i i i III. SimulationResults We¢rstconsiderspuriousregressioninisolation.Thenwestudyspurious re- gressionwithdatamining. A. PureSpuriousRegression Table II summarizes the results for the case of pure spurious regression.We record the estimated slope coe⁄cient in regression (1), its Newey^West t-ratio, andthecoe⁄cientofdeterminationateachtrialandsummarizetheirempirical distributions. The experiments are run for two sample sizes, based on the extremesinTableI.TheseareT¼66andT¼824inPanelsAandB,respectively. InPanelC,wematchthesamplesizestothestudiesinTableI.Ineachcase,10,000 trialsofthesimulationarerun;50,000trialsproducessimilarresults. TherowsofTableIIrefertodi¡erentvaluesforthetrueR-squares.Thesmal- lestvalueis0.1percent,wherethestockreturnisessentiallyunpredictable,and thelargest value is15 percent.The columns of Table IIcorrespond to di¡erent values of rn, the autocorrelationof the true expected return, which runs from 0.00 to 0.99. Inthese experiments, we set r¼rn.The subpanels labeled Critical t-statistic and Critical estimated R2 report empirical critical values from the 10,000 simulated trials, so that 2.5 percent of the t-statistics or ¢ve percent of theR-squareslieabovethesevalues. The subpanels labeled Mean d report the average slope coe⁄cients over the 10,000trials.The meanestimatedvaluesarealwayssmall,andveryclosetothe truevalueofzeroatthelargersamplesize.Thiscon¢rmsthattheslopecoe⁄cient estimatorsarewellbehaved,sothatbiasduetospuriousregressioncomesfrom thestandarderrors. 4Wecalibratethetrueautocorrelationsinthesimulationstothesampleautocorrelations, adjusted for¢rst-order ¢nite-samplebiasas: rr^ þ (1þ3rr^)/T, where rr^ is the OLS estimate of theautocorrelationandTisthesamplesize. 5The transformation is as follows. In the 500 instruments, the minimum bias-adjusted autocorrelation is (cid:3)0.571, the maximum is 0.999, and the median is 0.02. We center the transformeddistributionaboutthemedianinTableI,whichis0.95.Iftheoriginalautocorre- lationrislessthanthemedian,wetransformitto :95þðr(cid:3)0:02Þfð0:95þ0:571Þ=ð0:02þ0:571Þg: Ifthevalueisabovethemedian,wetransformitto :95þðr(cid:3)0:02Þfð0:999(cid:3)0:95Þ=ð0:999(cid:3)0:02Þg: SpuriousRegressionsinFinancialEconomics? 1401 Whenrn¼0,andthereisnopersistenceinthetrueexpectedreturn,thespur- ious regression phenomenon is not a concern.This is true even when the mea- sured regressor is highly persistent. (We con¢rm this with additional simulations,notreportedinthetables,wherewesetrn¼0andvaryr.)Thelogic is that when the slope in Equation (1) is zero and rn¼0, the regression error has no persistence, so the standard errors are well behaved. This implies that spurious regression is not a problem from the perspective of testing the null hypothesis that expected stock returns areunpredictable, even if a highly autocorrelatedregressorisused. Table II shows that spurious regression bias does not arise to any serious degree,providedrnis0.90orless,andthetrueR2isonepercentorless.Forthese parameters,theempiricalcriticalvaluesforthet-ratiosare2.48(T¼66,PanelA), and 2.07 (T¼824, Panel B).The empirical critical R-squares are close to their theoreticalvalues.Forexample,fora¢vepercenttestwithT¼66(824)theFdis- tributionimpliescriticalR-squaredvaluesof5.9percent(0.5percent).Thevalues inTable II when rn¼0.90 and true R2¼1percent, are 6.2 percent (0.5 percent); thus, the empirical distributions do not depart far from the standard rules of thumb. Variableslikeshort-terminterestratesanddividendyieldstypicallyhave¢rst- ordersampleautocorrelationsinexcessof0.95,aswesawinTableI.We¢ndsub- stantialbiaseswhentheregressorsarehighlypersistent.Considertheplausible scenario with a sample ofT¼824 observations where r¼0.98 and true R2¼10 percent. In view of the spurious regression phenomenon, an analyst who was notsurethatthecorrectinstrumentisbeingusedandwhowantedtoconducta 5percent,two-tailedt-testforthesigni¢canceofthemeasuredinstrumentwould havetouseat-ratioof3.6.Thecoe⁄cientofdeterminationwouldhavetoexceed 2.2percenttobesigni¢cantatthe5percentlevel.Thesecuto¡saresubstantially morestringentthantheusualrulesofthumb. Panel C of Table II revisits the evidence from the literature in Table I.The critical values for the t-ratios and R-squares are reported, along with the theoretical critical values for the R-squares implied by the F distribution. WesetthetrueR-squaredvalueequalto10percentandrn¼rineachcase.We ¢ndthat7ofthe17statisticsinTableIthatwouldbeconsideredsigni¢cantusing thetraditionalstandards,arenolongersigni¢cantinviewofthespuriousregres- sionbias. WhilePanelsAandBofTableIIshowthatspuriousregressioncanbeapro- blem in stock return regressions, Panel C ¢nds that accounting for spurious regression changes the inferences about speci¢c regressors that were found to besigni¢cantinpreviousstudies.Inparticular,we questionthesigni¢canceof thetermspreadinFamaandFrench(1989),onthebasisofeitherthet-ratioorthe R-squaredoftheregression.Similarly,thebook-to-marketratiooftheDowJones index,studiedbyPonti¡andSchall(1998)failstobesigni¢cantwitheithersta- tistic. Several other variables are marginal, failing on the basis of one but not both statistics.These include the short-term interest rate (Fama and Schwert (1977), using the more recent sample of Breen, Glosten, and Jagannathan (1989)), the dividend yield (Fama and French (1988a)), and the quality-related 1402 TheJournalofFinance yieldspread(KeimandStambaugh(1986)).Alloftheseregressorswouldbecon- sideredsigni¢cantusingthestandardcuto¡s. It is interesting tonotethat thebiases documented inTable IIdo not always diminishwith larger sample sizes; in fact, the criticalt-ratios arelarger inthe lowerrightcornerofthepanelswhenT¼824thanwhenT¼66.Themeanvalues oftheslopecoe⁄cientsareclosertozeroatthelargersamplesize,sothelarger criticalvaluesaredrivenbythestandarderrors.AsampleaslargeasT¼824is notby itself acure for the spurious regression bias.This is typical of spurious regression with a unit root, as discussed by Phillips (1986) for in¢nite sample sizesandnonstationarydata.6Itisinterestingtoobservesimilarpatterns,even withstationarydataand¢nitesamples. Phillips(1986)showsthatthesampleautocorrelationintheregressionstudied by Grangerand Newbold(1974) converges in limit to1.0. However, we ¢ndonly mildly in£ated residual autocorrelations (not reported in the tables) for stock returnsamplesaslargeasT¼2000,evenwhenweassumevaluesofthetrueR2 aslargeas40percent.Evenintheseextremecases,noneoftheempiricalcritical valuesfortheresidualautocorrelationsarelargerthan0.5.Sinceu ¼0inthe tþ1 casesstudiedbyPhillips,weexpecttoseeexplosiveautocorrelationsonlywhen the true R2 is very large.When R2 is small, the white noise component of the returns serves to dampen the residual autocorrelation.Thus, we are not likely toseelargeresidualautocorrelationsinassetpricingmodels,evenwherespur- iousregressionisaproblem.Theresiduals-baseddiagnosticsforspuriousregres- sion, such as the Durbin^Watsontests suggestedby Grangerand Newbold, are notlikelytobeverypowerfulinassetpricingregressions.Forthesamereason, naive application of the Newey^West procedure, where the number of lags is selectedbyexamining theresidualautocorrelations,is notlikely toresolvethe spuriousregressionproblem. NeweyandWest(1987)showthattheirprocedureisconsistentwhenthenum- ber of lags used grows withoutbound as the sample sizeT increases, provided thatthenumberoflagsgrowsnofasterthanT1/4.Thelagselectionprocedurein TableIIexamines12lags.Eventhoughnomorethanninelagsareselectedforthe actualdatainTableI,morelagswouldsometimesbeselectedinthesimulations, andaninconsistencyresultsfromtruncatingthelaglength.7However,in¢nite samples, an increase in the number of lags can make things worse.When‘‘too many’’ lags are used, the standard error estimates become excessively noisy, whichthickensthetailsofthesamplingdistributionofthet-ratios.Thisoccurs 6Phillips derives asymptotic distributions for the OLS estimators of equation (1), in the p ffiffiffiffi casewhere r¼1, utþ1(cid:5)0. He shows that the t-ratiofor d diverges forlargeT, while t(d)/ T, d, and the coe⁄cient of determination converge to well-de¢ned random variables. Marmol (1998) extends theseresultstomultipleregressionswithpartially integratedprocesses,and provides references to more recent theoretical literature. Phillips (1998) reviews analytical toolsforasymptoticanalysiswhennonstationaryseriesareinvolved. 7Atverylargesample sizes,a hugenumberoflags cancontrolthebias.Weverify thisby examiningsamplesaslargeasT¼5000,lettingthenumberoflagsgrowto240.With240lags, the critical t-ratio when the true R2¼10 percent and r¼0.98 falls from 3.6 in Panel B of TableIItoareasonablywell-behavedvalueof2.23.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.