ebook img

ESTIMATING SEMIPARAMETRIC ARCH(∞) MODELS BY KERNEL SMOOTHING METHODS1 ... PDF

66 Pages·2005·0.58 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ESTIMATING SEMIPARAMETRIC ARCH(∞) MODELS BY KERNEL SMOOTHING METHODS1 ...

Econometrica,Vol.73,No.3(May,2005),771–836 ESTIMATINGSEMIPARAMETRICARCH(∞)MODELSBY KERNELSMOOTHINGMETHODS1 BYO.LINTON2 ANDE.MAMMEN3 WeinvestigateaclassofsemiparametricARCH(∞)modelsthatincludesasaspe- cialcasethepartiallynonparametric(PNP)modelintroducedbyEngleandNg(1993) andwhichallowsforbothflexibledynamicsandflexiblefunctionformwithregardto the“newsimpact”function.Weshowthatthefunctionalpartofthemodelsatisfiesa typeIIlinearintegralequationandgivesimpleconditionsunderwhichthereisaunique solution.Weproposeanestimationmethodthatisbasedonkernelsmoothingandpro- filedlikelihood.Weestablishthedistributiontheoryoftheparametriccomponentsand thepointwisedistributionofthenonparametriccomponentofthemodel.Wealsodis- cussefficiencyofboththeparametricpartandthenonparametricpart.Weinvestigate theperformanceofourproceduresonsimulateddataandonasampleofS&P500in- dexreturns.Wefindevidenceofasymmetricnewsimpactfunctions,consistentwiththe parametricanalysis. KEYWORDS:ARCH,inverseproblem,kernelestimation,newsimpactcurve,non- parametricregression,profilelikelihood;semiparametricestimation,volatility. 1. INTRODUCTION STOCHASTIC VOLATILITY MODELS are of considerable current interest in em- pirical finance following the seminal work of Engle (1982). Perhaps the most popularversionofthisisBollerslev’s(1986)GARCH(1(cid:1)1)modelinwhichthe conditionalvarianceσ2 ofamartingaledifferencesequencey is t t (1) σ2=βσ2 +α+γy2 (cid:6) t t−1 t−1 This model has been extensively studied and generalized in various ways. See thereviewofBollerslev,Engle,andNelson(1994).Thispaperisaboutapar- ticularclassofnonparametric/semiparametricgeneralizationsof(1).Themo- tivationforthislineofworkistoincreasetheflexibilityoftheclassofmodels we use and to learn from this the shape of the volatility function without re- strictingitaprioritohaveornothavecertainshapes. The nonparametric ARCH literature apparently begins with Pagan and Schwert (1990) and Pagan and Hong (1991). They consider the case where σ2=σ2(y ),whereσ(·)isasmoothbutunknownfunction,andthemultilag t t−1 1We would like to thank Costas Meghir, two referees, Xiaohong Chen, Paul Doukhan, WolfgangHärdle,ChristianHuse,DennisKristensen,JensNielsen,andEricRenaultforhelpful comments. 2ResearchwassupportedbytheEconomicandSocialScienceResearchCounciloftheUnited Kingdom. 3SupportedbytheDeutscheForschungsgemeinschaft,Sonderforschungsbereich373“Quan- tifikationundSimulationÖkonomischerProzesse,”Humboldt-UniversitätzuBerlinandProject MA1026/6-2. 771 772 O.LINTONANDE.MAMMEN versionσ2=σ2(y (cid:1)y (cid:1)(cid:6)(cid:6)(cid:6)(cid:1)y ).HärdleandTsybakov(1997)appliedlocal t t−1 t−2 t−d linearfittoestimatethevolatilityfunctiontogetherwiththemeanfunctionand derivedtheirjointasymptoticproperties.Themultivariateextensionisgivenin Härdle,Tsybakov,andYang(1996).MasryandTjøstheim(1995)alsoestimate nonparametric ARCH models using the Nadaraya–Watson kernel estimator. Inpractice,itisnecessarytoincludemanylaggedvariables.Theproblemwith thisisthatnonparametricestimationofamultidimensionalregressionsurface suffersfromthewell-known“curseofdimensionality”:theoptimalrateofcon- vergence decreases with dimensionality d; see Stone (1980). In addition, it is hard to describe, interpret, and understand the estimated regression surface whenthedimensionismorethantwo.Furthermore,evenforlargedthismodel greatly restricts the dynamics for the variance process since it effectively cor- responds to an ARCH(d) model, which is known in the parametric case not tocapturethedynamicswell.Inparticular,iftheconditionalvarianceishighly persistent, the nonparametric estimator of the conditional variance will pro- videapoorapproximation,asreportedinPerron(1998).Sonotonlydoesthis model not capture adequately the time series properties of many data sets, but the statistical properties of the estimators can be poor and the resulting estimatorshardtointerpret. Additive models offer a flexible but parsimonious alternative to nonpara- metricmodels,andhavebeenus(cid:1)edinmanycontexts,seeHastieandTibshirani (1990).Supposethatσ2=c + d σ2(y ).Thebestachievablerateofcon- t v j=1 j t−j vergence for estimates of σ2(·) is that of one-dimensional nonparametric re- j gression; see Stone (1985). Yang, Härdle, and Nielsen (1999) proposed an alternative nonlinear ARCH model in which the(cid:2)conditional mean is addi- tive, but the volatility is multiplicative: σ2 = c d σ2(y ). Their estima- t v j=1 j t−j tion strategy is based on the method of partial means/marginal integration using local linear fits as a pilot smoother. Kim and Linton (2004) generalize this m(cid:1)odel to allow for arbitrary (but known) transformations, i.e., G(σ2)= t c + d σ2(y ),where G(·) isaknownfunctionlikelogorlevel.Horowitz v j=1 j t−j (2001)hasanalyzedthemodelwhereG(·)isalsounknown,buthisresultswere onlyinacross-sectionalsetting.Theseseparablemodelsdealwiththecurseof dimensionality,butstilldonotcapturethepersistenceofvolatilityandspecifi- callytheydonotnestthefavoriteGARCH(1(cid:1)1)process. This paper analyzes a class of semiparametric ARCH models that has both general functional form aspects and flexible dynamics. A special case of our model is the Engle and Ng (1993) PNP model where σ2 =βσ2 +m(y ), t t−1 t−j where m(·) is a smooth but unknown function. Our semiparametric model nests the simple GARCH(1(cid:1)1) model but permits more general functional form: it allows for an asymmetric leverage effect and as much dynamics as GARCH(1(cid:1)1). A major issue we solve is how to estimate the function m(·) by kernel methods. Our estimation approach is to derive population moment conditions for the nonparametric part and then solve them with empirical counterparts. The moment conditions we obtain are linear type II Fredholm SEMIPARAMETRICARCH(∞)MODELS 773 integralequations,andsotheyfallintheclassofinverseproblemsreviewedin Carrasco,Florens,andRenault(2003).Theseequationshavebeenextensively studiedintheappliedmathematicsliterature;see,forexample,Tricomi(1957). They also arise a lot in economic theory; see Stokey and Lucas (1989). The solution of these equations in our case only requires the computation of two- dimensionalsmoothingoperationsandone-dimensionalintegration,andsois attractivecomputationally.Fromastatisticalperspective,therehasbeensome recent work on this class of estimation problems. Starting with Friedman and Stuetzle (1981), in Breiman and Friedman (1985) and Hastie and Tibshirani (1990) these methods have been investigated in the context of additive non- parametric regression and related models, where the estimating equations are usually of type II. Recently, Opsomer and Ruppert (1997) and Mammen, Linton, and Nielsen (1999) have provided a pointwise distribution theory for thisspecificclassofproblems.NeweyandPowell(2003)studiednonparamet- ric simultaneous equations and obtained an estimation equation that was a linearintegralequationalso,exceptthatitisthemoredifficulttypeI.Theyes- tablish the uniform consistency of their estimator; see also Darolles, Florens, and Renault (2002). Hall and Horowitz (2003) establish the optimal rate for estimation in this problem and propose two estimators that achieve this rate. Neither paper provides pointwise distribution theory. Our estimation meth- odsandprooftechniquearepurelyapplicabletothetypeIIsituation,whichis nevertheless quite common elsewhere in economics. For example, Berry and Pakes (2002) derive estimators for a class of semiparametric dynamic models usedinindustrialorganizationapplications,andwhichsolvetypeIIequations similartoours. Our paper goes significantly beyond the existing literature in two respects. First,the integraloperatordoesnotnecessarilyhavenormlessthan1sothat theiterativesolutionmethodofsuccessiveapproximationsisnotfeasible.This also affects the way we derive the asymptotic properties, and we cannot di- rectlyapplytheresultsofMammen,Linton,andNielsen(1999)here.Second, we have also finite-dimensional parameters and their estimation is of interest in itself. We establish the consistency and pointwise asymptotic normality of ourestimatesoftheparameterandofthefunction.Weestablishthesemipara- metric efficiency bound for a Gaussian special case and show that our para- meter estimator achieves this bound. We also discuss the efficiency question regardingthenonparametriccomponentandconcludethatalikelihood-based version of our estimator cannot be improved on without additional structure. Weinvestigatethepracticalperformanceofourmethodonsimulateddataand presentthe resultofanapplicationto S&P500data.The empiricalresults in- dicatesomeasymmetryandnonlinearityinthenewsimpactcurve. Our model is introduced in the next section. In Section 3 we present our estimators.InSection4wegivetheasymptoticproperties.InSection5wedis- cussanextensionofourbasicsettingthataccommodatesarichervarietyoftail behavior.Section6reportssomenumericalresultsandSection7concludes. 774 O.LINTONANDE.MAMMEN 2. THEMODELANDITSPROPERTIES We shall suppose throughout that the process {y}∞ is stationary with fi- t t=−∞ nite fourthmoment. Weconcentratemostofourattentiononthe casewhere thereisnomeanprocess,althoughwelaterdiscussthe extensiontoallowfor somemeandynamics.Definethevolatilityprocessmodel (cid:3)∞ (2) σ2(θ(cid:1)m)=µ + ψ (θ)m(y )(cid:1) t t j t−j j=1 where µt ∈ R(cid:1)θ ∈ Θ ⊂ Rp, and m ∈ M, where(cid:1)M = {m:measurable}. The coefficients ψ (θ) satisfyatleast ψ (θ)≥0 and ∞ ψ (θ)<∞ forall θ∈Θ. j j j=1 j The true parameters θ and the true function m (·) are unknown and to be 0 0 estimated from a finite sample {y (cid:1)(cid:6)(cid:6)(cid:6)(cid:1)y }. The process µ can be allowed to 1 T t dependoncovariatesandunknownparameters,butatthisstageitassumedto beknown.Inmuchofthesequelitcanbeputequaltozerowithoutanylossof generality.Itwillbecomeimportantbelowwhenweconsidermorerestrictive choicesofM.Robinson(1991)isperhapsthefirststudyofARCH(∞)models, althoughherestrictedattentiontothequadraticmcase. FollowingDrostandNijman(1993),wecangivethreeinterpretationsto(2). ThestrongformARCH(∞)processariseswhen y (3) t =ε σ t t isi.i.dwithmean0andvariance1,whereσ2=σ2(θ (cid:1)m ).Thesemistrongform t t 0 0 ariseswhen (4) E(y|F )=0 and E(y2|F )≡σ2(cid:1) t t−1 t t−1 t where F is the sigma field generated by the entire past history of the t−1 y process. Finally, there is a weak form in which σ2 is defined as the projec- t tiononacertainsubspace.Specifically,letθ (cid:1)m bedefinedastheminimizers 0 0 ofthepopulationleastsquarescriterionfunction (cid:4)(cid:5) (cid:6) (cid:7) (cid:3)∞ 2 (5) S(θ(cid:1)m)=E y2− ψ (θ)m(y ) t j t−j j=1 (cid:1) and let σ2= ∞ ψ (θ )m (y ). The criterion (5) is well defined only when t j=1 j 0 0 t−j E(y4)<∞. t Inthespecialcasethatψ (θ)=θj−1,with0<θ<1,wecanrewrite(2)asa j differenceequationintheunobservedvariance (6) σ2=θσ2 +m(y ) (t=1(cid:1)2(cid:1)(cid:6)(cid:6)(cid:6))(cid:1) t t−1 t−1 SEMIPARAMETRICARCH(∞)MODELS 775 and this is consistent with a stationary GARCH(1(cid:1)1) structure for the unob- served variance when m(y)=α+γy2 for some parameters α(cid:1)γ. It also in- cludesotherparametricmodelsasspecialcases:theGlosten,Jegannathan,and Runkle(1993)model,takingm(y)=α+γy2+δy21(y <0),theEngle(1990) asymmetricmodel,takingm(y)=α+γ(y+δ)2,andtheEngleandBollerslev (1986)model,takingm(y)=α+γ|y|δ. Thefunctionm(·)isthe“newsimpactfunction,”anditdeterminestheway inwhichthevolatilityisaffectedbyshockstoy.Ourmodelallowsforgeneral news impact functions including both symmetric and asymmetric functions, and so accommodates the leverage effect (Nelson (1991)). The parameter θ, throughthecoefficientsψ (θ),determinesthepersistenceoftheprocess,and j we in principle allow for quite general coefficient values. A general class of coefficients can be obtained from the expansion of autoregressive moving av- erage(ARMA)lagpolynomials,asinNelson(1991). Our model generalizes the(cid:1)model considered in Carroll, Mammen, and Härdle (2002) in which σ2 = τ θj−1m (y ) for some finite τ. Their esti- t j=1 0 0 t−j mation strategy was quite different from ours: they relied on an initial esti- mator of a τ-dimensional surface and then marginal integration (Linton and Nielsen (1995)) to improve the rate of convergence. This method is likely to work poorly when τ is very large. Also, their theory requires the smoothness ofmtoincreasewithτ.Indeed,acontributionofourpaperistoprovideanes- timationmethodforθ andm(·)thatjustreliesonone-dimensionalsmoothing 0 operations,butisalsoamenabletotheoreticalanalysis.Someotherpaperscan be considered precursors to this one. First, Gouriéroux and Monfort (1992) introduced the qualitative threshold ARCH (QTARCH) which allowed quite flexible patterns of conditional mean and variance through step functions, al- though their analysis was purely parametric. Engle and Ng (1993) analyzed preciselythesemistrongmodel(2)withψ (θ)=θj−1andcalleditpartiallynon- j parametric or PNP for short. They proposed an estimation strategy based on piecewiselinearsplines.Finally,weshouldmentionsomeworkbyAudrinoand Bühlmann (2001):their model is that σ2=Λ(y (cid:1)σ2 ) for some smooth but t t−1 t−1 unknown function Λ(·), and includes the PNP model as a special case. How- ever, although they proposed an estimation algorithm, they did not establish thedistributiontheoryoftheirestimator. In the next subsection we discuss a characterization of the model that gen- erates our estimation strategy. If m were known, it would be straightforward to estimate θ from some likelihood or least squares criterion. The main issue ishowtoestimatem(·)evenwhenθisknown.Thekernelmethodlikestoex- pressthefunctionofinterestasaconditionalexpectationordensityofasmall number of observable variables, but this is not directly possible here because m is only implicitly defined. However, we are able to show that m can be ex- pressedintermsofallthebivariatejointdensitiesof(y(cid:1)y ),j=±1(cid:1)(cid:6)(cid:6)(cid:6)(cid:1)i.e., t t−j this collection of bivariate densities forms a set of sufficient statistics for our model.Weusethisrelationshiptogenerateourestimator. 776 O.LINTONANDE.MAMMEN 2.1. LinearCharacterization Suppose for pedagogic purposes that the semistrong process defined in (4) holds, and for simplicity define(cid:8)y2 =y2 −µ . Take marginal expectations for t t t anyj≥1, (cid:3)∞ (7) E((cid:8)y2|y =y)=ψ (θ )m(y)+ ψ (θ )E[m(y )|y =y](cid:6) t t−j j 0 k 0 t−k t−j k(cid:6)=j For each such j the above equation implicitly defines m(·). This is really a momentconditioninthefunctionalparameterm(·)foreachj,andcanbeused asanestimatingequation.Asintheparametricmethodofmomentscase,itcan paytocombinetheestimatingequationsintermsofefficiency.Specifically,we takethelinearcombinationofthesemomentconditions, (cid:3)∞ (8) ψ (θ )E((cid:8)y2|y =y) j 0 t t−j j=1 (cid:3)∞ (cid:3)∞ (cid:3)∞ = ψ2(θ )m(y)+ ψ (θ ) ψ (θ )E[m(y )|y =y](cid:1) j 0 j 0 k 0 t−k t−j j=1 j=1 k(cid:6)=j whichyieldsanotherimplicitequationinm(·). This equation arises as the first order condition from the least squares def- inition of σ2, given in (5), as we now discuss. We can assume that the quanti- tiesθ (cid:1)m (·t)aretheuniqueminimizersof(5)overΘ×Mbythedefinitionof 0 0 conditionalexpectation,seeDrostandNijman(1993).Furthermore,themini- mizerof(5)satisfiesafirst-orderconditionandintheAppendixweshowthat thisfirst-orderconditionisprecisely(8).Infact,ifweminimize(5)withrespect tom∈Mforanyθ∈Θandletm denotethisminimizer,thenm satisfies(8) θ θ withθ replacedbyθ.Notethatwearetreatingµ asaknownquantity. 0 t Wenextrewrite(8)(forgeneralθ)inamoreconvenientform.Letp denote 0 themarginaldensityofy andletp denotethejointdensityofy (cid:1)y.Define j(cid:1)l j l (cid:3)±∞ p (y(cid:1)x) (9) H (y(cid:1)x)=− ψ∗(θ) 0(cid:1)j (cid:1) θ j p (y)p (x) j=±1 0 0 (cid:3)∞ (10) m∗(y)= ψ†(θ)g (y)(cid:1) θ j j j=1 (cid:1) (cid:1) (cid:1) whereψ†(θ)=ψ (θ)/ ∞ ψ2(θ)andψ∗(θ)= ψ (θ)ψ (θ)/ ∞ ψ2(θ), j j l=1 l j k(cid:6)=0 j+k j l=1 l whileg (y)=E((cid:8)y2|y =y)forj≥1.Thenthefunctionm (·)satisfies j t t−j θ (cid:9) (11) m (y)=m∗(y)+ H (y(cid:1)x)m (x)p (x)dx θ θ θ θ 0 SEMIPARAMETRICARCH(∞)MODELS 777 for each θ∈Θ (this equation is equivalent to (8) for all θ∈Θ). The opera- tor H (y(cid:1)x)=p (y(cid:1)x)/p (y)p (x) is well studied in the statistics literature j 0(cid:1)j 0 0 (see Bickel, Klaassen, Ritov, and Wellner (1993, p. 440)); our operator H θ is just a weighted sum of such operators, where the weights are declining to zerorapidly.Inadditivenonparametricregression,thecorrespondingintegral operatorisanunweightedsumofoperatorslikeH (y(cid:1)x)overthefinitenum- j ber of dimensions (see Hastie and Tibshirani (1990) and Mammen, Linton, and Nielsen (1999)). Although the operators H are not self-adjoint without j an additional assumption of time reversibility, it can easily be seen that H is θ self-adjointinL (p )duetothetwo-sidedsummation.4 2 0 Our estimation procedure will be based on plugging estimates m(cid:10)∗ and H(cid:10) θ θ of m∗ or H , respectively, into (11) and then solving for m(cid:10) . The estimates m(cid:10)∗ anθd H(cid:10) θwill be constructed by plugging estimates of p θ, p , and g into θ θ 0(cid:1)j 0 j (10)and(9).Nonparametricestimatesofthesefunctionsonlyworkaccurately for arguments not too large. We do not want to enter into a discussion of tail behavior of nonparametric estimates at this point. For this reason we change our minimization problem (5), or rather restrict the parameter sets further. We consider minimization of (5) over all θ∈Θ and m∈M , where c now M is the class of all bounded measurable functions that vanish out- c side [−c(cid:1)c], where c is some fixed constant (this makes σ2 = µ whenever t t y ∈/ [−c(cid:1)c] for all j). Let us denote these minimizers by θ and m . Fur- t−j c c thermore,denotetheminim(cid:1)izerof(5)forfixedθ overm∈Mc bymθ(cid:1)c.Then θ andm minimizeE[{(cid:8)y2− ∞ ψ (θ)m(y )}2]overΘ×M andm mini- c c (cid:1) t j=1 j t−j c θ(cid:1)c mizesE[{(cid:8)y2− ∞ ψ (θ)m (y )}2]overM .Fornowweadoptafixedtrun- t j=1 j θ t−j c cationwherec andµ areconstantand(cid:11)known,butreturntothisinSection5. t Then m satisfies m (y)=m∗(y)+ c H (y(cid:1)x)m (x)p (x)dx for |y|≤c θ(cid:1)c θ(cid:1)c θ −c θ θ(cid:1)c 0 and vanishes for |y|>c. For simplicity but in abuse of notation we omit the subindexc ofm andwewrite θ(cid:1)c (12) m =m∗+H m (cid:6) θ θ θ θ For each θ∈Θ(cid:1) H isaself-adjointlinearoperatoronth(cid:11)e Hilbertspaceof θ functions m that are defined on [−c(cid:1)c] with norm (cid:9)m(cid:9)2= c m(x)2p (x)dx 2 −c 0 and(12)isalinearintegralequationofthesecondkind.Therearesomegen- eralresultsprovidingsufficientconditionsunderwhichsuchintegralequations haveauniquesolution.SeeDarolles,Florens,andRenault(2002)foradiscus- siononexistenceanduniquenessforthemoregeneralclassoftypeIequations. Weassumethefollowinghighlevelcondition: (cid:11) 4Specifically, with (cid:10)f(cid:1)g(cid:11) = f(x)g(x)p(cid:1)0(x(cid:1))dx denoting the usual inner prod- uc(cid:1)t (cid:1)in L2(p0), we have (cid:10)g(cid:1)Hθm(cid:11) = − j(cid:6)=kψj(θ)ψk(θ)E[g(yt−j)E[m(yt−k)|yt−j]] = − j(cid:6)=kψj(θ)ψk(θ)E[g(yt−j)m(yt−k)] = (cid:10)Hθg(cid:1)m(cid:11) because the double sum is symmetric inj(cid:1)k.ThedefinitionofadjointoperatorcanbefoundinBickel,Klaassen,Ritov,andWellner (1993,p.416). 778 O.LINTONANDE.MAMMEN ASSUMPT(cid:11)ION(cid:11) A1: TheoperatorHθ(x(cid:1)y)isHilbert–Schmidtuniformlyoverθ, i.e.,sup c c H (x(cid:1)y)2p (x)p (y)dxdy<∞. θ∈Θ −c −c θ 0 0 AsufficientconditionforAssumptionA1isthatthejointdensitiesp (y(cid:1)x) 0(cid:1)j areuniformlyboundedfor j(cid:6)=0 and |x|(cid:1)|y|≤c,andthatthedensity p (x) is 0 boundedawayfrom0for|x|≤c. Under Assumption A1, for each θ∈Θ, H is a self-adjoint bounded com- θ pact linear operator on the Hilbert space of functions L (p ), and there- 2 0 fore ha(cid:1)s a countable number of eigenvalues: ∞ > |λθ(cid:1)1| ≥ |λθ(cid:1)2| ≥ ···(cid:1) with sup ∞ λ2 <∞. θ∈Θ j=1 θ(cid:1)j (cid:1)ASSUMPTIONA2: Thereexistnoθ∈Θandm∈Mc with(cid:9)m(cid:9)2=1suchthat ∞ ψ (θ)m(y )=0withprobability1. j=1 j t−j Thisconditionrulesoutacertain“concurvity”inthestochasticprocess.That is,thedatacannotbefunctionallyrelatedinthisparticularway.Itisanatural generalization to our situation of the condition that the regressorsbe not lin- earlyrelatedinalinearregression.Aspecialcaseofthisconditionwasusedin Weiss(1986)andKristensenandRahbek(2003)foridentificationinparamet- ricARCHmodels,seealsotheargumentsusedinLumsdaine(1996,Lemma5) andRobinsonandZaffaroni(2002,Lemma9). ASSUMPTION A3: The operator Hθ fulfills the following continuity condition forθ(cid:1)θ(cid:12)∈Θ:sup(cid:9)m(cid:9)2≤1(cid:9)Hθm−Hθ(cid:12)m(cid:9)2→0for(cid:9)θ−θ(cid:12)(cid:9)→0. This condition is straightforward to verify. We now argue that because of AssumptionsA2andA3,foraconstant0<γ<1, (13) supλ <γ(cid:6) θ(cid:1)1 θ∈Θ Toprovethisnotethatforθ∈Θandm∈M with(cid:9)m(cid:9) =1, c 2 (cid:4)(cid:12) (cid:13) (cid:7) (cid:3)∞ 2 0<E ψ (θ)m(y ) j t−j j=1 (cid:9) c =χ m2(x)p (x)dx θ 0 −c(cid:9) (cid:9) (cid:3) c c +χ m(x)m(y) ψ∗(θ)p (x(cid:1)y)dxdy θ k 0(cid:1)k −c −c |k|≥1 (cid:9) (cid:9) c c =χ m2(x)p (x)dx−χ m(x)H m(x)p (x)dx(cid:1) θ 0 θ θ 0 −c −c SEMIPARAMETRICARCH(∞)MODELS 779 (cid:1) where χθ= ∞j=1ψ2j(θ) is a positive constant depending on(cid:11) θ. For eigenfunc- tio(cid:11)ns m ∈ M of H with eigenvalue λ this shows that m2(x)p (x)dx − c θ 0 λ m2(x)p (x)dx>0. Therefore λ <1 for θ∈Θ and j≥1. Now, because 0 θ(cid:1)j ofAssumptionA3andcompactnessofΘ,thisimplies(13). From (13) we get that I − H has eigenvalues bounded from below by θ 1−γ>0. Therefore I −H is strictly positive definite and hence invertible, θ and(I−H )−1 hasonlypositiveeigenvaluesthatareboundedby(1−γ)−1: θ (14) sup (cid:9)(I−H )−1m(cid:9) ≤(1−γ)−1(cid:6) θ 2 θ∈Θ(cid:1)m∈Mc(cid:1)(cid:9)m(cid:9)2=1 Therefore,wecandirectlysolvetheintegralequation(12)andwrite (15) m =(I−H )−1m∗ θ θ θ foreachθ∈Θ(cid:6)Therepresentation(15)isfundamentaltoourestimationstrat- egy,asityieldsidentificationofm . θ Wenextdiscussafurtherpropertythatleadstoaniterativesolut(cid:1)ionmethod ratherthanadirectinversion.Ifitholdsthat|λ |<1,thenm = ∞ Hjm∗. θ(cid:1)1 θ j=0 θ θ In this case the sequence of successive approximations m[n] =m∗ +H m[n−1], θ θ θ θ n = 1(cid:1)2(cid:1)(cid:6)(cid:6)(cid:6)(cid:1) converges in norm geometrically fast to m from any starting θ point. This sort of property has been established in other related problems— see Hastie and Tibshirani (1990) for discussion—and is the basis of most es- timationalgorithmsinthisarea.Unfortunately,theconditionsthatguarantee convergenceofthesuccessiveapproximationsmethodarenotlikelytobesat- isfied here even in the special case that ψ (θ)=θj−1. The reason is that the j un(cid:1)itfunctionisalwaysaneigenfunctionof Hθ witheigenvaluedeterminedby − ±∞ θ|j|1=λ ·1,whichimpliesthat λ =−2θ/(1−θ).Thisislessthan1 j=±1 θ θ in absolute value only when θ<1/3. This implies that we will not be able to use directly the particularly convenient method of successive approximations (i.e., backfitting) for estimation: however, with some modifications it can be applied;seeLintonandMammen(2003). 2.2. LikelihoodCharacterization Inthissectionweprovideanalternativecharacterizationofm (cid:1)θintermsof θ theGaussianlikelihood.Weusethischaracterizationlatertodefinethesemi- parametric efficiency bound for estimating θ in the presence of unknown m. This characterization is also important for robustness reasons, since it does notrequirefourthmomentsony. t Supposethatm (·)(cid:1)θ aredefinedastheminimizersofthecriterionfunction 0 0 (cid:14) (cid:15) y2 (16) (cid:16)(θ(cid:1)m)=E logσ2(θ(cid:1)m)+ t t σ2(θ(cid:1)m) t 780 O.LINTONANDE.MAMMEN (cid:1) with respect to both θ(cid:1)m(·), where σ2(θ(cid:1)m)=µ + ∞ ψ (θ)m(y ). No- t t j=1 j t−j tice that this criterion is well defined in many cases where the quadratic loss functionisnot. Minimizing (16) with respect to m for each given θ yields the first-order condition,whichisanonlinearintegralequationinm: (cid:3)∞ (cid:16) (cid:17) (17) ψ (θ)E σ−4(θ(cid:1)m){y2−σ2(θ(cid:1)m)}|y =y =0(cid:6) j t t t t−j j=1 Thisequationisdifficulttoworkwithfromthepointofviewofstatisticalanaly- sis because of the nonlinearity; see Horowitz and Mammen (2002). We con- siderinsteadalinearizedversionofthisequation.Supposethatwehavesome initialapproximationtoσ2.Thenlinearizing(17)aboutσ2,weobtainthelin- t t earintegralequation (18) m =m∗+H m ; θ (cid:1)θ θ θ ∞ ψ (θ)ga(y) m∗= (cid:1)j=1 j j (cid:1) θ ∞ ψ2(θ)gb(y) j=1 j j (cid:1) (cid:1) − ∞ ∞ ψ (θ)ψ(θ)gc (x(cid:1)y)p0(cid:1)l−j(x(cid:1)y) H (x(cid:1)y)= j=1 l=1(cid:1)l(cid:6)=(cid:1)j j l l(cid:1)j p0(y)p0(x)(cid:6) θ ∞ ψ2(θ)gb(y) j=1 j j Here, ga(y) = E[σ−4y2|y = y] = E[σ−2|y = y], gb(y)=E[σ−4|y =y], j t t t−j t t−j j t t−j and gc (x(cid:1)y)=E[σ−4|y =x(cid:1)y =y]. This is a second kind linear integral l(cid:1)j t t−l t−j equation in m (·) but with a different intercept and operator from (12). See θ Hastie and Tibshirani (1990,Section 6.5)fora similar calculation. Under our assumptions, see B4, the weighted operator satisfies Ass(cid:1)umptions A1 and A3 also.ForaproofofAssumptionA3notethat0<E[σ−4 ∞ ψ (θ)m(y )]2. t j=1 j t−j Notethatingeneralm differsfromm ,sincetheyaredefinedasminimizers θ θ of different criteria. However, for the strong and semistrong versions of our modelwegetm =m . θ0 θ0 3. ESTIMATION Weshallconstructestimatesofθandmfromasample{y (cid:1)(cid:6)(cid:6)(cid:6)(cid:1)y }.Wepro- 1 T ceedinfoursteps.First,foreachgivenθwecomputeestimatesofm∗ andH , θ θ and then estimate m by solving an empirical version of the integral equa- θ tion (12). We then estimate θ by minimizing a profile least squares criterion. We then use the estimated parameter to give an estimator of m(·). Finally, weuseourconsistentestimatorstodefinelikelihood-basedestimatorsthatim- prove efficiency under some conditions. In particular, we solve an empirical version of the linearized likelihood implied integral equation (18) and then

Description:
KEYWORDS: ARCH, inverse problem, kernel estimation, news impact like to thank Costas Meghir, two referees, Xiaohong Chen, Paul Doukhan, Otherwise it may be desirable to use some derivative-based optimization algo-.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.