ebook img

Revisiting the Neyman-Scott model: an Inconsistent MLE or an Ill-defined Model? PDF

0.09 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Revisiting the Neyman-Scott model: an Inconsistent MLE or an Ill-defined Model?

Revisiting the Neyman-Scott model: an Inconsistent MLE or an Ill-defined Model? 3 1 0 Aris Spanos 2 Department of Economics, n Virginia Tech, USA a J January 2013 6 2 ] Abstract E M The Neyman and Scott (1948) model is widely used to demonstrate a serious weakness of the Maximum Likelihood (ML) method: it can . t give rise to inconsistent estimators. The primary objective of this paper a t is to revisit this example with a view to demonstrate that the culprit s for the inconsistent estimation is not the ML method but an ill-defined [ statistical model. It is also shown that a simple recasting of this model 1 renders it well-defined and the ML method gives rise to consistent and v asymptotically efficient estimators. 8 7 2 6 . 1 0 3 1 : v i X r a 1 1 Introduction Despite claims for priority by a number of different authors, Maximum Likeli- hood(ML)wasfirstarticulatedasageneralmethodofestimationinthecontext of a parametric statistical model by Fisher (1922). It took several decades to establish the regularity conditions needed to ensure the key asymptotic prop- erties of ML Estimators (MLE), such as efficiency and consistency (Cramer, 1946, Wald, 1949), but since then ML has dominated estimation in frequentist statistics; see Stigler (2007), Hald (2007) for this history. Severalcounterexampleswereproposedinthe1940sand1950sraisingdoubts aboutthegeneralityoftheMLmethod. ThesecounterexamplesincludeHodges’s localsuperefficientestimator(LeCam,1953),themixtureoftwoNormaldistri- butions (Cox, 2006)) and the inconsistent MLE example proposed by the Ney- manandScott (1948)model. Commenting onthese examplesStigler (2007),p. 613,arguedthatnoneareconsideredseriousenoughtounderminethecredibility of the ML method, and singled out the last example: “TheWald-Neyman-Scottexamplewasofmorepracticalimport,andstillserves as a warning of what might occur in modern highly parameterized problems,where the information in the data may be spread too thinly to achieve asymptotic consis- tency.” The primary objective of this note is to revisit the Neyman-Scott example withaviewtounpackStigler’sassessmentbydemonstratingthattherealculprit for the inconsistentestimator is not the ML method, as such, but an ill-defined statistical model. It is also shown that a simple recasting of this model renders it well-defined and the ML method gives rise to consistent and asymptotically efficient estimators. 2 The Neyman-Scott model The quintessential example used to demonstrate that ML might give rise to inconsistent estimators is the Neyman-Scott model: X =µ +ε , it t it i=1,2, t=1,2,...,n,... εit ∽NIID(0,σ2)=0, ) which can be viewed as a simple time effects panel data model. Theunderlyingdistributionoftheobservablerandomvariables(X ,X )is 1t 2t bivariate Normal Independent, but not Identically Distributed: X µ σ2 0 X := 1t ∽NI t , , t=1,2,...,n,... t X µ 0 σ2 (cid:18) 2t (cid:19) (cid:18)(cid:18) t (cid:19) (cid:18) (cid:19)(cid:19) In light of the fact that the non-ID assumption implies that this model suffers from the incidental parameter problem, in the sense that the unknown parameters (µ ,µ ,...,µ ) increase with the sample size, the latter are viewed 1 2 n as nuisance parameters, and σ2 as the only parameter of interest. 2 3 Maximum Likelihood Estimation? DespitetheincidentalparameterproblemtheNeyman-Scottmodelcontinuesto be used as a counterexample to the ML method. That is, the literature ignores the incidentalparameter problem,and defines the distribution of the sample to be: f(x;θ) = n 2 σ√12πe{−2σ12(xit−µt)2}= n 2π1σ2e{−2σ12[(x1t−µt)2+(x2t−µt)2]} t=1i=1 t=1 This gives risQe toQthe ‘log-likelihood functioQn’: lnL(θ;x) =−nlnσ2− 1 n [(x −µ )2+(x −µ )2]. 2σ2 t=1 1t t 2t t The ‘Maximum Likelihood EstimatorsP’ (MLE) are supposed to be derived by solving the first-order conditions: ∂lnL(θ;x) = 1 [(x −µ )+(x −µ )]=0, ∂µt σ2 1t t 2t t ∂lnL(θ;x) =− n + 1 n [(x −µ )2+(x −µ )2]=0, ∂σ2 σ2 2σ4 t=1 1t t 2t t giving rise to: P µ =1(X +X ), t=1,2,...,n, t 2 1t 2t n n σ2= 1 [(X −µ )2+(X −µ )2]=1 s2, b 2n 1t t 2t t n t t=1 t=1 P P where s2=1[(Xb−µ )2+(X −µb )2]. b t 2 1t t 2t t Notice that for lnL(θ;x), θ :=(µ ,σˆ2,t=1,2,...,n) is a maximum since MLE t the second derivativbes at θ =θ areb: ∂2ln∂Lµ2t(θ;x) θ=θbMbbLE= −bσ22 θ=θbMLE=− σˆ22 <0, (cid:12)(cid:12) (cid:0) (cid:1)(cid:12)(cid:12) (cid:12) ∂2∂lnσ2L∂(µθt;x) θ=θbMLE= −σ14 nt=1[(x1t−µt)+(x2t−µt)] θ=θbMLE=0 (cid:12)(cid:12) P (cid:12)(cid:12) (cid:12) ∂2ln∂Lσ4(θ;x) θ=θbMLE= σn4−σ16 nt=1[(x1t−µt)2+(x2t−µt)2] θ=θbMLE=− σˆn4 <0, (cid:12) and thus(cid:12): P (cid:12) (cid:12) (cid:12) ∂2lnL(θ;x) ∂2lnL(θ;x) − ∂2lnL(θ;x) >0. ∂µ2t ∂σ4 ∂σ2∂µt θ=θbMLE (cid:16) (cid:17)(cid:16) (cid:17) (cid:16) (cid:17)(cid:12) The commonly used argument against the ML met(cid:12)hod is that since: (cid:12) E(µ )=µ and E(s2)=1σ2, t t t 2 it follows that the ‘MLE’ σ2 is both biased and inconsistent because: b E(σ2)=1 n E(s2)=1σ2, and σ2 a→.s. 1σ2, n bt=1 t 2 2 since the bias E(σ2)−σ2=P−1σ2 does not go to 0 as n→∞. b 2 b b 3 A moment’sreflectionrevealsthatthe inconsistency argumentisill-thought out. Theis becausethe incidentalparameterproblemrendersµ =1(X +X ) t 2 1t 2t an inconsistent estimatorofµ ,fort=1,2,...,n;there areonlytwoobservations t foreachµ ,andthustheirvarianceVar(µ )=1σ2 doesnotgotozeroasn→∞. t t 2 b WhatthecriticsoftheMLmethoddonotappreciateenoughisthefactthat treating the unknown parameters (µ ,µ ,...,µ ) as incidental and designating b 1 b2 n σ2 the only parameter of interest, does not let the statistician ‘off the hook’. Thisisbecausetheparameterofinterestσ2=E(X −µ )2,definingthevariation it t around µ , invokes the incidental parameters. Put more intuitively, when the t data come in the form of Z :=(z ,z ,...,z ), z :=(x ,x ), one can get to σ2 0 1 2 n t 1t 2t via µ , and using µ leads to problems because it is an inconsistent estimator. t t In that sense (Severini, 2000): “... this modelbfalls outside of the general framework we are considering since the dimension of the parameter (µ ,µ ,...,µ ,σ2) depends on the sample size.” 1 2 n (p. 108) n That is, calling σ2= 1 [(X −µ )2+(X −µ )2] a MLE is highly misleading 2n 1t t 2t t t=1 sincetheMLmethodwaPsnevermeanttobeappliedtostatisticalmodelswhose number of unknobwn parameters inbcreases withbthe sample size n. In truth, one should be very skeptical of any method of estimation which yields consistent estimators in cases where the statistical model in question is ill-defined, asinthe caseofthe incidentalparameterproblem. Hence, the more interesting question should be: why would the ML method yield a consistent estimator of σ2? ThefactthattheMLmethoddoesnot yieldaconsistentestimatorofσ2 should count in its favor not against it! To paraphrase Stigler’s quotation: the ML method ‘warns the modeler that the information in the data has been spread too thinly’. 4 Recasting the original Neyman-Scott model The question that naturally arises is: can one respecify the above statistical model to render it well-defined but retaining the parameter of interest? The answer is surprisingly straightforward. Since the incidental parameter problem arises because of the unknown but t-varying means (µ ,µ ,...,µ ), one can 1 2 n re-specify the original bivariate model into a univariate simple Normal (one parameter), using the transformation: Y = 1 (X −X )∽NIID 0,σ2 , t=1,2,...,n,..., t √2 1t 2t (cid:0) (cid:1) E(Y )= 1 E(X −X )= 1 (µ −µ )=0, t √2 1t 2t √2 t t Var(Y )= 1[Var(X )+Var(X )]=σ2. t 2 1t 2t This is a sensible thing to do because taking the difference eliminates the nui- sance parameters (µ ,µ ,...,µ ), without affecting the parameter of interest. 1 2 n 4 For this simple Normal model, the MLE for σ2 is: σ2 =1 n Y2, MLE n t=1 t which it is unbiased, fully efficient and strongly consistent: P b E(σ2 )=σ2, Var(σ2 )=2σ4, σ2 a→.s.σ2. MLE MLE n MLE Notes: (i) The above rbecasting of the Nebyman-Scott mbodel can be easily extended to the case (X ,X ,...,X ), 2≤m<n. 1t 2t mt (ii) Hald (2007), p. 182-3 offers an alternative, highly original, way to sidesteptheincidentalparameterproblemusingFisher’stwostageMLmethod. 5 Conclusion The main conclusion from the above discussion is that when the ML method gives rise to inconsistent estimators, the modeler should take a closer look at the assumed statistical model; chances are, it is ill-defined. This is particularly trueinthe casewherethe assumedmodelsuffersfromthe incidentalparameter problem. Insuchcasesthewayforwardistorecasttheoriginalmodeltorender it well-defined and then apply the ML method. This argument is illustrated above using the Neyman-Scott (1948) model. References [1] Cox,D. R. (2006), Principles of Statistical Inference, CambridgeUniversity Press, Cambridge. [2] Cramer,H.(1946),MathematicalMethodsofStatistics,PrincetonUniversity Press, Princeton, NJ. [3] Fisher,R.A.(1922),“Onthemathematicalfoundationsoftheoreticalstatis- tics”, Philosophical Transactions of the Royal Society A, 222: 309-368. [4] Hald,A.(2007),AHistoryofParametricStatisticalInferencefromBernoulli to Fisher, 1713-1935, Springer, NY. [5] Le Cam,L.(1953),“Onsomeasymptotic propertiesofmaximumlikelihood estimates and related Bayesian estimates,” University of California Publi- cations in Statistics, 1: 277-330. [6] Neyman,J.andE.L.Scott(1948),“Consistentestimatesbasedonpartially consistent observations,” Econometrica, 16: 1-32. [7] Severini, T. A. (2000), Likelihood Methods in Statistics, Oxford University Press, Oxford. [8] Stigler, S. M. (2007), “The Epic Story of Maximum Likelihood,” Statistical Science, 22: 598-620. [9] Wald, A. (1949), “Note on the consistency of the maximum likelihood esti- mate,” Annals of Mathematical Statistics, 20: 595-601. 5

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.