ebook img

ERIC EJ1049630: Evaluating Specification Tests in the Context of Value-Added Estimation PDF

2015·0.14 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC EJ1049630: Evaluating Specification Tests in the Context of Value-Added Estimation

JournalofResearchonEducationalEffectiveness,8:35–59,2015 Copyright©Taylor&FrancisGroup,LLC ISSN:1934-5747print/1934-5739online DOI:10.1080/19345747.2014.981905 Evaluating Specification Tests in the Context of Value-Added Estimation CassandraM.Guarino IndianaUniversity,Bloomington,Indiana,USA MarkD.Reckase,BrianW.Stacy,andJeffreyM.Wooldridge MichiganStateUniversity,EastLansing,Michigan,USA Abstract: We study the properties of two specification tests that have been applied to a variety of estimators in the context of value-added measures (VAMs) of teacher and school quality: the Hausmantestforchoosingbetweenstudent-levelrandomandfixedeffects,andatestforfeedback (sometimes called a “falsification test”). We discuss theoretical properties of the tests to serve as background, and propose parsimonious one-degree-of-freedom versions. An extensive simulation study provides important further insight into the VAM setting. Unfortunately, although both the Hausmanandfeedbacktestshavegoodpowerfordetectingthekindsofnonrandomassignmentthat caninvalidateVAMestimates,theyalsorejectinsituationswhereestimatedVAMsperformverywell intermsofrankingteachers.Consequently,thetestsmustbeusedwithcautionwhenstudenttracking isusedtoformclassrooms. Keywords: Value-added,teacherquality,specificationtests INTRODUCTION Measuresofteacherandschoolqualitybasedonvalue-addedmodels(orVAMs)ofstudent achievementaregainingincreasingacceptanceamongpolicymakersasatoolforevaluating teaching and school effectiveness. Therefore, it is important for researchers and policy makers to understand the statistical properties of the estimates derived from VAMs, and to have some knowledge of when they can be expected to perform well—and when they do not perform well. One way to proceed is to apply statistical tests of the assumptions underlying VAMs to see if they appear justified. Rothstein (2010) and Harris, Sass, and Semykina (2014) are two examples of studies that develop and apply statistical tests of assumptionstoVAMsdesignedtoproducemeasuresofteachereffectiveness. InapplyingstatisticaldiagnosticsinVAMsettings,itisimperativetobeclearaboutthe goaloftheanalysis.Isittodeterminewhetheralloftheassumptionsunderlyingconsistent estimationoftheparametersinastructuralproductionfunctionhold?Oristhemaingoal to get reasonably accurate estimates of teacher value added? In the research literature and in policy applications, the main focus appears to be on getting reliable estimates of value added. Structural models are often used to motivate estimation procedures, but the performanceofthevalue-addedestimatesisofprimaryinterest.Ifweassumethatproviding relativelyaccurateperformanceestimatesistheprimarypurposeoftheVAMliterature,then AddresscorrespondencetoCassandraM.Guarino,SchoolofEducation,EducationalLeadership andPolicyStudies,IndianaUniversity,107S.IndianaAvenue,Bloomington,IN47405-7000,USA. E-mail:[email protected] 36 C.M.Guarinoetal. it is critical to understand the difference between rejecting assumptions of an underlying structuralmodelandconcludingthataparticularprocedurelikelyproducespoorestimates ofvalueadded. In earlier work (Guarino, Reckase, & Wooldridge, 2015), we provide a summary of the known theoretical properties of various approaches to estimating VAMs designed to yield teacher performance estimates. More important, we provide extensive simulation evidenceshowinghowsixofthemostcommonlyusedestimatorsbehaveunderdifferent mechanisms used to match teachers and students. One of the key findings is that certain estimatorsthatarenottechnicallyconsistentcanperformwellinestimatingteachereffects forthepurposeofranking.Oneoftheestimators—ordinaryleastsquares(OLS)appliedto adynamictest-scoreequation,whichwedubbed“dynamicOLS,”or“DOLS”1—performs best across many scenarios, although other estimators are slightly better under certain assignmentmechanisms. Given the several choices among estimators in the VAM context it would be helpful tohavetoolsforchoosingamongdifferentestimators,especiallywhentheyproducevery differentestimatedeffects.Morefundamentally,canwedeterminewhetheranyestimator doesagoodenoughjobofestimatingvalueaddedtousetheestimatesforpolicypurposes, such as rewarding or sanctioning teachers based on estimated performance? It is natural toturntoexistingstatisticalteststohelpdiagnosewhetherornotunderlyingassumptions are met. Several general tests have been developed in the panel data literature, and some variantshavebeenproposedbyeducationresearchersfortheexpresspurposeofevaluating VAMs(e.g.,thoseinRothstein[2010]andHarris,Sass,andSemykina[2014]). The main purpose of this article is to determine the efficacy of specification tests in determininghowwellaVAMaccomplishesitstask,whichwetaketobeestimatingteacher effectsprimarilyforrankingpurposes.Todoso,weusesimulationsinwhichweoriginate student test score data based on known teacher effects but then act as if we do not know the true effects and estimate them. We then assess both the degree towhich a model and estimationapproachyieldaccurateteachereffectestimatesandthebehaviorofstatistical testsofunderlyingassumptions.Ourgoalistodeterminetheusefulnessofstatisticaltests inrevealingthequalityofspecificmodelsforestimatingteachereffects. Generally,wefocusontwokindsofteststhataredesignedtodetectnonrandomteacher assignment, that is, where teachers are assigned at least partly on the basis of observed or unobserved student characteristics. The first test—a robust version of the Hausman (1978)testcomparingthestudent-levelrandomandfixed-effectsestimators—isprimarily intended to uncover situations where teacher assignment is based on unobserved, time- constantstudentheterogeneity.Thetesthasthepowertodetectotherkindsofnonrandom assignmentmechanismsbutitsmainpurposeistodeterminewhetherteacherassignment iscorrelatedwithunobservedstudentheterogeneity. Thesecondkindoftestisactuallyaclassoftestswhosepurposeistodetectdynamic teacherassignmentmechanismsthatmightcausebiasinVAMestimates.Suchtestswere popularizedintheVAMcontextbyRothstein(2010),whocalledthem“falsificationtests.” Generally, a falsification test tries to determine whether future teacher assignments are predictive of pass scores or gain scores. Rothstein’s falsification tests are closely related to tests of strict exogeneity in the panel data literature. In the VAM context, a test of strict exogeneity looks for feedback from shocks to student performance today to future teacher assignment, typically allowing for student heterogeneity that is accounted for by 1DOLS essentially regresses the end-of-year score on prior test scores, teacher indicators, and studentcharacteristics. EvaluatingSpecificationTests 37 student-levelfixed-effects(FE)estimation.Rothsteinproposesfalsificationtestsincontexts withandwithoutunobservedstudentheterogeneity. Ideally, we could propose a strategy for applying diagnostic tests that would reject various estimation methods when they produce poor value-added measures and leave us with estimators that perform well. Unfortunately, our findings are not very positive from a practitioner’s perspective. Although in many cases the tests properly reject when the estimation method produces poor value-added estimates, in other cases the tests strongly rejectwhentheunderlyingestimationmethodproducesverygoodestimatesoftheVAMs, atleastforthepurposeofrankingteachers.Theparticularsituationthatcausesproblems for both the Hausman and feedback tests is when students are tracked based on some observable orunobservable factor,buttheclassroomsarerandomly assignedtoteachers. In such cases a variety of estimation methods produce reliable VAMs, depending on the natureofthetracking.Yet,asweshowinthesectionSimulationResults,thespecification testsstronglyrejectmanyofthebestestimators. AnimportantconsequenceofourfindingsisthatcriticismsofVAMsonthebasisof evidenceprovidedbyHausmanorfeedbacktestsarelikelytobeunjustified.Thebottom line is that, applied in the VAM context, the tests have the power to detect nonrandom assignment schemes that have nothing to do with whether popular estimators are doing theirmainjob:providinggoodVAMestimates. Otherauthorshaverecentlyanalyzedtheperformanceofdiagnostictestsinthecontext ofVAMestimation,althoughweseemtobethefirsttoevaluatetheHausmantestforcom- paringtheREandFEestimators.Kinsler(2012)studiesaparticularversionofRothstein’s (2010) falsification test in the presence of student heterogeneity based on Chamberlain’s (1984)correlatedrandomeffects(CRE)framework.2AsdiscussedbyKinsler(2012),Roth- stein’s falsification test is a test of the null hypothesis that teacher assignment is strictly exogenous with respect to time-varying student unobservables. In this article we study therobustregression-basedtestsuggestedinWooldridge(2010,Section10.7),whichcan be computed from simple fixed effects estimation (at the student level). For gain-score equationswithstudentheterogeneity,ourapproachhasseveraladvantagesoverRothstein’s (2010) approach. First, the regression-based test is computationally (and conceptually) much simpler while still robust to student-level serial correlation and heteroskedasticity of unknown form.Second, the regression-based testcan beapplied tounbalanced panels without change; the CRE approach is not well suited for unbalanced panels. Third, it is straightforwardtoobtainaone-degree-of-freedomregression-basedtest,therebyconserv- ingdegreesoffreedomandpossiblyimprovingfinite-samplestatisticalproperties. Goldhaber and Chaplin (2015) independently provide an evaluation of Rothstein’s regression-basedfalsificationtestinsettingsthatdonotallowforstudent-levelheterogene- ity. Goldhaber and Chaplin first determine whether Rothstein’s statistic properly detects omittedvariablebias.Theyconcludethatitispossibletohavedata-generatingmechanisms and estimators that produce unbiased VAMs but where the Rothstein test will reject the specification.Wecometoasimilarconclusion,butourmainfocusisonthereliabilityof VAMs for ranking teachers and not on bias per se.3 In addition, we use a version of the 2Kinsler (2012) shows using a simulation that Rothstein’s chi-square test based on the CRE approachperformspoorlywhenthenumberofstudentobservationsperteacherissmall. 3Guarinoetal.(2015)showthatsystematicbiasinVAMsneednotcauseproblemsforranking teachers. A constant additive bias clearly has no effect on rankings. In some cases, the bias is in theformofamplifyingtheestimatedeffects,andthisactuallyhelpswithranking,eventhoughthe magnitudesoftheestimatedVAMscannotbetrusted. 38 C.M.Guarinoetal. feedbacktestthatissuggestedbythepaneldataliterature,andwealsoproposeaconvenient one-degree-of-freedom version of the test. Further, the tracking mechanisms we use are differentfromthoseinGoldhaberandChaplin,whoexplicitlyintroducenonlinearitiesinto their tracking mechanisms. The nonlinearities in our tracking devices are more subtle in thattheyaregeneratedbyfixedclasssizesandassignmentprobabilitiesthatarenotlinear functionsofpastperformanceorunobservedheterogeneity. The rest of the article is organized as follows. We discuss the value-added modeling frameworkinthenextsection.Followingthat,ADiscussionoftheTestsintheVAMSetting describesthestatisticalteststhatwestudy—boththosethathavebeenappliedtoVAMsby other researchers and some that have not—and discusses their theoretical properties. We discussthekindsofnonrandomgroupingandassignmentmechanismsthatseemparticularly relevantinthesectionStudentGrouping,TeacherAssignment,andBehavioroftheTests. Inthelastthreesectionswediscussoursimulationdesignandthesimulationresults,and thenprovidesomeconcludingremarks. CONCEPTUALFRAMEWORKFORTESTINGVALUE-ADDEDMODELS Itishelpful tobegin withafairlygeneral value-added equation andabriefdiscussionof theassumptionsembeddedinit.Assumethattheachievementscore,Ait,isgeneratedas Ait =λAi,t−1+Eitβ0+ci +uit −λui,t−1 (1) uit =ρui,t−1+rit,t =1,2,...,T, (2) whereAit isameasureofachievementforstudentiingrade(oryear)t andEit isthe(row) vectorofeducationalinputswhosecoefficients,β0,areofgreatestinterest.Generally,Eit canincludeinputsattheschool,classroom,orevenindividuallevel.Inthepresentarticle, Eit isavectorofteacherassignmentindicators.Thevariableciisunobserved,student-level heterogeneity.Weassume{rit}isasequenceofindependent,identicallydistributedrandom variableswithmeanzerosothat{uit}followsanAR(1)model. AsdiscussedinToddandWolpin(2003)andGuarinoetal.(2015),equation(1)can bederivedfromageneralcumulativeeffectsmodel(CEM)undercertainfairlyrestrictive assumptions,and(2)addstheassumptionthattheerrors{uit}haveaparticularpatternof serialcorrelation.TheparameterλisthedecayparameterintheCEM.Eventhisrestricted version of the CEM is never estimated, because accounting for the combined issues of heterogeneity, ρ differing from λ, and the lagged dependent variable is a challenging econometricproblem. SometimesAi,t−1issubtractedfrombothsidesof(1)toobtainanequationforthegain score,(cid:5)Ait: (cid:5)Ait =τt +αAi,t−1+Eitβ0+ci +uit −λui,t−1, (3) whereα =λ−1.Themodelwithnodecayinlearningisgivenbyα =0.Althoughthis modelalleviatessomeendogeneityconcernsregardingthepresenceofalaggeddependent variable,the“nodecay”assumptionisquiterestrictive. There are several types of misspecifications that can cause difficulty for standard estimatorsofβ .Oneisfailureoftheso-called“commonfactorrestriction”(CFR),λ=ρ. 0 Under the CFR the errors uit −λui,t−1in (1) have no serial correlation, but if λ(cid:2)=ρ EvaluatingSpecificationTests 39 then (1) contains serial correlation. In general, serial correlation in the presence of a laggeddependentvariablecausesinconsistentestimationformanyestimationprocedures, includingOLSappliedto(1)(whereweignoreboththepresenceofciandserialcorrelation inuit −λui,t−1).TheArellanoandBond(1991)instrumentalvariablesprocedure,which removesci,reliesonnoserialcorrelationintheerrorsin(1). McClainandWooldridge(1995)proposeasimpletestofthenullhypothesisthatthe CFRholdsinthecontextoftimeseriesregressionmodels.Thetestiseasilyadaptedtothe paneldatacasewhenthereisnoheterogeneity,thatis,whenci isnotin(1).Unfortunately, it is not clear how to extend the test to allow for heterogeneity. Any neglected serial correlation, due to violation of the CFR, higher-order autoregressive properties, or the presence of ci will cause a rejection of the CFR restriction. However, in the sensitivity analysis in Guarino et al. (2015) we found that violation of the CFR did not appreciably affect the dynamic OLS estimator in the sense that DOLS still provided estimates of the teacher effects that produced reliable rankings among teachers. For these reasons we do notstudytheCFRtestfurtherinthisarticle. A second kind of misspecification arises in setting λ in equation (1) equal to unity (whichisthesameasdroppingAi,t−1 inequation[3]).Includingthelaggedachievement inequation(3)isasimple,effectivewaytodetectdynamicmisspecification.Harrisetal. (2014) apply tests for dynamic misspecification by including lagged teacher assignment usingdatafromFlorida.Wedonotstudythepropertiesofdynamicmisspecificationtestsin thecurrentarticlebecausetheyarestandardtestsofomittedvariables(andareconfounded bythepresenceofci,inanycase)4.Rather,weareinterestedinthebehaviorofexogeneity testswhenλisincorrectlysettounity. Finally—and most importantly for this article—we are interested in what happens when teachers were assigned to students in such a way to make Eit endogenous in an estimating equation. In this third kind of “misspecification” it is not necessarily true that (1) is an incorrect equation; it is simply that inputs have been chosen in a way to violate certainexogeneityrequirements,resultingininconsistentestimators. ADISCUSSIONOFTHETESTSINTHEVAMSETTING Thegeneralpurposeofthespecificationtestswestudyistodetectnonrandomassignment ofstudentstoteachers.Weconsidertestsofbothstaticanddynamicassignment.Oneform ofstaticassignmentoccurswhenteachersare(partly)assignedonthebasisofunobserved studentheterogeneity,thatis,studentswithfixedbutunobservedcharacteristicsarematched withparticularteachereffectivenesslevels.Anotherformofstaticassignmentisbasedon aninitial,orbase,testscoreinanearlygrade.Dynamicassignmentoccurswhentheprior testscoresofstudentsarematchedtoparticularteachereffectivenesslevels. Inwhatfollowsitisimportanttodistinguishbetweentwomechanismsthatcanbeused forgeneratingclassroomsofstudentstaughtbyparticularteachers.Studentsmaybefirst groupedonthebasisofunobservedorobservedcharacteristics,aprocessoftenreferredto as“tracking.”Thiskindofgroupingmightbedoneevenifteachersarerandomlyassignedto classrooms.Nonrandomteacherassignmentoccurswhenclassroomswithdifferentaverage 4Inthecurrenttestingsetting,omittingAi,t−1 from(1)andfailureoftheCFRrestrictioncanbe expectedtohavesimilarconsequencesbecause,ineffect,avariablethatcanpredictthegainscoreis omittedfromtheequation,andteacherassignmentmightbecorrelatedwiththatvariable. 40 C.M.Guarinoetal. levelsofabilityorachievementaresystematicallyassignedtoteacherswithdifferentlevels ofcompetence. AlthoughlittleresearchexistsonevaluatingspecificationtestswithintheVAMsetting, certaintestsarewellknown inthepaneldataliterature.Forexample, Wooldridge (2010, Section 10.7.3) discusses different versions of the Hausman test used to compare the RE and FE estimators. In the VAM context, the Hausman test is primarily a test of static assignmentmechanismsbecauseitisintendedmainlytopickupanycorrelationbetween studentheterogeneityandtheobservedinputs,inthiscase,teacherassignment.Asdiscussed in Wooldridge (2010), it is generally important to use a version of the Hausman test that isrobusttoviolationsofassumptionsthatarenotrequiredforconsistentlyestimatingthe parameters,inthiscase,theteachervalue-addedmeasures.Inparticular,wepreferteststhat arerobusttoserialcorrelationandheteroskedasticityinthestudents’idiosyncraticshocks. Aregression-basedtest,whichwereviewinthesectionTheHausmanTestComparingRE (orPOLS)toFE,providesastraightforwardmethodforobtainingafullyrobustHausman test. To detect dynamic forms of teacher assignment, Rothstein (2010) uses a regression- basedfalsificationtest,whichisinthesamespiritastestsforstrictexogeneityinthepanel dataliterature.Inparticular,Rothsteinincludesfutureteacherassignmentsinacurrentgain- scoreequationestimatedbyOLS.Importantly,withthesimpleregression-basedversionof thetest,Rothsteindoesnotallowforstudentfixedeffects.Withoutfixedeffects,standard estimators,suchasOLS,donotrequirestrictexogeneityforconsistentestimation.Thus,it isnotclearwhyonewantstotestforthepresenceofdynamicassignmentinsuchcases.By contrast,becausefailureofstrictexogeneitydoesresultininconsistencywhenstudent-level fixedeffectsareincluded,Wooldridge(2010,Section10.7.1)showshowfeedbackeffects areeasilytestedinthecontextoffixedeffectsestimation.Themoststraightforwardwayto testforfeedbackeffectsistoincludefuturevaluesoftheexplanatoryvariables—usuallyone periodahead—andtesttheirsignificanceusingarobustWaldtestafterFEestimation.We discussthisvariantofRothstein’sfalsificationtestinthesectionTestingStrictExogeneity inEquationswithStudentFixedEffects. In addition to Rothstein (2010), some other recent empirical papers have applied falsification tests in the VAM context. Koedel and Betts (2009) applied falsification tests in the context of VAM estimation using data from the San Diego school district. Harris etal.(2014)applyabatteryoftests(severalofwhichwedonotstudyhere)inanempirical context. In their application to data from Florida, they generally find evidence against random assignment of students to teachers and find estimated VAMs that vary widely acrossprocedures. ABasicGain-ScoreEquationandExogeneityAssumptions Indescribingtestsforendogeneityofteacherassignmentwestartwithagain-scoreequation because both the Hausman test and the falsification test (test of strict exogeneity) can be reasonablyapplied.Astandardgain-scoreequationis (cid:5)Ait =t +Eitβ0+Xitγ0+ci +eit,t =1,...,T, (4) which maintains the “no decay” assumption but where the errors, {eit}, may be serially correlated because (4) no longer includes a lagged dependent variable. The vector Xit includes controls, many of which may be constant, that are often included in empirical EvaluatingSpecificationTests 41 VAMstudies,suchasgender,race/ethnicity,disabilitystatus,andfree-and-reduced-lunch eligibility.Inthisarticlewedonotincludeextracontrolsinoursimulationstudy,butfora generaldiscussionofhowtoapplytests,itisusefultoexplicitlyincludeXit. The constants τt allow for different intercepts for different grades (or, with many cohorts,allowsforcohorteffects).Becausewehavefewgrades(timeperiods),thesecan beestimatedpreciselywithalargenumberofstudents.Theci arethetime-constantstudent unobservedeffects,sometimescalled“studentheterogeneity.”Thepresenceofcicausesthe compositeerror,vit =ci +eit,tobeseriallycorrelated.Moreimportant,ifci iscorrelated withtheinputsEit,leavingci intheerrortermcancauseinconsistencyinestimatingβ0. The idiosyncratic errors, {eit}, are time-varying unobserved factors that affect gain scores.Generally,thesecanbeseriallycorrelatedorheteroskedastic,orboth.Inthecontext ofanunderlyingcumulativeeffectsmodel,eitisalinearcombinationoftheerrorsappearing in the structural production function; see, for example, Guarino et al. (2015) for a more extensivediscussionofthestructuralmodel. An important assumption required of the most common panel data estimators that recognizethepresenceofci—REandFE—isstrictexogeneityoftheinputsconditionalon thestudentheterogeneity,namely, (cid:2) (cid:3) E eit|EiT,Ei,T−1,...,Ei1,XiT,Xi,T−1,...,Xi1,ci =0,t =1,...,T. (5) Note that in (5) the expectation of the error term, conditional on heterogeneity and all current,past,andfutureinputs,iszero. To interpret the strict exogeneity assumption, drop the {Xis :s =1,...,T} for sim- plicity.Then,whenwecombine(5)withequation(1),wehave E((cid:5)Ait|EiT,...,Ei1,ci)=t +Eitβ0+ci. (6) Equation(6)meansthat,oncewecontrolforstudentheterogeneity,onlyinputsattimet, Eit,appearinthegain-scoreequationattimet.Thisrestrictionimpliesthatpastinputs—in thiscase,previousteachers—havenoeffectonthecurrentgainscore,oncecurrentteacher andstudentheterogeneityhavebeenaccountedfor.AsdiscussedinHarrisetal.(2014),itis simpletotestsuchanassumption:simplyinclude,say,Ei,t−1andtestforjointsignificance (using whatever estimation method one settles on). Of course, including lagged inputs costsusayearofdata,butsuchteststypicallycanbecarriedoutforthekindsofdatasets available. Forourpurposesassumption(5)hasanotherimportantimplication:futurevalues,such asEi,t+1,donotappear ontheright-hand sideof(6).Generally, ifteacher assignmentat timet +1dependson(cid:5)Ait orAit,Ei,t+1willbe(partially)correlatedwitheit.Shortly,we usethisobservationtoobtainatestof(5). In addition to strict exogeneity of the inputs conditional on ci, another important assumptioninthepaneldataliteratureis (cid:2) (cid:3) E ci|EiT,Ei,T−1,...,Ei1 =E(ci)=0, (7) wherewehaveagaindroppedthe{Xis :s =1,...,T}.TheassumptionthatE(ci)=0is without loss of generality when the gain-score equation has an intercept (or a full set of time intercepts). When we combine assumptions (6) and (7), the inputs {Eit} are strictly 42 C.M.Guarinoetal. exogenouswithrespecttothecompositeerror: E(vit|EiT,Ei,T−1,...,Ei1)=0,t =1,...,T. (8) Assumption(8)isimportant,becauseitjustifiesgeneralizedleastsquaresestimation(GLS), includingthepopularREestimator,appliedto (cid:5)Ait =τt +Eitβ0+vit,t =1,...,T. (9) AspecialcaseofGLSispooledOLS(POLS),whereanyserialcorrelationinvit duetothe presence of ci—in fact, any serial correlation—is ignored. Inference is handled by using arobustvariancematrixestimator(whichisalsorobusttoheteroskedasticityofarbitrary form). From (9) it is easily seen that consistency of POLS only requires that vit and Eit areuncorrelated;thestrictexogeneityassumptionin(8)isnotneeded.However,because vit includes ci, POLS requires that the inputs are uncorrelated with the student-specific heterogeneity. TheHausmanTestComparingRE(orPOLS)toFE In empirical panel data applications, including VAM estimation, one could estimate the gain-scoreequation(4)bybothREandFE.Bothestimatorsrequirethestrictexogeneity assumptionstatedin(5)forconsistency.Inaddition,REusestheheterogeneityexogeneity conditionin(7).Therefore,itiscommontocomparetheREandFEestimatesasatestof(7). However,itisimportanttorememberthatREandFE—and,forthatmatter,POLS—will typicallyhavedifferentprobabilitylimitsifthestrictexogeneityassumption(1)isviolated. Therefore,anytestthatexplicitlyorimplicitlycomparestheREandFEestimators(orthe POLSandFEestimators)generallyhaspoweragainstviolationof(5)or(7),andonecannot usetheoutcomeoftheHausmantesttoconcludewhichassumptionfails,orwhetherboth fail. ThetraditionalformoftheHausman(1978)statisticusesaquadraticformbasedonthe differencesbetweentheREandFEestimators.Acriticalpointinapplyingthetraditional form is that it assumes that the RE estimator is (asymptotically) efficient: the variance- covariancematrixappearinginthequadraticformisvalidonlywhenREisasymptotically efficient. As discussed by Wooldridge (2010, Section 10.7.3), the relative efficiency of the RE estimator holds only when the idiosyncratic errors {eit} are serially uncorrelated and homoscedastic, both conditional on the covariates and ci. (See Wooldridge, 2010, 10.7.3 for a formal statement of the assumptions.) Yet the Hausman test has no power fordetectingserialcorrelationorheteroskedasticityin{eit}becausetheseproblemsdonot causeinconsistencyineithertheREorFEestimator.InthelanguageofWooldridge(1990), thetraditionalformoftheHausmantestadds“auxiliaryassumptions,”whichareusedto getastandardnulldistributioneven thoughthetesthasnopower fordetectingfailureof theassumptions. ItispossiblebutcomputationallycumbersometomodifytheusualHausmanstatistic to be robust to arbitrary serial correlation and heteroskedasticity in {eit}. One problem is thatthevariance-covariancematrixissingularwhentheestimatedequationincludestime effects,whichisverycommon.Amuchmorestraightforwardapproachistousearobust, regression-basedtest. EvaluatingSpecificationTests 43 The regression-based Hausman test is based on the correlated random effects specification (cid:5)Ait =τt +Eitβ0+E¯iξ +ai +eit (10) (cid:4)T where E¯i =T−1 Eir is the time average and ci =E¯iξ +ai. In this formulation, we r=1 explicitlymodeltheheterogeneityci asalinearfunctionofthetimeaverageoftheinputs (which is where the name “correlated random effects” comes from). Equation (10) still contains unobserved heterogeneity, ai, but it is uncorrelated with the entire history of inputs, {Eit}. If we maintain strict exogeneity conditional on ci, then strict exogeneity holds conditional on ai in (10). Therefore, equation (10) can be estimated by POLS or randomeffects. A well-known algebraic result (e.g., Wooldridge, 2010, Section 10.7.3) is that when POLS or RE is applied to (10), the resulting estimate of β is the fixed effects estimator 0 thatusesdeviationsfromtimeaveragestoremoveci fromtheequation (cid:5)Ait =τt +Eitβ0+ci +eit (11) Therefore, equation (10) is very useful for presenting a unified setting for RE and FE estimation.Inparticular,if(10)isestimatedbyrandomeffects(i.e.,feasibleGLSusingthe REstructure),itisstraightforwardtoconstructarobustWaldtestofH :ξ =0,whichhas 0 asmanydegreesoffreedomasthereareinputsEit5.ObtainingaWaldtestthatisrobust toarbitraryserialcorrelationorheteroskedasticityin{eit},whileremainingasymptotically efficientunderthetraditionalREassumptions,isstraightforwardusingpopularpackages that support RE and FE estimation.6 The regression-based test is asymptotically equiva- lent (against local alternatives) tothe traditional Hausman testwhen the{eit} are serially uncorrelatedandhomoscedastic. ArejectionofH0 :ξ =0istypicallytakentomeanthatci iscorrelatedwithE¯i but,as mentionedearlier,thisinterpretationisbasedonmaintainingassumption(5).Ifwereject H0 :ξ =0 then we have found that E¯i is correlated with the composite error, ci +eit, whichwarrantsastatisticalrejectionoftheREestimator.However,aswewillseeinour simulations in the section on simulation results, in the context of estimating VAMs one mustbecautiousinusingtheHausmantestinthisway.ItcouldbethatREisstatistically rejectedbutprovidesbetterestimatesoftheVAMcoefficientsthanitsnaturalalternative, FE.EventhoughtheREestimatesofVAMsmightbesystematicallybiased,theytypically have less sampling variation—sometimes much less—and the bias may be such that the estimatedVAMsdoagoodjobofrankingteachers.Wewillhavemoretosayonthisinthe sectiononsimulationresults. Apracticalproblemwithusingequation(10)asthebasisfortheHausmantestisthat, withmanyteachers,(10)containsmanyregressors:theoriginalteacherdummiesandthen theproportionoftimesthestudentseesthatteacheroverastudent’sentireobservedhistory 5ThePOLSandREestimatesofβ areequaltotheFEestimator.Further,withabalancedpanel 0 thePOLSandREestimatesofξ arethesame.(Theygenerallydifferwithanunbalancedpanel,in whichcaseREwillbemoreefficientunderthestandardREassumptions.)WhetherPOLSorREis used,thetestshouldbemadefullyrobusttoserialcorrelationandheteroskedasticityin{eit}. 6InStata,afullyrobustWaldtestiseasilyobtainedusingthe“cluster”option,whichishowwe carryoutthetestinoursimulations. 44 C.M.Guarinoetal. (i.e., the time average). Computationally, many regressors are not too difficult to handle with modern computers and statistical packages. A more pressing concern is potential finite-sampledistortionsinusinglarge-samplecriticalvalues(whichiswhattheHausman approach necessarily uses). The proper asymptotics rely on the number of students per teacher getting “large.” In practice, we may not have many student outcomes associated with some teachers, in which case a one-degree-of-freedom test may have better size properties. Rather than including the entire vector E¯i and testing joint significance, which ne- cessitatesaseparatevariableforeachteacherinthedataset,weproposeanewtest.This consists of substituting, for each student, the estimated average teacher effect across all yearsforthevectorE¯i.Thisisidenticaltoestimatingtheequation (cid:2) (cid:3) (cid:5)Ait =τt +Eitβ0+α E¯iβˆ0 +errorit (12) andperformingattestofα =0.Theestimateβˆ istheREestimatefromtheequation(11) 0 obtained in a first stage. One could also apply pooled OLS to (11), but the error term in (11)containsthestudentheterogeneityterm,andunderstandardassumptionsREismore efficientthanPOLS.ForthesamereasonweapplyREtoequation(12),butweuseafully robusttstatisticforαˆ toallowforarbitraryserialcorrelationandheteroskedasticity. A test using (12) rather than (10) conserves on degrees of freedom, but it may not detect certain kinds of teacher assignment mechanisms. In our simulation we study the propertiesofbothtestsandfindthatournew“one-degree-of-freedomHausmantest”has substantialpoweragainstnonrandomassignmentalternatives. InmanyapplicationsofREestimationintheVAMcontext,otherexplanatoryvariables are included as controls. Often such controls are student characteristics, such as family background, socioeconomic status, or baseline test scores that do not vary over time. (Test scores lagged one or more period are not allowed in RE estimation because lagged dependentvariablesalwaysviolatethestrictexogeneityassumption.)Whenavailable,itis importanttoincludesuchcontrolsinequation(10),leadingtoanequationsuchas (cid:5)Ait =τt +Eitβ0+E¯iξ +Ziγ +ai +eit, (13) whereZi isthevectoroftime-constantcontrols.Withgoodcontrols,itismoreplausible that the (remaining) unobserved heterogeneity is uncorrelated with {Eit}. One can also includetime-varying,strictlyexogenouscontrols,say{Xit},andthen(10)becomes (cid:5)Ait =τt +Eitβ0+E¯iξ +Ziγ +Xitη+X¯iλ+ai +eit, (14) wherewealsoincludethetimeaveragesof{Xit}.Totestwhethertheinputsarepartially correlated with heterogeneity we would still test H :ξ =0; failing to reject means we 0 candropE¯i from(11)andestimatetheequationbyRE,typicallyobtainingamoreprecise estimatorofβ .7 Thetestdescribedinequation(12)canbealsoappliedwhenadditional 0 covariatesareincludedinthemodel.Inoursimulations,weonlyconsideranequationwith teacherdummiesandnootherinputs. 7Guggenberger(2010)warnsoftheproblemsofusingtheHausmantestasapretestforchoosing betweenREandFE.Theregression-basedversionoftheHausmantestmakesitclearthattheHausman pretestingproblemisessentiallythesameastheproblemofpretestingwhetherasetofregressors belongsinanequationandthenusinganForWaldtesttodeterminewhetherthoseregressorsappear inthefinalmodel.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.