ebook img

Local average causal effects and superefficiency PDF

0.04 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Local average causal effects and superefficiency

Local average causal effects and superefficiency Peter M. Aronow ∗ 6 February 8, 2016 1 0 2 b e Abstract F Recentapproachesincausalinferencehaveproposedestimatingaveragecausaleffectsthat 5 are local to some subpopulation, often for reasons of efficiency. These inferential targets are ] sometimesdata-adaptive,inthattheyaredependentontheempiricaldistributionofthedata. In T thisshortnote,weshowthatifresearchersarewillingtoadapttheinferentialtargetonthebasis S . ofefficiency, then extraordinary gains inprecision can beobtained. Specifically, when causal h effects are heterogeneous, any asymptotically normal and root-n consistent estimator of the t a populationaveragecausaleffectissuperefficientforadata-adaptivelocalaveragecausaleffect. m Ourresultillustratesthefundamentalgaininstatisticalcertaintyaffordedbyindifferenceabout [ theinferential target. 2 v 3 1 Introduction 1 4 1 Whencausaleffectsareheterogeneous,theninferencesdependonthepopulationforwhichcausal 0 . effects are estimated. Although population average causal effects have traditionally been the 1 0 inferential targets, recent results have focused on estimating average causal effects that are lo- 6 cal to some subpopulation for reasons of efficiency. These approaches include trimming ob- 1 : servations based on the distribution of the propensity score (Crump et al., 2009), using regres- v i sion adjustment to estimate reweighted causal effects (Angristand Pischke, 2009; Humphreys, X 2009), or implementing calipers for propensity-score matching (Rosenbaumand Rubin, 1985; r a Austin, 2011). In some cases, the target parameter is dependent on the empirical distribution of the data, including cases where the researcher is explicitly conducting inference on, e.g., the average treatment effect among the treated conditional on the observed covariate distribution (Abadieand Imbens, 2002), or other causal sample functionals (Aronow,Green, and Lee, 2014; Balzer, Petersen, and vanderLaan,2015), withoutrevisionto theestimatorbeingused. This approach, taken at full generality, implies a form of indifference to which population causal effects are measured for. This indifference can be codified in the form of a data-adaptive ∗Peter M. Aronowis AssistantProfessor,DepartmentsofPoliticalScience andBiostatistics, YaleUniversity,77 ProspectSt., New Haven, CT 06520(Email: [email protected]). TheauthorthanksDonGreen, CyrusSamii, JasSekhonandMarkvanderLaanforhelpfulcomments.Allremainingerrorsaretheauthor’sresponsibility. 1 target parameter (van derLaan, Hubbard,and Pajouh, 2015) that is allowed to vary with the data dependingonwhichsubpopulation’slocalaveragecausaleffectisbestestimated. Whentreatment effectsareheterogeneous,adaptivelychangingthetargetparameteronthebasisofefficiencyyields an unusual result: if the population average causal effect can be consistently estimated with a root-n consistent and asymptotically normal estimator qˆ, then the same estimator qˆ is always superefficient (i.e., faster than root-n consistent) for a data-adaptive local average causal effect. Furthermore, with an additional regularity condition on mean square convergence, we show that themeansquareerror ofqˆ foradata-adaptivelocalaveragecausal effect is ofo(n 1). − 2 Results Consider a full data probability distribution G with an associated causal effect distribution t with finite expectation E [t ], where E [.] denotes the expectation over the distributionG. We impose G G aregularityconditionon t establishingnon-degeneracyoft . Assumption 1 (Effect heterogeneity). min(sup(Supp(t )) E [t ],E [t ] inf(Supp(t )))=c> G G − − 0. Weobservean empiricaldistributionF . Supposewehaveanroot-nconsistentand asymptoti- n cally normalestimatoroftheaveragecausal effect E [t ],qˆ. G Definition1. Anestimatorqˆ isroot-nconsistentandasymptoticallynormalforq if√n(qˆ q )= 0 0 N (0,s 2)+o (1),forsome0<s 2 <¥ . − p We nowdefinethetarget parameter, q . Fn Definition 2. Let thetargetparameter qˆ : qˆ E [t ] c G | − |≤ q Fn = EEG[[tt ]]+cc :qˆ:qˆ−EE[Gt []t<]>cc , G G − − −  where, asin Assumption1, c=min(sup(Supp(t )) E [t ],E [t ] inf(Supp(t ))). G G − − The target parameter adapts naturally to the closest value in an interval surrounding E [t ], G wherethewidthoftheintervalisdefinedbythesupportoft . Weformalizehoweachq isalocal Fn averagetreatmenteffect. Proposition 1. There exists a nonnegative weighting associated with each empirical distribution F , w , suchthatacrossallF , q = EG[wFnt ]. n Fn n Fn EG[wFn] A proof of Proposition 1 follows directly from the fact that a weighted mean can obtain any value in the interval defined by the infimum and supremum of its distribution’s support. We now provethethesuperefficiency ofqˆ. 2 Proposition 2. Suppose that Assumption 1 holds. Then for any root-n consistent and asymptoti- callynormalestimatorofE [t ],qˆ, √n(qˆ q )=o (1). G − Fn p Proof. The author thanks Jas Sekhon for suggesting the following proof strategy. Decompose qˆ intoq˜ =N (E [t ],s 2/n)andu=o (n 1/2),sothatqˆ =q˜+u. Since(q˜ q )iso (a )forany G p − − Fn p n positivesequence(a ), therate ofconvergenceofqˆ isat worst governedbytheboundensured by n q˜ q u’s op(n−1/2) convergence. To prove the claim, note that for any positive e , Pr(cid:16)| −anFn| ≥e (cid:17) ≤ qPˆr(cid:0)q˜q −q=Fno6=(a0(cid:1))=+o2F ((n−1c/√2)n=/s o).(Snin1c/e2)l,imyine→ld¥in2gF t(h−ecre√sunl/ts. ) = 0, (q˜ −q Fn) is op(an). Thus − Fn p n p − p − When an additional regularity condition is imposed on the convergence of qˆ to normality, a strongerresult can beobtainedabouttherateofmean squareconvergence. Proposition3. Supposethatqˆ obeys √n(qˆ E [t ])=N (0,s 2)+e , whereE [e 2]=o(n 1/2). G G − Then E [(qˆ q )2]=o(n 1). − G − Fn − Proof. Wewillshowthatthemeansquareerrorof(q˜ q )convergestozerosufficientlyquickly, implying that the rate of convergence of qˆ is at wors−t goFnverned by the mean square error bound ensured by e ’s convergence rate. To obtain the rate of convergence of the mean square error of q˜, we integrate over its squared deviation from the target parameter. Within c of E [t ], the G squared deviation is zero, thus we need only integrate over the squared deviation over the tails of the normal distribution. To ease calculations, we obtain an upper bound by integrating over the squared deviationfrom E [t ],ratherthanfrom q : G Fn ¥ √n x2n EG[(q˜−q Fn)2]≤2Zc x2s √2p e−2s 2 c2n F c√n =cs 2 e−2s 2 +2s 2 (cid:16)−s (cid:17) rp √n n =o(n 1). − SinceE [(q˜ q )2]=o(n 1)andn 1/2E [e 2]=o(n 1),theCauchy-Schwarzinequalityensures thatE [G(qˆ −q F)n2]=o(n −1)+o(n −1)=oG(n 1). − G − Fn − − − 3 Discussion Our results highlight the additional certainty obtained by indifference about the population for which averagecausal effects are measured. It is well knownthat efficiency gainsmay beobtained through data-adaptiveinference. But the extent to which theresearcher benefits from indifference about thetarget parameterhas been understated. Undertreatment effect heterogeneity – a precon- dition for locality to be a concern – all root-n consistent and asymptotically normal estimators of the average treatment effect are superefficient for a local average treatment effect. And while we donotspeaktothesubstantiveimplicationsofindifferenceinscientificinquiry,weshowhowsuch indifferenceyieldsgreatlyincreased statisticalcertainty. 3 References Abadie,A.andImbens,G.2002.Simpleandbias-correctedmatchingestimatorsforaveragetreat- menteffects. NBERtechnical workingpaperno.283. Angrist, J.D. and Pischke, J.S. 2009. Mostly harmless econometrics: An empiricist’s companion. Princeton,NJ: PrincetonUniversityPress. Aronow, P.M., Green, D.P. and Lee, D.K.K. 2014. Sharp bounds on the variance in randomized experiments.AnnalsofStatistics.42(3)850–871. Austin, P.C., 2011. Optimal caliper widths for propensity-score matching when estimating differ- ences in means and differences in proportions in observational studies. Pharmaceutical Statis- tics,10(2),pp.150–161. Balzer, L.B., Petersen, M.L. and van der Laan, M.J. 2015. Targeted estimation and inference for thesampleaveragetreatment effect. bepress. Crump, R.K., Hotz, V.J., Imbens, G.W. and Mitnik, O.A. 2009. Dealing with limited overlap in estimationofaverage treatmenteffects. Biometrika. Humphreys, M., 2009. Bounds on least squares estimates of causal effects in the presence of het- erogeneousassignmentprobabilities.Manuscript,ColumbiaUniversity. Rosenbaum P.R. and Rubin D.B. 1985. Constructing a control group using multivariate matched samplingmethodsthat incorporatethepropensityscore. American Statistician39(1):33–38 van der Laan, M.J., Hubbard, A.E. and Pajouh, S.K. (2013). Statisticalinference for dataadaptive target parameters.bepress. 4

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.