ebook img

Confidence intervals for means under constrained dependence PDF

0.13 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Confidence intervals for means under constrained dependence

Confidence intervals for means under constrained dependence Peter M.Aronow1,2, Forrest W. Crawford2, and Jose´ R. Zubizarreta3 1. DepartmentofPoliticalScience, YaleUniversity 6 2. DepartmentofBiostatistics,YaleSchool ofPublicHealth 1 3. DivisionofDecision,Risk andOperations,and DepartmentofStatistics,ColumbiaUniversity 0 2 February 2,2016 b e F 1 Abstract ] We developageneralframeworkforconductinginferenceonthemeanofdependentrandomvariables T givenconstraintsontheirdependencygraph.Weestablishtheconsistencyofanoraclevarianceestimator S ofthemeanwhenthedependencygraphisknown,alongwithanassociatedcentrallimittheorem. We . h deriveanintegerlinearprogramforfindinganupperboundfortheestimatedvariancewhenthegraph t isunknown,buttopologicalanddegree-basedconstraintsareavailable. Wedevelopalternativebounds, a m includingaclosed-formbound,underanadditionalhomoskedasticityassumption. Weestablishabasis forWald-typeconfidenceintervalsforthemeanthatareguaranteedtohaveasymptoticallyconservative [ coverage. We apply the approach to inference from a social network link-tracing study and provide 1 statisticalsoftwareimplementingtheapproach. v 9 Keywordsdependencygraph,HIVprevalence,oracleestimator,varianceestimate 5 3 0 1 Introduction 0 . 2 0 Researchers often encounter dependent data, where the exact nature of that dependence is unknown, and 6 theywishtomakeinferencesaboutoutcomemeans. Currentmethodstypicallyassumeeitherindependence 1 of unit outcomes, or that the dependency structure is known or directly estimable (LiangandZeger, 1986; : v Conley, 1999; White, 2014; OgburnandVanderWeele, 2014; CameronandMiller, 2015; Tabord-Meehan, i X 2015). In many cases, however, researchers may only have limited information about the nature of depen- r dence between units, or perhaps only the number of other units on which a given unit’s outcome depends. a For example, in studies of units embedded in a network, the degrees to which subjects are connected may be known, but the identities of the other subjects to whom they are connected may often remain unob- served (e.g., Crawford, 2016). The underlying relationships may be represented by a dependency graph (BaldiandRinott, 1989), where vertices represent individual units and edges represent the possibility of probabilistic dependence. Adependency graph isnot agenerative graphical modelforoutcomes, such asa Markov random field. Rather, a dependency graph is a description of possible non-independence relation- shipsbetweenunits. In this paper, wedevelop a framework for constructing confidence intervals for the mean of dependent random variables, where their dependency graph is unknown or partially known but subject to topological constraints. Considering the class of Wald-type normal-approximation-based estimators given the sample mean, we seek an upper bound for the estimated variance of the sample mean using upper bounds for the degrees of each unit in the dependency graph and a local dependence assumption. We show that this optimizationproblemcanbeexpressedasaintegerlinearprogramfortheelementsofthedependencygraph 1 adjacency matrix. Weimplement this approach inthenewstatistical software package depinfforR.The approach may be used even when no edges in the dependency graph are known. We also derive more computationally simple bounds, including a closed-form bound, when the random variables are assumed to be homoskedastic. We illustrate the utility of the method using data from a social link-tracing study of individuals athighriskforHIVinfection inSt. Petersburg, Russia. 2 Setting Consider asimple undirected graph G = (V,E)withnoparallel edges orself-loops. Let|V| = N. Associ- ated with each vertex i ∈ V isarandom variable X , and G characterizes probabilistic dependencies in the i outcomes(e.g.,BaldiandRinott,1989). Definition 1 (Dependency graph). G is a dependency graph if for all disjoint sets V1,V2 ⊂ V with no edge in E connecting a vertex in V1 to a vertex in V2, the set {Xi : i ∈ V1} is independent from the set {Xj : j ∈ V2}. We emphasize that a dependency graph represents a set of possible non-independence relationships amongunits,notagraphical modelthatinducesdependencies. Suppose G is a dependency graph and we observe a subset V ⊆ V, where |V | = n. Label these S S observedvertices1,...,n,andlabeltheunobservedverticesinV\V arbitrarilybyn+1,...,N. Foreach S i ∈VS,weobservetheoutcomesX1,...,Xn andthedegreesdi =|{j : {i,j} ∈ E}|foreachi ∈ VS. Definition2(Inducedsubgraph). ForasetofverticesV ⊆ V,theinducedsubgraphinG isG = (V ,E ), S S S S whereE = {{i,j} : i ∈V , j ∈ V , and{i,j} ∈ E}. S S S Let G = (V ,E ) be the induced subgraph of the observed vertices V . It follows that G is also a S S S S S dependency graph. Let G = (V ,E ) be a subgraph of G , consisting of all the observed vertices in V , R S R S S andasubsetoftheedgesinE . S Assumption1(Observeddata). WeobservetheoutcomesX1,...,Xn,thedegreesd1,...,dn,andGR. LetX = (X1,...,Xn),d = (d1,...,dn),anddenotetheobserveddataasY = (X,d,GR). Wewishtoconduct inference onthemeanµ = 1 E[X ]givenY. Themeanµisafunctional of n i∈VS i the joint distribution of outcomes for the units in the sample, and is accordingly a data-adaptive target pa- P rameter(vanderLaanetal,2013;Balzeretal,2015)andnotnecessarilyafeatureofanybroaderpopulation ofunits. LetX = n−1 X . Weproceed byconstructing conservative estimatorsof i∈VS i P n n 1 var(X)= cov(X ,X ). n2 i j iX∈VSjX∈VS Wemayusethesquarerootsoftheseestimates asstandard errorestimators inordertoconstruct Wald-type confidenceintervalsaboutthesamplemeanthatareguaranteedtohaveasymptoticcoverageforµatgreater thanorequaltonominallevels. 3 Variance estimation Theobserved subgraph G maynotrevealalltheedgesinG thatconnect observed vertices. Weconsider R S aclassofvariance estimators thatdepend onknowledge ofG ,whosestructure isrepresented byann×n S binary symmetric adjacency matrix in which rows and columns are ordered by the indices 1,...,n of the verticesinV . Wenowdefinesomekeyconcepts. S 2 Definition 3 (Compatibility). The n × n binary symmetric adjacency matrix A is compatible with the observed dataY ifforeach{i,j} ∈ E ,A = A = 1,andforeachi∈ V , A ≤ d . R ij ji S j∈VS ij i Thelastcondition inDefinition3requires thatthedegreeofiinthesubgrapPh G notbegreater thanits S degree in the full graph G. Let AO = {AO} be the true n×n adjacency matrix of G , where AO = 1 ij S ij if {i,j} ∈ E for i,j ∈ V and 0 otherwise. Let A(Y) = {A : AiscompatiblewithY} in the sense of S S Definition3;itisclearthatAO ∈A(Y). Definition 4(Oracle estimator). For a family of variance estimators V(A;Y) defined for A ∈ A(Y), the oracleestimatorisV(AO;Y). b ForavarianceestimatorV(A;Y),definethesetAm = {A ∈ A(Y): V(A;Y)ismaximized}. b Definition 5 (Maximal compatible estimator). Let Am ∈ Am. The maximal compatible estimator is b b V(Am;Y). The maximal compatible estimator provides a sharp upper bound for the oracle estimator because b V(AO;Y) ≤V(Am;Y). Finally,definetheplug-insamplevariance, σˆ2 =n−1 (X −X)2. i∈VS i Wenowdescribeanasymptoticscaling,alongwithboundednessconditionsforoutcomevaluesandunit P dbegrees. In pabrticular, bounding degrees suffices to ensure sufficient sparsity in the dependency graph to allowforroot-nconsistency, acentrallimittheorem, andconvergence ofthevariance estimator. Assumption2(Asymptoticscaling). Considerthesequence(G,Y) ofnestedgraphsG andobserveddata n Y = (G ,X,d), where G = (V ,E ), |V | = n, and |V| = N ≥ n. Assumethere exist finite, positive R R S R R n constants c1, c2 such that for every element (G,Y)n, Pr(|Xi −µ| > c1) = 0,∀i ∈ VS (bounded outcome values) and j∈VS AOij ≤ c2,∀i ∈ VS (bounded degrees in the dependency graph). Further assume there existsafiniteP,positive constant c3 suchthatlimn→∞nvar(X)= c3 (nondegenerate limitingvariance). Wewillproceed byderiving oracleestimators undertwosetsofnested assumptions. Weestablish their asymptotic properties, thenderivefeasible estimatorsthatdominatetheoracleestimators. 3.1 General Case We first consider the case where we impose no distributional assumptions on the distribution of any X i (beyondtheboundedness conditions ofAssumption 2). Definetheestimator 1 V1(A;Y) = n2 nσˆ2+ Aij(Xi −X)(Xj −X). (1) iX∈VSjX∈VS b   Thecorresponding oracleestimatorV1(AO;Y)isconsistent. Proposition 1. UnderAssumption2,foranyǫ > 0, b lim Pr(|nV1(AO;Y)−nvar(X)| > ǫ) = 0. n→∞ b Proof. We follow the general proof strategy of AronowandSamii (2013). We will establish mean square convergence ofnV1(AO;Y)tonvar(X),allowing ustoinvoke Chebyshev’s inequality toprovethepropo- sition. Decomposeσˆ2 = n−1 n X2−n−2( n X )2. LinearityofexpectationsimpliesE[X] = µand i=1 i i=1 i E[X2] =n−1 nb E[X2]. SinceAssumption2guaranteesboundedoutcomes,andthenumberofnonzero i=1 i P P P 3 elements in the covariance matrix of outcome values is O(n), var(X) = O(n−1)and var(X2) = O(n−1), yielding convergence ofσˆ2. Nextweaddressconvergenceofthesecondtermn−1 AO(X −X)(X −X). Asymptotic i∈VS j∈VS ij i j unbiasedness follows directly from linearity of expectations and var(X) = O(n−1). To establish mean P P squareconvergence, weconsider thevariance 1 var AO(X −X)(X −X) n ij i j  iX∈VSjX∈VS  1  = cov AO(X −X)(X −X),AO(X −X)(X −X) (2) n2 ij i j kl k l i,j,Xk,l∈VS (cid:0) (cid:1) 1 = AOAOcov (X −X)(X −X),(X −X)(X −X) n2 ij kl i j k l i,j,Xk,l∈VS (cid:0) (cid:1) wherethelastlinefollowsfrombilinearity ofcovariance. Letting ξ = cov (X −X)(X −X),(X −X)(X −X) , ijkl i j k l (cid:0) (cid:1) wenowexaminetheconditions underwhichξ 6= 0. Expanding thecovariance, ijkl ξ = cov (X −X)(X −X),(X −X)(X −X) ijkl i j k l = E (X −X)(X −X)(X −X)(X −X) (cid:0) i j k l (cid:1) −E (X −X)(X −X) E (X −X)(X −X) (cid:2) i j k l(cid:3) = E[X X X X ]−E[X X X X]−E[X X X X]−E[X X X X] (cid:2)i j k l i j(cid:3) (cid:2)k i j l (cid:3) i k l 2 2 2 −E[X X X X]+E[X X X ]+E[X X X ]+E[X X X ] j k l i j i k i l 2 2 2 +E[X X X ]+E[X X X ]+E[X X X ] j k j l k l 3 3 3 3 4 (3) −E[X X ]−E[X X ]−E[X X ]−E[X X ]+E[X ] i j k l − E[X X ]E[X X ]−E[X X ]E[X X]−E[X X ]E[X X] i j k l i j k i j l 2 +E(cid:2)[XiXj]E[X ]−E[XiX]E[XkXl]+E[XiX]E[XlX] 2 +E[X X]E[X X]−E[X X]E[X ]−E[X X]E[X X ] i k i j k l 2 +E[X X]E[X X]+E[X X]E[X X]−E[X X]E[X ] j l j k j 2 2 2 2 2 +E[X ]E[X X ]−E[X ]E[X X]−E[X ]E[X X]+E[X ]E[X ] k l k l (cid:3) Thenbyroot-nconsistencyofmeansandSlutsky’sTheorem,asn → ∞expectationsinvolvingX factorize, yielding, e.g. E(X X) = E(X )µ+O(n−1). Wethereforecombinetermsandrewrite(3)as i i ξ = cov(X X ,X X ) ijkl i j k l −µ cov(X X ,X )+cov(X X ,X )+cov(X ,X X )+cov(X X X ) i j k i j l i k l j k l (4) 2 −1 +µ cov(X ,X )+cov(X ,X )+cov(X ,X )+cov(X ,X ) +O(n ) (cid:0) i k i l j k j l (cid:1) ′ −1 = ξij(cid:0)kl+O(n ), (cid:1) where the limiting covariance is denoted ξ′ . This can only be nonzero if at least one of the covariance ijkl termsin(4)isnonzero. SinceG isadependencygraph,thisconditionisonlymetwhenthereexistsatleast S 4 oneedgebetweenavertexintheset{i,j}andavertexintheset{k,l}. ThereforeAOAOξ′ canonlybe ij kl ijkl nonzero if {AO = AO = 1}and {AO = 1}or{AO = 1}or{AO = 1}or{AO = 1} . ij kl ik il jk jl (cid:0) (cid:1) ByAssumption 2,thedegree ofeach vertexinVS isbounded byc2,sothecondition issatisfied byatmost 4nc3termsinthesummationin(2). Inaddition,wemaycomputetheremainderterm AOAO(ξ − 2 i,j,k,l∈VS ij kl ijkl ξ′ ) = AOAOO(n−1) = O(n), thus both terms are O(n) before dividing by n2. Therefore ijkl i,j,k,l∈VS ij kl P var n−1Pi∈VS j∈VS AOij(Xi −X)(Xj −X) = O(n−1)andtheresultfollows. (cid:16) (cid:17) PropoPsition 1Pis readily applicable to problems where the dependency graph is known, as it provides a basis forconsistent variance estimation, generalizing results forspecial cases (Conley,1999;Aronowetal, 2015). Wenowaddress thecase wherethetrue subgraph G isnotknown, but constraints onthegraph are S available. Let Am1 = {A ∈ A(Y) : V1(A;Y)ismaximized} be the set of compatible adjacency matrices that maximizeV1(A;Y). WecanfindanelementAm ofAm1 bysolvingthe0-1integerlinearprogram b b maximize (X−X)′A(X−X) A subjectto A1 (cid:22) d, (5) A (cid:23) A , R where A is the adjacency matrix of G and (cid:22) denotes the element-wise “less-than” relation. Since A is R R anadjacency matrix, wecan reduce theprogram and maximize overthe decision variables that correspond totheupperorlowerdiagonalelementsofAonly(fordetails,seethesupplementarymaterials). Theresult- ing program has n(n−1)/2 decision variables and in general it is a multidimensional knapsack problem (Kellereretal, 2004a). In the abstract, this problem is NP-hard problem, but it admits a polynomial time approximation scheme (PTAS). Nonetheless, typical PTASdepend heavily on the size of the problem and their running time isvery high (see, e.g., section 9.4.2 of Kellereretal 2004a). In spite of this, in standard practice, for example with 1000 observations or less as in our application in Section 5, problem (5) can be solved in a few seconds with modern optimization solvers such as Gurobi. To obtain a solution within a provably small optimality gap, these solvers use a variety of techniques, including: linear programming and branch-and-bound procedures to reduce the set of feasible solutions; presolve routines applied prior to the branch-and-bound procedures to reduce the size of the problem; cutting planes methods to remove fractional solutions and tighten the formulation; and a collection of heuristics to find good incumbent so- lutionsinthebranch-and-bound (BixbyandRothberg,2007;LinderothandLodi,2010;Nemhauser,2013). Allthesetechniques areusedinparallel byexploiting theavailability ofmultiple coresincomputers today. Weprovideanimplementation inthenewstatistical package depinfforR. Whilethetrue adjacency matrixAO isnotknown, anelement Am ∈ Am produces avariance estimate 1 V1(Am,Y) that is at least as large as the oracle estimator V1(AO;Y). As n grows large, the variance estimate V1(Am,Y) is conservative: the probability that nV1(Am) underestimates nvar(X)by more than ǫb> 0tendstozero. b b b Corollary 1. GivenAssumption2,thenforanyǫ > 0, lim Pr(nvar(X)−nV1(Am;Y)> ǫ) = 0. n→∞ b 5 Proof. Acrossallsamplerealizations, V1(Am;Y) ≥ V1(AO;Y). Then lim Pr(nvar(X)−nVb1(Am;Y) > ǫb) n→∞ ≤ nl→im∞Pr(nvar(X)−bnV1(Am;Y)+nV1(Am;Y)−nV1(AO;Y)> ǫ) (6) = nl→im∞Pr(nvar(X)−nVb1(AO;Y)> ǫ)b b = 0 b byProposition 1. Corollary 1 does not imply consistency of V1(Am;Y) as an estimator of nvar(X), nor does it imply thattheestimatorconvergestoanyparticularlimitingvalue. Ratherwehaveestablishedthat,forlargen,its distribution willtendtobeatleastaslargeasthebtruevariance. 3.2 Alternativebounds under homoskedasticity When all variances are equal, we can obtain alternative closed-form bounds that are computationally sim- pler and is less sensitive to between-sample variability in the empirical variance-covariance matrix. This estimatoressentially onlydependsontheestimatedvarianceofunitoutcomesandthemaximumnumberof edgesinthedependency graph. Assumption3(Homoskedasticity). var(X ) = var(X ),∀i,j ∈ V. i j Under homoskedasticity, the general estimator V1(Am,Y) developed in Section 3.1 provides conser- vative variance estimate. A bound that is relatively computationally simple to compute can be derived by notingthatwhenvar(X ) = σ2,cov(X ,X ) ≤ σ2AbO. Tothisend,definetheestimator i i j ij σˆ2 1 V2(A;Y) = 1+ Aij . (7) n  n  iX∈VSjX∈VS b   TheoracleestimatorV2(AO,Y)isnotgenerallyconsistent, thoughitisasymptotically conservative. Proposition 2. GivenAssumptions2and3,thenforanyǫ > 0, b lim Pr(nvar(X)−nV2(AO;Y) > ǫ) = 0. n→∞ Proof. Toprovetheclaim,wefirstdefineanalternativbe oracleestimator whichpresumes knowledge ofthe ρ values, i σˆ2 1 V∗(AO;Y)= 1+ AOρ . 2 n  n ij i iX∈VSjX∈VS b   Multiplying by n, nV∗(AO;Y) = σˆ2 1+ 1 AOρ . As in the proof of Proposition 1, σˆ2 2 n i∈VS j∈VS ij i convergesinmeansqbuare. ByAssumpthion2,1P≤1+Pn1 i∈VS j∈iVS AOij ≤ 1+c2,allowingustoinvoke Slutsky’s Theorem and Chebyshev’s Inequality to showPlimn→∞PPr(|nV2∗(AO;Y)−nvar(X)| < ǫ) = 0. The Cauchy-Schwarz Inequality (i.e., all ρi ≤ 1) implies V2∗(AO;Y) ≤ V2(AO;Y) across all sample realizations. Theresultfollowsdirectly. b b b 6 As before, we can maximize the estimator V2(A;Y) over the family of compatible graphs. Define Am2 = {A ∈ A(Y) : V2(A;Y)ismaximized}, and let Am ∈ Am2 . To find an element of Am2 , we solve the0-1integerlinearprogram b b maximize 1′A1 A subjectto A1 (cid:22) d, (8) A (cid:23)A , R whereagainAisanarbitrary0-1adjacencymatrixandA istheadjacencymatrixofG . Notethatfinding R R thesolution tothisproblem doesnotdepend ontheempirical variance-covariance matrix; thevariability of theestimatorV2(Am;Y)ispurelyattributable toestimation errorinσˆ2. Since V2(A;Y) does not rely on any feature of A other than the number of positive entries, we can derive a looserbclosed-form upper bound by considering the maximum number of edges that can be in E . S Fori∈ V b,letd′ = min{d ,n−1}bethedegreeofiinG,truncated atn−1. Let S i i σˆ2 1 V′(Y)= 1+ d′ . (9) 2 n  n i iX∈VS b   Theestimator(9)doesnotdependonanyparticularmemberofthesetAofcompatibleadjacencymatrices. Lemma 1. We have V2(AO,Y) ≤ V2(Am;Y) ≤ V2′(Y), with V2(Am;Y) = V2′(Y) when there exists a compatible adjacency matrixAm ∈ Asuchthatd′ = Am. i j∈VS ij b b b b b Proof. Bydefinition,V2(A;Y) ≤ V2(Am;Y)foreverPyA ∈ A. SinceAO ∈ A,itfollowsthatV2(AO,Y) ≤ V2(Am;Y). Now let dmi = j∈VS Amij be the degree of i in the adjacency matrix Am, and note that for everyi ∈V ,dm ≤db′. Then b b S i i P b σˆ2 1 V2(Am;Y)= n 1+ n Amij iX∈VSjX∈VS b   σˆ2 1 = 1+ dm n  n i  iX∈VS   σˆ2 1 ′ ≤ 1+ d n  n i iX∈VS = V′(Y)  2 as claimed. Now consider a compatible adjacency matrix A ∈ A with the property that d′ = A . b i j∈VS ij From the program (8) we see that 1′A1 = d′ is two times the maximal number of edges in G , A1 = d′ (cid:22) dbythedefinition ofd′ = (d′,...i,∈dV′S),aindA (cid:23) A since A ∈ A. ItfollowsthPatA ∈ AmS, 1 P n R sowemaycallAm = A. ThereforeV2(Am;Y)= V2′(Y),asclaimed. Lemma 1 implies a simple, conservative correction to the variance under homoskedasticity; simply b b multiplytheconventional varianceestimate σˆ2 by1+d′,whered′ istheaveragetruncated degree. n Asexpected, theupperboundestimatorsunderhomoskedasticity areasymptotically conservative. Corollary 2. GivenAssumptions 2and3,thenforanyǫ > 0, lim Pr(nvar(X)−nV2(Am;Y) > ǫ)= 0, n→∞ lim Pr(nvar(X)−nV′(Y) > ǫ)= 0. n→∞ b2 b 7 TheprooffollowsfromLemma1andthesamereasoning employedintheproofofCorollary 1. 4 Wald-type confidence intervals WenowprovethatourvarianceestimatescanbeusedtoformvalidWald-typeconfidenceintervalsaboutthe samplemean. First,weestablish acentrallimittheoremforthesamplemeangivenourasymptoticscaling. Lemma2. GivenAssumption2, X −µ var(X)→ N(0,1) d (cid:30)q . (cid:0) (cid:1) Lemma 2, a standard result in applying Stein’s method to the setting of local dependence, has been proven by, e.g., Theorem 2.7 of Chenetal (2004). Similarly, we reiterate the well-known basis for Wald- typeconfidence intervals. Lemma3. GivenAssumption2,ifavariance estimatorV(A;Y)satisfies lim Pr(|nV(A;Y)−nvar(X)| > ǫ)= 0, n→∞ b thenconfidenceintervalsformedasX±z1b−α/2 V(A;Y)willhave100(1−α)%coverageforµinlarge n. q b Lemma3followsdirectlyfromLemma2andSlutsky’sTheorem. Wenowestablish thevalidity ofconfidence intervalsconstructed viaLemma3. Proposition 3. GivenAssumption2,ifavariance estimatorV(A;Y)satisfies lim Pr(nvar(X)−nV(A;Y) > ǫ) = 0, n→∞ b thenconfidenceintervals formedasX ±z1−α/2 V(Ab;Y)willhaveatleast100(1−α)%coverageforµ inlargen. q b Proof. Definearandom variableU suchthat V(A;Y) ifV(A;Y) ≤ var(X) U = (var(X) otherwise. b b Thenlimn→∞Pr(|nU −nvar(X)| > ǫ) = 0,andbyLemma3Wald-typeconfidenceintervalsformedwith U asavarianceestimatewillhaveatleastpropercoverage. Acrosseverysamplerealization,V(A;Y) ≥ U, and thus the coverage of Wald-type confidence intervals using V(A;Y) will be also be at least proper levels. b b Ittherefore follows that Wald-type confidence intervals constructed using the conservative variance es- timatorsderivedinSection3yieldasymptotic coverageatatleastnominallevels. Corollary3. GivenAssumption2,thenconfidenceintervalsformedasX±z1−α/2 V1(Am)haveatleast 100(1−α)%coverageforµinlargen. q b Corollary4. GivenAssumptions2and3,thenconfidenceintervalsformedasX±z1−α/2 V2(Am;Y)or X ±z1−α/2 V2′(Am;Y)haveatleast100(1−α)%coverage forµinlargen. qb ProofsfoqrCorollaries 3and4followdirectlyfromCorollaries1and2andProposition 4. b 8 Table 1: Standard error estimates and 95% asymptotic Wald-type confidence intervals for the population HIVprevalence µ. Na¨ıve General Homoskedastic σˆ2/n 0.0147 V1(Am;Y) 0.0563 V2(Am;Y) 0.0602 95%CI: (0.299,0.357) q 95%CI: (0.217,0.438) q 95%CI: (0.210, 0.446) p b b V′(Y) 0.0602 2 q95%CI: (0.210, 0.446) b 5 Application: HIV prevalence in a network study The “Sexual Acquisition and Transmission of HIV-Cooperative Agreement Program” (SATH-CAP) sur- veyed n = 1022 injection drug users, menwho have sex withmen, and their sexual partners in St. Peters- burg, Russia from 2005 to 2008 (Iguchietal, 2009; Niccolaietal, 2010). Subjects were recruited using a social network link-tracing procedure known as “respondent-driven sampling” (RDS) (Heckathorn, 1997; Broadhead etal, 1998). Participants in an RDS study recruit other eligible subjects to whom they are con- nected within the target population social network. To preserve privacy, subjects do not report identifying information about their network alters; instead they report their degree in the target population network. Researchers observe thesociallinksalongwhichrecruitment takesplace,andthedegreesofrecruited indi- viduals. EachsubjectintheSATH-CAPstudycompletedademographicandbehavioral quenstionnaire and alsoreceivedarapidHIVtest. Wetreattheunderlyingsocialnetworkasadependencygraph,denotedG = (V,E)representingpossible probabilistic dependencies between surveyed subjects’ HIV status. Let the subgraph of recruitments be G = (V ,E ),asubgraph ofG;since onlyrecruitment links inG wereobserved, thestudy design didnot R S R revealtheinducedsubgraphG . Foreachsubjecti∈ V ,weobservetheirreportedtotaldegreed andtheir S S i binary HIV status Xi. Let the vector of subjects’ HIV status be X = (X1,...,Xn), and let the vector of theirdegreesbed = (d1,...,dn). ThestudyrevealsY = (X,d,GR),asdescribed inAssumption1. TheestimatedHIVprevalenceintheSATH-CAPstudyisµˆ = X = 0.328. Table1showsvarianceesti- matesandWald-type95%asymptoticconfidenceintervalscomputedusingthevarianceestimatorsdescribed in this paper. The first column shows the na¨ıve standard error estimate with corresponding confidence in- tervalbelow. Thesecondcolumngivesresultsforthegeneralcaseinwhichnoassumptions aremadeabout the variance of each X (Section 3.1). The third column gives results for the homoskedastic case in which i var(X )isassumedtobeequaltovar(X )fori6= j (Section3.2). i j Thena¨ıveconfidenceinterval isthenarrowest, andisequivalent tothecasewheretheadjacency matrix Aisdiagonal. Confidenceintervalscomputedusingthena¨ıveestimatormaydramaticallyunderstatetheun- certaintyinestimatesofµ,astheestimatorignoresthepossibilityofdependencebetweenunits. Confidence intervals computed using estimates V1 in the general case are narrower than estimators V2 computed under the homoskedasticity assumption. The widest intervals are obtained from the bounds given by V2(Am) and V2′(Y). From Lemma 1, we sebe that V2(Am;Y) = V2′(Y) because d′ = (d′1,...b,d′n) is the degree sequence ofacompatible adjacency matrixinA. b b b b 6 Discussion We have developed conservative estimators for the variance of the sample mean under partial observation ofadependencygraphandassumptionsaboutthevarianceofindividualoutcomes. Thevarianceestimation settingweaddresshereisquiteflexible,andcanaccommodateawidevarietyofdependencyandobservation 9 assumptions. Forexample, Assumption 1,whichstates thatweobserve Y = (X,d,G ),canbeweakened R when G is completely unknown. In this case the constraint in the integer linear programs (5) and (8) R becomesA (cid:23) 0where0isthen×nmatrixofallzeros;thisconstraintismetforalladjacencymatricesA, soitbecomessuperfluous. Alternatively, wemaynothavefullknowledge ofthedegrees d = (d1,...,dn), andinstead have onlyanupper bound d∗ for eachd ,oraglobal upper bound d ≤ d∗ foralli = 1,...,n. i i i Conservative variance estimation in both of these cases can be achieved (by susbstituting d∗ or d∗ for d ) i i with no change to the programs (5) and (8) or to the asymptotic results given here. When no information about G or the degrees d is available, setting every d = d∗ = n−1 delivers a maximally conservative R i upperbound. We note here four extensions. (i) Upper bounds for the variance estimates can be obtained by solving a relaxed form of the programs (5) and (8). By Proposition 3, using such upper bounds as a basis for conservative inferencewillalsoyieldvalidconfidenceintervals. Inpractice,theresultsobtainedbymodern optimizationsolverswillbetighterwithaprovablysmalloptimalitygapandthuswilltypicallybepreferable. (ii)Itispossibletoextendourresultstoobtainconfidenceintervalsmoregenerallyforasymptoticallylinear estimators (including regression estimators, e.g., CameronandMiller, 2015) using an empirical analogue of the variance of the influence function as the objective function. (iii) Our results facilitate conservative inference for causal estimands under interference between units (e.g., TchetgenandVanderWeele, 2010; LiuandHudgens, 2014), given interference that can be characterized by a constrained dependency graph. (iv) Given additional assumptions about the manner in which the units in the sample are drawn from a broader population, our results could be extended to facilitate confidence intervals for the mean of this broaderpopulation. Acknowledgement ForrestW.CrawfordwassupportedbyNIH/NCATSgrantKL2TR000140andNIMHgrantP30MH062294. Jose´R.ZubizarretaacknowledgessupportfromagrantfromtheAlfredP.SloanFoundation. Wearegrateful toRobertHeimerforhelpfulcommentsandforprovidingtheSATH-CAPdata,fundedbyNIH/NIDAgrant U01DA017387. We also thank Daniel Bienstock, Winston Lin, Luke W. Miratrix, Molly Offer-Westort, Lilla Orr, Cyrus Samii, and Jiacheng Wu for valuable comments. We express special thanks to Sahand Negahbanforimportantearlydiscussions regardingtheformulation oftheproblem. Supplementary Material A Formulation of the integer linear programs Inordertosolvetheprogram(5),letvˆ betheijthelementofthesamplecovariancematrixwithi= 1,...,n ij andj = 1,...,n. Sincethesamplecovariancematrixissymmetric,wecanfocusonitsuppertriangularpart and use the decision variable a = 1 if vˆ 6= 0, and 0 otherwise, for each i < j. Based on these decision ij ij variables, theintegerlinearprogram(5)canbewrittenas n n maximize vˆ a ij ij a i=1j=i+1 X X i−1 n subjectto a + a ≤ d , i = 1,...,n, ji ij i j=1 j=i+1 X X a ∈ {0,1}, i = 1,...,n, j = 1,...,n, i< j, ij 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.