StatisticalScience 2009,Vol.24,No.2,223–237 DOI:10.1214/09-STS294 (cid:13)c InstituteofMathematicalStatistics,2009 A Multivariate Variance Components Model for Analysis of Covariance in Designed Experiments James G. Booth, Walter T. Federer, Martin T. Wells and Russell D. Wolfinger 0 1 0 2 Abstract. Traditional methods for covariate adjustment of treatment n means in designed experiments are inherently conditional on the ob- a J served covariate values. In order to develop a coherent general method- 8 ology for analysis of covariance, we propose a multivariate variance 1 components model for the joint distribution of the response and co- variates. It is shown that, if the design is orthogonal with respect to ] E (random) blocking factors, then appropriate adjustments to treatment M means can be made using the univariate variance components model obtained by conditioning on the observed covariate values. However, . t a it is revealed that some widely used models are incorrectly specified, t leadingtobiasedestimatesandincorrectstandarderrors.Theapproach s [ clarifies some issues that have been the source of ongoing confusion in 1 the statistics literature. v Key words and phrases: Adjusted mean, blocking factor, conditional 1 model, orthogonal design, randomized blocks design. 1 0 3 1. 1. INTRODUCTION Cochran and Cox (1957), Snedecor and Cochran 0 (1967), as well as more recent texts such as Milliken 0 This article concerns the adjustment of treatment and Johnson (2002). We are specifically interested 1 meansindesignedexperimentstoaccount foroneor in settings in which the covariates are random vari- : v more covariates. Analysis of covariance has a long ables, not fixed by the design. Also, we generally i history, dating back to Fisher (1934). Much of its X suppose that the covariate values are not affected development in the context of designed experiments r by the treatments, for example, because they were a followed soon after. Examples can befoundin many measuredpriortoapplicationofthetreatments.The classical textbooks, including Federer (1955), multivariate mixed model we propose can be modi- fied to handle problems in which this assumption is James Booth e-mail: [email protected] and Martin not valid. However, there are additional inferential Wells e-mail: [email protected] are Professors and issues in this case. Walter Federer e-mail: [email protected] was Emeritus As a canonical example we discuss in detail the Professor, Department of Biological Statistics and randomized complete blocks (RCB) design with a Computational Biology, Comstock Hall, Cornell University, Ithaca, New York 14853, USA. Russell single covariate. In the classical analysis of this de- Wolfinger is Director of Scientific Discovery and signtheblocksaretreatedasfixed,andthecovariate Genomics, SAS Institute Inc., Cary, North Carolina is included as a predictor. The least squares treat- 27513, USA e-mail: Russ.Wolfi[email protected]. ment means obtained from the fixed blocks model fit adjust the arithmetic treatment means to ac- This is an electronic reprint of the original article count for differences in the average covariate mea- published by the Institute of Mathematical Statistics in Statistical Science, 2009, Vol. 24, No. 2, 223–237. This surements among the treatments. Including the co- reprint differs from the original in pagination and variate block means as an additional predictor has typographic detail. no effect on the least squares fit because differences 1 2 BOOTH, FEDERER, WELLS AND WOLFINGER attheblocklevelarealreadyaccountedfor.Treating showthatconditionalmaximumlikelihoodcanelim- blocks asfixedeffectively confinesthescopeofinfer- inate the bias. ence toonly thoseblocks andcovariate values in the An outline of the remainder of the paper is as fol- study. In particular, the standard errors obtained lows. In the next section we look back historically from the fixed effects model for the least squares and attempt to explain why some fundamental is- treatment means do not account for repeated sam- sues in analysis of covariance are still unresolved. pling of blocks. Therandomized complete blocks design is discussed However, in most applications the blocks in the in detail in Section 3. In Section 4 we show how the studycanbeviewedasarandomsamplefromapop- bivariate model for the randomized complete blocks ulation of interest, and it is of interest to extend the designcanbegeneralized toorthogonaldesigns,and scope of inference to the population of blocks. An to allow adjustment for multiple covariates. In the obvious modification of the classical model in this orthogonal case the conditional model for the re- case is to simply treat the block effects as random; sponsegiventhecovariates impliedbythemultivari- that is, to fit a (univariate) mixed model with two ate mixed model for the joint distribution turns out variance components. Under the standard indepen- to be a univariate mixed model. This implies that dence and normality assumptions, adjusted treat- appropriate adjustment of treatment means can be ment means obtained from the mixed model have accomplished usingstandardsoftware. Themethod- the same form as the classical least squares means, ology is applied to some standard examples of or- except that the covariate regression coefficient is thogonaldesigns.Thenonorthogonalcaseisdiscussed a weighted average of the estimates obtained from in Section 5. Here we show that appropriate adjust- intra- and inter-block regressions. ment cannot be accomplished by fitting a univari- A key point made in this paper is that both uni- ate mixed model. However, an EM algorithm for variate fixed and mixed models for analysis of co- fitting a general multivariate linear variance com- variance are inherently conditional on the measured ponents model is developed using arguments that covariate values. An obvious question is, therefore, parallel those for the univariate case, as discussed what joint distribution for response and covariate inSearle, Casella and McCulloch(1992),Chapter8. leads to the conditional models. We show that by The methodology is applied to balanced incomplete simply treating the block effects as random in the block designs and unbalanced designs. We conclude randomized complete blocks design setting, one is with some discussion in Section 6. implicitly assuming that the marginal block vari- ance for the covariate is zero; that is, an implied 2. HISTORICAL PERSPECTIVE model for the joint distribution that is not realistic. As a result, the adjusted treatment means obtained Since analysis of covariance in designed experi- from the naive, univariate mixed model are biased. ments has such a long history, readers may won- However, by starting with a bivariate variance com- der why confusion over such basic modeling issues ponents model for the joint distribution of response persists even today. We attempt to explain this by andcovariate, oneisledtoasensibleunivariatecon- discussing the topic in its historical context. Ar- ditional model, which properly accounts for the de- guably,theheydayofanalysisofcovariancewaspre- sign with respect to the covariate. The idea of a 1960, predating the widespread use of matrix alge- bivariate model is suggested in Cox and McCullagh bra in statistics and clearly before the availability [(1982), Section 7], but not fully developed. Multi- of high-speed computing. Matrices are two hundred variate variance components models are discussed andsomeyearsoldbuttheiruseinstatisticsonlybe- in Khuri, Mathew and Sinha [(1998), Chapter 10], came commonplace in the late 1950s. The very first but the application to analysis of covariance is not paperinthefirstissueoftheAnnalsofMathematical considered. The fully conditional approach was also Statistics by Wichsell (1930) is entitled “Remarks advocated by Neuhaus and McCulloch (2006) who on Regression,” yet it has no matrices. It is likely consider the situation where random effects in a that the lateness of adoption of matrices arose from generalized linear mixed model may be correlated their being treated as a topic in pure mathematics with one of the predictors. Classical likelihood ap- and, hence, their practical use in statistical mod- proaches lead to inconsistent estimators in this set- eling remained hidden. In the classic design books ting. Results in Neuhaus and McCulloch (2006) Cochran and Cox (1957) and Federer (1955), there COVARIANCEIN DESIGNED EXPERIMENTS 3 is no mention of matrices, whereas in Kempthorne Some statisticians, however, have advo- (1952)thedesignproblemisformulatedas ageneral cated a more general model which allows linear model but is not applied to analyze any ad- the intra-block regression coefficients to vanced designs. Kempthorne [(1952), page 66] even be different from the inter-block regres- notes that, at the time, there did not appear to sion. In this paper, all models are such be a complete matrix formulation of the general thattheintra-block regressionis thesame linear model anywhere in the literature. In unbal- as that for the inter-block regression. It is anced data it was a longtime puzzle why statistical difficult for this writer to visualize situa- methods gave two different least squares estimates tions allowing separate regressions. of fixed effects in a one-way classification depending It turns out that the “[s]ome statisticians” Zelen upon whether one assumed that one effect was zero, referred to were right, but why? Clearly their argu- orthatalltheeffectssummedtozero.Theliterature ments were not entirely convincing at the time. The had certainly not kept up with R. C. Bose’s (1949) bivariate variance components model described in concept of estimability. Nor were 1950s design and Section 3.3 reveals that Zelen’s two statements are linearmodelresearchersawarethatPenrose’s(1955) incompatible.Infact,betweenblock variationinthe generalized inverse matrix could be used to solve covariate implies that the intra- and inter-block re- the normal equations in the nonfull rank setting, as gressions are not the same. Putting it another way, in design problems. With the aid of a generalized forcing the two slopes to beequal amounts to an as- inverse, Rao (1962) demonstrated how the unique sumptionthatthereis novariation among theblock unbiased estimators of estimable functions could be covariate means.Thisassumption is clearly violated constructed. In a recent Statistical Science conver- in practical circumstances and therefore leads to bi- sation (Wells (2009)) Shayle Searle points out that ased (adjusted) treatment means as well as incon- random effects modeling in the 1950s was quite lim- sistent estimates of variance components. ited and mostly for balanced data. One of the early formulations of matrix methods in variance and co- 3. RANDOMIZED COMPLETE BLOCKS variancecomponentsanalysiscanbefoundinSearle DESIGN (1956). An excellent paper by Zelen (1957) reviews the 3.1 Classical Approach thinking at the time for balanced incomplete block Considerarandomizedcompleteblocksdesignwith (BIB) designs. The algebraic manipulations associ- response, Y, and associated concomitant variable, ated with the multivariate model described in this Z. Suppose that Z is measured prior to application paperwouldbeverydifficult,ifnotimpossible,with- of the treatment but is possibly correlated with the out the use of matrix algebra and facility with the response.Let i=1,...,t bethe index for treatment, multivariatenormaldistribution,mathematicaltools that were not fully developed in the statistics liter- and j=1,...,b be the index for block. The classical ature at the time of Zelen’s paper. We focus on Ze- fixedeffectslinearmodelforthisdesignisasfollows: len’s discussion of intra- and inter-block regressions, (1) Y =µ+τ +β +γz +E , ij i j ij ij which we summarize with the following two quotes from Section 3 of his paper: where Eij ∼ N(0,σe2), independently, and iτi = β =0. Notice that replacing the covariate z in the analysis of covariance, the inter- j j P ij with the block centered value z −z¯ has no effect ij ·j block model will be important if the vari- P on the fit of this model because the term, −γz¯ , ·j ability of the concomitant variate is large can be incorporated into the fixed effect for block j. for “between blocks” as compared to the The adjusted mean for treatment i is the estimated variability “within blocks.” This situation mean response at a fixed value of Z, conventionally may allow more precise inter-block esti- taken to be its average observed value, z¯ . matesoftheregressioncoefficientsascom- ·· ThroughoutthispaperGreeklettersrepresentfixed paredto thecorrespondingintra-block es- effects (unknown parameters), upper case Roman timates letters arerandom variables or known matrices, and and later, when discussing the slopes of intra- and lower case Roman letters are either observed values inter-block regressions, ofrandomvariablesorknownconstants(orvectors). 4 BOOTH, FEDERER, WELLS AND WOLFINGER Table 1 Adjusted treatment means and standard errors for Pearce’s apple yield data Univariate fixed Univariate mixed Bivariate mixed Treatment Adj.Mean Std.Err. Adj.Mean Std.Err. Adj.Mean Std.Err. A 280.48 6.37 280.41 13.69 280.48 12.98 B 266.57 6.36 266.55 13.68 266.57 12.98 C 274.07 6.36 274.05 13.68 274.07 12.98 D 281.14 6.44 281.32 13.72 281.14 13.02 E 300.92 6.72 301.33 13.87 300.92 13.19 S 251.34 6.86 250.85 13.95 251.34 13.28 Adjusted means and their standard errors for the apple yield data from Pearce (1953,1982),basedon fixedeffects,univariatemixedeffectsandbivariatemixed effectsmodels. ThestandarderrorsinvolvetheMLestimates ofvariancecompo- nents. With these conventions it is implicit in the notation ordinary least squares estimate that model (1) characterizes the conditional distri- zT(C ⊗C )y t b bution of the response given the observed values of (3) γˆols= zT(C ⊗C )z, the concomitant variable. t b We define the inter-block regression model to be where Ct = It − J¯t is the centering matrix of di- the implied model for the block means, specifically, mension t, and y=(y11,y12,...,ytb)T is the entire response vector, with z defined analogously. Since Y¯·j =µ+βj +γz¯·j +E¯·j. CbJ¯b=0,itfollowsfrom(3)thatγˆols isindependent oftheunadjustedtreatmentmeanvector,(I ⊗J¯ )y, Thus, in this context, the inter-block model con- t b with components, y¯ , i=1,...,t. Hence, the vari- i· tains no information about treatment differences, ance formula for the adjusted means based on the or about the regression parameter, γ, because the traditional model with fixed block effects is terms, γz¯ , are confounded with the block effects. ·j σ2 σ2 Wedefinetheintra-block regressionmodelusingthe (4) var(µˆ )= e + e (z¯ −z¯ )2. i,adj b zT(C ⊗C )z i· ·· t−1 orthogonal Helmert contrasts, h ,...,h , be- t b 2 t tween components of theobservation vector, Y ,for For numerical illustration we consider the apple j block j. Specifically, let Y∗ =hTY , for i=2,...,t yield data from Pearce (1953, 1982). In this experi- ij i j and j = 1,...,b, and similarly define z∗ and E∗. ment there were b=4 blocks, and t=6 treatments ij ij Then the intra-block model in this case is (A, B, C, D, E and S), with S being the standard practice in English apple orchards of keeping the Yi∗j =τi∗+γzi∗j +Ei∗j, land clean in the summer. The response, Y, is the yield per plot, and the covariate, Z, is the number where τ∗ =hTτ, i= 2,...,t, are t−1 orthogonal i i of boxes of fruit, measured to the nearest tenth of a contrasts among the treatment means. It follows box, for the four seasons previous to the application thatalltheinformationabouttreatmentdifferences, of the treatments. The adjusted treatment means and the covariate regression parameter, is contained and their estimated standard errors, based on three in the intra-block model. different models, are reported in Table 1. The adjusted treatment means are estimates of 3.2 Univariate Mixed Model the mean responses when Z =z¯ . These are given ·· by In most applications the blocks can be regarded as a random sample from a population, and it is of (2) µˆ =µˆ+τˆ +γˆz¯ =y¯ −γˆ(z¯ −z¯ ), i,adj i ·· i· i· ·· interest to make inferences about the average treat- menteffectsacrossthepopulationofpotentialblocks. for i=1,...,t, where γˆ is the BLUE for γ. These Insuchcasesitmakessensetotreattheblockeffects do not involve the (estimated) block effects because as random. Thus, model (1) becomes they are averages over the blocks and the block ef- fects sum to zero. In the fixed effects case, γˆ is the (5) Y =µ+τ +B +γz +E , ij i j ij ij COVARIANCEIN DESIGNED EXPERIMENTS 5 wherenow B ∼ i.i.d.N(0,σ2) independently of the case in which the block effects are omitted from the j b random errors, E . Replacing the covariate z with model. ij ij thedatacenteredvalue,z −z¯ ,hasnoeffect onthe Theadjustedmeansandtheirstandarderrorsbased ij ·· fit of this model because the term, −γz¯ , can be in- on the mixed effects model (5), for the apple yield ·· corporated into the fixed intercept. However, unlike data, are tabulated in Table 1. The ML variance the fixed block model (1), replacing the covariate componentestimatesinthisexampleareσˆ2=194.55 e with block centered values, z −z¯ , does affect the and σˆ2=553.98, resulting in a correlation estimate ij ·j b fit. ρˆ=0.9447, and γˆ=28.89. This compares with the The inter-block regression model derived from (5) ordinary least squares estimate γˆ=28.40. Thus, in this case theadjusted mean values are quitesimilar. is However, the standard errors reported by the soft- (6) Y¯·j =µ+γz¯·j +Bj +E¯·j, ware are quite different. This is because inferences from the fixed effects model (1) are restricted to while the intra-block model is the same as in the the four blocks in the study, whereas those from the fixed effects case. Thus, when the block effects are model (5) apply to the population of blocks. Specif- treated as random, they are incorporated into the ically, since (7) implies γˆ is independent of the mixed error term of the inter-block model. As a result, the unadjusted treatment means, inter-block model does contain additional informa- tion about the covariate regression parameter. In var(µˆi,adj) particular, it would appear from (6) that the infor- σ2+σ2 mation in the inter-block model will increase with = e b b the variability of the covariate block means. This zT[(I −ρJ¯ )⊗C ]Σ[(I −ρJ¯ )⊗C ]z explains the first quote from Zelen (1957) given in t t b t t b + Section 2 (albeit for a BIB design). (zT[(It−ρJ¯t)⊗Cb]z)2 The adjusted treatment means based on model ·(z¯ −z¯ )2, i· ·· (5) have the form (2), being the expected treat- where Σ ≡ var(Y) = σ2I ⊗I +σ2J ⊗I . Notice ment means in repeated sampling (involving differ- e t b b t b that, even as ρ approaches 1, this variance formula entblocks)atacommonconcomitantvariablevalue, still differs from the fixed effects variance given in Z =z¯ . However, the BLUE for γ is a weighted av- ·· (4) by an additive amount, σ2/b, which accounts for erage of the estimates obtained from the intra- and b variation due to sampling of blocks. inter-block regression models,wherethe weights are inversely proportional to their variances. Let ρ de- 3.3 Bivariate Mixed Model note correlation between any two sample treatment As noted in the Introduction, the models (1) and means, that is, (5) are inherently conditional on the observed val- σ2 ues of the covariate Z. The fixed effects model (1) is ρ=cor(Y¯ ,Y¯ )= b . i· k· σ2+σ2/t appropriate if the blocks in the experiment are the b e only ones of interest, whereas model (5) is an at- Then, it is shown in the Appendix that the BLUE tempt to broaden the applicability of inferences to of γ based on (5) has the form the hypothetical population from which the blocks were drawn. An obvious question is what model(s) zT[(I −ρJ¯ )⊗C ]y γˆ = t t b for the joint distribution of (Y,Z) leads to the con- mixed zT[(I −ρJ¯ )⊗C ]z t t b ditional model (5)? (7) zT(C ⊗C )y+(1−ρ)zT(J¯ ⊗C )y Consider a bivariate model in which the distribu- t b t b = . tionofZ isindependentofthetreatmentsbutallows zT[(I −ρJ¯ )⊗C ]z t t b forrandomvariationbetweenblocksandresidualer- If the block variance dominates the error variance, ror. Specifically, and hence ρ ≈ 1, then the mixed effects estimate Y µ τ of γ is close to the ordinary least squares estimate ij = y + i,y Z µ 0 ij z in (3). On the other hand, if the block variance is (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) (8) dominated by the error variance, then ρ≈0, and B E + j,y + ij,y the estimate in (7) corresponds to the fixed effect B E j,z ij,z (cid:18) (cid:19) (cid:18) (cid:19) 6 BOOTH, FEDERER, WELLS AND WOLFINGER where the block effects are i.i.d. bivariate normal, σ2 b,z · I − J (σ I +σ J ) Bj,y ∼ i.i.d.N 0 ,Σ = σb2,y σb,yz , (cid:18) t σe2,z+tσb2,z t(cid:19) e,zy t b,zy t (cid:18)Bj,z(cid:19) 2(cid:20)(cid:18)0(cid:19) B (cid:18)σb,zy σb2,z (cid:19)(cid:21) =σe2It+σb2Jt, independently of the bivariate residual errors, where σ2=σ2 −σ2 /σ2 , and e e,y e,yz e,z E 0 σ2 σ Eij,y ∼ i.i.d.N2 0 ,ΣE = σe,y σe2,yz . (13) σb2=σb2,y−[γeσb,yz+γb(σb,yz +σe,yz/t)]. (cid:18) ij,z(cid:19) (cid:20)(cid:18) (cid:19) (cid:18) e,zy e,z (cid:19)(cid:21) It follows from (10) and (12) that the univariate As before, let Y = (Y ,...,Y )T denote the re- j 1j tj conditional model implied by (9) is sponse vector for the jth block, and similarly define Zj. Then the conditionally specified model implied (14) Yij =µ+τi+Bj +γezij +γbz¯·j +Eij, bybivariatemodel(8)canbeformally derivedusing where µ=µ −(γ +γ )µ , τ ≡τ , and B ∼i.i.d. the fact that y e b z i i,y j N(0,σ2) independently of E ∼i.i.d.N(0,σ2). The Y b ij e j inter-block regression model implied by (14) is Z j (cid:18) (cid:19) Y¯ =µ+γ z¯ +B +E¯ , ·j be ·j j ·j µ σ2 I +σ2 J (9) ∼ i.i.d.N y , e,y t b,y t 2t µ 1 σ I +σ J where z t e,zy t b,zy t (cid:20)(cid:18) (cid:19) (cid:18) σ +σ /t σσe,e2y,zzIItt++σσbb2,,zyzJJtt(cid:19)(cid:21), γbe≡γe+γb= σb,b2y,zz +σee2,,zyz/t whereµ isthevectoroftreatmentmeanswithcom- is the slope of the inter-block regression. Thus, in y ponents, µ =µ +τ . It follows that this case the inter-block model contains no infor- i,y y i,y mation about the intra-block covariate regression E(Y |Z =z ) j j j coefficient. Similarly, in the case of a generalized =µ +(σ I +σ J )(σ2 I +σ2 J )−1 linear mixed model, where random effects may be y e,yz t b,yz t e,z t b,z t correlated with one of the predictors, it is shown ·(z −1 µ ) j t z in Neuhaus and McCulloch (2006) that conditional (10) 1 maximum likelihood also leads naturally to the par- =µ +(σ I +tσ J¯ ) y e,yz t b,yz t σ2 titioning of the covariate into between- and within- e,z cluster components. tσ2 Writing (13) in terms of the intra-block and inter- · I − b,z J¯ (z −1 µ ) t σ2 +tσ2 t j t z block slopes, γe and γbe, we obtain (cid:18) e,z b,z (cid:19) =µy+γe(zj −1tµz)+γb1t(z¯·j −µz), σb2=σb2,y−γeσe,yz/t−γbe(σb,yz +σe,yz/t). where γ =σ /σ2 and Thus,theblockvariancefortheresponseinthecon- e e,yz e,z ditional model is the marginal block variance ad- σe2,zσb,yz−σe,yzσb2,z justed for intra- and inter-block regression on the (11) γ = . b σ2 (σ2 +σ2 /t) covariate. e,z b,z e,z The univariate mixed model (5) is the conditional The conditional variance is model implied by (8) when γ =0, which only hap- b var(Yj|Zj =zj) pens if σb2,z =0, an unrealistic assumption in prac- tice. At this point it is interesting to recall Zelen’s =σ2 I +σ2 J e,y t b,y t (1957) comments, quoted earlier, concerning the −(σ I +σ J )(σ2 I +σ2 J )−1 equality of slopes in the inter- and intra-block mod- e,yz t b,yz t e,z t b,z t els, and the information in the inter-block model ·(σe,zyIt+σb,zyJt) abouttheintra-blockslopeincreasingwiththeblock- (12) to-block variability in the covariate. It is now clear =σ2 I +σ2 J e,y t b,y t that these statements are incompatible. Block-to- 1 block variability in the covariate implies that the −(σ I +σ J ) e,yz t b,yz t σ2 inter- and intra-block slopes are different. For this e,z COVARIANCEIN DESIGNED EXPERIMENTS 7 reason the use of the univariate mixed model (5) 4. GENERAL ORTHOGONAL BLOCKING leads to biased estimates of adjusted means and in- DESIGNS consistent estimates of variance components as the 4.1 Theory number of blocks increases. We definetheadjustedtreatment means to bethe Let Z =(Z ,...,Z )T be a covariate vector ij ij1 ijm expected responses if the covariate values were all associated with the response Y , for i=1,...,k in ij equal to theaverage observed covariate value. Thus, replicatej=1,...,b.Thus,thedatamatrixforrepli- the model (14) implies cate j is given by µi,adj=µ+τi+(γb+γe)z¯·· Y1j ZT1j (15) Y ZT =µi,y+γbe(z¯··−µz). ..2j ..2j=[Yj,Zj], It is shown in the Appendix that µˆ =z¯ , that γˆ . . z ·· e equals the ordinary least squares estimate based on Y ZT kj kj univariate fixed effects model, and that µˆi,y =y¯i·− say. Let Z∗ denote the rth column of Z . Suppose γˆ (z¯ −z¯ ). It follows that the adjusted treatment jr j e i· ·· thatY canbedecomposedintoY =µ +T +U , means based on (14) are identical to (2). j j y j j whereµ is the fixedmean of Y ,which dependson Estimatesoftheadjustedtreatmentmeansforthe y j the treatments, T is the sum random factors asso- apple yield data based on (14) are given in Table 1. j ciated with treatments (and therefore independent The estimate of the inter-block slope in this case is ofZ ),andU isthesumof q randomdesignfactors γˆ =37.25. This is quite different in magnitude (al- j j be plus residual errors. though not statistically) from the estimated intra- Wesupposethat vec[U ,Z ],j=1,...,b,arei.i.d. block slope, γˆ =28.40. Since the intra-block esti- j j e multivariate normal vectors of dimension k(m+1), mate is identical to those based on the univariate withmeans,vec[0 ,µT⊗1 ],wherethecomponents fixed effects model, the standard errors for the ad- k z k of µ are the marginal means of the m covariates, justed means are given by z and with covariance matrix, V. We say that the de- σ2+σ2 σ2 var(µˆ )= e b + e (z¯ −z¯ )2. signisan“orthogonal blockingdesign”ifthematrix i,adj b zT(Ct⊗Cb)z i· ·· V has thefollowing structure.Let A ≡J¯ ,and A , 0 k l The estimated standard errors are larger than those l =1,...,q, be k ×k matrices with the properties basedonthefixedeffectsmodelbyanadditivefactor that (a) A is idempotent, (b) A A =0 if l6=l′, l l l′ of σ2/b. This is as is should be, because the scope and q A =I .Thenwesupposethereexistnon- b l=0 l k of inference has been broadened to the population singularmatrices,G ,G ,...,G ,eachofdimension 0 1 q of blocks. m+P1, such that Up to now we have assumed that the covariate q values are not affected by the treatments. If they (16) V= G ⊗A . l l are, then the bivariate model (8) is no longer appro- l=0 priate. An obvious modification of (8) in this case X Noticethatadesigncanbeorthogonal,inthissense, is regardless of the assignment of the treatments. Y µ τ B E ij = y + i,y + j,y + ij,y . In general, the matrix V is a function of (q + Z µ τ B E (cid:18) ij(cid:19) (cid:18) z(cid:19) (cid:18) i,z(cid:19) (cid:18) j,z(cid:19) (cid:18) ij,z(cid:19) 1)1(m+1)(m+2) free variance and covariance pa- 2 The conditional model for Y given Z implied by rameterswhichdeterminetheq+1variance-covariance this model has exactly the same form as (14). How- matrices, Σ ,...,Σ , associated with the residual 0 q ever, the treatment effect parameter τ is equal to i errors, and the q random design factors. In particu- τ −γ τ . This makes sense in that what is be- i,y e i,z lar, we note that the variance–covariance structure ing estimated is the direct effect of treatments on for U is j the response mean, as opposed to the indirect effect q through the covariate. As noted by Bartlett (1936), V = g A , there is reason for caution in this setting dueto hid- uu l,uu l den extrapolation. Comparing conditional expecta- Xl=0 tions of treatment means at equal covariate levels where g , l=0,...,q, are scalar parameters, and l,uu may not make sense if the treatments affect what that this structure is that implied by orthogonality covariate values are observed. of the random blocking factors. 8 BOOTH, FEDERER, WELLS AND WOLFINGER Example. Consider the RCB design discussed the covariates are equal to their respective marginal in Section 3. In this case there is only one covariate, means is so m= 1. The vector U associated with the jth j µ =µ −1 γT(z¯∗ −µ ), block consists of thesum of the block effect and and adj y k 0 ·· z the residual error vector, which generalizes the formula (15) for the RCB de- sign. U =1 B +E . j t j j The conditional variance of U is given by j Finally,thecovariancematrixfor(YjT,ZTj )T isgiven var(Uj|Zj =zj)=Vuu−VuzVz−z1Vzu by q V=Σ ⊗I +Σ ⊗J = (g −g G−1 g )⊗A E t B t l,uu l,uz l,zz l,zu l =(Σ +tΣ )⊗J¯ +Σ ⊗(I−J¯ ). Xl=0 E B t E t q The fact that vec[Uj,Zj], j = 1,...,b, are i.i.d. = λlAl, multivariate normal vectors implies that marginally l=0 X vec(Z ),j=1,...,b,arei.i.d.N(µ ⊗1 ,V ),where j z k zz corresponding to an orthogonal design with orthog- u and z subscript combinations are used to denote onal partition {A }. l componentsofthepartitionedmatrix.Moreover,con- ditionally upon Z, U , j =1,...,b, have indepen- Example. Consider again the RCB design of j dent normal distributions with means Section 3. Note that the conditional mean (10) can be reexpressed in the form, E(U |Z =z ) j j j E(Y |Z =z )=µc +γ J¯ z +γ (I −J¯ )z , =V V−1(z −µ ⊗1 ) j j j y be t j e t t j uz zz j z k where µc =µ −γ µ , and γ =γ +γ . Moreover, q q y y be z be e b the conditional variance (12) can be reexpressed as = g ⊗A G−1 ⊗A (z −µ ⊗1 ) l,uz l l,zz l j z k Xl=0 ! Xl=0 ! var(Yj|Zj =zj)=(σe2+tσb2)J¯t+σe2(It−J¯t). q 4.2 Examples = g G−1 ⊗A (z −µ ⊗1 ) l,uz l,zz l j z k ! 4.2.1 Split-plotdesigns. Considerastandardsplit- l=0 X plot experiment with t whole-plot treatments, each q = γT ⊗A (z −µ ⊗1 ), replicated r times, and s split-plot treatments in l l j z k ! each wholeplot. Let Yijk denote response to split- l=0 X plot treatment k, in whole-plot j assigned to whole- where γTl =gl,uzG−l,z1z is a 1×m parameter vector. plot treatment i. Similarly index the covariate val- Since(γT⊗A )(µ ⊗1 )=γTµ ⊗A 1 =0,unless uesZ .Then,themarginalmodelsfortheresponse l l z k z l k ijk l=0, in which case it equals γTµ 1 , we have and covariate are 0 z k q m Y =µ +α +W +τ +ατ +E ijk y i,y (i)j,y k,y ik,y ijk,y E(U |Z =z )=−γTµ 1 + γ A z∗ . j j j 0 z k lr l jr and l=0 r=1 XX Z =µ +W +E Finally, since T has mean zero, and is independent ijk z (i)j,z ijk,z j of Zj, respectively.Bivariatenormalityforthepairs,(W(i)j,y, q m W(i)j,z) and (Eijk,y,Eijk,z), imply that the condi- E(Y |Z =z )=µc + γ A z∗ , tional model for appropriate covariate adjustment j j j y lr l jr has the form, l=0 r=1 XX whereµcy =µy−γT0µz1k.Thus,theconditionalmean Yijk =µ+αi+W(i)j +τk+ατik (17) oftheresponseisgiven byalinearmodelwithtreat- +γ z¯ +γ z +E , ment effects incorporated into µc, and covariate re- w ij· e ijk ijk y gression effects with slopes, {γ }, l=0,...,q, asso- whereα isthefixedmaineffectoftheithwholeplot lr i ciated with each of the m covariates, r=1,...,m. treatment,τ isthefixedmaineffectofthekthsplit- k Since A 1 =0 for l>0,the expected responseif all plottreatment, and W is therandomeffect of the l k (i)j COVARIANCEIN DESIGNED EXPERIMENTS 9 jth wholeplot replicate nested within the ith whole- 4.2.3 Incomplete blockdesigns. Consideranincom- plot treatment. Milliken and Johnson [(2002), Sec- plete block design with k<t treatments appearing tion15.4]discussasplit-plotdesigninthecontextof in each block. Let Y and Z denote the response j j a cookie baking experiment in which oven tempera- and covariate vectors for block j. The arguments ture is the whole plot factor, and cookie type is the of Section 3.3 lead to the conditional model (14), split-plot factor. The covariate in their example is with the subscript i taking k values in {1,2,...,t} the thickness of the slices of cookie dough, but their depending on the value of j, and with the block re- proposed “equal slopes” model does not include the gressionparameter,γ ,havingthesameformas(11) b whole plot regression term in (17). with t replaced by k.We noteherethateven though If the experiment is arranged in b blocks, with this design is not orthogonal with respect to the re- r replicate wholeplots for each wholeplot treatment sponse, it is orthogonal from the perspective of the level in each block, then the marginal model for the covariate.Itfollowsthattheappropriateadjustment response is for covariates in an incomplete block design can be Y =µ +B +α +(Bα) +W carried out using a univariate mixed model. ijkl y i,y j ij (ij)k,y Asanexampleweconsiderdatafromastudycon- +τ +(Bτ) +(ατ) +(Bατ) +E . l il jl ijl ijkl,y ducted by the National Bureau of Standards, dis- Sincethetreatments have noeffect onthecovariate, cussed in Zelen [(1957), Section 6], to determine the marginal model for Z is the effects of four geometrical shapes on the current Z =µ +B +W +E . noise of resistors. As described by Zelen, the “ge- ijkl z i,z (ij)k,z ijkl,z ometrical shapes were rectangular parallelepipeds Notice that, in this case, there are random inter- (all having the same thickness) formed by taking all actions between the blocking factor and treatments fourcombinationsof2widths(w ,w )and2lengths that affect the response, but not the covariate. Bi- 1 2 (l ,l ).” Three resistors were mounted on each of 12 variate normality of the pairs, (B ,B ), (W , 1 2 i,y i,z (ij)k,y ceramic plates according to a BIB design. The re- W )and(E ,E ),resultsinaconditional (ij)k,z ijkl,y ijkl,z sponsewas the logarithm of the noise measurement, model for the response with a covariate adjustment and the covariate was the logarithm of the resis- at the individual response level, as well as adjust- tance of each resistor. Estimated treatment effects ments for covariate variation in wholeplot and block and their standard errors obtained using the uni- means, specifically, variate mixed model of the form (5), and using the Y =µ+B +α +(Bα) +W ijkl i j ij (ij)k conditional model (14) derived from the bivariate +τ +(Bτ) +(ατ) +(Bατ) model (8), are given in Table 2. There are substan- l il jl ijl tial differences in both the estimated effects and the +γ z¯ +γ z¯ +γ z +E . b i··· w ijk· e ijkl ijkl standard errors obtained using the two models. The 4.2.2 Latin square design. Let Yijk denote the re- estimates are also highly correlated, and these cor- sponsein cell (i,j) of alatin squaredesign involving relations mustbetaken into account in comparisons two random blocking factors and a fixed treatment among the length and width combinations. In par- factor, each with b levels. An appropriate model ig- ticular, Zelen considered the interaction contrast, noring any covariate information is π=(l w −l w )−(l w −l w ). 2 1 2 2 1 1 1 2 Y =µ +R +C +τ +E . rijk y i,y j,y k ijk,y Estimatesofπunderthetwomodels(5)and(14)are A marginal model for a random covariate is 0.022 and 0.018 respectively, with standard errors Z =µ +R +C +E . rijk z i,z j,z ijk,z 0.056and0.061.Thus,bothmodelsleadtothesame Bivariate normality of the pairs, (R ,R ), (C , conclusion that there is little statistical evidence for i,y i,z j,y C ) and (E ,E ), results in a conditional interaction. j,z ijk,y ijk,z model for the response with a covariate adjustment at the individual response level, as well as adjust- 5. NONORTHOGONAL DESIGNS ments for covariate variation in row and column 5.1 Factorization means. That is, A key feature of the multivariate mixed model Y =µ+R +C +τ +γ z¯ ijk i j k r i·· in the orthogonal design case is that the parame- +γcz¯·j··+γezijk+Eijk. ters in the conditional model for Y (µc and γ , l= y l 10 BOOTH, FEDERER, WELLS AND WOLFINGER Table 2 to µ +τ . Now suppose that only k < t covariate i Estimated treatment effects and standard errors for Zelen’s values are recorded in block j. Let z denote the BIB j,o observed vector of covariate values (of length k)and Univariate Bivariate let z¯ denote its mean value. Then, modifying the ·j,o arguments that led to equation (10) results in the Treatment Effect Std.Err. Effect Std.Err. conditional mean l w −0.519 0.112 −0.449 0.233 1 1 ll1ww2 −00..223489 00..002391 −00..222398 00..004405 E(Yj|Zj,o=zj,o)=µy+γe(zj,o−1kµz) 2 1 l2w2 0.508 0.109 0.440 0.226 +γb,o1k(z¯·j,o−µz), Estimatedtreatmenteffectsandstandarderrorsobtainedus- whereγ hasthesamefunctionalformas (11)with b,o ing themixed models (5) and (14).In each case thevariance t replaced by k. Thus, if the blocks have different components were estimated REML which explains why the numbersof covariate measurements, the parameters firstsetofestimates(labeled“univariate”)differslightlyfrom those obtained byZelen (1957). of the conditional model for Y are not separable from those of the marginal model for Z. 0,1,...,q) are variation independent of those in the 5.2 General Multivariate Mixed Model marginal model for Z (µ and G , l=0,1,...,q). z l,zz The two sets of parameters combined represent a To simplify the notation, we relabel the vector of 1–1 transformation of the bivariate model param- responsesas Z0 (i.e., Z0≡Y). ThenZ=vec[Z0,Z1, eterization (µy, µz, Gl, l=0,1,...,q). In general, ...,Zm] is a vector containing all the responses and this decomposition of the parameter space may not associated values of m covariates stacked on top of bepossible,inwhichcaseappropriateadjustmentof one another. Thus, if the number of responses is n, the treatment means cannot be accomplished using then Z has length n× (m+ 1). The multivariate a univariate mixed model. To see this, suppose that mixed model described in this paper can be written the covariate data is only partially observed, say, in the form, z = (z ,z ), where z denotes the observed part, o m o r q and z the unobserved.Then,thejoint distribution m Z=Xβ+ C T + D B , of the data is i i i i i=1 i=0 X X f(y,z ;µ ,µ ,G) o y z whereXdeterminesthemeansstructure,T ∼N(0, i σ2I ), independently for i = 1,...,r, are random = f (y|z;µc,γ)f (z;µ ,G )dz , i ci Y|Z y Z z zz m factors associated with treatments, and B ∼N(0, i Z Σ ⊗I ), independently for i=0,1,...,q, are ran- and, hence, the marginal distribution of zo is i di dom (blocking) factors associated with the design, fY|Z(y|z;µcy,γ)fZ(z;µz,Gzz)dzmdy. with the exception of B0, which is the residualerror term (so that d ≡n). It is convenient to partition Z Z 0 Thereis now no guarantee that the parameters that the matrices X, C and D into blocks consisting i i determine the marginal distribution of z will be of the n rows associated with the response, or one o separable from those that determine the conditional of the m covariates. Thus, X=[XT,XT,...,XT ]T, 0 1 m distribution of y given zo. Ci=[CTi0,...,CTim]T andBi=[BTi0,...,BTim]T.Note To further illustrate this point, consider again the that, if the covariates are unaffected by the treat- RCB design discussed in Section 3. In this case the ments, then C ≡0 for j >0. The model implies ij bivariate model has t+7 parameters which deter- that Z has variance-covariance matrix equal to mine the marginal means and block and error co- r q variance matrices, (µ ,µ ,Σ ,Σ ). The parame- y z B E V≡var(Z)= C C σ2+ D (Σ ⊗I )DT. terization in terms of the marginal model for Z, i i i i i di i and the conditional model for Y, is a union of two Xi=1 Xi=0 variation independent components of dimensions 3 We define the adjusted response mean vector as and t+4 respectively. Specifically, (µz,σb2,z,σe2,z)∪ itsconditionalexpectation giventhecovariates eval- (µ,γ ,γ ,σ2,σ2), where µ has ith component equal uatedattheirestimatedmeanvalues.Ifwepartition e b e b