ANOVA models for Brownian motion Gordon Hazen†‡ Daniel Apley† Neehar Parikh‡ †Department of Industrial Engineering and Management Sciences ‡Northwestern University Transplant Outcomes Research Collaborative Northwestern University December 2013 Corresponding Author: Gordon Hazen IEMS Department, McCormick School of Engineering Northwestern University Evanston IL 60208-3119 Running Head: ANOVA for Brownian Motion 1 Abstract We investigate longitudinal models having Brownian-motion covariance structure. We show that any such model can be viewed as arising from a related “timeless” classical linear model where sample sizes correspond to longitudinal observation times. This relationship is of practical impact when there are closed-form ANOVA tables for the related classical model. Such tables can be directly transformed into the analogous tables for the original longitudinal model. We in particular provide complete results for one-way fixed and random effects ANOVA on the drift parameter in Brownian motion, and illustrate its use in estimating heterogeneity in tumor growth rates. KEY WORDS: LONGITUDINAL MODELS; BROWNIAN MOTION; ANALYSIS OF VARIANCE; ANOVA; RANDOM EFFECTS; STOCHASTIC HETEROGENEOUS TUMOR GROWTH. 1. Introduction The statistical analysis of longitudinal data has a long history (e.g., (Diggle, 1988; Diggle, Heagerty, Liang, & Zeger, 2002; Singer & Willett, 2003; Verbeke & Molenberghs, 2009)). Consider the situation in which repeated measurements x ,x ,,x on individual i are taken at times 0t t t . We consider here a particular i0 i1 iJi i0 i1 iJi linear model with parameters ,, ,2: 1 M x x (uu )t ~ normal(0, 2t ). (1) ij i0 i1 1 iM M ij ij ij ij where u ,,u are specified constants or observed covariates for individual i. We allow the index i to be i1 iM lexicographically nested, so that (1) includes, among many possible examples, the additively separable model with i = (g,h): x x ( )t ghj gh0 g h ghj ghj in which (,, ) = (,,, ,,, ). In (1) we restrict ourselves to the specific correlation structure 1 M 1 G 1 H Corr(,) = t t if j j (2) ij ij ij ij that arises when observations come from a Brownian motion process (e.g. (Karlin & Taylor, 1975; Oksendal, 1998)) with drift u u and volatility 2. i i1 1 iM M 2 Random effects versions of model (1) are important when measurements are available from multiple individuals (e.g., cancer growth data from multiple patients – see section 5). In this case, one might wonder whether the population of individuals is homogeneous – all sharing the same growth rate coefficients ,..., – or 1 M heterogeneous. In the latter case, one might suspect that individual growth rate coefficients are sampled from a hypothetical population (e.g., slow- versus fast-growing tumors). Such models are attractive because they allow an analyst to forecast adaptively based on an individual’s history. For instance, in a contribution not involving repeated measurements, Ayer et al (Ayer, Alagoz, & Stout, 2012) examine personalizing protocols for breast cancer screening, in which healthy screening outcomes for an individual yield an adaptive forecast of reduced cancer incidence. This allows avoidance of unnecessary screening, which would reduce costs and false-positive morbidities without impacting survival. In this paper, we show that any model of the form (1)-(2) can be viewed as arising from a related “timeless” classical linear model in which sample sizes correspond to observation time increments t = t t . Moreover, ij i,j+1 ij inferences from the related classical model apply directly to model (1)-(2), after modifying the degrees of freedom in the error sum of squares. When the classical linear model possesses a closed-form sum-of-squares table, this table converts directly into a sum-of-squares table for the desired model (1)-(2). In particular, we show that the special case of one way Brownian ANOVA, in which the parameters are the patient drifts themselves, arises from a classical one-way ANOVA that is simple enough to solve by hand or on a spreadsheet. Recasting the analysis in a familiar ANOVA format can provide greater insight than the “black box” results from statistical software. Any inference procedure for Brownian motion would also apply to processes such as geometric Brownian motion obtained by transformation (see section 5), since inference could occur on transformed data. We believe the practical import of this paper will be for analysts examining longitudinal data, but with limited access to or experience with statistical software or general linear models, who wish to quickly obtain ANOVA results in a familiar format without acquiring software or studying general linear models. We structure the paper as follows. In Section 2 we transform to a general linear model with constant error variances, and derive closed-form estimates and tests. In Section 3 we specialize these general results to one-way ANOVA on a Brownian drift parameter, and also consider estimation and testing for a random-effects versions of this model. In Section 4, we examine the connection described above between model (1)-(2) and related “timeless” classical linear models. Section 5 provides a one-way Brownian ANOVA on tumor growth data. Section 6 concludes. 3 2. ANOVA for the general longitudinal model In model (1)-(2), let x = x x t = t t j = 1,…,J. ij ij i,j1 ij ij i,j1 i Then we have x = t + ~ independent normal(0,2t ) (3) ij i ij ij ij ij where u u and there are a total of J = J observations. We rewrite the equation for the drift i i1 1 iM M i i vector = (,...,) in terms of the parameter vector = (,, ): 1 I 1 M = U (4) Notation We consider both fixed effects and random effects models. We use the notation x x x x x x i j ij iJi i1 i j ij t t t t t t i j ij iJi i1 i j ij The material below uses the following matrix-vector notation. The prime symbol denotes matrix transpose. I will J denote an identity matrix of dimension J. If A ,…,A are matrices, then diag(A) is the block diagonal matrix with 1 I i i blocks A. If the A have the same number of columns, then stack(A) is the matrix A A obtained by i i i i 1 I stacking A onto A onto … onto A. The following identity will be useful: 1 2 I diag(A)·stack(B) = stack(AB) i i i i i i i if A,B are compatible under matrix multiplication. i i The basic model We treat both fixed and random effects, but initially do not distinguish which case we are considering. With the transformation y = x /t1/2 ij ij ij the basic model (3) becomes y = t1/2+ e e ~ independent normal(0,2) ij i ij ij ij In vector notation, this is y = diag(t1/2) + e e ~ normal(0,2I ) (5) i i J where 4 ystack (y ) y stack (y ) i i i j ij t1/2 stack (t1/2) i j ij = stack(). i i The resulting model is: y = diag(t1/2)U + e. (6) i i This is a special case of the linear model y = X + e in which X = diag(t1/2)U. (7) i i In the following, we explore what happens to ANOVA for this linear model. Error sum of squares The least-squares estimates αˆ are the unrestricted values of that minimize the sum of squares S() subject to = U, where S(α) yXα 2 ydiag (t1/2)Uα 2 ydiag (t1/2)μ 2 stack y t1/2 2 i i i i i i i i 2 (8) x y t1/22 t ij i j ij ij i i j ijt i ij Let μˆ =Uαˆ The resulting error sum of squares is given by 2 x SS = yXαˆ 2 t ij ˆ (9) E i j ijt i ij SS has J r degrees of freedom, where r is the rank of X = diag(t1/2)U. The latter is equal to the rank of U E i i because if u is the ith row of U, then X = diag(t1/2)U = stack(t1/2u). Assuming each u is nonzero, then each i i i i i i i element t1/2u of this stack X is an array of rank 1 with rows proportional to u. If r is the rank of U, then there i i i are r linearly independent rows u, and X must have rank r = r. i Numerator sum of squares Suppose the null hypothesis to test is H: (against an alternative hypotheses with no restrictions on ) and consists of q independent linear restrictions on . Let αˆ be the least squares estimate under H, and let μˆ = Uαˆ . The numerator sum of squares for an F-test of H is SS XαˆXαˆ 2 diag t1/2U(αˆαˆ ) 2 diag t1/2(μˆμˆ ) 2 i i i i . (10) stack t1/2(ˆ ˆ ) 2 t ˆ ˆ 2. i i i i i i i i SS has q degrees of freedom. 5 For the fixed-effects scenario, the expectation of the mean square MS = SS /q can be obtained from Scheffe’s Rule 2 ((Scheffe, 1959), p.39), and is given by E[MS] = 2 + q1 t ( )2 (11) i i i i where is obtained from the formula for ˆ by replacing each term y by its mean t1/2. This is equivalent to i i ij ij i replacing each term x by t. ij ij i F‐Test SS and SS are independent, with SS /2 having a chi-square distribution with the appropriate degrees of freedom, E E and SS /2 being noncentral chi-square in general, and chi-square under . So the usual F-test of the hypothesis H applies. The noncentrality parameter for SS /2 may be obtained from Scheffe’s Rule 1 ((Scheffe, 1959), p.39): If in SS , each observation y is replaced by its expected value t1/2 (or x by t), the result is 22. ij ij i ij ij i Random effects In a random effects model, we would have ~ normal(0, ). (If only some terms of are random, this would be reflected by zeroes in .) The distributional and mean claims above remain conditionally true given . In particular, given , the quantity SS /2 is conditionally 2(Jr) a distribution that does not depend on . E Therefore, SS /2 is unconditionally 2(Jr). E The random-effects counterpart to the fixed-effects null hypothesis H: is a null hypotheses H that places certain restrictions on . Namely, H restricts the variances of the q independent linear contrasts of corresponding to to be zero. Then under H , SS /2 and SS /2 are conditionally independent given with E 2(Jr) and 2(q) distributions not depending on , so are therefore independent 2(Jr) and 2(q) variables under H . The F statistic therefore still has an F-distribution under H , and the same F-test applies. 3. Special Case of One‐way Brownian ANOVA As we have noted, our framework encompasses one-way or multi-way ANOVA models, nested or crossed, with or without covariates, etc. In this section, we specialize to the simplest version of Brownian ANOVA, a one-way model, which we define as the general model (1)-(2) with the I drifts ,, the parameters of interest. In the 1 I framework of the general model, for the one-way ANOVA special case we have = and U = I in (4), yielding X I 6 = diag(t1/2). The general results from the prior section lead directly to many of the assertions summarized in i i Table 1 below. The fixed-effects assertions in particular fall out in a straightforward manner when the least-squares estimates x x ˆ i ˆ i t i t i are substituted. Hence, in the next subsection, we focus attention on deriving the random-effects assertions in Table 1. Table 1. One-way fixed and random effects Brownian ANOVA Fixed effects Random effects Model ,, ,, independent normal(,2) 1 I 1 I parameters Hypothesis H: H: 2 = 0 1 I Error sum of 2 x x squares SS = t ij i with J I degrees of freedom E i j ijt t ij i SS MS = E E J I Hypothesis 2 x x ssquuma roefs SS = ititii t with I1 degrees of freedom SS MS = I1 Expected E[MS ] = 2 E[MS ] = 2 E E mean squares E[MS] = 2 + (I1)1iti(i )2 E[MS]= 2 + t2. t1 t t (I1)1t t1 t2. i i i i i F-test MS Reject H at significance level if > F MS ,I1,JI E Parameter ˆ2 MS ˆ2 MS E E estimates ˆ = xi ˆ2 t1(MS MS ) † i t E i x i seˆˆi ˆt12/2 ˆ iˆ2ˆ2ti i ti iˆ2ˆ2t i 1/2 seˆ ti iˆ2ˆ2t i †In case this quantity is negative, it is common practice to replace the estimate with zero. 7 A key take-away is that Table 1 is nearly identical to the analogous table for a “timeless” classical one-way ANOVA if one substitutes observation time increments t in the Brownian model for within-cell sample sizes in the classical ij model. This interesting connection continues to hold in the general model (1)-(2). We elaborate in section 4. Derivation of Random effects Results in Table 1 In the random effects version, we assume the effects are distributed as {: i = 1, 2, . . ., I} ~ independent normal(, i 2). Regarding the fixed-effects expectation of the mean square MS , we have in equation (11) from the prior section t1Ex t1 t t1 t , i i j i ij i i i which we denote by the quantity in Table 1, as it does not depend on i. In the random effects model, we have from (11) that E[MS] = 2 +.(I1)1Eiti(i i)2 Using Lemma A2, this becomes E[MS ]= 2 + t2, where t (I1)1t t1 t2, i i as shown in the table. Here, in parallel to classical random effects ANOVA, when the total observation times t i are identical across patients i, the quantity t will be their common value. Combining the last equation with the fact that MS is an unbiased estimate of 2, we obtain in the conventional E ANOVA moment estimator for 2 : ˆ2 t1(MS MS ). E As in classical random effects ANOVA, it is possible that this estimate can be negative, in which case it is common practice to report a zero estimate. A least-squares estimate of the random effects population mean can be obtained as follows when we treat 2 and 2 as known. Substituting ~ normal(1, 2I) into the random effects model y = diag(t1/2)+ e gives I I i i y = X1 + g g ~ normal(0, 2I + 2 XX) I J 8 Defining W = (2I+ 2 XX)1/2X1, J I the weighted least-squares estimate of , treating 2 and 2 as known, is Wz 1X(2I 2XX)1/2(2I 2XX)1/2y 1X(2I 2XX)1y ˆ I J J I J WW 1X(2I 2XX)1/2(2I 2XX)1/2X1 1X(2I 2XX)1X1 I J J I I J I The matrix to be inverted here simplifies to (2I 2XX)1 diag (2I 2t1/2t1/2)1 J i Ji i i Use Lemma A3 to write 2 (2I 2t1/2t1/2)1 2I (2I )1t1/2t1/2(2I )1 Ji i i Ji 12t1/2(2I )1t1/2 Ji i i Ji i Ji i 2 2I 4t1/2t1/2 Ji 122t i i i 2 2I t1/2t1/2. Ji 22t i i i Substitute this into the denominator of the last equation for ˆ to get 2 t1/2(2I 2t1/2t1/2)1t1/2 t1/22I t1/2t1/2t1/2 i i Ji i i i i i Ji 22t i i i i 2 2 t t 2 i i i22t i i 2t 2 t 1 i i i 22t i 2 2 t i i22t i t i i22t i and into the numerator to get 9 2 t1/2(2I 2t1/2t1/2)1y t1/22I t1/2t1/2y i i Ji i i i i i Ji 22t i i i i 2 2 x t x i i i22t i i i 2t 2 x 1 i i i 22t i 2 2 x i i22t i x i i22t i The resulting quotient is the estimate ˆ listed in Table 1. To obtain an estimated standard error for ˆ, calculate the numerator variance using independence of the x : i x 1 Var i Varx i22ti i22t 2 i i 1 VarE[x |]EVar[x |] i22t 2 i i i i i 1 Vart E2t i22t 2 i i i i 1 2t2 2t i22t 2 i i i t i . i22t i Then substitute to obtain seˆ listed in the table. 4. Equivalence to classical ANOVA models We noted in the prior section the near equivalence between one-way classical and Brownian ANOVAs when observation time is treated as sample size. This equivalence in fact extends to any longitudinal model (1)-(2), that is, to any parameterization = U. Such parameterizations include regression and ANCOVA models. In this section, we explain how this connection arises. We restrict ourselves here to fixed-effects models. For comparison, we rewrite our longitudinal model (3) here: x = t + ~ NID(0,2t ) i = 1,…,I; j = 1,…,J, (#) ij i ij ij ij ij i 10
Description: