ebook img

GRKPACK: Fitting Smoothing Spline ANOVA Models for Exponential Families PDF

14 Pages·2010·0.24 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview GRKPACK: Fitting Smoothing Spline ANOVA Models for Exponential Families

DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 942 January 1995 GRKPACK: Fitting Smoothing Spline ANOVA Models for Exponential Families by Yuedong Wang GRKPACK: Fitting Smoothing Spline ANOVA Models for Exponential Families (cid:3) Yuedong Wang Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A. January 17, 1995 Abstract Wahbaetal(1994c)introducedSmoothingSplineANalysisofVAriance(SSANOVA) method for data from exponential families. Based on RKPACK, which (cid:12)ts SS ANOVA models to Gaussian data, we introduce GRKPACK: a collection of subroutines for bi- nary, binomial, Poisson and Gamma data. We also show how to calculate Bayesian con(cid:12)dence intervals for SS ANOVA estimates. Key Words: generalized cross validation; Newton-Raphson iteration; RKPACK; smoothing parameter; smoothing spline ANOVA; unbiased risk estimate. 1 Introduction Generalized linear models (GLM's) for analysis of data from exponential families have been extensivelystudied and widely used since 1970's (Nelder and Wedderburn, 1972; McCullagh and Nelder, 1989). As the popularity of these methods has increased, so has the need for more sophisticated model building and diagnostic checking techniques. In the context of nonparametricestimateoftheGLMregression surface,O'Sullivanetal(1986) andGu (1990) used penalized likelihood method with smoothing splines and thin plate splines. Hastie and Tibshirani (1990) used additive models. Wahba et al (1994c) introduced the SS ANOVA models using the penalized likelihood and Smoothing Spline ANalysis of Variance methods. See also Wahba et al (1994a, 1994b), Wang (1994) and Wang et al (1995) for details of SS ANOVA models. In this paper, we describe a package for estimations of the SS ANOVA modelswithbinary,binomial,PoissonandGammadata. WecallthispackageasGRKPACK, which stands for generalized RKPACK. (cid:3) Supported by the National Institute of Health under Grants R01 EY09946, P60 DK20572 and P30 HD18258 1 First, we describe the computational part of the SS ANOVA model. Suppose data have the form(yi;ti); i = 1;2;(cid:1)(cid:1)(cid:1);n; where yi are independentobservations and ti = (t1i;(cid:1)(cid:1)(cid:1);tdi). The distribution function of yi is from an exponential family with density function g(yi;fi;(cid:30)) = exp((yih(fi)(cid:0)b(fi))=a((cid:30))+c(yi;(cid:30))); (1) where fi = f(ti) is the parameter of interest and h(fi) is a monotone transformation of fi known as the canonical parameter. (cid:30) is an unknown scale parameter. Let t = (t1;:::;td), (j) (j) (1) (d) and let tj 2 T , where T is a measurable space. Let T = T (cid:10)(cid:1)(cid:1)(cid:1)(cid:10)T , then t 2 T. Denote the log likelihood given yi and ti as li(fi) = logg(yi;fi;(cid:30)) = (yih(fi)(cid:0)b(fi))=a((cid:30))+c(yi;(cid:30)): (2) The purpose is to investigate the global relationship between f and t. (j) (j) Let d(cid:22)j be a probability measure on T and let H be a reproducing kernel Hilbert R (j) space (RKHS) (Aronszajn, 1950) of functions on T with T(j) fj(tj)d(cid:22)j = 0 for fj(tj) 2 (j) (j) (j) H . Let f1 g be the one dimensional space of constant functions on T . Consider the RKHS d Y (j) (j) G = (1 (cid:8)H ) j=1 X X (j) (j) (k) = f1g(cid:8) H (cid:8) (H (cid:10)H )(cid:8)(cid:1)(cid:1)(cid:1); (3) j j<k (j) where f1g denotes the constant functions on T. An elementfj in H is calleda main e(cid:11)ect, (j) (k) an element fjk in H (cid:10)H is called a two factor interaction, and so on. Similar to the usual ANOVA, model space is a subspace M of G. By deleting some higher-order interactions, we get less (cid:13)exible, but more \estimable" models. When a model is chosen, we can regroup and write the model space as q X 0 (cid:12) M = H (cid:8) H ; (4) (cid:12)=1 0 where H is a (cid:12)nite dimensional space containing functions which are not going to be penal- ized, usually lower order polynomials. See Wahba et al (1994c) for details. A SS ANOVA estimate is the solution to the following variational problem: 8 9 < Xn n Xq = 2 min (cid:0) li(fi)+ (cid:21)(cid:12)kP(cid:12)fk : (5) f2M : 2 ; i=1 (cid:12)=1 The (cid:12)rst part in (5) is the negative log likelihood. It measures the goodness of (cid:12)t. In (cid:12) 2 the second part, P(cid:12) is the orthogonal projector in M onto H and kP(cid:12)fk is a quadratic roughness penalty. (cid:21)(cid:12)'s are a set of smoothing parameters. They control the trade-o(cid:11) 2 between the goodness of (cid:12)t and the roughness of the estimate. Writing (cid:21)(cid:12) = (cid:21)=(cid:18)(cid:12), (5) becomes ( ) n X n 2 min (cid:0) li(fi)+ (cid:21)kP(cid:3)fk(cid:2) ; (6) f2M 2 i=1 Pq Pq (cid:12) where P(cid:3) = (cid:12)=1P(cid:12) is the orthogonal projection in M onto H(cid:3) = (cid:12)=1H and q X 2 2 (cid:0)1 2 kfk(cid:2) = kP0fk + (cid:18)(cid:12) kP(cid:12)fk ; (7) (cid:12)=1 is a modi(cid:12)ed norm indexed by (cid:2) = ((cid:18)1;(cid:1)(cid:1)(cid:1);(cid:18)q). We denote R(cid:12) as the reproducing kernel (cid:12) Pq (cid:12) (RK) (Aronszajn, 1950) for H under the original norm. The RK for (cid:12)=1H under k(cid:1)k(cid:2) is q X R(cid:2) = (cid:18)(cid:12)R(cid:12): (8) (cid:12)=1 The solution to (6) has the form (Wahba 1990, O'Sullivan et al. 1986) XM Xn Xq T T f(cid:21);(cid:2)(t) = dv(cid:30)v(t)+ ci( (cid:18)(cid:12)R(cid:12)(ti;t)) = (cid:30)(t) d+(cid:24)(t) c; (9) v=1 i=1 (cid:12)=1 M 0 0 T where f(cid:30)vgv=1 is a set of basis functions of H , M = dim(H ), (cid:30) (t) = ((cid:30)1(t);(cid:1)(cid:1)(cid:1);(cid:30)M(t)), T (cid:24) (t) = (R(cid:2)(t1;t);(cid:1)(cid:1)(cid:1);R(cid:2)(tn;t)). cn(cid:2)1 and dM(cid:2)1 are vectors of coe(cid:14)cientsto be estimated. Substituting (9) into (6), we can estimate c and d by minimizing n X n T T T I(c;d) = (cid:0) li((cid:30) (ti)d+(cid:24) (ti)c)+ (cid:21)c Q(cid:2)c; (10) 2 i=1 Pq where Q(cid:2) = (cid:12)=1(cid:18)(cid:12)Q(cid:12) and Q(cid:12)'s are n(cid:2)n matriceswith Q(cid:12)(i;k) = R(cid:12)(ti;tk). Since li's are not quadratic, (10) can not be solved directly. But if all li(fi)'s are strictly concave, we can use Newton-Raphson procedure to compute c and d for (cid:12)xed (cid:21) and (cid:2). Let ui = (cid:0)dli=dfi, 2 2 wi = (cid:0)d li=dfi . Let T u = (u1;(cid:1)(cid:1)(cid:1);un); (11) W = diag(w1;(cid:1)(cid:1)(cid:1);wn); (12) T S = ((cid:30)(t1);(cid:1)(cid:1)(cid:1);(cid:30)(tn)) : (13) Then @I=@c = Q(cid:2)u+n(cid:21)Q(cid:2)c; (14) T @I=@d = S u; (15) 2 T @ I=@c@c = Q(cid:2)WQ(cid:2) +n(cid:21)Q(cid:2); (16) 2 T @ I=@c@d = Q(cid:2)WS; (17) 2 T T @ I=@d@d = S WS: (18) 3 The Newton-Raphson iteration satis(cid:12)es the linear system ! ! ! Q(cid:2)W Q(cid:2) +n(cid:21)Q(cid:2) Q(cid:2)W S c(cid:0)c (cid:0)Q(cid:2)u (cid:0)n(cid:21)Q(cid:2)c T T = T ; (19) S W Q(cid:2) S W S d(cid:0)d (cid:0)S u where the subscript minus indicates quantities evaluated at the previous Newton-Raphson iteration. Denote f = Sd+Q(cid:2)c (20) as the vector of estimates of f at design points. Let ~ 1=2 1=2 Q(cid:2) = W Q(cid:2)W ; (21) (cid:0)1=2 ~c = W c; (22) ~ 1=2 S = W S; (23) ~ d = d; (24) y~~ = W(cid:0)1=2(W f (cid:0)u ): (25) (19) can be simpli(cid:12)ed to (Q~(cid:2) +n(cid:21)I)c~+S~d~ = y~~; S~Tc~ = 0: (26) Choosing appropriate smoothing parameters is crucial for e(cid:11)ectively estimating the true function from data by (cid:12)tting smoothing spline models. The generalized cross validation (GCV) method estimates smoothing parameters by minimizing the GCV score 1=nk(I (cid:0)A((cid:21);(cid:2)))y~~k2 V((cid:21);(cid:2)) = ; (27) 2 [(1=n)tr(I (cid:0)A((cid:21);(cid:2)))] where A((cid:21);(cid:2)) satis(cid:12)es (w11(cid:0)=2f(cid:21);(cid:2)(t1);(cid:1)(cid:1)(cid:1);wn1=(cid:0)2f(cid:21);(cid:2)(tn))T = A((cid:21);(cid:2))y~~; (28) and f(cid:21);(cid:2)(ti)'s are computed from the solution of (26). The unbiased risk (UBR) method estimates smoothing parameters by minimizing the following unbiased risk estimate 2 1 (cid:27)^ U~((cid:21);(cid:2)) = k(I (cid:0)A((cid:21);(cid:2)))y~~k2 +2 trA((cid:21);(cid:2)); (29) n n P 2 n 2 where (cid:27)^ = 1=n i=1ui(cid:0)=wi(cid:0), an estimateof dispersion parameter. If the dispersion param- eter is known to be 1, such as in the case of binary data and Poisson data, a better estimate is 1 2 U((cid:21);(cid:2)) = k(I (cid:0)A((cid:21);(cid:2)))y~~k2 + trA((cid:21);(cid:2)): (30) n n See Wang (1994) and Wang et al (1995) for detail discussions about the GCV and UBR methods. 4 2 The Algorithm A generic code RKPACK (Gu, 1989; Gu and Wahba, 1991) is available to solve (26) and estimate (cid:21) and (cid:2) via GCV (option V) or the UBR method at the same time. When using 2 2 the UBR method, we can either specify (cid:27) = 1 (option U) or estimate (cid:27) (option U~ ). This suggests the following algorithm: Algorithm. Given the matrices S, Q(cid:12)'s, the response vector y and the starting vector f0: 1. Compute u and W . Compute the transformations S~, Q~(cid:12) = W1=2Q(cid:12)W1=2 and y~~; 2. Call RKPACK with inputs S~, fQ~(cid:12); (cid:12) = 1;(cid:1)(cid:1)(cid:1);qg and y~~. That is, solve (26) and choose (cid:21) and (cid:2) by GCV (option V) or the UBR method (option U or option U~ ); 3. Compute the new f. Stop if the algorithm convergesunder some criteria (for example, P P n 2 n i=1wi(cid:0)((fi(cid:0)fi(cid:0))=(1+jfij)) = i=1wi(cid:0) < p for some prespeci(cid:12)ed p > 0 is used in our programs) or the number of iterations exceeds some prespeci(cid:12)ed number L; otherwise go to step 1. The starting value f0 may be a constant function, a GLM (cid:12)t or some other estimates. (cid:0)6 We usually let p = 10 . The algorithm usually takes 5 to 15 iterations to converge. We believeL = 30 is big enough for most applications. Since changing (cid:21) and (cid:2) at each iteration means modifying the problem successively,convergence is not guaranteed. Nevertheless,the algorithm converges most of the time. 3 Approximate Bayesian Con(cid:12)dence Intervals for SS ANOVA Estimates In an observation study, often design is not balanced. It is desirable to construct con(cid:12)dence intervals for the SS ANOVAestimateand decide a region in which the estimates are deemed to be reliable. Con(cid:12)dence intervalsfor componentslikemaine(cid:11)ects and interactions are also useful for model selection. Wahba et al (1994c) derived approximate Bayesian con(cid:12)dence intervals for SS ANOVA estimates and showed how to use them in practice. Let the prior for f(t) be XM 1 Xq q F(cid:24)(t) = (cid:28)(cid:23)(cid:30)(cid:23)(t)+b2 (cid:18)(cid:12)Z(cid:12)(t); (31) (cid:23)=1 (cid:12)=1 T where (cid:28) = ((cid:28)1;(cid:1)(cid:1)(cid:1);(cid:28)M) (cid:24) N(0;(cid:24)I), Z(cid:12) are independent, zero mean Gaussian stochastic processes, independent of (cid:28), with EZ(cid:12)(t)Z(cid:12)(z) = R(cid:12)(t;z). With the log likelihood in (2) and (cid:24) ! 1, Wahba et al (1994c) derived the approximate posterior mean and covariance. We list them in the following theorem. 5 Theorem 1 Let c and d be a solution to (10). Let (cid:21), (cid:2) be smoothing parameters at conver- 2 (cid:0)1 gence and W, Q(cid:2) be matrixbasedon converged values. Letn(cid:21) = (cid:27)^ =b and M = Q(cid:2)+n(cid:21)W . q 1 For g0;(cid:23)(t) = (cid:28)(cid:23)(cid:30)(cid:23)(t), g(cid:12)(t) = b2 (cid:18)(cid:12)Z(cid:12)(t), (cid:23) = 1;(cid:1)(cid:1)(cid:1);M, (cid:12) = 1;(cid:1)(cid:1)(cid:1);q, we have E(g0;(cid:23)(t)jy) (cid:25) d(cid:23)(cid:30)(cid:23)(t); Xn E(g(cid:12)(t)jy) (cid:25) ci(cid:18)(cid:12)R(cid:12)(t;ti); i=1 1 T T (cid:0)1 (cid:0)1 Cov(g0;(cid:23)(z);g0;(cid:22)(t)jy) (cid:25) (cid:30)(cid:23)(z)(cid:30)(cid:22)(t)e(cid:23)(S M S) e(cid:22); b 1 Cov(g(cid:12)(z);g0;(cid:23)(t)jy) (cid:25) (cid:0)d(cid:23);(cid:12)(z)(cid:30)(cid:23)(t); b 1 Xn Cov(g(cid:12)(z);g(cid:12)(t)jy) (cid:25) (cid:18)(cid:12)R(cid:12)(z;t)(cid:0) ci;(cid:12)(z)(cid:18)(cid:12)R(cid:12)(t;ti); b i=1 n 1 X Cov(g(cid:13)(z);g(cid:12)(t)jy) (cid:25) (cid:0) ci;(cid:13)(z)(cid:18)(cid:12)R(cid:12)(t;ti); b i=1 T where e(cid:23) is the (cid:23)th unit vector, and (d1;(cid:12)(z);(cid:1)(cid:1)(cid:1);dM;(cid:12)(z)) = d(cid:12)(z) and T (c1;(cid:12)(z);(cid:1)(cid:1)(cid:1);cn;(cid:12)(z)) = c(cid:12)(z) are given by 0 1 (cid:18)(cid:12)R(cid:12)(z;t1) B C d(cid:12)(z) = (STM(cid:0)1S)(cid:0)1STM(cid:0)1B@ ... CA; (32) (cid:18)(cid:12)R(cid:12)(z;tn) 0 1 (cid:18)(cid:12)R(cid:12)(z;t1) B C c(cid:12)(z) = [M(cid:0)1 (cid:0)M(cid:0)1S(STM(cid:0)1S)(cid:0)1STM(cid:0)1]B@ ... CA: (33) (cid:18)(cid:12)R(cid:12)(z;tn) 4 A Special Case: Tensor Product of W2 To illustrate how to use SS ANOVA method and how to construct Bayesian con(cid:12)dence (j) intervals, consider the special case T = [0;1], j = 1;(cid:1)(cid:1)(cid:1);d and d (cid:21) 2 (weusually transform all continuous variables into [0;1] for (cid:12)tting and then transform them back). Take the component space on [0,1] as the RKHS Z 1 (1) (2) 2 W2 = ff : f and f abs. cont.; (f ) < 1g (34) 0 with norm Z Z Z 1 1 1 2 2 (1) 2 (2) 2 f = ( f) +( f ) + (f ) : (35) 0 0 0 We decompose W2 = N (cid:8) L (cid:8) S, where N is the space of constants, with the square R 1 2 norm( 0 f) ; L is the space of linear functions which integrateto zero, with the square norm 6 R 1 (1) 2 ( f ) ; and S is the space of functions with square integrable 2nd derivative and satisfy R 0 R 1 (v) 1 (2) 2 0 f = 0, v = 0;1, with the square norm 0(f ) . The RK for subspace S is R(t;z) = k2(t)k2(z)(cid:0)k4(t(cid:0)z); (36) where k(cid:23)((cid:1)) = B(cid:23)((cid:1))=v! and B(cid:23)((cid:1)) is the (cid:23)th Bernoulli polynomial. Let G be the tensor product of component spaces d j j j G = (cid:10) (N (cid:8)L (cid:8)S ) j=1 d j d j = f1g(cid:8)f((cid:8) L )(cid:8)((cid:8) S )g j=1 j=1 j k j k j k (cid:8)f((cid:8)j<k(L (cid:10)L ))(cid:8)((cid:8)j6=k(L (cid:10)S ))(cid:8)((cid:8)j<k(S (cid:10)S ))g(cid:8)(cid:1)(cid:1)(cid:1); j where with some abuse of notation, we are omitting factors of the form (cid:10)fN g; f1g is the d j 1 j d space of constant functions on [0;1] ; L = N (cid:10)(cid:1)(cid:1)(cid:1)(cid:10)L (cid:10)(cid:1)(cid:1)(cid:1)(cid:10)N is the space of functions that is a constant on tk, k 6= j and a linearfunction on tj. Others havesimilarinterpretation. The terms in the three brackets are the spaces of the constant, the main e(cid:11)ects and the 2- interactions respectively. For simplicity of notation (calculations can be easily extended to having more than one interactions), suppose we choose a model space contains the constant, all main e(cid:11)ects and the interaction between t1 and t2: d j j 1 2 1 2 1 2 1 2 M = f1g(cid:8)f(cid:8)j=1(L (cid:8)S )g(cid:8)f(L (cid:10)L )(cid:8)(S (cid:10)L )(cid:8)(L (cid:10)S )(cid:8)(S (cid:10)S )g(:37) Therefore, we have M = d+2, q = d+3. We usually take (cid:30)1(t) = 1; (38) (cid:30)(cid:23)(t) = t(cid:23)(cid:0)1 (cid:0)0:5; (cid:23) = 2;(cid:1)(cid:1)(cid:1);d+1; (39) (cid:30)M(t) = (t1 (cid:0)0:5)(cid:2)(t2(cid:0)0:5) (40) 0 as basis functions for H and R(cid:12)(t;z) = R(t(cid:12);z(cid:12)); (cid:12) = 1;(cid:1)(cid:1)(cid:1);d; (41) Rd+1(t;z) = R(t1;z1)(cid:2)(t2 (cid:0)0:5)(cid:2)(z2(cid:0)0:5); (42) Rd+2(t;z) = (t1 (cid:0)0:5)(cid:2)(z1(cid:0)0:5)(cid:2)R(t2;z2); (43) Rd+3(t;z) = R(t1;z1)(cid:2)R(t2;z2) (44) 1 d+3 as the RK's for H ;(cid:1)(cid:1)(cid:1);H respectively. Write f(t) = C +f1(t1)+(cid:1)(cid:1)(cid:1)+fd(td)+f1;2(t1;t2); (45) where fj's are the main e(cid:11)ects and f1;2 is the interaction between t1 and t2. Comparing to Theorem 1, we have C = g01; (46) fj(tj) = g0;j+1(t)+gj(t); j = 1;(cid:1)(cid:1)(cid:1);d; (47) f1;2(t1;t2) = g0;M(t)+gd+1(t)+gd+2(t)+gd+3(t): (48) 7 Suppose wewanttocalculateposterior meansand standard deviationsofthemaine(cid:11)ects, the interaction and the overall function on grid points Z1 (cid:2)(cid:1)(cid:1)(cid:1)(cid:2)Zd, where Zj's are sets of points in [0;1]. For simplicityof notation, suppose Zj = Z = fz1;(cid:1)(cid:1)(cid:1);zKg. Calculations for di(cid:11)erent Zj's are the same. Let T fj = (fj(z1);(cid:1)(cid:1)(cid:1);fj(zK)); j = 1;(cid:1)(cid:1)(cid:1);d; (49) T f1;2 = (f1;2(z1;z1);(cid:1)(cid:1)(cid:1);f1;2(zK;z1);(cid:1)(cid:1)(cid:1);f1;2(z1;zK);(cid:1)(cid:1)(cid:1);f1;2(zK;zK)); (50) T (cid:0)1 (cid:0)1 A = (S M S) ; (51) T (cid:30)(cid:23) = ((cid:30)(cid:23)(z1);(cid:1)(cid:1)(cid:1);(cid:30)(cid:23)(zK)); (cid:23) = 2;(cid:1)(cid:1)(cid:1);d+1; (52) T (cid:30)1;2 = ((cid:30)M(z1;z1);(cid:1)(cid:1)(cid:1);(cid:30)M(zK;z1);(cid:1)(cid:1)(cid:1);(cid:30)M(z1;zK);(cid:1)(cid:1)(cid:1);(cid:30)M(zK;zK)); (53) T dj = (dj+1;j(z1);(cid:1)(cid:1)(cid:1);dj+1;j(zK)); j = 1;(cid:1)(cid:1)(cid:1);d; (54) T d1;2 = (dM;d+1(z1;z1);(cid:1)(cid:1)(cid:1);dM;d+1(zK;z1);(cid:1)(cid:1)(cid:1);dM;d+1(z1;zK);(cid:1)(cid:1)(cid:1);dM;d+1(zK;zK)) + (dM;d+2(z1;z1);(cid:1)(cid:1)(cid:1);dM;d+2(zK;z1);(cid:1)(cid:1)(cid:1);dM;d+2(z1;zK);(cid:1)(cid:1)(cid:1);dM;d+2(zK;zK)) + (dM;d+3(z1;z1);(cid:1)(cid:1)(cid:1);dM;d+3(zK;z1);(cid:1)(cid:1)(cid:1);dM;d+3(z1;zK);(cid:1)(cid:1)(cid:1);dM;d+3(zK;zK));(55) C(cid:12) = (c(cid:12)(z1);(cid:1)(cid:1)(cid:1);c(cid:12)(zK))n(cid:2)K; (cid:12) = 1;(cid:1)(cid:1)(cid:1);d; (56) C1;2 = (cd+1(z1;z1);(cid:1)(cid:1)(cid:1);cd+1(zK;z1);(cid:1)(cid:1)(cid:1);cd+1(z1;zK);(cid:1)(cid:1)(cid:1);cd+1(zK;zK))n(cid:2)K2 + (cd+2(z1;z1);(cid:1)(cid:1)(cid:1);cd+2(zK;z1);(cid:1)(cid:1)(cid:1);cd+2(z1;zK);(cid:1)(cid:1)(cid:1);cd+2(zK;zK))n(cid:2)K2 + (cd+3(z1;z1);(cid:1)(cid:1)(cid:1);cd+3(zK;z1);(cid:1)(cid:1)(cid:1);cd+3(z1;zK);(cid:1)(cid:1)(cid:1);cd+3(zK;zK))n(cid:2)K2 (57) (cid:3)(cid:12) = ((cid:18)(cid:12)R(cid:12)(zk;t(cid:12)i))K(cid:2)n; (cid:12) = 1;(cid:1)(cid:1)(cid:1);d; (58) (cid:3)1;2 = (((cid:18)d+1Rd+1 +(cid:18)d+2Rd+2 +(cid:18)d+3Rd+3)((zk1;zk2);(t1i;t2i))K2(cid:2)n; (59) (cid:6)(cid:12) = ((cid:18)(cid:12)R(cid:12)(zi;zk))K(cid:2)K; (cid:12) = 1;(cid:1)(cid:1)(cid:1);d; (60) (cid:6)1;2 = (((cid:18)d+1Rd+1 +(cid:18)d+2Rd+2 +(cid:18)d+3Rd+3)((zi1;zi2);(zk1;zk2)))K2(cid:2)K2: (61) FromTheorem1, the posterior meansand standard deviations for the constant, the main e(cid:11)ects at points Z and interaction at points Z (cid:2)Z are E(Cjy) (cid:25) d1; (62) Cov(Cjy) (cid:25) bA(1;1); (63) E(fjjy) (cid:25) dj+1(cid:30)j+1 +(cid:3)jc; j = 1;(cid:1)(cid:1)(cid:1);d; (64) T T T Cov(fjjy) (cid:25) b[A(j +1;j +1)(cid:30)j+1(cid:30)j+1 (cid:0)dj(cid:30)j+1 (cid:0)(cid:30)j+1dj +(cid:6)j (cid:0)(cid:3)jCj]; j = 1;(cid:1)(cid:1)(cid:1);d; (65) E(f1;2jy) (cid:25) dM(cid:30)1;2+(cid:3)1;2c; (66) T T T Cov(f1;2jy) (cid:25) b[A(M;M)(cid:30)1;2(cid:30)1;2(cid:0)d1;2(cid:30)1;2(cid:0)(cid:30)1;2d1;2+(cid:6)1;2(cid:0)(cid:3)1;2C1;2]: (67) d For any point z 2 Z , the posterior mean of the overall function d X E(f(z)jy) = E(Cjy)+ E(fj(zjy))+E(f1;2jy): (68) j=1 8 Table 1: Drivers in GRKPACK data single smoothing parameter drivers multiple smoothing parameter drivers binary dbsdr dbmdr binomial dbisdr dbimdr Poisson dpsdr dpmdr Gamma dgsdr dgmdr The posterior variance of the overall function can be calculated using the formula d X Var(C + fj(z)+f1;2(z)jy) = j=1 d X Var(Cjy)+ Var(fj(z)jy)+Var(f1;2(z)jy) j=1 d d X X +2 Cov(C;fj(z)jy)+2Cov(C;f1;2(z)jy)+2 Cov(fj(z);f1;2(z)jy): (69) j=1 j=1 Each term in (69) can be read o(cid:11) easily from the calculations for the component posterior variances. T (cid:0)1 (cid:0)1 We need to calculate (S M S) , c(cid:12)(t) and d(cid:12)(t). Gu and Wahba (1993) discussed how to calculate these quantities when W = I. Let Q~(cid:2) = W1=2Q(cid:2)W1=2, S~ = W1=2S, and M~ = Q~(cid:2) +n(cid:21)I. We can calculate (S~TM~(cid:0)1S~)(cid:0)1, d~(cid:12)(t) and c~(cid:12)(t) the same way as Gu p ~ and Wahba (1993) with their R(cid:12)(t;ti) replaced by R(cid:12)(t;ti) = wiR(cid:12)(t;ti). We than have T (cid:0)1 (cid:0)1 ~T ~(cid:0)1~ (cid:0)1 ~ 1=2 (S M S) = (S M S) , d(cid:12)(t) = d(cid:12)(t) and c(cid:12)(t) = W c~(cid:12)(t). Two utility routines dcrdr, dsms in RKPACK can be used to calculate these quantities. The (1-(cid:11))100% Bayesian con(cid:12)dence interval for a function g (g could be the constant C, the main e(cid:11)ects fj's, the interaction f1;2 and the overall function f) at a given point z is q q (E(g(z)jy)(cid:0)z(cid:11)=2 Var(g(z)jy);E(g(z)jy)+z(cid:11)=2 Var(g(z)jy) ); (70) where z(cid:11)=2 is the upper (cid:11)=2 percentile of the standard Normal distribution. 5 GRKPACK GRKPACK is a collection of subroutines using the SS ANOVA algorithm in section 2 for binary, binomial, Poisson and Gamma data. Users can modify these routines for other distributions in exponential families. We are developing routines for ordinal data. Subroutines in GRKPACK are listed in Table 1. Correspondence of notations between this report and GRKPACK are listed in Table 2. 9

Description:
nonparametric estimate of the GLM regression surface, O'Sullivan et al (1986) models using the penalized likelihood and Smoothing Spline ANalysis of Variance methods nX i=1 ci(. qX. =1. R (ti;t)) = (t)T d+ (t)Tc;. (9) where f vgM v=1 is a set of .. 9] O'Sullivan, F., Yandell, B. and Raynor, W. (19
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.