ebook img

using smoothing spline anova to examine the relation of - Deep Blue PDF

20 Pages·1997·1.15 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview using smoothing spline anova to examine the relation of - Deep Blue

STATISTICSINMEDICINE,VOL.16,1357—1376(1997) USING SMOOTHING SPLINE ANOVA TO EXAMINE THE RELATION OF RISK FACTORS TO THE INCIDENCE AND PROGRESSION OF DIABETIC RETINOPATHY YUEDONGWANG(cid:49)*,GRACEWAHBA(cid:50),CHONGGU(cid:51),RONALDKLEIN(cid:52) ANDBARBARAKLEIN(cid:52) (cid:49)(cid:68)(cid:101)(cid:112)(cid:97)(cid:114)(cid:116)(cid:109)(cid:101)(cid:110)(cid:116)(cid:111)(cid:102)(cid:66)(cid:105)(cid:111)(cid:115)(cid:116)(cid:97)(cid:116)(cid:105)(cid:115)(cid:116)(cid:105)(cid:99)(cid:115)(cid:44)(cid:85)(cid:110)(cid:105)v(cid:101)(cid:114)(cid:115)(cid:105)(cid:116)(cid:121)(cid:111)(cid:102)(cid:77)(cid:105)(cid:99)(cid:104)(cid:105)(cid:103)(cid:97)(cid:110)(cid:44)(cid:49)(cid:52)(cid:50)(cid:48)(cid:87)(cid:97)(cid:115)(cid:104)(cid:105)(cid:110)(cid:103)(cid:116)(cid:111)(cid:110)(cid:72)(cid:101)(cid:105)(cid:103)(cid:104)(cid:116)(cid:115)(cid:44)(cid:65)(cid:110)(cid:110)(cid:65)(cid:114)(cid:98)(cid:111)(cid:114)(cid:44)(cid:77)(cid:73)(cid:52)(cid:56)(cid:49)(cid:48)(cid:57)(cid:44)(cid:85)(cid:46)(cid:83)(cid:46)(cid:65)(cid:46) (cid:50)(cid:68)(cid:101)(cid:112)(cid:97)(cid:114)(cid:116)(cid:109)(cid:101)(cid:110)(cid:116)(cid:111)(cid:102)(cid:83)(cid:116)(cid:97)(cid:116)(cid:105)(cid:115)(cid:116)(cid:105)(cid:99)(cid:115)(cid:44)(cid:85)(cid:110)(cid:105)v(cid:101)(cid:114)(cid:115)(cid:105)(cid:116)(cid:121)(cid:111)(cid:102)(cid:87)(cid:105)(cid:115)(cid:99)(cid:111)(cid:110)(cid:115)(cid:105)(cid:110)(cid:44)(cid:49)(cid:50)(cid:49)(cid:48)(cid:87)(cid:46)(cid:68)(cid:97)(cid:121)(cid:116)(cid:111)(cid:110)(cid:83)(cid:116)(cid:46)(cid:77)(cid:97)(cid:100)(cid:105)(cid:115)(cid:111)(cid:110)(cid:44)(cid:87)(cid:73)(cid:53)(cid:51)(cid:55)(cid:48)(cid:54)(cid:44)(cid:85)(cid:46)(cid:83)(cid:46)(cid:65)(cid:46) (cid:51)(cid:68)(cid:101)(cid:112)(cid:97)(cid:114)(cid:116)(cid:109)(cid:101)(cid:110)(cid:116)(cid:111)(cid:102)(cid:83)(cid:116)(cid:97)(cid:116)(cid:105)(cid:115)(cid:116)(cid:105)(cid:99)(cid:115)(cid:44)(cid:85)(cid:110)(cid:105)v(cid:101)(cid:114)(cid:115)(cid:105)(cid:116)(cid:121)(cid:111)(cid:102)(cid:77)(cid:105)(cid:99)(cid:104)(cid:105)(cid:103)(cid:97)(cid:110)(cid:44)(cid:77)(cid:97)(cid:115)(cid:111)(cid:110)(cid:72)(cid:97)(cid:108)(cid:108)(cid:44)(cid:65)(cid:110)(cid:110)(cid:65)(cid:114)(cid:98)(cid:111)(cid:114)(cid:44)(cid:77)(cid:73)(cid:53)(cid:56)(cid:49)(cid:48)(cid:57)(cid:44)(cid:85)(cid:46)(cid:83)(cid:46)(cid:65)(cid:46) (cid:52)(cid:68)(cid:101)(cid:112)(cid:97)(cid:114)(cid:116)(cid:109)(cid:101)(cid:110)(cid:116)(cid:111)(cid:102)(cid:79)(cid:112)(cid:104)(cid:116)(cid:104)(cid:97)(cid:108)(cid:109)(cid:97)(cid:108)(cid:111)(cid:103)(cid:121)(cid:44)(cid:85)(cid:110)(cid:105)v(cid:101)(cid:114)(cid:115)(cid:105)(cid:116)(cid:121)(cid:111)(cid:102)(cid:87)(cid:105)(cid:115)(cid:99)(cid:111)(cid:110)(cid:115)(cid:105)(cid:110)(cid:44)(cid:54)(cid:49)(cid:48)(cid:87)(cid:97)(cid:108)(cid:110)(cid:117)(cid:116)(cid:83)(cid:116)(cid:46)(cid:77)(cid:97)(cid:100)(cid:105)(cid:115)(cid:111)(cid:110)(cid:44)(cid:87)(cid:73)(cid:53)(cid:51)(cid:55)(cid:48)(cid:54)(cid:44)(cid:85)(cid:46)(cid:83)(cid:46)(cid:65)(cid:46) SUMMARY SmoothingsplineANOVA(ANalysisOfVAriance)methodsprovideaflexiblealternativetothestandard parametricGLIM(generalizedlinearmodels)methodsforanalysingtherelationshipofpredictorvariables tooutcomeswithdatafromlargeepidemiologicstudies.Thesemethodsallowthevisualizationofrelation- shipsnotreadilyfitbysimpleGLIMmodels,andprovidefortheabilitytovisualizeinteractionsbetweenthe variables.Atthe sametime, theyreduceto GLIMmodelsif the data suggestthat the addedflexibility is unwarranted. Using this method, we investigate risk factors for incidence and progression of diabetic retinopathyinagroupofpatientswitholderonsetdiabetesfromtheWisconsinEpidemiologicalStudyof DiabeticRetinopathy.Wecarryoutfouranalysestoillustratevariouspropertiesofthisclassofmethods. Some of the results confirm previous findings with use of standard methods, while others allow the visualizationofmorecomplexrelationshipsnotevidentfromtheapplicationofparametricmethods.(cid:40)1997 byJohnWiley&Sons,Ltd. Statist.Med.,16,1357—1376(1997) No.ofFigures:10 No.ofTables:1 No.ofReferences:38 1. INTRODUCTION Manydemographicmedicalstudiescollectdataoftheform(cid:77)y,t(i),i"1,2n(cid:78),whereiindexes (cid:105) theithstudyparticipant,y is1or0,indicatingwhetheramedicalconditionofinterestispresent (cid:105) orabsentatfollow-up,andt (i)"(t (i),2,t (i))isavectorof dpredictorvariablesatbaseline, (cid:49) (cid:100) thatmayormaynotrelatetothelikelihoodthaty isa1.Fromthesedatawewishtoestimate (cid:105) p(t)(cid:34) Prob(cid:77)y"1(cid:68)t(cid:78), the probability that a person with predictor vector t presents with the condition of interest at follow-up. We use such estimates for the prevalence of the condition of interestin generalpopulations and to study the sensitivity of p to the predictor variables,or, if * Correspondenceto:Y.Wang,DepartmentofBiostatistics,UniversityofMichigan,1420WashingtonHeights,Ann Arbor,MI48109,U.S.A. Contractgrantsponsor:NIH Contractgrantnumber:EY09946,P60DK20572,P30HD18258,U54HD29184,EY03083 Contractgrantsponsor:NSF Contractgrantnumber:DMS9121003,DMS9301511 CCC 0277—6715/97/121357—20$17.50 (cid:82)(cid:101)(cid:99)(cid:101)(cid:105)v(cid:101)(cid:100) (cid:68)(cid:101)(cid:99)(cid:101)(cid:109)(cid:98)(cid:101)(cid:114) (cid:49)(cid:57)(cid:57)(cid:53) (cid:40) 1997 by John Wiley & Sons, Ltd. (cid:82)(cid:101)v(cid:105)(cid:115)(cid:101)(cid:100) (cid:65)(cid:117)(cid:103)(cid:117)(cid:115)(cid:116) (cid:49)(cid:57)(cid:57)(cid:54) 1358 Y.WANGE„A‚. possible, to combinations of them. The traditional GLIM models(cid:49) do this by defining f(t), the logit, as f(t)"log[p(t)/(1!p(t))] (1) and assuming that f is a simple parametric function of the components of t. When the t are (cid:97) continuous variables, the most commonly used model is linear in the components of t (cid:100) f(t)"f(t ,2,t )"(cid:107)#(cid:43) (cid:98) t (2) (cid:49) (cid:100) (cid:97) (cid:97) (cid:97)(cid:47)(cid:49) butsometimesoneusessecondoreventhirddegreepolynomialsifitappearsthatalinearmodel isinadequate.Ifsomeofthet arecategoricalvariables,thenwecanuseindicatorfunctions.More (cid:97) generally, given M parametric functions (cid:47) ,(cid:108)"1,2,M subject to some identifiability condi- (cid:108) tions, f is modelled parametrically as (cid:77) f(t)"(cid:43) (cid:98) (cid:47) (t). (3) (cid:108) (cid:108) (cid:108)(cid:47)(cid:49) Then ((cid:107) and) the (cid:98)’s are obtained by minimizing the negative log-likelihood(cid:76)(y,f) given by (cid:110) (cid:76)(y,f)"(cid:43) [y(cid:105)f(t(i))!log(1#e(cid:102)(cid:40)(cid:116)(cid:40)(cid:105)(cid:41)(cid:41))] (4) (cid:105)(cid:47)(cid:49) with(2)or,moregenerally,(3),substitutedforf.Ifthe‘truebutunknown’fisactuallysomelinear combinationofthespecified(cid:47) thenthisis,ofcoursethecorrectprocedure.Unfortunately,aswe (cid:108) examine large data sets more closely, it becomes clear that linear models, or even quadratic or cubic models, are often inadequate. What happens if, for example, the dependence on t is ‘J’ (cid:97) shaped,or,evenhastwopeaks,possiblyrepresentingtwodistinctsubpopulations?Ifthe‘true’log oddsratiofis notwellapproximatedbysomefunctionofthespecifiedform,thenwe mayhave largebiasesintheestimatesoff.Furthermore,thecommonstatisticalhypothesistests,confidence intervals,P-values,etc.arenotnecessarilyvalidiffisnotofthespecifiedform.Inthispaperwe describe and demonstrate the use of smoothing spline ANOVA (SS-ANOVA) methods to estimate f. These methods avoid many of the above difficulties, yet they provide the ability to visualize some of the relationships between the variables not easily observed with use of more traditional methods. We demonstrate their use to examine the relation of risk factors to the incidenceandprogressionofdiabeticretinopathy,usingdatafromtheWisconsinEpidemiologi- cal Study of Diabetic Retinopathy (WESDR). The analysis here is based on (a special case of) the smoothing spline ANOVA method for modelling and estimating f that appears in Wahba, Wang, Gu, Klein and Klein(cid:50) (WWGKK), andwe haveimplementedthe analysisviathe publiclyavailablecodeGRKPACKdescribedin Wang.(cid:51)Thecodeitselfisavailablefromnetlib(cid:64)research.att.cominthegcvdirectorythere, and through statlib(cid:64)lib.stat.cmu.edu. An SS-ANOVA estimate is a form of penalized likelihood estimate. We first note two importantpenalizedlikelihoodestimatesthatappearintheliterature.Then,intheremainderof thisintroduction,wedescribethemainfeaturesofSS-ANOVAingeneraland thedetailsofthe SS-ANOVAmodels in particular that we apply to the WESDR data. MajorprecursorsoftheworkunderdiscussionincludeO’Sullivan(cid:52)andO’Sullivanetal.,(cid:53)who proposedapenalizedlog-likelihoodestimateforf based onthin platesplines.Thesesplinesare useful in many contexts,and one can incorporatethem in an SS-ANOVAmodel;(cid:54)(cid:44)(cid:55) we do not employ them in the present work. Statist.Med.,Vol.16,1357—1376(1997) (cid:40)1997byJohnWiley&Sons,Ltd. SMOOTHINGSPLINEANOVA 1359 Hastie and Tibshirani(cid:56) (see also other references there) discussed estimates of f of the form (cid:100) f(t)"f(t ,2,t )"(cid:107)#(cid:43) f (t ) (5) (cid:49) (cid:100) (cid:97) (cid:97) (cid:97)(cid:47)(cid:49) where the f are ‘smooth’ functions obtained by some smoothing process, including the use of (cid:97) cubic smoothing splines. The S code (Chambers and Hastie(cid:57)) provides the facility for fitting models of the form (5). They also note that some of their work extends to the SS-ANOVA methods that we consider here. The varying-coefficient models of Hastie and Tibshirani(cid:49)(cid:48) are averyinterestingsub-familyoftheSS-ANOVAmodels.Cubicsmoothingsplinesforthef in(5) (cid:97) are the solution to the minimization problem: find f (in an appropriate function space), and (cid:97) subject to some identifiability criteria such as (cid:58)f (t )dt "0, to minimize the penalized log- (cid:97) (cid:97) (cid:97) likelihood functional (cid:100) (cid:73)(cid:106)(y,f)"(cid:76)(y,f)#(cid:43) (cid:106)(cid:97)J(cid:97)(f(cid:97)) (6) (cid:97)(cid:47)(cid:49) where we define the penalty functionals J by (cid:97) (cid:80)(cid:49) J (f )" (f(cid:64)(cid:64)(t ))(cid:50)dt . (7) (cid:97) (cid:97) (cid:97) (cid:97) (cid:97) (cid:48) As the smoothing parameter (cid:106) tends to infinity, f tends to a linear function in t , so that the (cid:97) (cid:97) (cid:97) minimizer of (6) tends to the linear function as in (2) if all the (cid:106) ’s become large. Penalized (cid:97) likelihoodmethodswithmoregeneralpenaltyfunctionalsandunpenalizedtermsarediscussedin reference 11. Recentresearchhasfocusedonmoregeneralmodelsforfthatallowtheexplicitmodellingand visualization of possible interactions between variables, via functional analysis of variance decompositions(to be described)and SS-ANOVAestimationmethods.Given a fairlyarbitrary functionf(t ,2,t )ofseveralvariables,wecandefinea(functional)ANOVAdecompositionof (cid:49) (cid:100) f (generalizing ideas from parametric ANOVA) as (cid:100) f(t ,2,t )"(cid:107)#(cid:43) f (t )#(cid:43) f (t ,t )#2#f (t ,2,t ) (8) (cid:49) (cid:100) (cid:97) (cid:97) (cid:97)(cid:98) (cid:97) (cid:98) (cid:49)(cid:44)2(cid:44)(cid:100) (cid:49) (cid:100) (cid:97)(cid:47)(cid:49) (cid:97)(cid:58)(cid:98) where the f are the main effects, f are the two factor interactions, etc. The components are (cid:97) (cid:97)(cid:98) uniquelydeterminedgivenasetofaveragingoperators(cid:69)(cid:97),whichaveragefunctionsoverthet(cid:97)in some specified way. For example, if t is a continuous variable in the interval [0,1], then (cid:97) a possible choice for (cid:69)(cid:97) is (cid:80)(cid:49) ((cid:69)(cid:97)f) (t(cid:49),2,t(cid:97)(cid:126)(cid:49),t(cid:97)(cid:96)(cid:49),2,t(cid:100))" f(t(cid:49),2,t(cid:100))dt(cid:97). (9) (cid:48) Then the mean is (cid:107)"(cid:60)(cid:100)(cid:97)(cid:47)(cid:49)(cid:69)(cid:97)f, the main effects are f(cid:97)"(I!(cid:69)(cid:97))(cid:60)(cid:98)O(cid:97)(cid:69)(cid:98)f, the two factor interactionsaref(cid:97)(cid:98)"(I!(cid:69)(cid:97))(I!(cid:69)(cid:98))(cid:60)(cid:99)(cid:47)(cid:97)(cid:44)(cid:98)(cid:69)(cid:99)f,etc.Since(cid:69)(cid:97)(cid:69)(cid:97)"(cid:69)(cid:97),wecanseethattheterms inthefunctionalANOVAdecompositionsatisfysideconditionsanalogoustothoseinordinary parametricANOVA,forexample, (cid:69)(cid:97)f(cid:97)"0. In general,for model fittingpurposes,we eliminate higherordertermsinthefunctionalANOVAdecompositionandweestimatesomeorallofthe lower order terms by finding (cid:107), and (some of) the f ,f , etc. to minimize a penalized log- (cid:97) (cid:97)(cid:98) likelihoodfunctional(cid:73)(cid:106)(y,f)givenbyanexpressionthatgeneralizes(6),withaseparatepenalty functionalfor eachindependentlysmoothedterm in the model. We can also incorporatein the (cid:40) 1997byJohnWiley&Sons,Ltd. Statist.Med.,Vol.16,1357—1376(1997) 1360 Y.WANGE„A‚. modelindicatorfunctionsforcategoricalvariables.Otheraveragingoperatorsincludeweighted sums over possible or observed values of t . (cid:97) It is well known(cid:49)(cid:49) that solutions to variational problems like the minimizer of (cid:73)(cid:106)(y,f) and generalizationsto (cid:73)(cid:106)(y,f) with f containing interactionterms as in (8) havea representation as alinearcombinationoftheunpenalizedtermsinthemodelplusn(basis)functions,thatwecan constructfromthet(i),thereproducingkernelsassociatedwiththepenaltyfunctionals,andthe smoothingparameters,seereferences3,4,11—14.Giventhesmoothingparameters,thenumerical problemofcomputingtheminimizerf reducestofindingthecoefficientsofarepresentationforit (cid:106) asa linearcombinationofthe unpenalizedtermsand the abovementionedbasis functions.We can solve the numerical problem for the coefficients with a Newton—Raphson iteration in conjunction with matrix decompositions. The history of these numerical methods includes references4,5,11,15—18.Therearevariousapproximatenumericalmethodsavailableandunder developmentfordatasetsthataretoolargeformatrixdecompositionmethods;these,however, are beyond the scope of this article. Objective methods for choosing the smoothing parameters are desirable. This paper and the code GRKPACK employ the iterative unbiased risk method given in Gu(cid:49)(cid:53)(cid:44)(cid:49)(cid:54) for the non- Gaussiancaseandextendedtothepsmoothingparametercase(cid:106)"((cid:106) ,2,(cid:106) )inWang(cid:49)(cid:57)and (cid:49) (cid:112) WWGKK. This method is a computable proxy for the Kullback—Liebler information distance from the estimate to the ‘truth’, see references 2, 16, 20 and 21. Thus, the method attempts to choosethe smoothingparametersto minimizethe Kullback—Lieblerinformationdistancefrom the estimate f and the unknown ‘true’ f. Other related references are 22—24. (cid:106) As with any estimation method, it is important to have some indication of its accuracy. It is particularly important in the case here of non-parametric regression with Bernoulli data derivedfrommedicalrecordsbecausethesedatatendtobeirregular,andmaycontaininfluential outliers.Therigidityofparametricmodelsmaymasktheeffectofoutliers,aswellasobscurethe fact that we sometimes make inferences in fairly data-sparse regions. The non-parametric estimatesmaybemoresensitivetooutliers,anditisimportanttobeabletodelineatearegionin thepredictorvariablespacewherewecanrelyupontheestimate,aswellaswherewecanprovide some sort of confidence statement. In this work we use the Bayesian ‘confidence intervals’ proposedinreference25,adapatedtothecomponent-wisemultiplesmoothingparametercasein reference 7, to non-Gaussian data in reference 26, and to the multiple smoothing parameter non-Gaussian case in reference 19 and WWGKK. We use the Bayesian ‘confidence interval’ forf to delineatethe regioninpredictorvariablespace forwhichwedeemtheoverallestimate (cid:106) as reliable, by computing an appropriate level curve in a contour plot of the width of the confidence interval, and using this to enclose a ‘reliable’ region. We also use it in conjunction with cross-sectional plots so that we can visualize the accuracy estimates. We have used the component-wise confidence intervals to eliminate some terms that we cannot distinguish from noise, by deleting terms whose confidence intervals contain 0 over most of their domain. We remarkthatwe base theseconfidenceintervals onan across-the-functionproperty,that is, a 95 per cent confidence interval has the property that (approximately) we expect the confidence intervals to cover the true curve at about 95 per cent of the n data points, see references 3, 21 and 26. We now proceed to the details of the particular models employed in the analysis of the WESDRdata.Wecarryoutfouranalyses,eachofwhichillustratessomeparticularfeatureofthis classof models.In eachcasewehave rescaledthe(continuous)predictorvariablest ,2, t to (cid:49) (cid:100) [0,1], andweusetheaveragingoperator(cid:69)"(cid:69)(cid:97)definedin(9).Weconsideronlymaineffectsand twofactorinteractions.WeemploythepenaltyfunctionalJ of(7)andatwo-dimensionalrelative (cid:97) J defined below.If we eliminatedall of the twofactorinteractions, thenthe modelsdescribed (cid:97)(cid:98) Statist.Med.,Vol.16,1357—1376(1997) (cid:40)1997byJohnWiley&Sons,Ltd. SMOOTHINGSPLINEANOVA 1361 immediately below reduce to the form (5) as studied by Hastie and Tibshirani,(cid:56) although our estimation method is different. We require somedefinitions to define the terms in the particular ANOVAdecomposition(8) thatweuse.Definethe(linear)‘trend’function(cid:47)(u)"u!1/2foru3[0,1],and,forcontinuous functions g defined on [0,1] let (cid:66)g stand for (cid:66)g"g(1)!g(0) . For future reference note that (cid:69)(cid:47),(cid:58)(cid:49)(cid:47)(u)du"0 and (cid:66)(cid:47)"1. We use (cid:69) and (cid:66) to define ANOVA and identifiability side conditio(cid:48)ns.Asubscript(cid:97)on(cid:69)or(cid:66)meansthatitappliestowhatfollowsconsideredasafunction of t . The typical main effect term f(t ) in this set up has a decomposition of the form (cid:97) (cid:97) (cid:97) f (t )"d (cid:47)(t )#f (t ) (10) (cid:97) (cid:97) (cid:97) (cid:97) (cid:52)(cid:97) (cid:97) whered (cid:47)(t )isthelinearandunpenalizedpartoff andf isthe(detrended)‘smooth’partoff . (cid:97) (cid:97) (cid:97) (cid:52)(cid:97) (cid:97) f(cid:52)(cid:97)satisfiesthesideconditions(cid:69)(cid:97)f(cid:52)(cid:97)"(cid:66)(cid:97)f(cid:52)(cid:97)"0andappearsinsideapenaltyfunctionalasJ(cid:97)(f(cid:52)(cid:97)). Due to the side conditions on f ,J (f )"0 implies that f "0. (cid:97) (cid:97) (cid:52)(cid:97) (cid:52)(cid:97) The two factor interaction f (t ,t ) has an analogous decomposition into four interaction (cid:97)(cid:98) (cid:97) (cid:98) terms, namely f (t ,t )"d (cid:47)(t )(cid:47)(t )#(cid:47)(t ) f(cid:40)(cid:97)(cid:41)(t )#(cid:47)(t ) f(cid:40)(cid:98)(cid:41)(t )#f (t ,t ) (11) (cid:97)(cid:98) (cid:97) (cid:98) (cid:97)(cid:98) (cid:97) (cid:98) (cid:97) (cid:52)(cid:98) (cid:98) (cid:98) (cid:52)(cid:97) (cid:97) (cid:52)(cid:97)(cid:98) (cid:97) (cid:98) where the ‘smooth’ factors of the trend(cid:93)smooth interactions satisfy the side conditions (cid:69)(cid:97)f(cid:52)(cid:40)(cid:97)(cid:98)(cid:41)"(cid:66)(cid:97)f(cid:52)(cid:40)(cid:97)(cid:98)(cid:41)"(cid:69)(cid:98)f(cid:52)(cid:40)(cid:98)(cid:97)(cid:41) "(cid:66)(cid:98)f(cid:52)(cid:40)(cid:98)(cid:97)(cid:41)"0 and the ‘smooth—smooth’ interaction term satisfies ((cid:69)(cid:97)f(cid:52)(cid:97)(cid:98))(t(cid:98))"((cid:66)(cid:97)f(cid:52)(cid:97)(cid:98))(t(cid:98))"((cid:69)(cid:98)f(cid:52)(cid:97)(cid:98))(t(cid:97))"((cid:66)(cid:98)f(cid:52)(cid:97)(cid:98))(t(cid:97))"0, for all t(cid:97),t(cid:98). Letting J(cid:97)(cid:98)(f(cid:52)(cid:97)(cid:98))" (cid:58)(cid:49)(cid:58)(cid:49)( (cid:76)(cid:52) f (t ,t ))(cid:50)dt dt ,thenonecanshowthatthesideconditionsinsurethatJ (f )"0 im(cid:48)p(cid:48)lies(cid:76)t(cid:50)(cid:97)th(cid:76)ta(cid:98)(cid:50)t(cid:52)(cid:97)f(cid:98) "(cid:97) 0(cid:98). Lett(cid:97)ing(cid:98)(cid:106)"((cid:106) ,(cid:106) ,(cid:106)(cid:40)(cid:98)(cid:41),(cid:106)(cid:40)(cid:97)(cid:41),(cid:106) ) and (cid:97)(cid:98) (cid:52)(cid:97)(cid:98) (cid:52)(cid:97)(cid:98) (cid:97) (cid:98) (cid:97) (cid:98) (cid:97)(cid:98) (cid:74)(cid:106)(f)"(cid:106)(cid:97)J(cid:97)(f(cid:52)(cid:97))#(cid:106)(cid:98)J(cid:98)(f(cid:52)(cid:98))#(cid:106)(cid:40)(cid:97)(cid:98)(cid:41)J(cid:97)(f(cid:52)(cid:40)(cid:97)(cid:98)(cid:41))#(cid:106)(cid:98)(cid:40)(cid:97)(cid:41)J(cid:98)(f(cid:52)(cid:40)(cid:98)(cid:97)(cid:41))#(cid:106)(cid:97)(cid:98)J(cid:97)(cid:98)(f(cid:52)(cid:97)(cid:98)) (12) we estimate f by finding (cid:107),d ,d ,d , and f ,f ,f(cid:40)(cid:98)(cid:41),f(cid:40)(cid:97)(cid:41) and f to minimize (cid:97) (cid:98) (cid:97)(cid:98) (cid:52)(cid:97) (cid:52)(cid:98) (cid:52)(cid:97) (cid:52)(cid:98) (cid:52)(cid:97)(cid:98) (cid:73)(cid:106)(y,f)"(cid:76)(cid:106)(y,f)#(cid:74)(cid:106)(f). (13) Duetothesideconditions,asacomponentof(cid:106)becomeslarge,theestimateofthe‘smooth’term thatitmultipliesbecomessmall.Thus,agoodmethodforchoosingthecomponentsof(cid:106)fromthe data effectively eliminates unneeded terms in the expansion if one estimates their companion smoothing parameters as large. We can include categorical variables in the model, by, for example, letting (cid:75) f(t ,t ,z)"(cid:107)#f (t )#f (t )#f (t )#(cid:43) (cid:99) I (z) (14) (cid:97) (cid:98) (cid:97) (cid:97) (cid:98) (cid:98) (cid:97)(cid:98) (cid:97)(cid:44)(cid:98) (cid:107) (cid:107) (cid:107)(cid:47)(cid:50) wherezisavariablewithKpossiblevaluesz ,2,z andI (z)"1ifz"z and0otherwise.We (cid:49) (cid:75) (cid:107) (cid:107) then include the (cid:99)’s in the minimization of (13). Themathematicsbehindtherepresentationoff,thenumericalmethodfortheminimizationof (cid:73)(cid:106),thedata-basedchoiceof(cid:106)accordingtotheiterativeunbiasedriskmethod,andthecalculation oftheconfidenceintervalsaredescribedinWWGKK.Furtherdetailsappearinreference3and the documentation for GRKPACK. In the four analyses that we carry out below, there are betweend"2andd"4predictorvariables.Inthed"4case,consideringthepenaltyfunctional in(12),andincludingall ofthepossiblemaineffectsandtwofactorinteractiontermswiththeir own smoothing parameters, the result is 4 main effects smoothing parameters ((cid:106) ), 12 (cid:97) trend(cid:93)smoothsmoothingparameters((cid:106)(cid:40)(cid:98)(cid:41))and6smooth(cid:93)smoothparameters((cid:106) ),morethan (cid:97) (cid:97)(cid:98) wewouldwanttofitsimultaneouslywiththesamplesizeshere(alllessthan500).Wereducedthe (cid:40) 1997byJohnWiley&Sons,Ltd. Statist.Med.,Vol.16,1357—1376(1997) 1362 Y.WANGE„A‚. numberoftermswithsmoothingparameterstoamaximumof7basedonpreviouslypublished analyses of these data by more traditional methods, by extensive pre-screening, by fitting parametricpolynomialGLIMmodelswithtermsashighascubicordertodetectanddeleteterms thatfailedtogiveanyevidenceofsignificance,andbyfittingsomemarginalSS-ANOVAmodels asdemonstratedinSection6.Explorationofmoresystematicpre-screeningmethodsisanareaof activeresearch.Oncewehadreducedthenumberofsmoothingparametersto7orless,weused informal model selection methods as discussed in WWGKK. These included the deletion of component terms too small to have an observable effect on cross-sectional plots of f, and the deletion of terms whose componentwise confidence intervals included the 0 function. Further detailsspecifictoeachanalysisappearbelow.Leaving-out-one-thirdmodelselectionprocedures have been discussed in reference 20, but we have not used them here. Section 2 discusses the Wisconsin Epidemiologic Study of Diabetic Retinopathy, and Sections3,4,5and6discussfouranalysesofthisstudy.Section7isasummaryandconclusion. 2. WISCONSIN EPIDEMIOLOGICSTUDY OF DIABETIC RETINOPATHY(WESDR) TheWESDRisanongoingepidemiologicstudyofacohortofpatientswhoreceivetheirmedical care in an 11-county area in southern Wisconsin, who were first examined in 1980—1982, then again in 1984—1986 and 1990—1992. At this writing a third follow-up is currently in progress. Detaileddescriptionsof thedatahaveappearedinreferences27and28andreferencesthere.In brief,asampleof2990diabeticpatientswasselectedinan11-countyareainsouthernWisconsin. This sample consisted of two groups. The first group was 1210 patients diagnosed as diabetic beforetheywere30yearsofageandwhotookinsulin(‘youngeronsetgroup’).Thesecondgroup consistedof1780patientsdiagnosedwithdiabetesafter30yearsofage.Ofthese,824weretaking insulin (‘older onset group taking insulin’) and 956 were not (‘older onset group not taking insulin’).Ofthe2990eligiblepatients,2366participatedinthebaselineexaminationfrom1980to 1982. A large number of medical, demographic, ocular and other covariates were recorded at the baseline and later examinations along with a retinopathy score for each eye (to be described). Relationsbetweenvariouscovariatesandtheretinopathyscoreshavehadextensiveanalysiswith standardstatisticalmethodsincludingcategoricaldataanalysis andlinearlogistic models,with resultsreportedinaseriesof WESDRmanuscripts29—35.WeappliedSS-ANOVAmethodsto asubsetofthedatafromtheyoungeronsetgroupinWWGKKinconjunctionwithanextensive technicalaccountofthemathematicaltheoryandnumericalmethodsbehindthemethod.Itisthe purpose of this account to carry out four SS-ANOVA analyses on data from the older onset WESDR group, and in the process, to illustrate some of the more important features of the method. Our goal is to explain the possible results, advantages and disadvantages in a less technical manner, aimed at clinicians and medical researchers. We limited the present study to the construction of predictive models for incidence and for progression(tobedefined)ofdiabeticretinopathyatthefirstfollow-up,asafunctionofsomeof thecovariatesavailableatbaseline.Wedothisbothforthe‘olderonsetgrouptakinginsulin’(ID group) and the ‘older onset group not taking insulin’ (NID group). We analyse both ‘incident events’and‘progressionevents’forboththeIDandtheNIDgroups.Weonlylistthecovariates pertinent to our analysis: 1. Age: age at the examination (years); 2. Duration: duration of diabetes at the examination (years); 3. Glycosylated haemoglobin: a measure of hyperglycaemia (%);(cid:56) Statist.Med.,Vol.16,1357—1376(1997) (cid:40)1997byJohnWiley&Sons,Ltd. SMOOTHINGSPLINEANOVA 1363 4. Body mass index (bmi): weight in kg/(height in m)(cid:50); 5. Pulse rate counted for 30 seconds; 6. Baseline retinopathy severity levels (base-retinopathy-level).See explanation below. At the baseline and follow-up examinations, each eye was graded as one of the 6 levels: 10,21,31,41,51 and 60#, in order of increasing retinopathy severity with 10 indicating no retinopathyand60#indicatingthemostseverestage,proliferativeretinopathy.Wederivedthe retinopathy level for a participant by giving the eye with the higher level (more severe re- tinopathy) greater weight. For example, we specify the level for a participant with level 31 retinopathyineacheyewiththenotation‘level31/31’;foraparticipantwithlevel31inoneeye andlesssevereretinopathyintheothereyethenotationis‘level31/(31’.Thisschemeprovided an 11-step scale: 10/10; 21/(21; 21/21; 31/(31; 31/31; 41/(41; 41/41; 51/(51; 51/51; 60#/(60#,and60#/60#.Participantsintheanalysisforincidenceconsistedofsubjectswith level10/10atthebaselineandnomissingdata.Themodelprovidesanestimateoftheincidence rate,definedastheprobabilitythatsuchaparticipanthaslevel21/(21orworseatthefollow-up examination. Participants in the analysis for progression consisted of subjects with no or non-proliferativeretinopathyatbaselineandnomissingdata.Themodelprovidesanestimateof theprogressionrate,withprogressiondefinedasanincreaseinbaselinelevelbytwostepsormore (10/10 to 21/21 or greater, or 21/(21 to 31/(31 or greater, for instance). These analyses correspond to analyses previously carried out by traditional methods in some of the WESDR referencescited.Notethatthisallowssubjectsincludedintheincidenceanalysisalsotobepartof the progression analysis. 3. INCIDENCE IN THE OLDER ONSET GROUP NOT TAKING INSULIN After excluding participants with missing values, there were 297 participants in the older onset NID group with a baseline score of 10/10, thus qualifying them for inclusion in the NID ‘Incidence’analysis.Therewasoneobservationofglycosylatedhaemoglobinrecordedas23·6 per cent, much greater than the others. We decided to delete this influential observation. Our conclusions would remain the same if we included this observation in the analysis. Kleinetal.(cid:50)(cid:55)found,usinglinearlogisticmodels,thatglycosylatedhaemoglobinistheonly statisticallysignificantpredictorofincidenceofretinopathyinolderonsetpatients.Usinglinear logisticmodelsasascreeningtool,wefoundthattheeffectof ageis notlinearand thatthereis a strong interactionbetween age and glycosylated haemoglobin. We first fit an SS-ANOVA model with the main effects of age and glycosylated haemoglobinandallthreeinteractionterms.Themaineffectofglycosylatedhaemoglobin islinear.Allinteractiontermsexcepttrend(glycosylatedhaemoglobin)(cid:93)smooth(age)arenear zero. The final model is f(age, glycosylated haemoglobin) "(cid:107)#f (age)#a (cid:93)glycosylated haemoglobin (cid:49) (cid:49) #trend(glycosylated haemoglobin) (cid:93)smooth(age). (15) We plot age versus glycosylated haemoglobin on Figure 1(a). We have marked those participants with incident retinopathy as solid circles and those without as open circles. We superimposethecontourlinesofestimatedposteriorstandarddeviations.Thesecontoursagree wellwiththedistributionoftheobservations.Thuswecanusethemtodelineatearegioninwhich (cid:40) 1997byJohnWiley&Sons,Ltd. Statist.Med.,Vol.16,1357—1376(1997) 1364 Y.WANGE„A‚. Figure1. NID‘Incidence’analysis:(a)dataandcontoursofconstantposteriorstandarddeviation.Solidcirclesindicate incidenceandopencirclesindicatenon-incidence;(b)estimatedprobabilityofincidenceinthedefinedregion,asafunction ofageandglycosylatedhaemoglobin Statist.Med.,Vol.16,1357—1376(1997) (cid:40)1997byJohnWiley&Sons,Ltd. SMOOTHINGSPLINEANOVA 1365 Figure2. NID‘Incidence’analysis.Cross-sectionsofestimatedprobabilityofincidenceasafunctionofage,withtheir90 per cent Bayesian confidence intervals, at four quantiles of glycosylated haemoglobin. q1, q2, q3 and q4 are the quantilesat0·125,0·375,0·625and0·875 wedeem the estimateof the probabilityfunctionas reliable.We decidedto use the region with estimated posterior standard deviations less than or equal to 1, in the logit scale. We plot the probability functionestimate on Figure 1(b). Figure 2 gives cross-sectionsof the estimatedprobabilityofincidenceasafunctionofagewiththeir90percentBayesianconfidence intervals at the cross-sections, at four quantiles of glycosylated haemoglobin. The width of theconfidenceintervalssuggeststhatthesmallbumpsareprobablyanartifact.Thegeneralshape of the response, however, is evident, and it appears that the age effect is not particularly well modelled with a second or even a third degree polynomial. The risk for incidence of retinopathy increases with increasing glycosylated haemoglobin. The risk increases with increasing age for lower glycosylated haemoglobin. This increase might result fromthedevelopmentofotherconditionssuchashypertensionandatheroscleroticvasculardisease inoldercomparedtoyoungersubjectsandthatleadstoincreasedriskofretinopathy,evenwhen the glycosylated haemoglobin level is relatively low. The risk decreases with increasing age for higherglycosylatedhaemoglobin.Thisdecreasemayresultfrommortalitysinceanoldparticipant with high glycosylated haemoglobin has a higher mortality risk during the observation period.(cid:51)(cid:54) (cid:40) 1997byJohnWiley&Sons,Ltd. Statist.Med.,Vol.16,1357—1376(1997) 1366 Y.WANGE„A‚. Figure3. NID‘Progression’analysis.Maineffectsestimatesofglycosylatedhaemoglobinandbmialongwiththeir component-wise90percentBayesianconfidenceintervals,logitscale 4. PROGRESSION OF THE OLDER ONSET GROUP NOT TAKING INSULIN After excluding participants with missing values, there were 432 participants in the older onset NID group qualified for inclusion in the ‘Progression’ analysis. Klein et al.(cid:50)(cid:55) found that age, duration, and glycosylated haemoglobin were statistically significant in a linear logistic model.Usinglinearlogisticmodels,wefoundthatthedurationmaineffectofpolynomialsupto thecubicissignificantandthebmimaineffectofpolynomialsuptothequarticissignificant.We also found that the multiplication interaction between age and polynomials up to cubic of durationissignificant.Theseindicatethatalinearlogisticmodelwithlowerorderpolynomials may be inadequate. We decided to fit the SS-ANOVA model: f(age, duration, glycosylated haemoglobin, bmi) "(cid:107)#f (age)#f (duration)#f (age, duration) (cid:49) (cid:50) (cid:49)(cid:44)(cid:50) #f (glycosylated haemoglobin)#f (bmi). (16) (cid:51) (cid:52) Figure3plotsthemaineffectsestimatesandtheir90percentBayesianconfidenceintervalsfor glycosylatedhaemoglobinandbmi.Weseethattheeffectofglycosylatedhaemoglobinis strong.Theeffectofbmiissmall,and0iscontainedwithinitsconfidenceinterval.Theestimates areeffectivelylinearonalogitscaleeventhoughwealloweda‘smooth’termforeachofthemin the model. This illustrates the ability of the method to reduce to a partially parametric form if warranted by the data. The effects of glycosylated haemoglobin and bmi are additive in the logit scale. In the remaining plots, we fix glycosylated haemoglobin and bmi at their median values. WeplotageversusdurationonFigure4(a).Participantswhohadprogressionofretinopathy appear as solid circles and those with no progressionas open circles. We superimposecontour lines of the estimated posterior standard deviations as a function of age and duration with glycosylatedhaemoglobin and bmi fixed attheir medianvalues. Thesecontoursagree well with the distribution of the observations. We decided to use the region with estimated Statist.Med.,Vol.16,1357—1376(1997) (cid:40)1997byJohnWiley&Sons,Ltd.

Description:
Smoothing spline ANOVA (ANalysis Of VAriance) methods provide a flexible . here is based on (a special case of) the smoothing spline ANOVA method for.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.