ebook img

Generalized Additive Models PDF

397 Pages·2005·10.55 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Generalized Additive Models

Generalized Additive Models: an introduction with R COPYRIGHT CRC DO NOT DISTRIBUTE Simon N.Wood Contents Preface xi 1 LinearModels 1 1.1 Asimplelinearmodel 2 Simpleleastsquaresestimation 3 1.1.1 Samplingpropertiesofβˆ 3 1.1.2 Sohowoldistheuniverse? 5 1.1.3 Addingadistributionalassumption 7 Testinghypothesesaboutβ 7 Confidenceintervals 9 1.2 Linearmodelsingeneral 10 1.3 Thetheoryoflinearmodels 12 1.3.1 Leastsquaresestimationofβ 12 1.3.2 Thedistributionofβˆ 13 1.3.3 (βˆ β )/σˆ t 14 i− i βˆi ∼ n−p 1.3.4 F-ratioresults 15 1.3.5 Theinfluencematrix 16 1.3.6 Theresiduals,(cid:15)ˆ,andfittedvalues,µˆ 16 1.3.7 ResultsintermsofX 17 1.3.8 The Gauss Markov Theorem: what’s special about least squares? 17 1.4 Thegeometryoflinearmodelling 18 1.4.1 Leastsquares 19 1.4.2 Fittingbyorthogonaldecompositions 20 iii iv CONTENTS 1.4.3 Comparisonofnestedmodels 21 1.5 Practicallinearmodels 22 1.5.1 Modelfittingandmodelchecking 23 1.5.2 Modelsummary 28 1.5.3 Modelselection 30 1.5.4 Anothermodelselectionexample 31 Afollowup 34 1.5.5 Confidenceintervals 35 1.5.6 Prediction 36 1.6 Practicalmodellingwithfactors 36 1.6.1 Identifiability 37 1.6.2 Multiplefactors 39 1.6.3 ‘Interactions’offactors 40 1.6.4 UsingfactorvariablesinR 41 1.7 GenerallinearmodelspecificationinR 44 1.8 Furtherlinearmodellingtheory 45 1.8.1 ConstraintsI:generallinearconstraints 45 1.8.2 ConstraintsII:‘contrasts’andfactorvariables 46 1.8.3 Likelihood 48 1.8.4 Non-independentdatawithvariablevariance 49 1.8.5 AICandMallow’sstatistic, 50 1.8.6 Non-linearleastsquares 52 1.8.7 Furtherreading 54 1.9 Exercises 55 2 GeneralizedLinearModels 59 2.1 ThetheoryofGLMs 60 2.1.1 Theexponentialfamilyofdistributions 62 2.1.2 FittingGeneralizedLinearModels 63 2.1.3 The IRLS objective is a quadratic approximation to the log-likelihood 66 CONTENTS v 2.1.4 AICforGLMs 67 2.1.5 Largesampledistributionofβˆ 68 2.1.6 Comparingmodelsbyhypothesistesting 68 Deviance 69 Modelcomparisonwithunknownφ 70 2.1.7 φˆandPearson’sstatistic 70 2.1.8 Canonicallinkfunctions 71 2.1.9 Residuals 72 PearsonResiduals 72 DevianceResiduals 73 2.1.10 Quasi-likelihood 73 2.2 GeometryofGLMs 75 2.2.1 ThegeometryofIRLS 76 2.2.2 GeometryandIRLSconvergence 79 2.3 GLMswithR 80 2.3.1 Binomialmodelsandheartdisease 80 2.3.2 APoissonregressionepidemicmodel 87 2.3.3 Log-linearmodelsforcategoricaldata 92 2.3.4 SoleeggsintheBristolchannel 96 2.4 Likelihood 101 2.4.1 Invariance 102 2.4.2 Propertiesoftheexpectedlog-likelihood 102 2.4.3 Consistency 105 2.4.4 Largesampledistributionofθˆ 107 2.4.5 Thegeneralizedlikelihoodratiotest(GLRT) 107 2.4.6 Derivationof2λ χ2underH 108 ∼ r 0 2.4.7 AICingeneral 110 2.4.8 Quasi-likelihoodresults 112 2.5 Exercises 114 vi CONTENTS 3 IntroducingGAMs 119 3.1 Introduction 119 3.2 Univariatesmoothfunctions 120 3.2.1 Representingasmoothfunction:regressionsplines 120 Averysimpleexample:apolynomialbasis 120 Anotherexample:acubicsplinebasis 122 Usingthecubicsplinebasis 124 3.2.2 Controllingthedegreeofsmoothingwithpenalizedregres- sionsplines 126 3.2.3 Choosingthesmoothingparameter,λ:crossvalidation 128 3.3 AdditiveModels 131 3.3.1 Penalized regression spline representation of an additive model 132 3.3.2 Fittingadditivemodelsbypenalizedleastsquares 132 3.4 GeneralizedAdditiveModels 135 3.5 Summary 137 3.6 Exercises 138 4 SomeGAMtheory 141 4.1 Smoothingbases 142 4.1.1 Whysplines? 142 Naturalcubicsplinesaresmoothestinterpolators 142 Cubicsmoothingsplines 144 4.1.2 Cubicregressionsplines 145 4.1.3 Acycliccubicregressionspline 147 4.1.4 P-splines 148 4.1.5 Thinplateregressionsplines 150 Thinplatesplines 150 Thinplateregressionsplines 153 Propertiesofthinplateregressionsplines 154 Knotbasedapproximation 156 4.1.6 Shrinkagesmoothers 156 CONTENTS vii 4.1.7 Choosingthebasisdimension 157 4.1.8 Tensorproductsmooths 158 Tensorproductbases 158 Tensorproductpenalties 161 4.2 SettingupGAMsaspenalizedGLMs 163 4.2.1 Variablecoefficientmodels 164 4.3 JustifyingP-IRLS 165 4.4 Degreesoffreedomandresidualvarianceestimation 166 4.4.1 Residualvarianceorscaleparameterestimation 167 4.5 SmoothingParameterEstimationCriteria 168 4.5.1 Knownscaleparameter:UBRE 168 4.5.2 Unknownscaleparameter:CrossValidation 169 ProblemswithOrdinaryCrossValidation 170 4.5.3 GeneralizedCrossValidation 171 4.5.4 GCV/UBRE/AICintheGeneralizedcase 173 ApproachestoGAMGCV/UBREminimization 175 4.6 NumericalGCV/UBRE:performanceiteration 177 4.6.1 MinimizingtheGCVorUBREscore 177 Stableandefficientevaluationofthescoresandderivatives 178 Theweightedconstrainedcase 181 4.7 NumericalGCV/UBREoptimizationbyouteriteration 182 4.7.1 DifferentiatingtheGCV/UBREfunction 182 4.8 Distributionalresults 185 4.8.1 Bayesianmodel,andposteriordistributionoftheparameters, foranadditivemodel 185 4.8.2 Structureoftheprior 187 4.8.3 PosteriordistributionforaGAM 187 4.8.4 Bayesian confidence intervals for non-linear functions of parameters 190 4.8.5 P-values 190 4.9 Confidenceintervalperformance 192 viii CONTENTS 4.9.1 Singlesmooths 192 4.9.2 GAMsandtheircomponents 195 4.9.3 UnconditionalBayesianconfidenceintervals 198 4.10 FurtherGAMtheory 200 4.10.1 ComparingGAMsbyhypothesistesting 200 4.10.2 ANOVAdecompositionsandNesting 202 4.10.3 Thegeometryofpenalizedregression 204 4.10.4 The“natural”parameterizationofapenalizedsmoother 205 4.11 OtherapproachestoGAMs 208 4.11.1 BackfittingGAMs 209 4.11.2 Generalizedsmoothingsplines 211 4.12 Exercises 213 5 GAMsinpractice:mgcv 217 5.1 Cherrytreesagain 217 5.1.1 Finercontrolofgam 219 5.1.2 Smoothsofseveralvariables 221 5.1.3 Parametricmodelterms 224 5.2 BrainImagingExample 226 5.2.1 PreliminaryModelling 228 5.2.2 Wouldanadditivestructurebebetter? 232 5.2.3 Isotropicortensorproductsmooths? 233 5.2.4 Detectingsymmetry(withbyvariables) 235 5.2.5 Comparingtwosurfaces 237 5.2.6 Predictionwithpredict.gam 239 Predictionwithlpmatrix 241 5.2.7 Variancesofnon-linearfunctionsofthefittedmodel 242 5.3 AirPollutioninChicagoExample 243 5.4 Mackereleggsurveyexample 249 5.4.1 Modeldevelopment 250 5.4.2 Modelpredictions 255 CONTENTS ix 5.5 Portugueselarks 257 5.6 Otherpackages 261 5.6.1 Packagegam 261 5.6.2 Packagegss 263 5.7 Exercises 265 6 Mixedmodels:GAMMs 273 6.1 Mixedmodelsforbalanceddata 273 6.1.1 Amotivatingexample 273 Thewrongapproach:afixedeffectslinearmodel 274 Therightapproach:amixedeffectsmodel 276 6.1.2 Generalprinciples 277 6.1.3 Asinglerandomfactor 278 6.1.4 Amodelwithtwofactors 281 6.1.5 Discussion 286 6.2 Linearmixedmodelsingeneral 287 6.2.1 Estimationoflinearmixedmodels 288 6.2.2 DirectlymaximizingamixedmodellikelihoodinR 289 6.2.3 Inferencewithlinearmixedmodels 290 Fixedeffects 290 Inferenceabouttherandomeffects 291 6.2.4 Predictingtherandomeffects 292 6.2.5 REML 293 TheexplicitformoftheREMLcriterion 295 6.2.6 Alinkwithpenalizedregression 296 6.3 LinearmixedmodelsinR 297 6.3.1 TreeGrowth:anexampleusinglme 298 6.3.2 Severallevelsofnesting 303 6.4 Generalizedlinearmixedmodels 303 6.5 GLMMswithR 305 6.6 GeneralizedAdditiveMixedModels 309 x CONTENTS 6.6.1 Smoothsasmixedmodelcomponents 309 6.6.2 InferencewithGAMMs 311 6.7 GAMMswithR 312 6.7.1 AGAMMforsoleeggs 312 6.7.2 TheTemperatureinCairo 314 6.8 Exercises 318 A SomeMatrixAlgebra 325 A.1 Basiccomputationalefficiency 325 A.2 Covariancematrices 326 A.3 Differentiatingamatrixinverse 326 A.4 Kroneckerproduct 327 A.5 OrthogonalmatricesandHouseholdermatrices 327 A.6 QRdecomposition 328 A.7 Choleskidecomposition 328 A.8 Eigen-decomposition 329 A.9 Singularvaluedecomposition 330 A.10Pivoting 331 A.11Lanczositeration 331 B Solutionstoexercises 335 B.1 Chapter1 335 B.2 Chapter2 340 B.3 Chapter3 345 B.4 Chapter4 347 B.5 Chapter5 354 B.6 Chapter6 363 Bibliography 373 Index 378

Description:
A Generalized Additive Model (GAM) is a GLM in which part of the linear pre- dictor is specified in what I've done is utter rubbish. Working through
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.