Table Of Content

Applied Linear Regression Third Edition SANFORD WEISBERG University of Minnesota School of Statistics Minneapolis, Minnesota A JOHN WILEY & SONS, INC., PUBLICATION Copyright2005byJohnWiley&Sons,Inc.Allrightsreserved. PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey. PublishedsimultaneouslyinCanada. Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinanyform orbyanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise,exceptas permittedunderSection107or108ofthe1976UnitedStatesCopyrightAct,withouteithertheprior writtenpermissionofthePublisher,orauthorizationthroughpaymentoftheappropriateper-copyfee totheCopyrightClearanceCenter,Inc.,222RosewoodDrive,Danvers,MA01923,978-750-8400, fax978-646-8600,oronthewebatwww.copyright.com.RequeststothePublisherforpermission shouldbeaddressedtothePermissionsDepartment,JohnWiley&Sons,Inc.,111RiverStreet, Hoboken,NJ07030,(201)748-6011,fax(201)748-6008. LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbestefforts inpreparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyor completenessofthecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesof merchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysales representativesorwrittensalesmaterials.Theadviceandstrategiescontainedhereinmaynotbe suitableforyoursituation.Youshouldconsultwithaprofessionalwhereappropriate.Neitherthe publishernorauthorshallbeliableforanylossofprofitoranyothercommercialdamages,including butnotlimitedtospecial,incidental,consequential,orotherdamages. ForgeneralinformationonourotherproductsandservicespleasecontactourCustomerCare DepartmentwithintheU.S.at877-762-2974,outsidetheU.S.at317-572-3993orfax317-572-4002. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprint, however,maynotbeavailableinelectronicformat. LibraryofCongressCataloging-in-PublicationData: Weisberg,Sanford,1947– Appliedlinearregression/SanfordWeisberg.—3rded. p.cm.—(Wileyseriesinprobabilityandstatistics) Includesbibliographicalreferencesandindex. ISBN0-471-66379-4(acid-freepaper) 1.Regressionanalysis.I.Title.II.Series. QA278.2.W442005 (cid:1) 519.536—dc22 2004050920 PrintedintheUnitedStatesofAmerica. 10987654321 To Carol, Stephanie and to the memory of my parents Contents Preface xiii 1 Scatterplots and Regression 1 1.1 Scatterplots, 1 1.2 Mean Functions, 9 1.3 Variance Functions, 11 1.4 Summary Graph, 11 1.5 Tools for Looking at Scatterplots, 12 1.5.1 Size, 13 1.5.2 Transformations, 14 1.5.3 Smoothers for the Mean Function, 14 1.6 Scatterplot Matrices, 15 Problems, 17 2 Simple Linear Regression 19 2.1 Ordinary Least Squares Estimation, 21 2.2 Least Squares Criterion, 23 2.3 Estimating σ2, 25 2.4 Properties of Least Squares Estimates, 26 2.5 Estimated Variances, 27 2.6 Comparing Models: The Analysis of Variance, 28 2.6.1 The F-Test for Regression, 30 2.6.2 Interpreting p-values, 31 2.6.3 Power of Tests, 31 2.7 The Coefficient of Determination, R2, 31 2.8 Confidence Intervals and Tests, 32 2.8.1 The Intercept, 32 2.8.2 Slope, 33 vii viii CONTENTS 2.8.3 Prediction, 34 2.8.4 Fitted Values, 35 2.9 The Residuals, 36 Problems, 38 3 Multiple Regression 47 3.1 Adding a Term to a Simple Linear Regression Model, 47 3.1.1 Explaining Variability, 49 3.1.2 Added-Variable Plots, 49 3.2 The Multiple Linear Regression Model, 50 3.3 Terms and Predictors, 51 3.4 Ordinary Least Squares, 54 3.4.1 Data and Matrix Notation, 54 3.4.2 Variance-Covariance Matrix of e, 56 3.4.3 Ordinary Least Squares Estimators, 56 3.4.4 Properties of the Estimates, 57 3.4.5 Simple Regression in Matrix Terms, 58 3.5 The Analysis of Variance, 61 3.5.1 The Coefficient of Determination, 62 3.5.2 Hypotheses Concerning One of the Terms, 62 3.5.3 Relationship to the t-Statistic, 63 3.5.4 t-Tests and Added-Variable Plots, 63 3.5.5 Other Tests of Hypotheses, 64 3.5.6 Sequential Analysis of Variance Tables, 64 3.6 Predictions and Fitted Values, 65 Problems, 65 4 Drawing Conclusions 69 4.1 Understanding Parameter Estimates, 69 4.1.1 Rate of Change, 69 4.1.2 Signs of Estimates, 70 4.1.3 Interpretation Depends on Other Terms in the Mean Function, 70 4.1.4 Rank Deficient and Over-Parameterized Mean Functions, 73 4.1.5 Tests, 74 4.1.6 Dropping Terms, 74 4.1.7 Logarithms, 76 4.2 Experimentation Versus Observation, 77 CONTENTS ix 4.3 Sampling from a Normal Population, 80 4.4 More on R2, 81 4.4.1 Simple Linear Regression and R2, 83 4.4.2 Multiple Linear Regression, 84 4.4.3 Regression through the Origin, 84 4.5 Missing Data, 84 4.5.1 Missing at Random, 84 4.5.2 Alternatives, 85 4.6 Computationally Intensive Methods, 87 4.6.1 Regression Inference without Normality, 87 4.6.2 Nonlinear Functions of Parameters, 89 4.6.3 Predictors Measured with Error, 90 Problems, 92 5 Weights, Lack of Fit, and More 96 5.1 Weighted Least Squares, 96 5.1.1 Applications of Weighted Least Squares, 98 5.1.2 Additional Comments, 99 5.2 Testing for Lack of Fit, Variance Known, 100 5.3 Testing for Lack of Fit, Variance Unknown, 102 5.4 General F Testing, 105 5.4.1 Non-null Distributions, 107 5.4.2 Additional Comments, 108 5.5 Joint Confidence Regions, 108 Problems, 110 6 Polynomials and Factors 115 6.1 Polynomial Regression, 115 6.1.1 Polynomials with Several Predictors, 117 6.1.2 Using the Delta Method to Estimate a Minimum or a Maximum, 120 6.1.3 Fractional Polynomials, 122 6.2 Factors, 122 6.2.1 No Other Predictors, 123 6.2.2 Adding a Predictor: Comparing Regression Lines, 126 6.2.3 Additional Comments, 129 6.3 Many Factors, 130 6.4 Partial One-Dimensional Mean Functions, 131 6.5 Random Coefficient Models, 134 Problems, 137 x CONTENTS 7 Transformations 147 7.1 Transformations and Scatterplots, 147 7.1.1 Power Transformations, 148 7.1.2 Transforming Only the Predictor Variable, 150 7.1.3 Transforming the Response Only, 152 7.1.4 The Box and Cox Method, 153 7.2 Transformations and Scatterplot Matrices, 153 7.2.1 The 1D Estimation Result and Linearly Related Predictors, 156 7.2.2 Automatic Choice of Transformation of Predictors, 157 7.3 Transforming the Response, 159 7.4 Transformations of Nonpositive Variables, 160 Problems, 161 8 Regression Diagnostics: Residuals 167 8.1 The Residuals, 167 8.1.1 Difference Between eˆ and e, 168 8.1.2 The Hat Matrix, 169 8.1.3 Residuals and the Hat Matrix with Weights, 170 8.1.4 The Residuals When the Model Is Correct, 171 8.1.5 The Residuals When the Model Is Not Correct, 171 8.1.6 Fuel Consumption Data, 173 8.2 Testing for Curvature, 176 8.3 Nonconstant Variance, 177 8.3.1 Variance Stabilizing Transformations, 179 8.3.2 A Diagnostic for Nonconstant Variance, 180 8.3.3 Additional Comments, 185 8.4 Graphs for Model Assessment, 185 8.4.1 Checking Mean Functions, 186 8.4.2 Checking Variance Functions, 189 Problems, 191 9 Outliers and Influence 194 9.1 Outliers, 194 9.1.1 An Outlier Test, 194 9.1.2 Weighted Least Squares, 196 9.1.3 Significance Levels for the Outlier Test, 196 9.1.4 Additional Comments, 197 9.2 Influence of Cases, 198 9.2.1 Cook’s Distance, 198 CONTENTS xi 9.2.2 Magnitude of D , 199 i 9.2.3 Computing D , 200 i 9.2.4 Other Measures of Influence, 203 9.3 Normality Assumption, 204 Problems, 206 10 Variable Selection 211 10.1 The Active Terms, 211 10.1.1 Collinearity, 214 10.1.2 Collinearity and Variances, 216 10.2 Variable Selection, 217 10.2.1 Information Criteria, 217 10.2.2 Computationally Intensive Criteria, 220 10.2.3 Using Subject-Matter Knowledge, 220 10.3 Computational Methods, 221 10.3.1 Subset Selection Overstates Significance, 225 10.4 Windmills, 226 10.4.1 Six Mean Functions, 226 10.4.2 A Computationally Intensive Approach, 228 Problems, 230 11 Nonlinear Regression 233 11.1 Estimation for Nonlinear Mean Functions, 234 11.2 Inference Assuming Large Samples, 237 11.3 Bootstrap Inference, 244 11.4 References, 248 Problems, 248 12 Logistic Regression 251 12.1 Binomial Regression, 253 12.1.1 Mean Functions for Binomial Regression, 254 12.2 Fitting Logistic Regression, 255 12.2.1 One-Predictor Example, 255 12.2.2 Many Terms, 256 12.2.3 Deviance, 260 12.2.4 Goodness-of-Fit Tests, 261 12.3 Binomial Random Variables, 263 12.3.1 Maximum Likelihood Estimation, 263 12.3.2 The Log-Likelihood for Logistic Regression, 264 xii CONTENTS 12.4 Generalized Linear Models, 265 Problems, 266 Appendix 270 A.1 Web Site, 270 A.2 Means and Variances of Random Variables, 270 A.2.1 E Notation, 270 A.2.2 Var Notation, 271 A.2.3 Cov Notation, 271 A.2.4 Conditional Moments, 272 A.3 Least Squares for Simple Regression, 273 A.4 Means and Variances of Least Squares Estimates, 273 A.5 Estimating E(Y|X) Using a Smoother, 275 A.6 A Brief Introduction to Matrices and Vectors, 278 A.6.1 Addition and Subtraction, 279 A.6.2 Multiplication by a Scalar, 280 A.6.3 Matrix Multiplication, 280 A.6.4 Transpose of a Matrix, 281 A.6.5 Inverse of a Matrix, 281 A.6.6 Orthogonality, 282 A.6.7 Linear Dependence and Rank of a Matrix, 283 A.7 Random Vectors, 283 A.8 Least Squares Using Matrices, 284 A.8.1 Properties of Estimates, 285 A.8.2 The Residual Sum of Squares, 285 A.8.3 Estimate of Variance, 286 A.9 The QR Factorization, 286 A.10Maximum Likelihood Estimates, 287 A.11The Box-Cox Method for Transformations, 289 A.11.1 Univariate Case, 289 A.11.2 Multivariate Case, 290 A.12Case Deletion in Linear Regression, 291 References 293 Author Index 301 Subject Index 305 Preface Regressionanalysisanswersquestionsaboutthedependenceofaresponsevariable ononeormore predictors,includingpredictionoffuture valuesofaresponse,dis- covering which predictors are important, and estimating the impact of changing a predictororatreatmentonthevalueoftheresponse.Atthepublicationofthesec- ondeditionofthisbookabout20yearsago,regressionanalysisusingleastsquares was essentially the only methodology available to analysts interested in questions likethese.Cheap,widelyavailablehigh-speedcomputinghaschangedtherulesfor examining these questions. Modern competitors include nonparametricregression, neuralnetworks, supportvector machines, and tree-basedmethods, among others. Anewfieldofcomputerscience,calledmachinelearning,addsdiversity,andcon- fusion,tothemix. Withthe availabilityofsoftware,usinganeuralnetworkorany of these other methods seems to be just as easy as using linear regression. So, a reasonable question to ask is: Who needs a revisedbook on linear regression using ordinary least squares when all these other newer and, presumably, better methods exist? This question has several answers. First, most other modern regression modeling methods are really just elaborations or modifications of linear regression modeling. To understand, as opposed to use, neural networks or the support vector machine is nearly impossible without a good understanding of linearregressionmethodology. Second,linearregressionmethodologyisrelatively transparent, as will be seen throughout this book. We can draw graphs that will generally allow us to see relationships between variables and decide whether the modelsweareusingmakeanysense.Manyofthemoremodernmethodsaremuch like a black box in which data are stuffed in at one end and answers pop out at the other, without much hope for the nonexpert to understand what is going on inside the box. Third, if you know how to do something in linear regression, the same methodology with only minor adjustments will usually carry over to other regression-type problems for which least squares is not appropriate. For example, the methodology for comparingresponse curves for different values of a treatment variablewhentheresponseiscontinuousisstudiedinChapter 6ofthisbook.Anal- ogous methodology canbe usedwhenthe responseis a possiblycensoredsurvival time, even though the method of fitting needs to be appropriate for the censored response and not least squares. The methodology of Chapter 6 is useful both in its xiii

Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) PDF

329 Pages·2005·3.573 MB·English

by Sanford Weisberg

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) PDF Free - Full Version

by Sanford Weisberg| 2005| 329 pages| 3.573| English

Download Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) by Sanford Weisberg in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics)

No description available for this book.

Detailed Information

Author:	Sanford Weisberg
Publication Year:	2005
ISBN:	471663794
Pages:	329
Language:	English
File Size:	3.573
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) PDF?

Yes, on https://PDFdrive.to you can download Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) by Sanford Weisberg completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) on my mobile device?

After downloading Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics)?

Yes, this is the complete PDF version of Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) by Sanford Weisberg. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Applied Linear Regression, Third Edition (Wiley Series in Probability and Statistics) PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.