ebook img

Regression Analysis: Theory, Methods, and Applications PDF

360 Pages·1990·6.491 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Regression Analysis: Theory, Methods, and Applications

Springer Texts in Statistics Advisors: George Casella Stephen Fienberg Ingram Olkin Springer New York Berlin Heidelberg Barcelona Hong Kong London Milan Paris Singapore Tokyo Springer Texts in Statistics Alfred: Elements of Statistics for the Life and Social Sciences Berger: An Introduction to Probability and Stochastic Processes Bilodeau and Brenner: Theory of Multivariate Statistics Blom: Probability and Statistics: Theory and Applications Brockwell and Davis: An Introduction to Times Series and Forecasting Chow and Teicher: Probability Theory: Independence, Interchangeability, Martingales, Third Edition Christensen: Plane Answers to Complex Questions: The Theory of Linear Models, Second Edition Christensen: Linear Models for Multivariate, Time Series, and Spatial Data Christensen: Log-Linear Models and Logistic Regression, Second Edition Creighton: A First Course in Probability Models and Statistical Inference Dean and Voss: Design and Analysis of Experiments du Toit, Steyn, and Stumpf Graphical Exploratory Data Analysis Durrett: Essentials of Stochastic Processes Edwards: Introduction to Graphical Modelling, Second Edition Finkelstein and Levin: Statistics for Lawyers Flury: A First Course in Multivariate Statistics Jobson: Applied Multivariate Data Analysis, Volume I: Regression and Experimental Design Jobson: Applied Multivariate Data Analysis, Volume II: Categorical and Multivariate Methods Kalbfleisch: Probability and Statistical Inference, Volume I: Probability, Second Edition Kalbfleisch: Probability and Statistical Inference, Volume II: Statistical Inference, Second Edition Karr: Probability Keyjitz: Applied Mathematical Demography, Second Edition Kiefer: Introduction to Statistical Inference Kokoska and Nevison: Statistical Tables and Formulae Kulkarni: Modeling, Analysis, Design, and Control of Stochastic Systems Lehmann: Elements of Large-Sample Theory Lehmann: Testing Statistical Hypotheses, Second Edition Lehmann and Casella: Theory of Point Estimation, Second Edition Lindman: Analysis of Variance in Experimental Design Lindsey: Applying Generalized Linear Models Madansky: Prescriptions for Working Statisticians McPherson: Statistics in Scientific Investigation: Its Basis, Application, and Interpretation Mueller: Basic Principles of Structural Equation Modeling: An Introduction to LISREL and EQS (continued after index) Ashish Sen Muni Srivastava Regression Analysis Theory, Methods, and Applications With 38 Illustrations i Springer Ashish Sen Muni Srivastava College of Architecture, Art, and Urban Planning Department of Statistics School of Urban Planning and Policy University of Toronto The University of Illinois Toronto, Ontario Chicago, IL 60680 Canada M5S lAl USA Editorial Board George Casella Stephen Fienberg Ingram 01kin Biometrics Unit Department of Statistics Department of Statistics Cornell University Carnegie-Mellon U ni versi ty Stanford University Ithaca, NY 14853-7801 Pittsburgh, PA 15213 Stanford, CA 94305 USA USA USA Mathematical Subject Classification: 62Jxx, 62-01 Library of Congress Cataloging-in-Publication Data Sen, Ashish K. Regression analysis: Theory, methods, and applications/Ashish Sen, Muni Srivastava. p. cm.-(Springer texts in statistics) ISBN-J3: 978-1-4612-8789-6 I. Regression analysis. I. Srivastava, M.S. II. Title. III. Series. QA278.2.S46 1990 519.5'36---dc20 89-48506 Printed on acid-free paper. © 1990 Springer-Verlag New York Inc. Softcover reprint of the hardcover I st edition 1990 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Photcomposed copy prepared from the authors' LATEX file. 9 8 7 6 5 ISBN-13: 978-1-4612-8789-6 e-ISBN-13: 978-1-4612-4470-7 001: 10.1007/978-1-4612-4470-7 To Ashoka Kumar Sen and the memory of Jagdish Bahadur Srivastava Preface Any method of fitting equations to data may be called regression. Such equations are valuable for at least two purposes: making predictions and judging the strength of relationships. Because they provide a way of em pirically identifying how a variable is affected by other variables, regression methods have become essential in a wide range of fields, including the social sciences, engineering, medical research and business. Of the various methods of performing regression, least squares is the most widely used. In fact, linear least squares regression is by far the most widely used of any statistical technique. Although nonlinear least squares is covered in an appendix, this book is mainly about linear least squares applied to fit a single equation (as opposed to a system of equations). The writing of this book started in 1982. Since then, various drafts have been used at the University of Toronto for teaching a semester-long course to juniors, seniors and graduate students in a number of fields, including statistics, pharmacology, engineering, economics, forestry and the behav ioral sciences. Parts of the book have also been used in a quarter-long course given to Master's and Ph.D. students in public administration, urban plan ning and engineering at the University of Illinois at Chicago (UIC). This experience and the comments and criticisms from students helped forge the final version. The book offers an up-to-date account of the theory and methods of regression analysis. We believe our treatment of theory to be the most complete of any book at this level. The methods provide a comprehensive toolbox for the practicing regressionist. The examples, most of them drawn from 'real life' , illustrate the difficulties commonly encountered in the prac tice of regression, while the solutions underscore the subjective judgments the practitioner must make. Each chapter ends with a large number of exer cises that supplement and reinforce the discussions in the text and provide valuable practical experience. When the reader has mastered the contents of this book, he or she will have gained both a firm foundation in the the ory of regression and the experience necessary to competently practice this valuable craft. A first course in mathematical statistics, the ability to use statistical computer packages and familiarity with calculus and linear algebra are viii Preface prerequisites for the study of this book. Additional statistical courses and a good knowledge of matrices would be helpful. This book has twelve chapters. The Gauss-Markov Conditions are as sumed to hold in the discussion of the first four chapters; the next five chapters present methods to alleviate the effects of violations of these con ditions. The final three chapters discuss the somewhat related topics of multicollinearity, variable search and biased estimation. Relevant matrix and distribution theory is surveyed in the first two appendices at the end of the book, which are intended as a convenient reference. The last appendix covers nonlinear regression. Chapters and sections that some readers might find more demanding are identified with an asterisk or are placed in appendices to chapters. A reader can navigate around these without losing much continuity. In fact, a reader who is primarily interested in applications may wish to omit many of the other proofs and derivations. Difficult exercises have also been marked with asterisks. Since the exercises and examples use over 50 data sets, a disk containing most of them is provided with the book. The READ.ME file in the disk gives further information on its contents. This book would have been much more difficult, if not impossible, to write without the help of our colleagues and students. We are especially grateful to Professor Siim Soot, who examined parts of the book and was an all-round friend; George Yanos of the Computer Center at VIC, whose instant E-mail responses to numerous cries for help considerably shortened the time to do the numerical examples (including those that were ulti mately not used); Dr. Chris Johnson, who was a research associate of one of the authors during the time he learnt most about the practical art of regression; Professor Michael Dacey, who provided several data sets and whose encouragement was most valuable; and to Professor V. K. Srivas tava whose comments on a draft of the book were most useful. We also learnt a lot from earlier books on the subject, particularly the first editions of Draper and Smith (1966) and Daniel and Wood (1971), and we owe a debt of gratitude to their authors. Numerous present and former students of both authors contributed their time in editing and proof-reading, checking the derivations, inputting data, drawing diagrams and finding data-sets. Soji Abass, Dr. Martin Bilodeau, Robert Drozd, Andrea Fraser, Dr. Sucharita Ghosh, Robert Gray, Neleema Grover, Albert Hoang, M.R. Khavanin, Supin Li, Dr. Claire McKnight, Cresar Singh, Yanhong Wu, Dr. Y. K. Yau, Seongsun Yun and Zhang Ting wei constitute but a partial list of their names. We would like to single out for particular mention Marguerite Ennis and Piyushimita Thakuriah for their invaluable help in completing the manuscript. Linda Chambers JEXed an earlier draft of the manuscript, Barry Grau was most helpful identifying computer programs, some of which are referred to in the text, Marilyn Engwall did the paste-up on previous drafts, Ray Brod drew one of the Preface ix figures and Bobbie Albrecht designed the cover. We would like to express our gratitude to all of them. A particular thanks is due to Dr. Colleen Sen who painstakingly edited and proofread draft after draft. We also appreciate the patience of our colleagues at UIC and the Uni versity of Toronto during the writing of this book. The editors at Springer Verlag, particularly Susan Gordon, were most supportive. We would like to gratefully acknowledge the support of the Natural Sciences and Engi neering Research Council of Canada and the National Science Foundation of the U.S. during the time this book was in preparation. The help of the Computer Center at UIC which made computer time freely available was indispensable. Preface to the Fourth Printing We have taken advantage of this as well as previous reprintings to correct several typographic errors. In addition, two exercises have been changed. One because it required too much effort and another because we were able to replace it with problems we found more interesting. In order to keep the price of the book reasonable, the data disk has is no longer included. Its contents have been placed at web sites from which they may be downloaded. The URLs are http://VIWW.springer-ny . com andhttp://VIWW.uic.edu/-ashish/regression.html. Contents 1 Introduction 1 1.1 Relationships . . . . . . . . . . . . . . . . . . . 1 1.2 Determining Relationships: A Specific Problem 2 1.3 The Model ............... 5 1.4 Least Squares . . . . . . . . . . . . . . . . 7 1.5 Another Example and a Special Case. . . 10 1.6 When Is Least Squares a Good Method? . 11 1. 7 A Measure of Fit for Simple Regression 13 1.8 Mean and Variance of bo and b1 . 14 1.9 Confidence Intervals and Tests 17 1.10 Predictions . . . . . . . 18 Appendix to Chapter 1. 20 Problems ..... 23 2 Multiple Regression 28 2.1 Introduction............... 28 2.2 Regression Model in Matrix Notation. 28 2.3 Least Squares Estimates. 30 2.4 Examples . . . . . . . . . . . . . . . . 31 2.5 Gauss-Markov Conditions . . . . . . . 35 2.6 Mean and Variance of Estimates Under G-M Conditions 35 2.7 Estimation of 0-2 • • • • • • • 37 2.8 Measures of Fit . . . . . . . . 39 2.9 The Gauss-Markov Theorem 41 2.10 The Centered Model . . . . 42 2.11 Centering and Scaling . . . 44 2.12 *Constrained Least Squares 44 Appendix to Chapter 2 . 46 Problems .......... 49 3 Tests and Confidence Regions 60 3.1 Introduction.... 60 3.2 Linear Hypothesis ...... 60 3.3 *Likelihood Ratio Test . . . . 62 3.4 *Distribution of Test Statistic 64 3.5 Two Special Cases . . . . . . 65 3.6 Examples . . . . . . . . . . . 66 3.7 Comparison of Regression Equations 67 3.8 Confidence Intervals and Regions . . 71 3.8.1 C.l. for the Expectation of a Predicted Value 71 3.8.2 C.l. for a Future Observation . . . . . . . . . 71 3.8.3 *Confidence Region for Regression Parameters 72 3.8.4 *C.l.'s for Linear Combinations of Coefficients 73 Problems ......................... 74 4 Indicator Variables 83 4.1 Introduction........ 83 4.2 A Simple Application .. 83 4.3 Polychotomous Variables. 84 4.4 Continuous and Indicator Variables . 88 4.5 Broken Line Regression ..... . 89 4.6 Indicators as Dependent Variables 92 Problems ............. . 95 5 The Normality Assumption 100 5.1 Introduction ...... . 100 5.2 Checking for Normality . 101 5.2.1 Probability Plots . 101 5.2.2 Tests for Normality. 105 5.3 Invoking Large Sample Theory 106 5.4 *Bootstrapping ... 107 5.5 *A symptotic Theory 108 Problems .... 110 6 Unequal Variances 111 6.1 Introduction.......... 111 6.2 Detecting Heteroscedasticity . 111 6.2.1 Formal Tests ..... 114 6.3 Variance Stabilizing Transformations 115 6.4 Weighting 118 Problems .... 128 7 * Correlated Errors 132 7.1 Introduction ... 132 n 7.2 Generalized Least Squares: Case When Is Known 133 7.3 Estimated Generalized Least Squares . . . . . . 134 7.3.1 Error Variances Unequal and Unknown 134 7.4 Nested Errors . . . . . . . . . . . . . . . . . . . 136

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.