ebook img

Applied logistic regression (Wiley Series in probability and statistics) PDF

396 Pages·2000·4.42 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Applied logistic regression (Wiley Series in probability and statistics)

Wiley Seriu in Probability and Stlltisrics Applied Logistic Regression Second Edition David W Hosmer Stanley Lemeshow WILEY SERIES fN PROBABILITY AND STATISTICS TEXTS AND REFERENCES SECTION Established by WALTER A. SHEWHART and SAMUELS. WILKS Editors: Noel A. C. Cressie, Nicholas I. Fisher, lain M Johnstone, J. B. Kadane, David W. Scott, Bernard W. Silverman, Adrian F. M. Smith, Jozef L. Teugels; Vic Barnett, Emeritus, Ralph A. Bradley, Emeritus, J. Stuart Hunter, Emeritus, David G. Kendall, Emeritus A complete list of the titles in this series appears at the end of this volume. Applied Logistic Regression Second Edition DAVID W. HOSMER University ofM assachusetts Amherst, Massachusetts STANLEY LEMESHOW The Ohio State University Columbus, Ohio A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York • Chichester • Weinheim • Brisbane • Singapore • Toronto To Trina, Wylie, Tri, D. W.H. To Elaine, Jenny, Adina, Steven, S. L. This text is printed on acid-tree paper. @ Copyright© 2000 by John Wiley & Sons, Inc. All rights reserved. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (20 1) 748-6011, fax (20 1) 748-6008, E-Mail: PERMREQ@ WILEY. COM. To order books or for customer service please, calll(800)-CALL-WILEY (225-5945). Library of Congress Cataloging in Publication Data: Hosmer, David W. Applied logistic regression I David W. Hosmer. Jr., Stanley Lemeshow.-2nd ed. p. em. Includes bibliographical references and index. ISBN 0-471-35632-8 (cloth : alk. paper) I. Regression analysis. I. Lemeshow, Stanley. II. Title. QA278.2.H67 2000 519.5'36-dc21 00-036843 Printed in the United States of America 10 9 8 7 6 5 4 CONTENTS 1 Introduction to the Logistic Regression Model 1 1.1 Introduction, I 1.2 Fitting the Logistic Regression Model, 7 1.3 Testing for the Significance ofthe Coefficients, 11 1.4 Confidence Interval Estimation, 17 1.5 Other Methods of Estimation, 21 I .6 Data Sets, 23 1.6.1 The ICU Study, 23 1.6.2 The Low Birth Weight Study, 25 1.6.3 The Prostate Cancer Study, 26 1.6.4 The UMARU IMPACT Study, 27 Exercises, 28 2 MuJtiple Logistic Regression 31 2.1 Introduction, 31 2.2 The Multiple Logistic Regression Model, 31 2.3 Fitting the Multiple Logistic Regression Model, 33 2.4 Testing for the Significance of the Model, 36 2.5 Confidence Interval Estimation, 40 2.6 Other Methods of Estimation, 43 Exercises, 44 3 Interpretation of the Fitted Logistic Regression Model 47 3.1 Introduction, 47 3.2 Dichotomous Independent Variable, 48 3.3 Polychotomous Independent Variable, 56 3.4 Continuous Independent Variable, 63 3.5 The Multivariable Model, 64 3.6 Interaction and Confounding, 70 3.7 Estimation of Odds Ratios in the Presence of Interaction, 74 3.8 A Comparison of Logistic Regression and Stratified Analysis for 2 x 2 Tables, 79 3.9 Interpretation ofthe Fitted Values, 85 Exercises, 88 4 Model-Building Strategies and Methods for v vi CONTENTS Logistic Regression 91 4.1 Introduction, 91 4.2 Variable Selection, 92 4.3 Stepwise Logistic Regression, 116 4.4 Best Subsets Logistic Regression, 128 4.5 Numerical Problems, 135 Exercises, 142 5 Assessing the Fit of the Model 143 5.1 Introduction, 143 5.2 Summary Measures of Goodness-of-Fit, 144 5.2.1 Pearson Chi-Square Statistic and Deviance, 145 5.2.2 The Hosmer-Lemeshow Tests, 147 5.2.3 Classification Tables, 156 5.2.4 Area Under the ROC Curve, 160 5.2.5 Other Summary Measures, 164 5.3 Logistic Regression Diagnostics, 167 5.4 Assessment of Fit via External Validation, 186 5.5 Interpretation and Presentation of Results from a Fitted Logistic Regression Model, 188 Exercises, 200 6 Application of Logistic Regression with Different Sampling Models 203 6.1 Introduction, 203 6.2 Cohort Studies, 203 6.3 Case-Control Studies, 205 6.4 Fitting Logistic Regression Models to Data from Complex Sample Surveys, 211 Exercises, 222 7 Logistic Regression for Matched Case-Control Studies 223 7 .I Introduction, 223 7.2 Logistic Regression Analysis for the 1-1 Matched Study, 226 7.3 An Example of the Use of the Logistic Regression Model in a 1-1 Matched Study, 230 7.4 Assessment of Fit in a Matched Study, 236 7.5 An Example ofthe Use of the Logistic Regression Model in a 1-M Matched Study, 243 7.6 Methods for Assessment of Fit in a 1-M CONTENTS vii Matched Study, 248 7. 7 An Example of Assessment of Fit in a 1-M Matched Study, 252 Exercises, 259 8 Special Topics 260 8.1 The Multinomial Logistic Regression Model, 260 8.1.1 Introduction to the Model and Estimation of the Parameters, 260 8.1.2 Interpreting and Assessing the Significance of the Estimated Coefficients, 264 8.1.3 Model-Building Strategies for Multinomial Logistic Regression, 273 8 .1.4 Assessment of Fit and Diagnostics for the Multinomial Logistic Regression Model, 280 8.2 Ordinal Logistic Regression Models, 288 8.2.1 Introduction to th~ Models, Methods for Fitting and Interpretation of Model Parameters, 288 8.2.2 Model Building Strategies for Ordinal Logistic Regression Models, 305 8.3 Logistic Regression Models for the Analysis of Correlated Data, 308 8.4 Exact Methods for Logistic Regression Models, 330 8.5 Sample Size Issues .When Fitting Logistic Regression Models, 339 Exercises, 347 Addendum 352 References 354 Index 369 This page intentionally left blank Preface To The Second Edition The use of logistic regression modeling has exploded during the past decade. From its original acceptance in epidemiologic research, the method is now commonly employed in many fields including but not nearly limited to biomedical research, business and finance, criminol ogy, ecology, engineering, health policy, linguistics and wildlife biol ogy. At the same time there has been an equal amount of effort in re search on all statistical aspects of the logistic regression model. A lit erature search that we did in preparing this Second Edition turned up more than 1000 citations that have appeared in the 10 years since the First Edition of this book was published. When we worked on the First Edition of this book we were very lim ited by software that could carry out the kinds of analyses we felt were important. Specifically, beyond estimation of regression coefficients, we were interested in such issues as measures of model performance, diagnostic statistics, conditional analyses and multinomial response data. Software is now readily available in numerous easy to use and widely available statistical packages to address these and other extremely im portant modeling issues. Enhancements to these capabilities are being added to each new version. As is well-recognized in the statistical com munity, the inherent danger of this easy-to-use software is that investi gators are using a very powerful tool about which they may have only limited understanding. It is our hope that this Second Edition will bridge the gap between the outstanding theoretical developments and the need to apply these methods to diverse fields of inquiry. Numerous texts have sections containing a limited discussion of lo gistic regression modeling but there are still very few comprehensive texts on this subject. Among the textbooks written at a level similar to ix X PREFACE TO THE SECOND EDITION this one are: Cox and Snell ( 1989), Collett ( 1991) and Kleinbaum (1994). As was the case in our First Edition, the primary objective of the Second Edition is to provide a focused introduction to the logistic re gression model and its use in methods for modeling the relationship between a categorical outcome variable and a set of covariates. Topics that have been added to this edition include: numerous new techniques for model building including determination of scale of continuous co variates; a greatly expanded discussion of assessing model performance; a discussion of logistic regression modeling using complex sample sur vey data; a comprehensive treatment of the use of logistic regression modeling in matched studies; completely new sections dealing with lo gistic regression models for multinomial, ordinal and correlated re sponse data, exact methods for logistic regression and sample size is sues. An underlying theme throughout this entire book is the focus on providing guidelines for effective model building and interpreting the resulting fitted model within the context of the applied problem. The materials in the book have evolved considerably over the past ten years as a result of our teaching and consulting experiences. We have used this book to teach parts of graduate level survey courses, quarter- or semester-long courses, and focused short courses to working professionals. We assume that students have a solid foundation in linear regression methodology and contingency table analysis. The approach we take is to develop the model from a regression analysis point of view. This is accomplished by approaching logistic regression in a manner analogous to what would be considered good statistical practice for linear regression. This differs from the approach used by other authors who have begun their discussion from a contin gency table point of view. While the contingency table approach may facilitate the interpretation of the results, we believe that it obscures the regression aspects of the analysis. Thus, discussion of the interpretation of the model is deferred until the regression approach to the analysis is firmly established. To a large extent there are no major differences in the capabilities of the various software packages. When a particular approach is avail able in a limited number of packages, it will be noted in this text. In general, analyses in this book have been performed in STATA [Stata Corp. (1999)]. This easy to use package combines excellent graphics and analysis routines, is fast, is compatible across Macintosh, Windows and UNIX platforms and interacts well with Microsoft Word. Other

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.