ebook img

Log-Linear Models PDF

420 Pages·1990·9.915 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Log-Linear Models

Springer Texts in Statistics Advisors: Stephen Fienberg Ingram Olkin Springer Texts in Statistics Alfred Elements of Statistics for the Life and Social Seiences Biom Probability and Statistics: Theory and Applications Chow and Teicher Probability Theory: Independence, Interchangeability, Martingales. Second Edition Christensen Plane Answers to Complex Questions: The Theory of Linear Models Christensen Linear Models for Multivariate, Time Series, and Spatial Data Christensen Log-Linear Models du Toit, Steyn and Graphical Exploratory Data Analysis Strumpf Finkeistein and Levin Statistics for Lawyers Kalbfleisch Probability and Statistical Inference: Volume 1: Probability. Second Edition Kalbfleisch Probability and Statistical Inference: Volume 2: Statistical Inference. Second Edition Keyfitz Applied Mathematical Demography. Second Edition Kiefer Introduction to Statistical Inference Kokoska and Nevison Statistical Tables and Formulae Madansky Prescriptions for Working Statisticians McPherson Statistics in Scientific Investigation: Its Basis, Application, and Interpretation Nguyen and Rogers Fundamentals of Mathematical Statistics: Volume 1: Probability for Statistics (continued afterindex) Ronald Christensen Log-Linear Models Springer Science+Business Media, LLC Ronald Christensen Department of Mathematics and Statistics University of New Mexico Albuquerque, NM 87131 USA Editorial Board Stephen Fienberg Irrgram Olkin Department of Statistics Department of Statistics Carnegie-Mellon U niversity Stanford University Pittsburgh, PA 15213 Stanford, CA 94305 USA USA Mathematics Subject Classification: 62H17 Library of Congress Cataloging-in-Publication Data Christensen, Ronald R. Log-linear models / Ronald Christensen. p. cm. - (Springer texts in statistics) Includes bibliographical references and index. 1. Log-linear models. I. title. II. Series. QA278.C49 1990 90-42812 519.5'35-dc20 CIP Printed on acid-free paper. © 1990 by Springer Science+B usiness Media N ew York Originally published by Springer-Verlag New York Inc. in 1990. Softcoverreprint ofthe hardcover1st edition 1990 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher Springer Science+Business Media, LLC, except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation. computer software, or by similar or dissimilar methodology now known or hereafter developed is for bidden. The use of general descriptive names, trade names, trademarks, etc., in this pub lication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Photocomposed copy prepared from the author's lffiTRXfile. 9 8 7 6 5 4 3 2 1 ISBN 978-1-4757-4113-1 ISBN 978-1-4757-4111-7 (eBook) DOI 10.1007/978-1-4757-4111-7 To Sharon and Fletch Preface This book examines log-linear models for contingency tables. Logistic re gression and logistic discrimination are treated as special cases and gener alized linear models (in the GLIM sense) are also discussed. The book is designed to fill a niche between basic introductory books such as Fienberg (1980) and Everitt (1977) and advanced books such as Bishop, Fienberg, and Holland (1975), Haberman (1974), and Santner and Duffy (1989). lt is primarily directed at advanced Masters degree students in Statistics but it can be used at both higher and lower levels. The primary theme of the book is using previous knowledge of analysis of variance and regression to motivate and explicate the use of log-linear models. Of course, both the analogies and the distinctions between the different methods must be kept in mind. The book is written at several levels. A basic introductory course would take material from Chapters I, II (deemphasizing Section II.4), III, Sec tions IV.1 through IV.5 (eliminating the material on graphical models), Section IV.lü, Chapter VII, and Chapter IX. The advanced modeling ma terial at the end of Sections VII.1, VII.2, and possibly the material in Section IX.2 should be deleted in a basic introductory course. For Mas ters degree students in Statistics, all the material in Chapters I through V, VII, IX, and X should be accessible. For an applied Ph.D. course or for advanced Masters students, the material in Chapters VI and VIII can be incorporated. Chapter VI recapitulates material from the first five chapters using matrix notation. Chapter VIII recapitulates Chapter VII. This ma terial is necessary (a) to get standard errors of estimates in anything other than the saturated model, (b) to explain the Newton-Raphson (iteratively reweighted least squares) algorithm, and ( c) to discuss the weighted least squares approach of Grizzle, Starmer, and Koch (1969). I also think that the more general approach used in these chapters provides a deeper un derstanding of the subject. Most of the material in Chapters VI and VIII requires no more sophistication than matrix arithmetic and being able to understand the definition of a column space. All of the material should be accessible to people who have had a course in linear models. Throughout the book, Chapter XV of Christensen (1987) is referenced for technical de tails. For completeness, and to allow the book to be used in nonapplied Ph.D. courses, Chapter XV has been reprinted in this volume under the same title, Chapter XV. The prerequisites differ for the various courses described above. At a minimum, readers should have had a traditional course in statistical meth ods. To understand the vast majority of the book, courses in regression, analysis of variance, and basic statistical theory are recommended. To fully appreciate the book, it would help to already know linear model theory. lt is difficult for me to understand but many of my acquaintance view viii Preface me a.s quite opinionated. While I admit that I have not tried to keep my opinions to myself, I have tried to clearly acknowledge them a.s my opinions. There are many people I would like to thank in connection with this work. My family, Sharon and Fletch, were supportive throughout. Jackie Damrau did an exceptional job of typing the first draft. The folks at BMDP provided me with copies of 4F, LR, and 9R. MINITAB provided me with Versions 6.1 and 6.2. Dick Lund gave me a copy of MSUSTAT. All of the computations were performed with this software or GLIM. Several people made valuable comments on the manuscript; these include Rahman Azari, Larry Blackwood, Ron Schrader, and Elizabeth Slate. Joe Hill introduced me to statistical applications of graph theory and convinced me of their importance and elegance. He also commented on part of the book. My editors, Steve Fienberg and Ingram Olkin, were, as always, very helpful. Like many people, I originally learned about log-linear models from Steve's book. Two people deserve special mention for how much they contributed to this effort. I would not be the author of this book were it not for the amount of support provided in its development by Ed Bedrick and Wes Johnson. Wes provided much of the data used in the examples. I suppose that I should also thank the legislature of the state of Montana. It was their penury, while I worked at MontanaState University, that motivated me to begin the project in the spring of 1987. If you don't like the book, blame them! Ronald Christensen Albuquerque, New Mexico April 5, 1990 (Happy Birthday Dad) Contents Preface vii I Introduction 1 I.1 Conditional Probability and Independence . 1 I.2 Random Variablesand Expectations 11 I.3 The Binomial Distribution . . 12 I.4 The Multinomial Distribution 14 I.5 The Poisson Distribution 18 I.6 Exercises 20 II Two-Dimensional Tables 23 Il.1 Two Independent Binomials . . . . . . 23 Il.2 Testing Independence in a 2 x 2 Table . 29 Il.3 I x J Tables . . . . . . . . . . . . . . . 33 II.4 Maximum Likelihood Theory forTwo-Dimensional Tables 41 Il.5 Log-Linear ModelsforTwo-Dimensional Tables 46 Il.6 Exercises . . . . . . . . . . . . . . . . . . . . . . 53 III Three-Dimensional Tables 61 III.1 Simpson's Paradox and the Need for Higher Dimensional Tables . . . . . . . . . . . . . . . . . . 62 III.2 Independence and Odds Ratio Models for Three-Dimensional Tables ........... . 63 III.3 Iterative Computation of Estimates . . . . . . . . 78 III.4 Log-Linear Models for Three-Dimensional Tables . 81 III.5 Product-Multinomial and Other Sampling Plans 91 III.6 Exercises ..................... . 95 IV Higher Dimensional Tables 99 IV.1 Model Interpretations and Graphical Models 100 IV.2 Collapsing Tables ............ . 113 IV.3 Stepwise Procedures for Model Selection 115 IV.4 Initial Models for Selection Methods ... 118 x Contents IV.5 Example of Stepwise Methods . . . . . 128 IV.6 Aitkin's Method of Backward Selection 137 IV. 7 Model Selection Among Decomposable and Graphical Models . . . . . . . . . . 143 IV.8 Model Selection Criteria . . . . . . . . 149 IV.9 Residuals and Influential Observations. 154 IV.lü Drawing Conclusions 161 IV.ll Exercises . . . . . . . . . . . . . . . . . 162 V Models for Factors with Quantitative Levels 167 V.1 Models for Two-Factor Tables 168 V.2 Higher Dimensional Tables 175 V.3 Unknown Factor Scores 177 V.4 Exercises . . . . . . . . . . 183 VI The Matrix Approach to Log-Linear Models 185 Vl.1 Maximum Likelihood Theory for Multinomial Sampling 189 Vl.2 Asymptotic Results . . . . . . . 193 Vl.3 Product-Multinomial Sampling . . . . . . . . . . . . . 210 VI.4 Inference for Model Parameters . . . . . . . . . . . . 213 Vl.5 Methods for Finding Maximum Likelihood Estimates 216 Vl.6 Regression Analysis of Categorical Data . 218 VI. 7 Residual Analysis and Outliers 224 Vl.8 Exercises . . . . . . . . . . . . . . . . . . 230 VII Response Factors and Logistic Discrimination 233 VII.1 Logit Models . . . . . . . . . . . . . . . . . . . . 236 VII.2 Logit Modelsfora Multinomial Response Factor 246 VII.3 Logistic Regression . . . . . . . . . . . 255 VI1.4 Logistic Discrimination and Allocation 269 VII.5 Recursive Causal Models 279 VII.6 Exercises . . . . . . . . . . . . . . . . . 294 VIII The Matrix Approach to Logit Models 300 VIII.1 Estimation and Testing for Logistic Models 300 VIII.2 Asymptotic Results . . . . . . . . . . . . . 308 VIII.3 Model Selection Criteria for Log;istic Regression. 316 VIII.4 Likelihood Equations and Newton-Raphson 318 VIII.5 Weighted Least Squares for Logit Models . . . . 320 VIII.6 Multinomial Response Models . . . . . . . . . . 322 VIII. 7 Discrimination, Allocation, and Retrospective Data 323 VIII.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 330

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.