ebook img

Event History Analysis with R PDF

234 Pages·2012·2.539 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Event History Analysis with R

The R Series Statistics E v e n With an emphasis on social science applications, Event History t Event History Analysis Analysis with R presents an introduction to survival and event H history analysis using real-life examples. Keeping mathematical i with R details to a minimum, the book covers key topics, including both s discrete and continuous time data, parametric proportional hazards, t o and accelerated failure times. r Features y • Introduces parametric proportional hazards models with A baseline distributions like the Weibull, Gompertz, Lognormal, n and Piecewise constant hazard distributions, in addition to a traditional Cox regression l • Presents mathematical details as well as technical material in an y appendix s • Includes real examples with applications in demography, i s econometrics, and epidemiology • Provides a dedicated R package, eha, containing special w treatments, including making cuts in the Lexis diagram, creating i communal covariates, and creating period statistics t h A much-needed primer, Event History Analysis with R is a didactically excellent resource for students and practitioners of applied event R history and survival analysis. B r o s t r ö Göran Broström m K11534 K11534_Cover.indd 1 2/28/12 2:46 PM Event History Analysis with R K11534_FM.indd 1 2/27/12 3:03 PM Chapman & Hall/CRC The R Series Series Editors John M. Chambers Torsten Hothorn Department of Statistics Institut für Statistik Stanford University Ludwig-Maximilians-Universität Stanford, California, USA München, Germany Duncan Temple Lang Hadley Wickham Department of Statistics Department of Statistics University of California, Davis Rice University Davis, California, USA Houston, Texas, USA Aims and Scope This book series reflects the recent rapid growth in the development and application of R, the programming language and software environment for statistical computing and graphics. R is now widely used in academic research, education, and industry. It is constantly growing, with new versions of the core software released regularly and more than 2,600 packages available. It is difficult for the documentation to keep pace with the expansion of the software, and this vital book series provides a forum for the publication of books covering many aspects of the development and application of R. The scope of the series is wide, covering three main threads: • Applications of R to specific disciplines such as biology, epidemiology, genetics, engineering, finance, and the social sciences. • Using R for the study of topics of statistical methodology, such as linear and mixed modeling, time series, Bayesian methods, and missing data. • The development of R, including programming, building packages, and graphics. The books will appeal to programmers and developers of R software, as well as applied statisticians and data analysts in many fields. The books will feature detailed worked examples and R code fully integrated into the text, ensuring their usefulness to researchers, practitioners and students. Published Titles Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R, Robert E. Krider and Daniel S. Putler Event History Analysis with R, Göran Broström Programming Graphical User Interfaces with R, John Verzani and Michael Lawrence R Graphics, Second Edition, Paul Murrell Statistical Computing in C++ and R, Randall L. Eubank and Ana Kupresanin K11534_FM.indd 2 2/27/12 3:03 PM The R Series Event History Analysis with R Göran Broström Professor of Statistics Centre for Population Studies Umeå University Umeå, Sweden K11534_FM.indd 3 2/27/12 3:03 PM CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2012 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Version Date: 20120106 International Standard Book Number-13: 978-1-4398-3167-0 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro- vides licenses and registration for a variety of users. For organizations that have been granted a pho- tocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com Contents List of Figures ix List of Tables xiii Preface xv 1 Event History and Survival Data 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Survival Data . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Right Censoring . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Left Truncation . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Time Scales . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.6 Event History Data . . . . . . . . . . . . . . . . . . . . . . . 9 1.7 More Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 Single Sample Data 17 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Continuous Time Model Descriptions . . . . . . . . . . . . . 17 2.3 Discrete Time Models . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Nonparametric Estimators . . . . . . . . . . . . . . . . . . . 23 2.5 Doing it in R . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3 Cox Regression 31 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 ProportionalHazards . . . . . . . . . . . . . . . . . . . . . . 31 3.3 The Log-Rank Test . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 ProportionalHazards in Continuous Time . . . . . . . . . . 39 3.5 Estimation of the Baseline Hazard . . . . . . . . . . . . . . . 42 3.6 Explanatory Variables . . . . . . . . . . . . . . . . . . . . . . 42 3.7 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8 Interpretation of Parameter Estimates . . . . . . . . . . . . . 47 3.9 ProportionalHazards in Discrete Time . . . . . . . . . . . . 47 3.10 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.11 Male Mortality . . . . . . . . . . . . . . . . . . . . . . . . . . 49 v vi 4 Poisson Regression 57 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2 The PoissonDistribution . . . . . . . . . . . . . . . . . . . . 57 4.3 The Connection to Cox Regression . . . . . . . . . . . . . . . 59 4.4 The Connection to the Piecewise Constant Hazards Model . 62 4.5 Tabular Lifetime Data . . . . . . . . . . . . . . . . . . . . . . 62 5 More on Cox Regression 67 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.2 Time-Varying Covariates . . . . . . . . . . . . . . . . . . . . 67 5.3 Communal covariates . . . . . . . . . . . . . . . . . . . . . . 68 5.4 Tied Event Times . . . . . . . . . . . . . . . . . . . . . . . . 71 5.5 Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.6 Sampling of Risk Sets . . . . . . . . . . . . . . . . . . . . . . 75 5.7 Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.8 Checking Model Assumptions . . . . . . . . . . . . . . . . . 80 5.9 Fixed Study Period Survival . . . . . . . . . . . . . . . . . . 83 5.10 Left- or Right-CensoredData . . . . . . . . . . . . . . . . . . 84 6 Parametric Models 85 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 ProportionalHazards Models . . . . . . . . . . . . . . . . . . 85 6.3 Accelerated Failure Time Models . . . . . . . . . . . . . . . . 112 6.4 ProportionalHazards or AFT Model? . . . . . . . . . . . . . 115 6.5 Discrete Time Models . . . . . . . . . . . . . . . . . . . . . . 116 7 Multivariate Survival Models 127 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.2 Frailty Models . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.3 Parametric Frailty Models . . . . . . . . . . . . . . . . . . . 134 7.4 Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 8 Competing Risks Models 139 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 8.2 Some Mathematics . . . . . . . . . . . . . . . . . . . . . . . 140 8.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 8.4 Meaningful Probabilities . . . . . . . . . . . . . . . . . . . . 140 8.5 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 8.6 R Code for Competing Risks . . . . . . . . . . . . . . . . . . 144 9 Causality and Matching 147 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 9.2 Philosophical Aspects of Causality . . . . . . . . . . . . . . . 147 9.3 Causal Inference . . . . . . . . . . . . . . . . . . . . . . . . . 148 9.4 Aalen’s Additive Hazards Model . . . . . . . . . . . . . . . . 150 9.5 Dynamic Path Analysis . . . . . . . . . . . . . . . . . . . . . 152 vii 9.6 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 9.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 A Basic Statistical Concepts 159 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.2 Statistical Inference . . . . . . . . . . . . . . . . . . . . . . . 159 A.3 Asymptotic theory . . . . . . . . . . . . . . . . . . . . . . . . 161 A.4 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . 163 B Survival Distributions 165 B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 B.2 Relevant Distributions in R . . . . . . . . . . . . . . . . . . 165 B.3 ParametricProportionalHazardsandAcceleratedFailureTime Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 C A Brief Introduction to R 177 C.1 R in General . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 C.2 Some Standard R Functions . . . . . . . . . . . . . . . . . . 182 C.3 Writing Functions . . . . . . . . . . . . . . . . . . . . . . . . 187 C.4 Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 C.5 Probability Functions . . . . . . . . . . . . . . . . . . . . . . 194 C.6 Help in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 C.7 Functions in eha and survival . . . . . . . . . . . . . . . . 197 C.8 Reading Data into R . . . . . . . . . . . . . . . . . . . . . . 202 D Survival Packages in R 205 D.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 D.2 eha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 D.3 survival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 D.4 Other Packages . . . . . . . . . . . . . . . . . . . . . . . . . 207 Bibliography 209 Index 213 TThhiiss ppaaggee iinntteennttiioonnaallllyy lleefftt bbllaannkk List of Figures 1.1 Survival data. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Right censoring. . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 A Lexis diagram . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Marital fertility . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 The illness-death model. . . . . . . . . . . . . . . . . . . . . . 12 2.1 Life table and survival function . . . . . . . . . . . . . . . . . 19 2.2 Interpretation of the density function . . . . . . . . . . . . . . 20 2.3 Interpretation of the hazard function . . . . . . . . . . . . . . 21 2.4 The geometric distribution. . . . . . . . . . . . . . . . . . . . 24 2.5 A simple survival data set.. . . . . . . . . . . . . . . . . . . . 24 2.6 Preliminaries for estimating the hazard function. . . . . . . . 25 2.7 Nonparametric estimation of the hazard function. . . . . . . . 26 2.8 The Nelson–Aalen estimator. . . . . . . . . . . . . . . . . . . 27 2.9 The Kaplan–Meier estimator . . . . . . . . . . . . . . . . . . 27 2.10 Male mortality, Nelson–Aalen and Kaplan–Meier plots . . . . 28 2.11 Male mortality, Weibull fit. . . . . . . . . . . . . . . . . . . . 30 3.1 Proportionalhazard functions . . . . . . . . . . . . . . . . . . 32 3.2 Proportionalhazard functions, log scale . . . . . . . . . . . . 33 3.3 Two-sample data . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4 Old age mortality, women vs. men, cumulative hazards. . . . 37 3.5 Old age mortality by birthplace, cumulative hazards. . . . . . 39 3.6 Effect of proportional hazards . . . . . . . . . . . . . . . . . . 40 3.7 Estimated cumulative baseline hazard function. . . . . . . . . 52 3.8 Effects, interaction with centering. . . . . . . . . . . . . . . . 55 3.9 Effects, interaction without centering . . . . . . . . . . . . . . 55 4.1 The Poisson cdf . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 Number of children beyond one for married women . . . . . . 59 4.3 Theoretical Poisson distribution . . . . . . . . . . . . . . . . 59 4.4 Hazard functions for males and females, model based . . . . . 65 4.5 Hazard functions for males and females, raw data . . . . . . . 66 5.1 A time-varying covariate . . . . . . . . . . . . . . . . . . . . . 68 5.2 Log rye price deviations from trend . . . . . . . . . . . . . . . 69 5.3 Nelson–Aalen plot with the aid of the function risksets . . 73 ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.