A Pocket Guide to Epidemiology David G. Kleinbaum Kevin M. Sullivan Nancy D. Barker A Pocket Guide to Epidemiology DavidG.Kleinbaum KevinM.Sullivan DepartmentofEpidemiology DepartmentofEpidemiology RollinsSchoolofPublicHealth RollinsSchoolofPublicHealth EmoryUniversity EmoryUniversity 1518CliftonRoad,NE 1518CliftonRoad,NE Atlanta,GA30322 Atlanta,GA30322 USA USA [email protected] [email protected] NancyD.Barker 2465TraywickChase Alpharetta,GA30004 USA [email protected] LibraryofCongressControlNumber:2006933294 ISBN-10:0-387-45964-2 e-ISBN-10:0-387-45966-9 ISBN-13:978-0-387-45964-6 e-ISBN-13:978-0-387-45966-0 Printedonacid-freepaper. (cid:1)C 2007SpringerScience+BusinessMedia,LLC Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewritten permissionofthepublisher(SpringerScience+BusinessMedia,LLC,233SpringStreet,NewYork, NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Use in connection with any form of information storage and retrieval, electronic adaptation, computer software,orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden. Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,evenifthey arenotidentifiedassuch,isnottobetakenasanexpressionofopinionastowhetherornottheyare subjecttoproprietaryrights. 9 8 7 6 5 4 3 2 1 springer.com Preface Four years ago (2002), I (DGK) authored a unique educational program,ActivEpi (Springer Publishers), developed in CD-ROM format to provide a multimedia interactive “electronic textbook”onbasicprinciples and methodsof epidemiology. In 2003,theActivEpi Companion Text,authoredby myself (DGK), KM Sullivan and ND Barker and alsopublishedby Springer, was developed to provide a hard- copyof the material contained in theActivEpi CD-ROM. The CD-ROM contains 15 chapters,with each consistingof a collectionof “activities” including narrated expositions, interactive study questions, quizzes, homework questions, and web links to relevant referenceson the Internet. In the nearly three years since the publication of the ActivEpi CD-ROM, we have received several suggestions from instructors of introductory epidemiology courses as well as health and medical professionals to produce an abbreviated version that narrows the discussion to the most “essential” principles andmethods. Instructors expressed to us their concern that the material covered by the CD- ROM (and likewise, the Companion Text)was too comprehensive to conveniently fit the amount of time available in an introductory course.Professionals expressed their desire for a more economically time-consuming version that would conveniently fit their “after hours” availability. To address these suggestions, we haveherewithproducedA Pocket Guide to Epidemiology which provides a much shorter, more “essential” version of the material covered by theActivEpiCD-ROM and Companion Text.We realize that determining what is “essential” is not a simple task, especially since, from our point of view, the original CD-ROM was already restricted to “essential” topics. Nevertheless, to produce this text, we decided to remove fromthe originalmaterial a great many fine points of explanation and complicated topics/issues about epidemiologic principlesand methods, with our primarygoala“quicker read”. A Pocket Guide to Epidemiology contains less than half asmany pages as the ActivEpi Companion Text. We have continued to include in A Pocket Guide to Epidemiology many of the study questions and quizzes that are provided in each Lesson of the CD ROM, but we have eliminated homework exercises, computer exercises, and Internet linkages from the original CD-ROM. Nevertheless, we indicate throughout A Pocket Guide to Epidemiology how and where the interested reader can turn to the ActivEpi CD ROM (or the Companion Text) to pursue moredetailed information. We authors view A Pocket Guide to Epidemiology as a stand-alone introductory text on the basic principles and concepts of epidemiology. Our primary audience for this textis the public health student orprofessional, clinician, healthjournalist, and anyoneelse at any age or life experience that is interested in learning whatepidemiology is all about in a convenient, easy to understand format with timely, real-world health examples. We believe that the reader of this text will also benefit from using the multi-media learner-interactive features of the ActivEpi CD ROM electronic textbook to further clarify and enhance what is covered in this more abbreviated (non-electronic) text. Nevertheless, we suggest that, on its own, A Pocket Guide to Epidemiology will provide the interested reader with a comfortable, time-efficient and enjoyable introduction to epidemiology. About the Authors David G. Kleinbaum, KevinM. Sullivan, and NancyBarker David G. Kleinbaum is a Professor of Epidemiology at Emory University's Rollins School of Public Health in Atlanta, GA, and an internationally recognized expert in teaching biostatistical and epidemiological concepts andmethods at all levels.He is the author of several widely acclaimed textbooks including, Applied Regression Analysis and Other Multivariable Methods, Epidemiologic Research: Principles and Quantitative Methods, Logistic Regression-A Self-Learning Text, and SurvivalAnalysis-A Self-Learning Text. Dr. Kleinbaum has more than 25 years of experience teaching over 100 short courses on statistical and epidemiologic methods to a variety of international audiences, and has publishedwidely in both themethodological and applied public health literature. He is also an experienced and sought-after consultant, and is presently an ad-hoc consultant to all research staff at the Centers for Disease Control and Prevention. On a personal note, Dr. Kleinbaum is an accomplished jazz flutist, and plays weekly in Atlanta with his jazzcombo, The Moonlighters JazzBand. Dr. Kevin M. Sullivan is an Associate Professor of Epidemiology at Emory University’s Rollins School of Public Health. He has worked in the area of epidemiology and public health for over30 years and has over 80 publications in peer-reviewedjournals andhas published chapters in several books. Dr. Sullivan has used theActivEpi Companion Textbook and CD-ROM in a number of courses he teaches, both in traditional classroom-based courses and distance learning courses. He is one ofthe developers ofEpi Info,afreelydownloadable web-based software package for the analysis of epidemiologic data published by the Centers for Disease Control and Prevention. He is also the co-authorofOpenEpi, a freely downloadableweb-based calculator for epidemiologicdata (www.OpenEpi.com). Ms. Nancy Barker is a statistical consultant who formerly worked at the Centers for Disease Control and Prevention. She is an Instructor in the Career MPH at the Rollins School of Public Health at Emory University where she teaches a distance learning course on Basic Epidemiology that uses ActivEpi CD and ActivEpi Companion Text as the course textbooks. She also has been co- instructor in several short courses on beginning and intermediate epidemiologic methods in the Epi in Action Program sponsored by the Rollins School of Public Health. Contents Preface . . . . . . v Chapter 1 A Pocket-Size Introduction. . . 1 Chapter 2 The Big Picture – With Examples . 3 Chapter 3 How to Set Things Up? Study Designs . 21 Chapter 4 How Often does it Happen? Disease Frequency 43 Chapter 5 What’s the Answer? Measures of Effect. 71 Chapter 6 What is the Public Health Impact? . 91 Chapter 7 Is There Something Wrong? Validity & Bias 109 Chapter 8 Were Subjects Chosen Badly? Selection Bias 127 Chapter 9 Are the Data Correct? Information Bias . 139 Chapter 10 Other Factors Accounted For? Confounding Interaction . . . . 161 Chapter 11 Confoundingcan be Counfounding– Several RiskFactors . . . . 179 Chapter 12 SimpleAnalyses – 2x2 Tables are Not That Simple . . . . . 191 Chapter 13 Control – What it’s All About . . 227 Chapter 14 How to Deal with Lots of Tables - Stratified Analysis . . . . 245 Chapter 15 Matching – Seems Easy, but not That Easy. 257 Index . . . . . . 277 1 CHAPTER A POCKET-SIZE INTRODUCTION Epidemiology is the study of health and illness in human populations. For example, a randomized clinical trial conducted by Epidemiologists at the Harvard Schoolof Public Health showed that taking aspirin reducesheartattack risk by 20 to 30 percent. Public health studies in the 1950's demonstrated that smoking cigarettes causes lung cancer. Environmental epidemiologists have been evaluating the evidence that living near power lines may have a high risk for childhood leukemia. Cancer researchers wonderwhyolder women are less likely to be screened for breast cancer than younger women. All of these are examples of epidemiologic research, because they all attempt to describe the relationship between a health outcome and one or more explanations or causes of that outcome. All of these examples share several challenges: they must choose an appropriate study design, they must be careful to avoid bias, and they must use appropriate statistical methods to analyze the data. Epidemiology deals with each of these three challenges. 2 CHAPTER THE BIG PICTURE - WITH EXAMPLES The field of epidemiology was initially concerned with providinga methodological basis for the study and control of population epidemics. Now, however, epidemiology has a much broader scope, including the study of both acute and chronic diseases, the quality of health care, and mental health problems. As the focus of epidemiologic inquiry has broadened, so has the methodology. In this overview chapter, we describe examples of epidemiologic research and introduce several importantmethodological issuestypically considered in such research. The Sydney Beach Users Study Epidemiology is primarily concerned with identifying the important factors or variables that influence a health outcome of interest. In the Sydney Beach Users Study, the key question was “Is swimming at the beaches in Sydney associated with an increased risk ofacute infectious illness?” In Sydney, Australia, throughout the 1980s,complaints were expressed in the local news media that the popular public beaches surrounding the city were becoming more and more unsafe for swimming. Much of the concern focused on the suspicion that thebeacheswere being increasingly pollutedby waste disposal. In 1989, the New South Wales Department of Health decided to undertake a study to investigate the extent to which swimming and possible pollution at 12 popular Sydney beaches affected the public's health, particularly during the s Summer months when the beaches weremost crowded. The primary research question of interest was: are persons who swim at Sydney beaches at increased risk for developing an acute infectious illness? The study was carried out by selecting subjects on the beaches throughout the summer months of 1989-90. Those subjects eligible to participate at this initial interview were then followed-up by phone a week later to determine swimming exposure on the dayof thebeach interview and subsequent illness statusduring the weekfollowing the interview. Water quality measurements at the beaches were also taken on each day that subjectsweresampled inorder to match swimming exposure topollutionlevels at the beaches. Analysis of the study data lead to the overall conclusion that swimming in polluted water carried a statistically significant 33% increased risk for an infectious illness when compared to swimming in non-polluted water. These 4 Chapter 2. The Big Picture – With Examples results were considered by health department officials and the public alike to confirm that swimming in Sydney beaches posed an important health problem. Consequently, the state and local health departments together with other environmentalagencies in the Sydney areaundertook a program to reduce sources of pollution of beach water that lead to improved water quality at the beaches during the1990’s. Summary (cid:153) The Sydney Beach Users Study is an example of the application of epidemiologic principles and methods to investigate a localized public health issue. (cid:153) Thekeyquestion in the Sydney Beach Users Study was: o Does swimming at the beaches in Sydney, Australia (in 1989-90) pose an increased health risk foracuteinfectiousillnesses? o The conclusion was yes, a 33% increased risk. Important Methodological Issues We provide a general perspective of epidemiologic research by highlighting several broad issues that arise during the course of most epidemiologic investigations. There are many issues to worry about when planning an epidemiologic research study (see Box below). In this chapter we will begin to describe a list of broad methodological issues that need to be addressed. We will illustrate each issue using thepreviouslydescribed Sydney BeachUsers Studyof 1989. Issuesto consider when planningan epidemiologic research study Question Definea question of interest and key variables Variables What to measure: exposure (E), disease (D), and control (C) variables Design What study design and sampling frame? Frequency Measures of disease frequency Effect Measures of effect Bias Flaws in study design,collection, or analysis Analysis Perform appropriate analyses The first two issues require clearly defining the study question of interest, followed by specifying the key variables to be measured. Typically, we first should ask: What is the relationship of one or more hypothesized determinants to a disease or health outcome ofinterest? Adeterminantis often called anexposure variable and is denoted by the letter E. The diseaseorhealth outcome is often denoted as D. Generally, variables other than exposure and disease that are known to predict the health outcome must be taken into account. We often call these variables control variables and denote them using theletterC. Next, we must determine how to actually measure these variables. This step requires determining the information-gathering instruments and survey questionnairesto be obtainedor developed.