Table Of ContentAnalysis of Correlated
Data with SAS and R
Fourth Edition
http://taylorandfrancis.com
Analysis of Correlated
Data with SAS and R
Fourth Edition
Mohamed M. Shoukri
CRCPress
Taylor&FrancisGroup
6000BrokenSoundParkwayNW,Suite300
BocaRaton,FL33487-2742
©2018byTaylor&FrancisGroup,LLC
CRCPressisanimprintofTaylor&FrancisGroup,anInformabusiness
NoclaimtooriginalU.S.Governmentworks
Printedonacid-freepaper
InternationalStandardBookNumber-13:978-1-1381-9745-9(Hardback)
Thisbookcontainsinformationobtainedfromauthenticandhighlyregardedsources.Reasonableeffortshave
beenmadetopublishreliabledataandinformation,buttheauthorandpublishercannotassumeresponsi-
bility for the validity of all materials or the consequences of their use. The authors and publishers have
attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyrightholdersifpermissiontopublishinthisformhasnotbeenobtained.Ifanycopyrightmaterialhasnot
beenacknowledged,pleasewriteandletusknowsowemayrectifyinanyfuturereprint.
ExceptaspermittedunderU.S.CopyrightLaw,nopartofthisbookmaybereprinted,reproduced,trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented,includingphotocopying,microfilming,andrecording,orinanyinformationstorageorretrieval
system,withoutwrittenpermissionfromthepublishers.
Forpermissiontophotocopyorusematerialelectronicallyfromthiswork,pleaseaccesswww.copyright.com
(http://www.copyright.com/)orcontacttheCopyrightClearanceCenter,Inc.(CCC),222RosewoodDrive,
Danvers,MA01923,978-750-8400.CCCisanot-for-profitorganizationthatprovideslicensesandregis-
trationforavarietyofusers.FororganizationsthathavebeengrantedaphotocopylicensebytheCCC,a
separatesystemofpaymenthasbeenarranged.
TrademarkNotice:Productorcorporatenamesmaybetrademarksorregisteredtrademarks,andareused
onlyforidentificationandexplanationwithoutintenttoinfringe.
LibraryofCongressCataloging-in-PublicationData
Names:Shoukri,M.M.(MohamedM.),author.|Shoukri,M.M.(MohamedM.).
AnalysisofcorrelateddatawithSASandR.
Title:StatisticalanalysisofhealthdatausingSASandR/MohamedM.Shoukri.
Description:Fourthedition.|BocaRaton:CRCPress,2018.|Previousedition:
AnalysisofcorrelateddatawithSASandR/MohamedM.Shoukri
(BocaRaton:Chapman&Hall/CRC,2007).
Identifiers:LCCN2017050107|ISBN9781138197459(hardback)
Subjects:LCSH:Epidemiology–Statisticalmethods.|Mathematical
VisittheTaylor&FrancisWebsiteat
http://www.taylorandfrancis.com
andtheCRCPressWebsiteat
http://www.crcpress.com
Success isnot final.Failure is not fatal.It is the courageto continue that counts.
SirWinston Churchill
To my loving wife Suhair.
http://taylorandfrancis.com
Contents
Preface.....................................................................................................................xv
1. Study Designs andMeasures ofEffect Size..............................................1
1.1 Study Designs........................................................................................1
1.1.1 Introduction..............................................................................1
1.1.2 Nonexperimental orObservational Studies........................2
1.1.3 Types of Nonexperimental Designs......................................2
1.1.3.1 Descriptive/Exploratory Survey Studies.............2
1.1.3.2 Correlational Studies(Ecological Studies)...........2
1.1.3.3 Cross-Sectional Studies...........................................3
1.1.3.4 Longitudinal Studies...............................................3
1.1.3.5 Prospective or Cohort Studies...............................3
1.1.3.6 Case-Control Studies..............................................4
1.1.3.7 Nested Case-Control Study...................................5
1.1.3.8 Case-Crossover Study.............................................6
1.1.4 Quasi-Experimental Designs..................................................7
1.1.5 Single-Subject Design (SSD)...................................................7
1.1.6 Quality ofDesigns...................................................................8
1.1.7 Confounding............................................................................8
1.1.8 Sampling...................................................................................9
1.1.9 Types of SamplingStrategies................................................9
1.1.10 Summary................................................................................10
1.2 Effect Size..............................................................................................11
1.2.1 What IsEffectSize?...............................................................11
1.2.2 Why Report Effect Sizes?......................................................11
1.2.3 Measures ofEffectSize.........................................................13
1.2.4 What IsMeant by “Small,”“Medium,”and“Large”?....13
1.2.5 Summary................................................................................15
1.2.6 American Statistical Association (ASA) Statement
about the p-value...................................................................15
Exercises..........................................................................................................17
2. ComparingGroup MeansWhen the StandardAssumptions
Are Violated...................................................................................................19
2.1 Introduction.........................................................................................19
2.2 Nonnormality.......................................................................................20
vii
viii Contents
2.3 Heterogeneity of Variances................................................................23
2.3.1 Bartlett’sTest..........................................................................24
2.3.2 Levene’sTest (1960)..............................................................27
2.4 Testing Equality ofGroup Means.....................................................27
2.4.1 Welch’sStatistic (1951).........................................................27
2.4.2 Brown and Forsythe Statistic (1974b) for Testing
Equality ofGroup Means.....................................................28
2.4.3 Cochran’s(1937) Method of Weighing forTesting
Equality ofGroup Means.....................................................29
2.5 Nonindependence...............................................................................32
2.6 Nonparametric Tests...........................................................................35
2.6.1 Nonparametric Analysis ofMilk Data Using SAS...........36
3. Analyzing Clustered Data...........................................................................43
3.1 Introduction.........................................................................................43
3.2 The Basic Feature ofCluster Data....................................................44
3.3 Effect ofOne Measured Covariate onEstimation
ofthe Intracluster Correlation...........................................................48
3.4 Samplingand Design Issues..............................................................52
3.4.1 Comparison ofMeans...........................................................52
3.5 Regression Analysis forClustered Data..........................................56
3.6 Generalized Linear Models................................................................60
3.6.1 Marginal Models (Population Average Models)..............61
3.6.2 Random Effects Models........................................................61
3.6.3 Generalized Estimating Equation (GEE)............................62
3.7 Fitting Alternative Models forClustered Data...............................64
3.7.1 Proc Mixed forClustered Data...........................................66
3.7.2 Model 1:Unconditional Means Model...............................66
3.7.3 Model 2:Including aFamily LevelCovariate..................67
3.7.4 Model 3:Including the Sib-Level Covariate......................69
3.7.5 Model 4:Including One FamilyLevel Covariate
and Two Subject Level Covariates.....................................70
Appendix.........................................................................................................72
Exercises..........................................................................................................74
4. Statistical Analysis ofCross-Classified Data...........................................79
4.1 Introduction.........................................................................................79
4.2 Measures of Association in 2×2 Tables.........................................80
4.2.1 Absolute Risk.........................................................................80
4.2.2 Risk Difference.......................................................................81
4.2.3 Attributable Risk...................................................................81
4.2.4 Relative Risk...........................................................................81
4.2.5 Odds Ratio..............................................................................81
4.2.6 Relationship between OddsRatio and Relative Risk......82
Contents ix
4.2.7 Incidence Rate and Incidence Rate Ratio AsaMeasure
of Effect Size...........................................................................82
4.2.8 What IsPerson-Time?...........................................................82
4.3 Statistical Analysis from the2 ×2Classification Data..................84
4.3.1 Cross-Sectional Sampling.....................................................84
4.3.2 Cohort and Case-Control Studies.......................................87
4.4 Statistical Inference on OddsRatio..................................................88
4.4.1 Significance Tests...................................................................90
4.4.2 Interval Estimation................................................................94
4.5 Analysis ofSeveral 2× 2Contingency Tables................................94
4.5.1 Test ofHomogeneity.............................................................97
4.5.2 Significance Testof CommonOdds Ratio.........................98
4.5.3 Confidence Interval onthe Common OddsRatio..........102
4.6 Analysis ofMatched Pairs (One Case and One Control)............103
4.6.1 Estimating the OddsRatio.................................................104
4.6.2 Testing the Equality ofMarginal Distributions..............106
4.7 Statistical Analysis ofClustered Binary Data...............................108
4.7.1 Approaches toAdjust the Pearson’s Chi-Square............110
4.7.2 Donner andDonald Adjustment......................................110
4.7.3 Procedures Basedon Ratio EstimateTheory..................110
4.7.4 Confidence Interval Construction.....................................111
4.7.5 Adjusted Chi-Square forStudies Involving More
than Two Groups................................................................114
4.8 Inference onthe Common OddsRatio..........................................121
4.8.1 Donald and Donner’sAdjustment....................................121
4.8.2 Rao andScott’sAdjustment...............................................123
4.9 Calculations ofRelative andAttributable Risks
from Clustered Binary Data............................................................130
4.10 Sample Size Requirements forClustered BinaryData................131
4.10.1 Paired-Sample Design.........................................................131
4.10.2 Comparative Studiesfor Cluster Sizes Greater
or Equalto Two...................................................................132
4.11 Discussion...........................................................................................133
Exercises........................................................................................................134
5. Modeling BinaryOutcome Data..............................................................141
5.1 Introduction.......................................................................................141
5.2 The Logistic Regression Model.......................................................143
5.3 Coding Categorical Explanatory Variables and Interpretation
ofCoefficients....................................................................................146
5.4 Interaction andConfounding inLogistic Regression..................150
5.5 The Goodness of Fit andModel Comparisons.............................155
5.5.1 The Pearson’sc2Statistic...................................................155
5.5.2 The Likelihood RatioCriterion (Deviance)......................155