M M B TM ETHODS IN OLECULAR IOLOGY SeriesEditor JohnM.Walker School ofLife Sciences University ofHertfordshire Hatfield, Hertfordshire,AL109AB,UK For further volumes: http://www.springer.com/series/7651 . Statistical Human Genetics Methods and Protocols Edited by Robert C. Elston Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA Jaya M. Satagopan Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, NY, USA Shuying Sun Department of Epidemiology and Biostatistics, Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA Editors RobertC.Elston JayaM. Satagopan DepartmentofEpidemiology Departmentof Epidemiology andBiostatistics and Biostatistics CaseWesternReserveUniversity Memorial Sloan-Kettering Cancer Center Cleveland,OH,USA New York, NY,USA [email protected] [email protected] Shuying Sun Departmentof Epidemiology and Biostatistics Case Comprehensive Cancer Center Case Western Reserve University Cleveland, OH,USA [email protected] ISSN1064-3745 e-ISSN1940-6029 ISBN978-1-61779-554-1 e-ISBN978-1-61779-555-8 DOI10.1007/978-1-61779-555-8 SpringerNewYorkDordrechtHeidelbergLondon LibraryofCongressControlNumber:2011945439 ªSpringerScience+BusinessMedia,LLC2012 Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewrittenpermissionofthe publisher (Humana Press, c/o Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Useinconnectionwithanyformof informationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdevelopedisforbidden. Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,eveniftheyarenotidentified assuch,isnottobetakenasanexpressionofopinionastowhetherornottheyaresubjecttoproprietaryrights. Printedonacid-freepaper HumanaPressispartofSpringerScience+BusinessMedia(www.springer.com) Preface Therecentadvancesingenetics,especiallyinthemolecular techniquesthathaveover the last quarter of a century spectacularly reduced the cost of determining genetic markers, open up a field of research that is becoming of increasing help in detecting, preventing and/or curing many diseases that afflict us. This has brought with it the need for novel methodsofstatisticalanalysisandtheimplementationofthesemethodsinawidevarietyof computerprograms.Itisouraiminthisbooktomakethesemethodsandprogramsmore easily accessible to the beginner who has data to analyze, whether a student or a senior investigator.Apartfromthefirstchapter,whichdefinessomeofthegenetictermsweshall use, and the last chapter, which compares three major multipurpose programs/software packages,eachchapterofthisbooktakesupaparticularanalyticaltopicandillustratesthe use of at least one piece of software that the authors have found helpful for the relevant statisticalanalysisoftheirownhumangeneticdata.Thereisoftenmorethanoneprogram that performs a particular type of analysis and, once you have used one program for a particularanalysis,youmayfindyoupreferanotherprogram—andthereisagoodchance youwillfindthatthesamebasicanalysisisdescribedinmorethanonechapterofthisbook. You may therefore wish to browse over several chapters, in the first place restricting your reading to only the introductory sections, which describe the underlying theory. Thechaptersareorderedintheapproximatelogicalorderinwhichhumangeneticstudies are often conducted; so, if you are new to research in human genetics, this initial reading couldserveas anintroductionto thesubject.Our mainpurpose, however,istoservethe needsofthosewhohavealreadyperformedtheirstudyandnowneedtoanalyzetheirdata. The second sections of the chapters give you step-by-step instructions for running the programs and interpreting the program outputs, with extra notes in the third sections. However, although ouraimisverymuch to offera “doityourself”manual, therewill be timeswhenyouwillneedtoconsultastatisticalgeneticist,especiallyfortheinterpretation ofcomputeroutput.Wehavetriedtobefairlycomprehensiveincoveringstatisticalhuman genetics,butwedonotcoverhereanyofthebioinformaticsoftwareforgenesequencing, whichisstillverymuchinitsinfancy. Cleveland,OH,USA RobertC.Elston NewYork,NY,USA JayaM.Satagopan Cleveland,OH,USA ShuyingSun Cleveland,OH,USA RobertElston v Contents Preface .................................................................... v Contributors................................................................ ix 1 GeneticTerminology ................................................... 1 RobertC.Elston,JayaM.Satagopan,andShuyingSun 2 IdentificationofGenotypeErrors ........................................ 11 YinY.ShugartandYingWang 3 DetectingPedigreeRelationshipErrors ................................... 25 LeiSun 4 IdentifyingCrypticRelationships......................................... 47 LeiSunandApostolosDimitromanolakis 5 EstimatingAlleleFrequencies............................................ 59 IndraAdriantoandCourtneyMontgomery 6 TestingDeparturefromHardy–WeinbergProportions...................... 77 JianWangandSanjayShete 7 EstimatingDisequilibriumCoefficients.................................... 103 MarenVensandAndreasZiegler 8 DetectingFamilialAggregation .......................................... 119 AdamC.Naj,YoSonPark,andTerriH.Beaty 9 EstimatingHeritabilityfromTwinStudies................................. 151 KarinJ.H.Verweij,MiriamA.Mosing,BrendanP.Zietsch, andSarahE.Medland 10 EstimatingHeritabilityfromNuclearFamilyandPedigreeData.............. 171 MurielleBochud 11 CorrectingforAscertainment............................................ 187 WarrenEwensandRobertC.Elston 12 SegregationAnalysisUsingtheUnifiedModel............................. 211 XiangqingSun 13 DesignConsiderationsforGeneticLinkageandAssociationStudies.......... 237 Je´re´mieNsengimanaandD.TimothyBishop 14 Model-BasedLinkageAnalysisofaQuantitativeTrait ...................... 263 AudreyH.SchnellandXiangqingSun 15 Model-BasedLinkageAnalysisofaBinaryTrait............................ 285 RitaM.Cantor 16 Model-FreeLinkageAnalysisofaQuantitativeTrait........................ 301 NathanJ.MorrisandCatherineM.Stein 17 Model-FreeLinkageAnalysisofaBinaryTrait............................. 317 WeiXu,ShelleyB.Bull,LuciaMirea,andCeliaM.T.Greenwood vii viii Contents 18 SingleMarkerAssociationAnalysisforUnrelatedSamples................... 347 GangZheng,JinfengXu,AoYuan,andJosephL.Gastwirth 19 Single-MarkerFamily-BasedAssociationAnalysisConditional onParentalInformation ................................................ 359 Ren‐HuaChungandEdenR.Martin 20 SingleMarkerFamily-BasedAssociationAnalysisNotConditional onParentalInformation ................................................ 371 JunghyunNamkung 21 AllowingforPopulationStratificationinAssociationAnalysis................ 399 HuaizhenQinandXiaofengZhu 22 HaplotypeInference.................................................... 411 XinLiandJingLi 23 Multi-SNPHaplotypeAnalysisMethodsforAssociationAnalysis............. 423 DanielO.StramandVenkatramanE.Seshan 24 DetectingRareVariants................................................. 453 TaoFengandXiaofengZhu 25 TheAnalysisofEthnicMixtures.......................................... 465 XiaofengZhu 26 IdentifyingGeneInteractionNetworks ................................... 483 GurkanBebek 27 StructuralEquationModeling ........................................... 495 CatherineM.Stein,NathanJ.Morris,andNoraL.Nock 28 GenotypeCallingfor theAffymetrixPlatform ............................. 513 ArneSchillertandAndreasZiegler 29 GenotypeCallingfor theIlluminaPlatform ............................... 525 YikYingTeo 30 ComparisonofRequirementsandCapabilitiesofMajor MultipurposeSoftwarePackages ......................................... 539 RobertP.IgoJr.andAudreyH.Schnell Index...................................................................... 559 Contributors INDRA ADRIANTO (cid:1) Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA TERRI H. BEATY (cid:1) Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA GURKAN BEBEK (cid:1) Center for Proteomics and Bioinformatics, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, USA D. TIMOTHY BISHOP (cid:1) Section of Epidemiology and Biostatistics, Leeds Institute of Molecular Medicine, University of Leeds, Cancer Genetics Building, Leeds, UK MURIELLE BOCHUD (cid:1) Institute of Social and Preventive Medicine, University of Lausanne, Lausanne, Switzerland SHELLEY B. BULL (cid:1) Samuel Lunenfeld Research Institute of Mount Sinai Hospital and Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada RITA M. CANTOR (cid:1) Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA; Center for Neurobehavioral Genetics, Department of Psychiatry, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA REN-HUA CHUNG (cid:1) John P. Hussman Institute for Human Genomics, Leonard M. Miller School of Medicine, University of Miami, Miami, FL, USA APOSTOLOS DIMITROMANOLAKIS (cid:1) Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada ROBERTC. ELSTON (cid:1) Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA WARREN EWENS (cid:1) Department of Biology, University of Pennsylvania, Philadelphia, PA, USA TAO FENG (cid:1) Department of Epidemiology and Biostatistics, Case Western Reserve University School of Medicine, Cleveland, OH, USA CELIA M.T. GREENWOOD (cid:1) Centre for Clinical Epidemiology, Lady Davis Research Institute, Jewish General Hospital, Montreal, QC, Canada; Cancer Research Society Division of Epidemiology, Department of Oncology, McGill University, Montreal, QC, Canada JOSEPH L. GASTWIRTH (cid:1) Department of Statistics, George Washington University, Washington, DC, USA ROBERTP. IGO JR (cid:1) Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA JING LI (cid:1) Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA XIN LI (cid:1) Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA EDEN R. MARTIN (cid:1) John P. Hussman Institute for Human Genomics, Leonard M. Miller School of Medicine, University of Miami, Miami, FL, USA ix
Description: