ebook img

Modeling X-linked ancestral origins in multiparental populations PDF

56 Pages·2015·0.54 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Modeling X-linked ancestral origins in multiparental populations

G3: Genes|Genomes|Genetics Early Online, published on March 4, 2015 as doi:10.1534/g3.114.016154 Modeling X-linked ancestral origins in multiparental 1 populations 2 ChaozhiZheng∗ 3 Biometris,WageningenUniversityandResearchCentre,TheNetherlands 4 March1,2015 5 Runninghead: ModelingX-linkedancestralorigins 6 Keywords: advancedintercrosslines(AIL),collaborativecross(CC),diversityoutcross(DO), 7 Drosophilasyntheticpopulationresource(DSPR),mapexpansion. 8 ∗ Correspondingauthor: 9 ChaozhiZheng 10 Biometris 11 WageningenUniversityandResearchCentre 12 POBox16,6700AAWageningen 13 TheNetherlands 14 Phone: +31317480806 15 Email: [email protected] 16 1 © The Author(s) 2013. Published by the Genetics Society of America. Abstract 17 The models for the mosaic structure of an individual’s genome from multiparental populations 18 have been developed primarily for autosomes, whereas X chromosomes receive very little at- 19 tention. In this paper, we extend our previous approach to model ancestral origin processes 20 along two X chromosomes in a mapping population, which is necessary for developing hid- 21 den Markov models in the reconstruction of ancestry blocks for X-linked quantitative trait lo- 22 cus (QTL) mapping. The model accounts for the joint recombination pattern, the asymmetry 23 between maternally and paternally derived X chromosomes, and the finiteness of population 24 size. The model can be applied to various mapping populations such as the advanced inter- 25 cross lines (AIL), the Collaborative Cross (CC), the heterogeneous stock (HS), the Diversity 26 Outcross (DO), and the Drosophila synthetic population resource (DSPR). We further derive 27 the map expansion, density (per Morgan) of recombination breakpoints, in advanced intercross 28 populations with L inbred founders under the limit of an infinitely large population size. The 29 analytic results show that for X chromosomes the genetic map expands linearly at a rate (per 30 generation)oftwo-thirdstimes1−10/(9L)fortheAIL,andatarateoftwo-thirdstimes1−1/L 31 for the DO and the HS, whereas for autosomes the map expands at a rate of 1 − 1/L for the 32 AIL,theDO,andtheHS. 33 2 Introduction 34 There have been recently designed QTL mapping populations with either multiple parents to 35 increase the genetic diversity of the founder population, or many intercross generations to im- 36 provethemappingresolutionbyaccumulatinghistoricalrecombinationevents. Someexamples 37 38 include the Collaborative Cross (CC) (CHURCHILL et al. 2004), the advanced intercross lines 39 (AIL) (DARVASI and SOLLER 1995), the heterogeneous stock (HS) (MOTT et al. 2000), the 40 diversity outcross (DO) (SVENSON et al. 2012), and the Drosophila synthetic population re- 41 source (DSPR) (KING et al. 2012). The CC can be regarded as a set of eight-way recombinant inbredlines(RIL)bysiblingmating,whereeightfoundersofeachlinearepermuted. 42 ThegenomesofindividualsinQTLmappingpopulationsarerandommosaicsofthefounders’ 43 genomes. The QTL mapping generally necessitates the reconstruction of these genome blocks 44 along two homologous chromosomes of a sampled individual from available genotype data. 45 Such reconstruction is often performed under a hidden Markov model (HMM) with the latent 46 state being the pair of ancestral origins at a locus, where the transition probability of ancestral 47 originsbetweentwoloci,orthetwo-locusdiplotype(two-haplotype)probabilitiesarerequired. 48 Modelingancestraloriginsalongapairofautosomalchromosomeshasbeenwelldeveloped 49 50 recently. BROMAN (2012a) extended the approach of HALDANE and WADDINGTON (1931) from the two-way to four- and eight-way RIL by sibling mating, and provided recipes for cal- 51 52 culating autosomal two-locus diplotype probabilities numerically. JOHANNES and COLOME- 53 TATCHE (2011) derived autosomal two-locus diplotype probabilities for the two-way RIL by 54 selfing. ZHENG et al. (2014) described a general modeling framework for ancestral origins that can be applied to autosomes in various mapping populations such as the RIL by selfing or 55 siblingmatingandtheAIL. 56 A special treatment is required for modeling ancestral origins along a pair of X chromo- 57 58 somes. HALDANE and WADDINGTON (1931) derived the recurrence relations of the X-linked two-locusdiplotypeprobabilitiesforthetwo-wayRILbysiblingmatingandthebi-parentalre- 59 peated parent-offspring mating, and their closed form solutions for the final homozygous lines. 60 61 BROMAN (2005) extended the solutions to the two- and three-locus haplotype probabilities for 62 thetwo,four,oreight-wayRILbysiblingmating. BROMAN (2012b)derivedtheX-linkedtwo- 3 locushaplotypeprobabilitiesinadvancedintercrosspopulationsincludingtheAIL,theHS,and 63 theDO,assuminganinfinitelylargepopulationsize. 64 65 In this paper, we extend our previous work (ZHENG et al. 2014) to model the ancestral origins along a pair of X chromosomes in a finite mapping population. This extension also 66 67 builds on the theory of junctions in inbreeding (FISHER 1949, 1954). A junction is defined as a boundary point of genome blocks on chromosomes where two distinct ancestral origins 68 meet, and the boundary points that occur at the same location along multiple chromosomes are 69 counted as a single junction. The map expansion is the expected junction density (per Morgan) 70 on a maternally or paternally derived X chromosome, denoted by Rm or Rp, respectively. We 71 denotebyρmp theoveralljunctiondensityalongtheXXchromosomesofafemale,anditcanbe 72 73 usedasameasureofX-linkedQTLmappingresolution(DARVASIandSOLLER1995;WELLER 74 and SOLLER 2004). The key feature of this extension is to account for the asymmetry between maternally and 75 paternally derived X chromosomes because the latter did not experience any crossover events 76 withYchromosomes. WefirstpresentamodelframeworkforX-linkedancestralorigins,where 77 the recurrent relations are derived for various junction densities including the map expansions 78 Rm and Rp. Then, we derive the closed form solutions for these expected densities in mapping 79 populationsincludingtheRILbysiblingmating,theAIL,theHS,theDO,andtheDSPR;they 80 are evaluated by forward simulation studies. Lastly, we discuss the model assumptions and the 81 implicationsoftheanalyticresultsonhaplotypereconstructionsandbreedingdesigns. 82 A model for X-linked ancestral origins 83 Assumptions and notation 84 Consideradioeciouspopulationwithtwoseparatesexes: homogameticfemaleswithsexchro- 85 mosomesXXandheterogameticmaleswithsexchromosomesXY.Therearenorecombination 86 events between X and Y, and thus we ignore the pseudoautosomal regions on the XY chro- 87 mosomes. As in most mammals and some insects (Drosophila), some flowering species, such 88 as white campion (Silene latifolia), papaya (Carica papaya), and asparagus (Asparagus offi- 89 4 90 cianalis), have the XY sex determination system (MING and MOORE 2007). The dioecious population was founded in generation 0, and it has non-overlapping generations. There are no 91 natural or artificial selections since the founder population. The mating schemes of produc- 92 ing the next generation are random, and they may vary from one generation to the next. The 93 assignmentsofoffspringgendersareassumedtobeindependentofmatingschemes. 94 The ancestral origins along two homologous autosomes have been modeled as a continu- 95 96 ous time Markov chain (CTMC) (ZHENG et al. 2014). We extend the approach to account for the asymmetry of XX chromosomes, using superscript m (p) for maternally (paternally) 97 derived genes or chromosomes. See Table S1 for a list of symbols used in this paper. Let 98 O(x) = (Om(x),Op(x))betheorderedpairoftheancestraloriginsatlocationxalongthetwo 99 X chromosomes of a randomly sampled female. The ancestral origin process O(x) is assumed 100 to follow a CTMC , where x is the time parameter in unit of Morgan. We assign a unique an- 101 cestral origin to the X chromosomes of each inbred founder, or to each X chromosome of each 102 outbred founder. Let L be the number of possible ancestral origins that Om(x) or Op(x) may 103 take. L may be less than the number of inbred founders if some male founders did not produce 104 daughters to pass down their X chromosomes. For example, L = 3 for the four-way RIL by 105 siblingmatingsinceoneofthefoundermatingpairsproducesonlyoneson(Figure1A). 106 The L possible ancestral origins are assumed to be exchangeable, so that we focus on the 107 changesofancestralorigins. SeeFigure1BandtherelevantpartofDiscussionontheexchange- 108 ability assumption. The initial distribution of O(0) at the leftmost locus x = 0 is specified by 109 αmp(11),aprobabilitythatthetwoancestraloriginsarethesame(identicalbydescent,IBD)at 110 alocus. Letαmp(12) = 1−αmp(11)bethenon-IBDprobability. GiveneitherIBDornon-IBD 111 at the locus, the ancestral origin pair O(0) takes each of the possible combinations with equal 112 probability. 113 The transition rate matrix of the CTMC can be constructed from the expected densities 114 Jmp(abcd) of all the junction types (abcd) along the two X chromosomes of a female. The 115 junction type (abcd) denotes the four-gene IBD configuration (abcd) on both sides of a junc- 116 tion,whereab(cd)isontheleft-hand(right-hand)side,haplotypeac(bd)isonthefirst(second) 117 chromosome, and the same integers denote IBD. Figure 1C illustrates the seven types of junc- 118 5 tions: (1112), (1121), (1122), (1211), (1213), (1222), and (1232) for L ≥ 3, where the two 119 types (1213) and (1232) do not exist for L = 2. We do not define junction types for the eight 120 two-locus configurations (1111), (1123), (1212), (1221), (1223), (1231), (1233), and (1234), 121 since there are either zero or no less than two junctions between the two loci. Figure 1D shows 122 thetransitionratematrixoftheCTMCinthefour-wayRILbysiblingmating. Figure1Eshows 123 therelationshipsbetweentheexpecteddensitiesJmp(abcd)andthetransitionrates,andtheyare 124 derived based on the interpretation that Jmp(abcd)∆d is the two-locus diplotype probability, in 125 thelimitthatthegeneticdistance∆d(inMorgan)betweentwolocigoestozero. 126 ThemapexpansionsRm andRp andtheoverallexpectedjunctiondensityρmp aregivenby 127 Rm =Jmp(1121)+Jmp(1122)+Jmp(1222)+Jmp(1232), (1) Rp =Jmp(1112)+Jmp(1122)+Jmp(1211)+Jmp(1213), (2) ρmp =Rm +Rp −Jmp(1122), (3) 128 similar to those for autosomes (ZHENG et al. 2014) except that Rm (cid:54)= Rp for X chromosomes. We have Jmp(1112) = Jmp(1211) and Jmp(1121) = Jmp(1222), since the junction densities 129 do not depend on the direction of chromosomes. In contrast to the single-locus two-gene non- 130 IBD probability αmp(12), the ordering of the superscripts in Jmp(abcd) generally does matter, 131 that is, Jmp(abcd) (cid:54)= Jpm(abcd) except for the junction type (1122). In addition, we have 132 Jmp(1213) = Jpm(1232) (see Figure 1C). Thus, the CTMC of X-linked ancestral origins can 133 be described by one non-IBD probability αmp(12) and the five expected junction densities Rm, 134 Rp, Jmp(1122), Jmp(1232), and Jpm(1232), under the exchangeability assumption of the L 135 possibleancestralorigins. 136 Single-locus non-IBD probabilities 137 The calculation of the expected junction densities necessitates the introduction of the proba- 138 bilities for the two- and three-gene IBD configurations at a single locus. All the following 139 derivationsoftherecurrencerelationsfortheseprobabilitiesarebasedontheMendelianinher- 140 6 itance of X-linked genes: a paternally derived gene must be a copy of the maternally derived 141 gene in a male of the previous generation, and a maternally derived gene has equal probability 142 ofbeingacopyofeitherthematernallyderivedgeneorthepaternallyderivedgeneinafemale 143 ofthepreviousgeneration. 144 In a dioecious mapping population, the single-locus two-gene probabilities of IBD con- 145 figuration (ab) depend on whether the two homologous genes are in a single individual or 146 not. Thus, we denote by βmm(ab), βmp(ab), and βpp(ab) the two-gene probability of IBD 147 configuration (ab), given that the two homologous genes are in two distinct individuals in gen- 148 eration t and have parental origins mm, mp, and pp, respectively (Figure 2A); it holds that 149 βpm(ab) = βmp(ab). 150 The recurrence relations of the two-gene non-IBD probabilities are derived by tracing the 151 parental origins of two homologous genes from generation t ≥ 1 into the previous generation, 152 andtheyaregivenby 153 (cid:20) (cid:21) 1 1 1 1 βmm(12) =sm αmp(12)+(1−sm) βmm(12)+ βmp + βpp (4a) t t 2 t−1 t 4 t−1 2 t−1 4 t−1 1 1 βmp(12) = βmm(12)+ βmp(12) (4b) t 2 t−1 2 t−1 βpp(12) =(1−sp)βmm(12) (4c) t t t−1 αmp(12) =βmp(12) (4d) t t whereequation(4d)holdsimmediatelyafteronegenerationofrandommating,althoughitmay 154 not hold in the founder population at t = 0. In equation (4a), the first term on the right-hand 155 sidereferstothescenariothatthetwogeneswithparentoriginsmmingenerationtcomefrom 156 asinglefemaleofthepreviousgenerationwiththeprobabilitysm,andwithprobability1/2that 157 t they come from different genes of the female. In equation (4b), the two genes with parental 158 origins mp cannot merge since they must come from one male and one female of the previous 159 generation. In equation (4c), the two genes with parental origins pp in generation t come from 160 a single male of the previous generation with the probability sp; if so, they must merge since 161 t thereisonlyoneXchromosomeinamale. 162 7 We introduce the single-locus three-gene probabilities of IBD configuration (abc). Let 163 βmmm(abc), βmmp(abc), βmpp(abc), and βppp(abc) be the probabilities of IBD configuration 164 (abc), given that the three homologous genes are in three distinct individuals in generation t 165 and have parental origins mmm, mmp, mpp, and ppp, respectively (Figure 2B). Similarly, we 166 define αmmp(abc) and αmpp(abc) for three homologous genes in two distinct individuals. The 167 ordering of the superscripts does not matter for these three-gene probabilities, for example, 168 βmmp(abc) = βmpm(abc) = βpmm(abc). 169 The recurrence relations of the three-gene non-IBD probabilities are derived by tracing the 170 parentaloriginsofthreehomologousgenesfromgenerationt ≥ 1intothepreviousgeneration, 171 andtheyaregivenby 172 (cid:20) (cid:21) 1 1 1 βmmm(123) =3qm αmmp(123)+ αmpp(123) +(1−sm −2qm) t t 2 2 t−1 2 t−1 t t (5a) (cid:20) (cid:21) 1 3 3 1 × βmmm(123)+ βmmp(123)+ βmpp(123)+ βppp(123) 8 t−1 8 t−1 8 t−1 8 t−1 1 βmmp(123) =sm αmmp(123)+(1−sm) t t 2 t−1 t (cid:20) (cid:21) (5b) 1 1 1 × βmmm(123)+ βmmp(123)+ βmpp(123) 4 t−1 2 t−1 4 t−1 (cid:20) (cid:21) 1 1 βmpp(123) =(1−sp) βmmm(123)+ βmmp(123) (5c) t t 2 t−1 2 t−1 βppp(123) =(1−sp −2qp)βmmm(123) (5d) t t t t−1 αmmp(123) =βmmp(123) (5e) t t αmpp(123) =βmpp(123) (5f) t t where qm is the coalescence probability of three maternally derived genes in generation t that 173 t a particular pair of genes come from a single female of the previous generation and the third 174 comes from another female of the previous generation, and similarly qp for three paternally 175 t derivedgenes. Theequations(5e,5f)holdimmediatelyafteronegenerationofrandommating, 176 althoughtheymaynotholdinthefounderpopulationatt = 0. 177 Thederivationsoftherecurrenceequations(5a–5d)forthethree-genenon-IBDprobabilities 178 are similar to equations (4a–4c) for the two-gene non-IBD probabilities. In equation (5a), the 179 8 pre-factor3denotesthateachofthethreepossiblepairsofgenesmaycomefromasinglefemale 180 ofthepreviousgeneration;theterm(1−sm −2qm)istheprobabilitythatthethreematernally 181 t t derivedgenesingenerationtcomefromthreedistinctfemalesofthepreviousgeneration,andit 182 isobtainedbytheprobability1−smthatonepairofgenescomefromtwodistinctfemalesminus 183 t theprobability2qm thatthethirdgeneandeithergeneofthepaircomefromasinglefemaleof 184 t the previous generation. Similarly, the term (1−sp −2qp) in equation (5d) is the probability 185 t t that the three paternally derived genes in generation t come from three distinct males of the 186 previousgeneration. 187 Expected junction densities 188 We derive the recurrence relations for Rm, Rp, Jmp(1122), Jmp(1232), and Jpm(1232). The 189 190 recurrencerelationforRm followsfromthetheoryofjunctions(FISHER 1954): anewjunction is formed whenever a recombination event occurs between two X chromosomes that are non- 191 IBDatthelocationofacrossover. TherecurrencerelationsforthemapexpansionsRm andRp 192 aregivenby 193 1 1 Rm = Rm + Rp +αmp(12), (6a) t 2 t−1 2 t−1 t−1 Rp = Rm , (6b) t t−1 where equation (6b) follows directly from no recombination events occurring between the XY 194 chromosomesinamaleofthepreviousgeneration. 195 Tomeasuredifferentialmapexpansionsbetweenmaternallyandpaternallyderivedchromo- 196 somes,wedefineRX = (2Rm+Rp)/3andR− = (Rm−Rp)/2,andtheirrecurrencerelations 197 t t t t t t aregivenby 198 2 RX =RX + αmp(12), (7a) t t−1 3 t−1 1 1 R− =− R− + αmp(12), (7b) t 2 t−1 2 t−1 9 accordingtotherecurrenceequations(6a,6b). Ifthereareequalnumbersofmalesandfemales 199 in the population, a randomly chosen X chromosome is maternally derived with probability 200 2/3, and it is paternally derived with probability 1/3. Thus RX can be interpreted as the map 201 t expansiononarandomlychosenXchromosome. 202 For comparisons, we denote by RA the map expansion on a random chosen autosome, and 203 t 204 anditsrecurrencerelationisgivenby(MACLEOD etal.2005; ZHENG etal.2014) RA = RA +αAA(12), (8) t t−1 t−1 whereαAA(12)referstothenon-IBDprobabilitybetweentwohomologousautosomalgenesin 205 t an individual. The equations (7a, 8) show that the map expansion RX for an X chromosome is 206 t two-thirds RA for an autosome if the non-IBD probability αAA(12) for autosomes is the same 207 t t asαmp(12)forXXchromosomes,andthesexratiois1. 208 t In addition to Jmp(abcd) and Jpm(abcd), we define Kmm(abcd), Kmp(abcd) , Kpm(abcd), 209 t t t t t and Kpp(abcd) for haplotypes ac and bd that are in two distinct individuals and have parental 210 t originsmm,mp,pm,andpp,respectively(Figure2C).Thecontributionstothejunctionsinthe 211 current generation come from either the existing junctions at the previous generation, or a new 212 junction via a crossover event. In the following, we focus on the formation of a new junction, 213 since the contributions of the existing junctions in the previous generation are similar to those 214 forthetwo-genenon-IBDprobabilitiesintherecurrenceequations(4a–4c). 215 Theschematicsoftherecurrencerelationsforjunctiontypes(1232)and(1122)areshownin 216 FigureS1. Theancestrytransitionsoftype(1122)occuronbothhaplotypesacandbdatexactly 217 the same location, and thus a new junction of type (1122) can be formed only by duplicating a 218 chromosome segment. It holds that Jmp(1122) = Jpm(1122) and Kmp(1122) = Kpm(1122) 219 t t t t becauseofthesymmetryoftype(1122). Wehave 220 10

Description:
populations with L inbred founders under the limit of an infinitely large population size The transition rate matrix of the CTMC can be constructed from the .. recurrence relation for Rm follows from the theory of junctions (FISHER
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.