Peaetal.BMCGenomics2013,14:61 http://www.biomedcentral.com/1471-2164/14/61 RESEARCH ARTICLE Open Access Extensive genomic characterization of a set of near-isogenic lines for heterotic QTL in maize Zea mays ( L.) Giorgio Pea1*, Htay Htay Aung1, Elisabetta Frascaroli2, Pierangelo Landi2 and Mario Enrico Pè1 Abstract Background: Despite thecrucial role that heterosis has playedin crop improvement, itsgenetic and molecular bases are still elusive. Several types ofstructured populations were used to discover thegenetic architecture underlying complex phenotypes,and several QTLrelated to heterosis were detected. However, such analyses generally lacked thestatistical power required for the detailed characterization of individual QTL. Currently, QTL introgression intonear-isogenic materials is considered themost effective strategy to this end, despite such materials inevitably contain a variable, unknown and undesired proportion of non-isogenic genome. Anintrogression program based on residual heterozygous lines allowed us to develop five pairs of maize (Zea mays L.) near-isogeniclines (NILs) suitable for thefine characterization of threemajorheterotic QTL previously detected. Here wedescribe the results ofthedetailed genomic characterization of these NILs that weundertook to establish their genotypic structure, to verify thepresence oftheexpected genotypeswithin target QTLregions,and to determine the extentand location of residual non-isogenic genomic regions. Results: The SNP genotypingapproach allowed us to determine the parent-of-origin allele for 14,937 polymorphic SNPsand to describe in detail thegenotypicstructure of allNILs. The correct introgression was confirmed for all target QTL inthe respective NIL and several non-isogenic regions were detectedgenome-wide. Possible linkage drag effectsassociated to thespecific introgressed regions were observed. The extent and position of other non-isogenic regions varied amongNIL pairs, probably deriving from random segregating sections still present at theseparationof lineages within pairs. Conclusions: The results of this work strongly suggest that the actual isogenicity and the genotypic architecture of near-isogenic materials should be monitored both during theintrogression procedure and onthefinal materials as a paramount requisite for a successful mendelization of target QTL. The informationhere gathered onthe genotypic structure of NILs will be integrated infuture experimental programs aimed at thefine mappingand isolation of major heterotic QTL, a crucial step towards the understanding ofthe molecular bases of heterosis in maize. Keywords: Genome-wide genotyping, Singlenucleotide polymorphism, Maize, Near-isogenic lines, Heterosis, Linkage drag *Correspondence:[email protected] 1ScuolaSuperioreSant’Anna,PiazzaMartiridellaLibertà33,56127,Pisa,Italy Fulllistofauthorinformationisavailableattheendofthearticle ©2013Peaetal.;licenseeBioMedCentralLtd.ThisisanOpenAccessarticledistributedunderthetermsoftheCreative CommonsAttributionLicense(http://creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,and reproductioninanymedium,providedtheoriginalworkisproperlycited. Peaetal.BMCGenomics2013,14:61 Page2of15 http://www.biomedcentral.com/1471-2164/14/61 Background Despite the crucial role that heterosis has played in crop M BH M BB 1 1 improvement over the years, its genetic and molecular QTL BH QTL BB bases are still elusive [1-3]. Several types of structured M BH M BB populations, such as RIL (Recombinant Inbred Line) 2 2 populations, have been widely used to discover the gen- RHL-F x RIL-F 4:5 12:13 etic architecture underlying complex phenotypes, in- MAS cluding heterosis. Numerous QTL (Quantitative Trait Loci) for yield and/or yield components related to heter- M BH M BB osis were detected in maize [4-6] as well as in several 1 1 QTL BH QTL BB other species [7-18], showing a variety of effects ranging M BH M BB from dominance at various levels to epistasis. However, 2 2 QTL analysis in structured populations has not been RHL-ΨBC x RIL-F 1 12:13 able yet to provide a definite answer on the genetic na- ture of hybrid vigor. In fact, whereas structured popula- MAS tions are statistically powerful for discovering relevant M BH genomic intervals, they generally lack the statistical 1 power needed for the isolation and the detailed QTL BH characterization of individual QTL [19]. Accurate esti- M BH 2 mation of the effects of QTL, including those for heter- RHL-BC otic QTL [3], introgressed into near-isogenic materials is 1 considered the most effective strategy for their characterization and the identification of their molecular BC -S 1 1 bases; this in turn could allow incorporating specific MAS QTL into effective breeding programs [20,21]. Near- BC1-S1[BB] BC1-S1[HH] isogenic Lines (NILs) and Introgression Lines (ILs) [22] are suitable genetic materials to this aim, and can be M BB M HH used as a resource to initiate positional cloning projects BC1-S2[BB] 1 1 BC1-S2[HH] QTL BB QTL HH and to address more general questions regarding epi- NIL-BB NIL-HH M BB M HH static interactions, genome organization and genetic 2 2 linkage. General-purpose NIL or IL panels have been Figure1Introgressionschemeadoptedfortheproductionof developed for QTL analyses in maize, such as those recombinantheteroticNILs.ThetwoNILs(NIL-BBandNIL-HH)were obtained in crosses between B73 and Tx303 [23], be- designedtointrogressthetargetQTL(redbar)intheB73/B73andinthe tween B73 and Gaspe Flint [24] and between B73 and H99/H99genotype,respectively,whilebeingisogenicanywhereelsein thegenome.EachpairofNILswasindependentlyobtainedstartingfrom Mo17 [19], with the aim of providing tools for validating differentRIL-F individuals(RHL-F )selectedbecauseheterozygousat QTL andforinitiating fine-mappingexperiments. 4:5 4:5 thetargetQTL.ThesewerecrossedtwicetothecorrespondingRIL-F 11:13 In previous work undertaken to shed light on the gen- asrecurrentparent,resultingintheintrogressionoftargetQTLinto etic basis of heterosis in maize, we performed a QTL ana- differenthighlyhomozygousB73×H99recombinantgenetic lysis on genetic materials derived from a RIL-F12:13 backgrounds.Asanexample,aback-crosstoahypotheticalRIL-F12:13 homozygousB73/B73atbothflankingmarkers(M andM)is population derived from the single cross B73×H99 [4]. 1 2 represented.Applicationofmarker-assistedselection(MAS)isreported Thelevelof heterosisunderlyinggeneticeffectsforseveral alongsidearrowswhichindicate(pseudo)-backcrossandselfing agronomictraitswasevaluated,andanumberofQTLwith generations.Verticalbarsrepresentachromosomeportionofindividual heterotic effects on phenotypes were detected. With the plants;blue,pinkandgreencolordenoterespectivelygenotypesB73/ aim of identifying the molecular determinants underlying B73,H99/H99andB73/H99.GenotypesindicatedattheQTLarethose expectedinabsenceofdoublerecombinationevents.Abbreviations: heterotic QTL, we undertook an introgression program RHL,residualheterozygousline;ΨBC,pseudo-backcrossone;BC, based on residual heterozygous lines to develop on- 1 1 backcrossone;BC-S andBC-S,firstandsecondselfinggenerations. 1 1 1 2 purpose near-isogenic materials [25] suitable for the fine characterization of QTL chosen for their over-dominance effects [4]. In particular, following the crossing-and- in different genetic combinations and growing conditions selection approach schematized in Figure 1, we obtained [25,26].Intheperspectiveoffinemappingtheintrogressed pairsofNILsdesignedtohavecontrastinggenotypesatthe heterotic QTL, we chose to focus our efforts on the five selected QTLregionswithinanotherwiseisogenic recom- NILpairsintrogressingQTL3.05,4.10and10.03indiffer- binant genetic background (recombinant NILs). The pres- ent genetic backgrounds (Table 1); these showed over- ence of heterotic phenotypes in these NILs was validated dominanceforgrainyieldandnumberofkernelsperplant, Peaetal.BMCGenomics2013,14:61 Page3of15 http://www.biomedcentral.com/1471-2164/14/61 Table1LocationandexactlengthofintrogressedQTLregions QTLa Left Position Right Position Interval Introgression Markerb (bp) Markerb (bp) length linesc (Mbp) 3.05 bnlg1505 147,812,359 dupssr23 166,846,373 19.03 RIL8,RIL40 4.10 umc1101 241,805,620 umc1109 243,738,469 1.93 RIL40,RIL55 10.03 bnlg1451 4,436,646 umc2016 62,064,437 57.66 RIL63 aQTLnamescorrespondtothechromosomebinswheretheQTLweremapped.bUpstreamanddownstreammarkersusedformarker-assistedintrogression. cIntrogressionlinesaretheB73×H99RIL-F employedasrecurrentparentsintheintrogressionscheme(seeFigure1).PositionsrefertoexactBLASTNlocal 11:13 alignmentontheB73RefGen_v1maizereferencesequence(withthecorrectdistanceandorientation)ofbothforwardandreversemarkerPCRprimers(publicly availableatMaizeGDB[31]). which are important agronomic and economic traits. genotyping techniques available, those based on highly Figure 2 summarizes the performance of these five NIL parallelized polymorphisms detection are better suited to pairs and their hybrids obtained in our previous studies efficiently tackle this goal. The recent completion of the comprising six field trials with two replications per trial maizegenomereferencesequence[27],allowingthephys- ([26]andunpublisheddata).Resultsrefertograinyieldper ical mapping of SNPs (Single Nucleotide Polymorphisms) plant and its component number of kernels per plant, the and other variations to the maize chromosomes, set the two traits exhibiting the highest level of heterosis. Field path for the development and testing of highly inform- performance is expressed as percentage of the parental ative,easytouseandrobustgenotypingplatformsforthis mean,toallowacomparisonamongQTLeffects.Ourdata species. These platforms are based either upon Compara- clearlyshowsthattheheterozygotewassignificantlysuper- tiveGenomic Hybridization (CGH),whichalsoallows the ioratleasttotheparentalmeaninallinstancesbutkernel detection of Copy Number Variations (CNVs) relative to per plant for 3.05 R8. In two cases, 4.10 R40 and 10.03 the reference genome [28], Genotyping by Sequencing R63, the heterozygote was even strongly superior to the (GBS) methods [29] or large scale SNP genotyping arrays best homozygote. Taken together, these data corroborate [30].Thelatterplatform, developedbyIllumina, Inc.(San our previous observations [4,25], justifying the more Diego, CA, USA) under the name of MaizeSNP50 Bead- detailed investigation here presented on these NILs. In Chip, is based on SNP markers selected to be preferen- fact,althoughdesignedaspairsofinbredlinesthatideally tially located in genes and evenly distributed across the differ by a single and well defined region, NILs inevitably genome and it has been tested with a large set of maize contain a variable proportion of non-isogenic genome. germplasm, including North American and European in- This is mainly due to linkage drag, that carries segments bred lines, parent/hybrid combinations, and distantly linked to the one targeted for introgression, and to the relatedteosintematerials. presenceofotherresidualunlinkednon-isogenicgenomic Herewe describe the results ofthe detailed genotyping, fragments. Therefore, to perform an accurate research, a obtained by using the Illumina MaizeSNP50 BeadChip detailed picture of the genotypic structure of the genetic [30], of five NIL pairs that we specifically produced [25] materialunderstudyisparamount.Amongthenumerous for the introgression of three heterotic QTL [4]. The GRAIN YIELD KERNELS PER PLANT 130 130 A B mean value 120 mean value120 % of the overall parental 11019000 % of the overall parental 11019000 BBHPBHMH 80 80 3.05 R8 3.05 R40 4.10 R40 4.10 R55 10.03 R63 3.05 R8 3.05 R40 4.10 R40 4.10 R55 10.03 R63 Figure2PerformanceofheteroticNILsforgrainyieldandkernelsperplant.Performance,aspercentageoftheoverallparentalmean acrosssixfieldtrials(PM,redline),ofthethreegenotypesB73/B73(BB,blue),H99/H99(HH,pink)andB73/H99(BH,green)forgrainyield(panelA) andforkernelsperplant(panelB)ofQTL3.05,4.10and10.03inallfiveNILsets.VerticalbarsrepresentthestandarderroraspercentageofPM. Peaetal.BMCGenomics2013,14:61 Page4of15 http://www.biomedcentral.com/1471-2164/14/61 objectivesofthisstudywere:(1)toassessthereliabilityof considered pedigree consistency (see Methods) to the full theMaizeSNP50BeadChipplatformanddetermineitsde- SNPdataset(genotypecallsandqualitystatusofallSNPsis scriptivepowerontheyetuntestedB73×H99geneticsys- reportedinAdditionalfile2).Briefly,thefollowingnumber tem; (2) to establish the detailed genetic structure of five of SNPs were excluded at each subsequent filtering step: NIL pairs produced to introgress three heterotic QTL in 8,319 failed in all samples; 5,386 failed in any samples different recombinant genetic backgrounds; (3) to verify among B73-SSA, H99 and B73-SSA×H99 F hybrid; 618 1 the presence of the expected genotypes within the target unmapped or mapped to the unknown chromosome; 114 QTL introgression regions; (4) to compare genotypic pat- heterozygous in either of the NIL parental lines (of which ternswithinNILpairsinordertoestimatetheextentand 29inB73-SSA,81inH99and4inboth);625havingincon- positionofresidualnon-isogenicgenomicregions. sistent genotypes in the B73-SSA×H99 F hybrid; and, fi- 1 nally,5havinginconsistentgenotypeinNILssamples. Results The descriptive power of the MaizeSNP50 BeadChip SNPqualitycontrolandgenomicdistributionintheB73× withrespecttotheB73×H99geneticsystemwasassessed H99geneticsystem bydeterminingthegenomicdistributionoftheSNPwork- Before proceeding to the genetic characterization of NILs, ingsetamongandwithinthemaizechromosomes(Table2 we assessed the reliability of the MaizeSNP50 BeadChip and Figure 3). The distribution of the 42,771 SNPs platform and determined its descriptive power on the yet (Table 2, “Total”) was found significantly non-uniform untested B73×H99 genetic system. To this aim, we per- among chromosomes due to a significant higher- and formed thorough quality controls of SNPs and analyzed lower-than-expected number of SNPs in chromosomes 1 their genomic distribution based upon the genotypic calls and 4, respectively. The same pattern was also observed obtainedfortheNILs’parentalinbredlines(B73andH99), when considering all the 55,326 SNPs with known fortheirF hybridandfortworeferenceB73samples. chromosomepositionoriginallyincludedinthechip(data 1 InordertoassesstheconsistencyofSNPgenotypecalls not shown), indicating that the overall SNP distribution across replicate samples, we compared to each other the wasnotbiasedbytheappliedfilteringcriteria. twoB73referencesamplesincludedasinternalcontrolsin The comparison ofgenotype calls between B73-SSA and the experiment (see Methods for details on the adopted H99 showed that 14,937 SNPs, about 35% of the retained filteringcriteria for thisspecific task). They wereidentical SNPs, are polymorphic between the two inbred lines, with forallthe47,937consideredSNPsandweresubsequently proportionsrangingfrom31.1%inchromosome5to38.5% treatedasasinglesample(indicatedas“B73”).B73sample in chromosome 7 (Table 2, “Polymorphic”). The ratios of wasthencomparedtoinbredlineB73fromScuolaSuper- polymorphic vs. monomorphic SNPs (Table 2, PM ratio) iore Sant’Anna (B73-SSA, i.e., the parental line of the ori- resulted overall non-independent from the chromosome ginal B73×H99 cross) in order to determine the level of wheretheSNPsweremapped.Pairwisetestsindicatedthat divergence between samples from different seed stocks. this was due to significant deviations from independence The full list of the B73 vs. B73-SSAcompared SNPgeno- forallchromosomesexceptchromosomes6and10.Inpar- typesisavailableinAdditionalfile1.Atotalof121and21 ticular, chromosomes 3, 4, 7 and 9 showed PM ratios SNPs showed a heterozygous genotype in either B73 or higher than expected under the hypothesis of independ- B73-SSA, respectively. The vast majority of SNPs (47,556 ence,whereastheoppositewastrueforchromosomes1,2, SNPs, 99.2%) showed an identical genotype between the 5and8.Allnon-independentPMratios(seeTable2)could two B73 samples, 14 SNPs being heterozygous in both be accounted for by various combinations of significant samples. The discordant genotypes at the remaining 381 non-uniform distributions among chromosomes of either SNPs (0.79%) were due either to a heterozygous state in the polymorphic (P) or the monomorphic (M) SNPs (i.e., onesamplebutnotintheother(142SNPs)ortodiscord- the numerator or the denominator of the PM ratio, re- ant homozygous genotypes (239 SNPs). Ninety-five per- spectively), or both. For example, the significantly low PM cent (362) of the total discordant SNPs were mapped ratio observed in chromosome 2 corresponds to a signifi- among chromosomes 2 (110 SNPs, 28.9% of discordant cant depletion of polymorphic SNPs only, whereas in SNPs),4(149SNPs,39.1%)and5(103SNPs,27.0%),most chromosome5itisduetopolymorphicandmonomorphic of them being within single uninterrupted clusters cover- SNPs being at the same time, respectively, less and more ing in total about 27 Mbp along intervals 202.7–208.6 thanexpectedbytheiruniformdistribution. Mbp on chromosome 2, 234.6–246.5 Mbp on chromo- Thegenome-widedistributionofdistancesbetweenadja- some4and140.4–149.5Mbponchromosome5. cent polymorphic SNPs is shown in Figure 3. The overall Beforeproceedingtotheanalysisofthegeneticstructure mean distance is 137 kbp, with means per chromosome of NILs, a working subset of 42,771 good quality SNPs ranging from 125 kbp in chromosome 7 to 150 kbp in (73.9%ofthe57,838SNPsavailableonthechip)wasidenti- chromosome 5 (data not shown). Half of the polymorphic fiedbyapplyingstringentqualityfilteringcriteriawhichalso SNPs fall within 41 kbp from each other and 95% within Peaetal.BMCGenomics2013,14:61 Page5of15 http://www.biomedcentral.com/1471-2164/14/61 Table2SNPsgenomicdistributionamongchromosomes Chromosomes SNPdistribution No. Length %a Total Polymorphic Monomorphic PMratio (Mbp) (T) (P) (M) No. %b No. %c No. %c 1 300.2 14.7% 6,812 15.9%▲ 2,272 33.4% 4,540 66.6%▲ ▼ 2 234.8 11.5% 4,910 11.5% 1,625 33.1%▼ 3,285 66.9% ▼ 3 230.6 11.3% 4,697 11.0% 1,730 36.8% 2,967 63.2%▼ ▲ 4 247.1 12.1% 4,770 11.2%▼ 1,752 36.7% 3,018 63.3%▼ ▲ 5 216.9 10.6% 4,640 10.8% 1,444 31.1%▼ 3,196 68.9%▲ ▼ 6 169.3 8.3% 3,468 8.1% 1,238 35.7% 2,230 64.3% 7 171.0 8.4% 3,539 8.3% 1,361 38.5%▲ 2,178 61.5%▼ ▲ 8 174.5 8.5% 3,712 8.7% 1,212 32.7% 2,500 67.3%▲ ▼ 9 152.4 7.4% 3,151 7.4% 1,192 37.8%▲ 1,959 62.2%▼ ▲ 10 149.7 7.3% 3,072 7.2% 1,111 36.2% 1,961 63.8% Total 2,046 100% 42,771 100% 14,937 34.9%d 27,834 65.1%d aChromosomelengthovertotalgenomelength;bnumberofSNPsperchromosomeovertotalnumberofSNPs;cnumberofpolymorphic(P)andmonomorphic (M)SNPsoverthetotalnumberofSNPsmappedontherespectivechromosome;dnumberofpolymorphic(P)andmonomorphic(M)SNPsovertotalnumber SNPs.Up/downarrowsindicatethesignofsignificant(P<0.05)deviationsfromtheexpectedSNPdistributionamongchromosomesforthecorrespondingtested nullhypotheses,namely:uniformdistributionoftotalSNPs(T);uniformdistributionofpolymorphicSNPs(P);uniformdistributionofmonomorphicSNPs(M); independentdistributionofpolymorphicvs.monomorphicSNPsfromtheirmappingchromosome(PMratio). 525 kbp. Only 210 SNP intervals have sizes exceeding 1 101.3-108.4 Mbp), the largest annotated maize centromere Mbp, 90% of which (187 SNPs) being below 3 Mbp. The byafactoroffive[31]. maximum SNP interval sizes observed per single chromo- Polymorphic SNPs were also found significantly non- some range from 2.9 Mbp in chromosome 3 to 33.1 Mbp uniformly distributed within each of the ten maize chro- inchromosome5.Noticeably,thislatterinterval,byfarthe mosomes arbitrarily divided in 10 Mbp bins (Figure 4). largest detected (the second one being 8.4 Mbp in size), is Orthogonal pairwise chi-square tests corrected for mul- mapped between two SNPs at 89.3 and 122.5 Mbp and is tiple tests with Benjamini-Hochberg False Discovery Rate centered around the centromere of chromosome 5 (cent5, (FDR<5%) allowed us to identify significantly low- and 10,000 100% 9,000 90% 2 5 8 8,000 7, 80% 7,000 70% 6,000 60% y 5,000 50% c n e u q 4,000 40% e Fr 3,000 30% 5 SNP frequency 2 8 2,000 1, 76 Cumulative % 20% 2 1,000 1, 948 649 479 377 300 239 171 137 102 87 79 44 43 40 27 28 24 210 10% 0 0% Distance (bp) Figure3Genome-widedistributionofdistancesbetweenadjacentpolymorphicSNPs.TotalnumberofSNPs:14,937;averagedistance:137 kbp;median:41kbp;95thpercentile:525kbp. Peaetal.BMCGenomics2013,14:61 Page6of15 http://www.biomedcentral.com/1471-2164/14/61 Interval (Mbp) chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 0-10 0 10-20 20-30 30-40 40-50 50-60 50 60-70 70-80 80-90 90-100 100-110 100 110-120 120-130 130-140 140-150 150-160 150 160-170 170-180 180-190 190-200 200-210 200 210-220 220-230 230-240 240-250 250-260 250 260-270 270-280 280-290 290-300 300-310 300 Legend: No. of polymorphic SNPs per bin Deviation from uniform distribution (5% FDR) Features 0 (min) higher than expected annotated centromere 40 not significant introgressed QTL region 80 lower than expected incomplete terminal bin 120 160 (max) Figure4DistributionofpolymorphicSNPwithinchromosomes.Binsize:10Mbp.Bincolorsindicatesignificantnon-uniformdistributionof SNPs(seetext).Red:over-representationofpolymorphicSNPs;Blue:under-representationofpolymorphicSNPs.Thickboxesindicatebins containingtheannotatedcentromeres;greenverticalbarsdenotebinscontainingtheintrogressedQTL;greyrectanglesindicateincomplete terminalbins(excludedfromtests). high-polymorphicbinswithineachchromosome,i.e.,bins represented genome), included 24 out of 42 (61.5%) low- comprising less or more polymorphic SNPs than those polymorphic bins, against only 4 out of 39 (9.5%) high- expectedunderthenullhypothesisoftheiruniformdistri- polymorphic ones. Noticeably, annotated centromeres map bution within each chromosome. A total of 81 out of the withinsignificantlylow-polymorphicbinsinallinstancesex- 200 bins (40.5%) were classified as significantly low- (42, ceptchromosomes2and7.Higherresolution(2Mbpbins) blue color) or high-polymorphic (39, red color), the former heat-maps representing the distribution on the maize chro- and the latter being preferentially observed in centromeric mosomesofallgood-qualitySNPsandoftheproportionof andtelomericregions,respectively.Infact,thefirstfivebins polymorphic SNPs within each bin, respectively, are pro- located at the two telomeres of each chromosome (i.e. 100 videdinAdditionalfile3. “telomeric” bins, corresponding to half of the represented genome), included 71.8% (28 out of 39) of the high- AssessmentofalleleinheritancepatternsinNILs polymorphic bins and only 15.4% (6 out of 42) of the low- Fivepairs ofmaize recombinantNILswereanalyzed (see polymorphic ones. Conversely, the 5 bins per chromosome Table 1 and Methods for a full description and nomen- centered on the bin including the annotated centromere clature), each designed to introgress one of three heter- (i.e. 50 “centromeric” bins, corresponding to 25% of the otic QTL mapping on chromosome bins 3.05, 4.10 and Peaetal.BMCGenomics2013,14:61 Page7of15 http://www.biomedcentral.com/1471-2164/14/61 10.03 [25]. Each pair consists of two inbred lines, named 61 and 14 heterozygous in NIL-BB and NIL-HH only, NIL-BB and NIL-HH, designed as to introgress the tar- respectively). Cases (1) and (2) could be accounted for get QTL in the B73/B73 and in the H99/H99 genotype, bytheco-dominantnatureofSNPmarkers(i.e.,hetero- respectively (Figure 1). Two pairs of NILs were available zygous calls are obtained when genetically heteroge- for QTL3.05 and QTL4.10, whereas only one pair was neous samples are scored in bulk), whereas cases (3) produced for QTL10.03. The full list of SNP genotype and (4) are not expected by segregation. However, the calls and a summary of genotypes distributions among factthatthe3SNPsofcase(3)delimitaregionbetween chromosomes of polymorphic SNPs in all NIL samples 157.3 and 176.2 Mbp on chromosome 4 where other 86 are reported in Additional file 2 and Additional file 4, adjacent SNPs show consistent genotypes (i.e.,homozy- respectively. Failure rates in NILs averaged 0.86%, gouscontrastinggenotypesinNIL-BBandNIL-HHand ranging from 0.19% in NIL10.03_R63-BB to 3.2% in a corresponding heterozygous genotype in NIL-BH), NIL4.10_R55-HH, with an observed residual heterozy- suggests that in this case the observed genotypic incon- gosity of 0.92% (ranging from 0.04% in NIL3.05_R8-BB sistencymightbeactuallyduetoSNPcallinginaccuracy to 3.2% in NIL4.10_R55-HH). The lowest and the high- in the NIL-BH sample. Besides invoking possible SNP est average heterozygosity level per single chromosome call errors in any of the three samples, a reasonable ex- were observed in chromosomes 8 (0.18%) and 5 (3.3%), planation for the 75 SNPs of case (4) could reside in a respectively. Thegenome-wideproportion ofB73 homo- “sampling effect”, that is, only NIL-BH individuals homo- zygous SNPs over all the polymorphic SNPs ranged zygous at these SNPs were actually genotyped, having between 27.7% in NIL4.10_R55-HH and 51.1% in beenselectedbyeffectofchanceamongindividualsinfact NIL3.05_R8-BB. segregatingina1:1proportion. Graphicalrepresentationsofthepatternofallelicinherit- ance along chromosomes of each NIL (Figure 5) were AssessmentofisogenicityinNILs obtainedbyplottingcolor-codedgenotypesofallnon-failed A summary of the number, chromosomal distribution and polymorphic SNP against their positions on the B73 size of all detected non-isogenic regions is reported in RefGen_v1maizereferencegenome[27].QTLregions3.05, Table3.Forthesakeofsimplicity,thespecificchromosome 4.10 and 10.03 (Table 1) include respectively 122, 40 and containing the QTL introgressed in each NIL was treated 392 polymorphic good-quality SNPs on the maize Mai- separatelyfromtheotherchromosomes,andnodistinction zeSNP50chip,correspondingintheordertoSNPdensities was made between fully homozygous and partially hetero- of 6.57, 20.69 and 7.15 SNP/Mbp. Analysis of SNP geno- zygousnon-isogenicregions.Withtheexclusionofthespe- typic data indicates that all target QTL regions were intro- cific chromosomes containing the QTL introgressed in gressed as expected in all the respective NIL pair(s), i.e., as each pair, the overall number of distinct non-isogenic homogeneous regions of homozygous B73/B73 and H99/ blocks per NIL pair fell between 10 (3.05_R8) and 19 H99genotypesinNIL-BBandNIL-HH,respectively. (3.05_R40), with an average block length comprised be- We evaluated also the cross between NIL4.10_R55-BB tween 3.6 and 13.7 Mbp in NIL pairs 10.03_R63 and and NIL4.10_R55-HH, where the heterozygosity level 3.05_R8, respectively. The estimated proportion of non- was the highest detected among all NIL materials (834 isogenic genome, also excluding the QTL chromosome, heterozygous SNPs). Out of the 14,036 polymorphic ranged from 2.4% in NIL pair 10.03_R63 to 11.5% in pair SNPs successfully scored in all three samples of set 3.05_R40. The largest proportions of non-isogenic regions 4.10_R55, genotype inheritance in NIL-BH with respect innon-QTLchromosomeswereobservedforchromosome to the genotype observed in its two parental lines (NIL- 5 in NIL pairs 3.05_R8 (54.3%) and 4.10_R40 (55.0%), fol- BB and NIL-HH) could be unambiguously confirmed for lowed by chromosome 10 in pair 3.05_R40 (44.9%), 13,858 SNPs (98.7%). Among these, 705 of the 805 chromosome8inpair4.10_R40(26.3%),chromosome7in (87.6%) heterozygous SNPs overall detected in NIL-BH pair 4.10_R55 (20.7%). In three of the remaining cases, corresponded to contrasting homozygous genotypes in non-isogenic regions represented 10 to 20% of each NIL-BB and NIL-HH. For the remaining 178 non- chromosome,whereasinallremaininginstancestheywere matching SNPs (1.3%), four different situations were below 10%. The average observed non-isogenicity per observed: (1) a heterozygous genotype in NIL-BH when chromosome, excluding in each case the QTL chromo- one of the parents was heterozygous (96 SNPs in total, somes in the corresponding NILs, was 6.8% and ranged 44 heterozygous only in NIL-BB and 52 only in NIL- from 0.51% in chromosome 4 (n=3) to 22.0% in chromo- HH); (2) a heterozygous genotype in all three samples some 5 (n=5). The largest proportions of non-isogenic (4 SNPs); (3) a homozygous genotype in NIL-BH and regions within QTL specific chromosomes (excluding the contrasting homozygous genotypes in parental NILs QTL region) were observed in both NIL pairs for QTL (3 SNPs); (4) a homozygous genotype in NIL-BH when 3.05, with 63.7% and 46.9% in NIL R40 and R8, respect- oneoftheparentswasheterozygous(75SNPs,ofwhich ively. Non-isogenicity for the QTL chromosome were very Peaetal.BMCGenomics2013,14:61 Page8of15 http://www.biomedcentral.com/1471-2164/14/61 NIL pair3.05_R8 NIL pair 3.05_R40 chr chr chr 1 chr 1 chr 2 chr 2 chr 3 chr 3 chr 4 chr 4 chr 5 chr 5 chr 6 chr 6 chr 7 chr 7 chr 8 chr 8 chr 9 chr 9 chr 10 chr 10 05 0 100 150 200 250 300 05 0 100 150 200 250 300 Position (Mbp) Position (Mbp) NIL pair 4.10_R40 NIL pair 4.10_R55 chr chr chr 1 chr 1 chr 2 chr 2 chr 3 chr 3 chr 4 chr 4 chr 5 chr 5 chr 6 chr 6 chr 7 chr 7 chr 8 chr 8 chr 9 chr 9 chr 10 chr 10 05 0 100 150 200 250 300 05 0 100 150 200 250 300 Position (Mbp) Position (Mbp) NIL pair10.03_R63 Legend: chr chr 1 B73/B73 chr 2 H99/H99 SNP genotype chr 3 chr 4 B73/H99 chr 5 Isogenic(sharedg enotype) chr 6 SNP chr 7 isogenicity GenotypeNIL-HH chr 8 Non-isogenic status chr 9 GenotypeNIL-BB chr 10 TargetQTL region 05 0 100 150 200 250 300 Position (Mbp) Figure5GenotypicstructureandisogenicityofNILpairs.GraphssynthesizinginformationbothontheoverallgenotypicstructureofNILs (i.e.,positionandsizeofrecombinationblocks,givenbycolors)andongenotypeandpositionofnon-isogenicregions(unalignedplottinglines). EachpanelrepresentcontrastingNILswithinapair.GenotypesatSNPsarecolorcoded(B73/B73=blue;H99/H99=pink;B73/H99=green).For eachchromosome,non-isogenicregionsareplottedastwoseparatelinesabove(SNPscoreinNIL-HH)andbelow(SNPscoreinNIL-BB)ashared lineinthemiddle,whereSNPswiththesamegenotypeinbothNILs(i.e.,isogenic)areplotted.RedbarsrepresenttheQTLintrogressionregions. similar to each other for the two NIL pairs for QTL 4.10, inFigure5.Non-isogenicregionsofdifferentextenthav- 15.9% and 17.1% in pair R40 and R55 respectively, and ing concordant genotype with the QTL region were much lower than that of the two pairs for QTL 3.05. Fi- found immediately upstream and downstream of the nally,theobservedresidualnon-isogenicityinchromosome flanking markers defining each QTL region. The pres- 10fortheNILpair10.03_R63was6.0%. ence of these regions could be ascribed to linkage drag The chromosome positions of all non-isogenic regions effect specifically associated with the markers employed detected for each NIL pair,alongwith the corresponding for the introgression of each QTL. Non-isogenicity SNP genotypes of the individual NILs can be observed around QTL 3.05 extended asymmetrically by 17.3 Mbp Peaetal.BMCGenomics2013,14:61 Page9of15 http://www.biomedcentral.com/1471-2164/14/61 Table3Summarystatisticsofthegenomicdistributionofnon-isogenicregionsinNILpairs Chromosome Statsa NIL3.05 NIL3.05 NIL4.10 NIL4.10 NIL10.03 R8 R40 R40 R55 R63 1 NB 1 2 - 6 3 NS 15 123 - 183 148 TS 2.5 17.2 - 52.4 17.1 CF 0.84% 5.7% - 17.5% 5.7% 2 NB - 2 2 3 3 NS - 113 55 96 155 TS - 16.5 8.1 11.3 17.6 CF - 7.0% 3.4% 4.8% 7.5% 3 NB 5b 5b 1 3 2 NS 762 1,023 52 171 46 TS 108.0 146.8 3.8 15.7 6.3 CF 46.9% 63.7% 1.6% 6.8% 2.7% 4 NB - 3 1b 3b 1 NS - 35 275 251 2 TS - 3.8 39.3 42.2 0.00 CF - 1.5% 15.9% 17.1% 0.0002% 5 NB 4 1 2 - - NS 490 10 459 - - TS 117.8 1.2 119.3 - - CF 54.3% 0.54% 55.0% - - 6 NB 1 3 - - - NS 25 82 - - - TS 1.4 6.7 - - - CF 0.85% 4.0% - - - 7 NB 1 3 2 4 1 NS 3 21 22 273 25 TS 0.15 1.8 2.0 35.4 2.3 CF 0.09% 1.0% 1.2% 20.7% 1.3% 8 NB 2 3 3 1 1 NS 94 225 398 2 16 TS 13.4 22.6 45.9 0.00 1.2 CF 7.7% 13.0% 26.3% 0.00% 0.68% 9 NB - - 3 - 2 NS - - 97 - 14 TS - - 10.3 - 1.9 CF - - 6.8% - 1.2% 10 NB 1 2 3 - 1b NS 10 417 144 - 69 TS 1.6 67.2 17.9 - 9.0 CF 1.1% 44.9% 12.0% - 6.0% Totalc NB 10(15) 19(24) 16(17) 17(20) 13(14) NS 637(1,399) 1,026(2,049) 1,227(1,502) 725(976) 406(475) TS 137.0(245.0) 136.9(283.7) 207.3(246.6) 114.8(157.0) 46.3(55.2) GF 7.5%(12.0%) 7.5%(13.9%) 11.5%(12.1%) 6.4%(7.7%) 2.4%(2.7%) Number,chromosomaldistributionandsizeofdetectednon-isogenicblocksineachNILpairarereported.Datareferringtothechromosomescontainingthe introgressedQTLforthecorrespondingNILpairareunderlined.aStatistics:NB=numberofnon-isogenicblocks;NS=numberofnon-isogenicSNPs;TS=totalsize ofnon-isogenicblocks(Mbp);CF=non-isogenicchromosomefraction;GF=non-isogenicgenomefraction.B73RefGen1genomesize:2,046,341,370bp.bQTL introgressionregions(seeTable1)areexcludedfromwithin-chromosomecomputations;cTotalsarereportedbothexcludingandincluding(inbrackets)datafrom thechromosomecontainingtheintrogressedQTLinthecorrespondingNILpair. Peaetal.BMCGenomics2013,14:61 Page10of15 http://www.biomedcentral.com/1471-2164/14/61 in NIL pair 3.05_R8 (13.4 Mbp upstream and 3.9 Mbp isogenic blocks could be observed in NIL pairs 3.05_R40 downstream) and by 145.2 Mbp in NIL pair 3.05_R40 and 4.10_R40, clearly a manifestation of their relatedness (128.1 Mbp upstream and 17.1 Mbp downstream). In the since they were developed in the same genetic RIL 40 former pair, an additional non-isogenic region of 86.7 background. MbpdetectedfurtherupstreamtotheQTLregionlargely overlapped with the more extended non-isogenic region Discussion detectedinNILpair 3.05_R40. Acomparablelinkagedrag In this study we undertook the analysis of the detailed effect observed in the two genetic backgrounds suggests genotypic structure of near-isogenic materials specifically that it might be specifically associated with the markers produced for the introgression of three heterotic QTL in employedforintrogressingtheQTL.InNILpair3.05_R40, maize that we detected and then further characterized twoadditionaladjacentnon-isogenicregionsharboringop- [4,25,26].TheNILpairsherebyanalyzedrepresentunique posite homozygous genotypes, being probably the effect of material for the study of hybrid vigor in maize, since the twocloserecombinationeventsaroundpositions19.2–19.6 analysis of mendelizedheteroticQTLmight shedlight on Mbp and 12.2 Mbp, were also detected immediately somerelevant,andpossiblygeneral,geneticandmolecular preceding the upstream non-isogenic extended region, mechanisms underlying this phenomenon. Given this expanding it of additional 9.4 Mbp (up to position premise,itisparamountthatgreateffortsaremadetoac- 10,302,495bp).InthissameNILpair,asmallsub-segment quireadetailedknowledgeofthegeneticstructureofsuch within the QTL region (19 adjacent SNPs between materials, possibly on a whole-genome scale. In fact, ac- positions 152.0 and 154.4 Mbp) resulted isogenic, and curatedeterminationoftheactualisogenicityofNILsata homozygous H99/H99, in the two contrasting NILs. Non- genome-wide level must not be neglected, since conclu- isogenicity around QTL 4.10 also extended beyond the sions derived on the effects of QTL mendelizing in NIL QTL boundaries in both NIL pairs, also suggesting the materialslargelyrelyuponit.Pursuingthisgoal,wegeno- presenceoflinkagedrag,althoughofamorelimitedextent. typedtheoriginalinbredlines(B73-SSAandH99)andthe In this case, non-isogenic regions extended by about 39.3 recombinant NILs obtained from their cross by the Illu- Mbp (39.0 Mbp upstream and 0.3 Mbp downstream) in mina MaizeSNP50 BeadChip. This platform allows the NIL pair R40 and by about 6.9 Mbp (4.9 Mbp upstream scoring of predetermined genotype variants at more than and 2.0 Mbp downstream) in NIL pair R55. Albeit non 50,000 SNPs selected upon a large maize diversity panel contiguous,thenon-isogenicregioninpair4.10_R55might and was recently tested on several US inbred lines [30]. be considered to extended upstream for further 20.8 Mbp The Illumina MaizeSNP50 BeadChip system was chosen (up to position 216.1 Mbp) for a total of about 25.7 Mbp because, in the context of a bi-parental genetic system upstream, thus overlapping to a larger extent that of the based on common US inbred lines such as the one under otherQTL4.10NILpair,similarlytowhatobservedabove analysis, it provided the best combination of potential in- for the QTL 3.05 NIL pairs. Finally, the non-isogenic formation content, technical reliability, and resolution regions flanking QTL 10.03 region extended about 9.3 needed to ascertain the overall genetic structure and the Mbp(0.2Mbpupstreamand9.1Mbpdownstream). levelofisogenicityofNILs.Ithasbeenproposedthathet- Additional unlinked non-isogenic regions between con- erosis mightbe associated to large structuralvariations in trasting NILs were also detected in all pairs and mapped the genome leading to complex patterns of gene comple- genome-wide. Non-isogenic blocks were generally found as mentation through the combination of the dispensable stretchesofSNPsofcoherentgenotypes,whereasrecombin- genomes within the extremely diverse maize germplasm ation within non-isogenic blocks was observed in a few [32]. The platform chosen for the present analysis, differ- casesonly(e.g.blocksonchromosomes3and6inNILpair ently from others based upon CGH or re-sequencing 3.05_R40). Chromosomal regions having a heterozygous techniques, is not suited for addressing the study of such genotype in one of the compared NILs only constitute for variations, which were at this point beyond the scope of the most part full independent blocks, being only in fewer the present work, even though they might indubitably be instances one of the two boundaries of otherwise larger relevantforacloserinvestigationonthemolecularnature homozygousnon-isogenicblocks.Theonlyexceptiontothis oftheintrogressedQTL. latter situation was observed in NIL pair 3.05_R40 for the First of all we compared the genotypic structure of non-isogenic block mapped at interval 130.5-135.7 Mbp on B73-SSA, the inbred line from which all NILs were chromosome 10 (20 SNPs), which consists in fact of three derived, with that of the reference B73 accession, for sub-blocks, the middle one of which being heterozygous which both replicate SNP scoring produced identical in NIL-BB only. Only 2–3 isolated SNPs per NIL pair are results, confirming the technical reproducibility of geno- heterozygous in both NILs. Finally, positions of unlinked type calls. The few differences detected between B73-SSA non-isogenic blocks across all NIL pairs appear to be and B73 were in line with the level of inconsistency largely unrelated. A more extended overlapping of non- already observed with duplicate samples from different