ebook img

Healing Length and Bubble Formation in DNA PDF

0.17 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Healing Length and Bubble Formation in DNA

Healing Length and Bubble Formation in DNA Z. Rapti1, A. Smerzi2,3, K. Ø. Rasmussen2 and A. R. Bishop2 1 Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, and Department of Mathematics, University of Illinois at Urbana-Champaign, 1409 W. Green Street, Urbana, IL 61801 2 Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545 and 6 3 Istituto Nazionale di Fisica per la Materia BEC-CRS, Universit`a di Trento, I-38050 Povo, Italy 0 0 C.H. Choi, A. Usheva 2 Endocrinology, Beth Israel Deaconess Medical Center and Harvard Medical School, n Department of Medicine, 99 Brookline Avenue, Boston, Massachusetts 02215 a (Dated: February 6, 2008) J 3 We have recently suggested that the probability for the formation of thermally activated DNA 1 bubblesis,toaverygoodapproximation,proportionaltothenumberofsoftATpairsoveralength L(n) that depend on the size n of thebubbleand on the temperature of the DNA.Here we clarify ] the physical interpretation of this length by relating it to the (healing) length that is required for t theeffectofabase-pairdefecttobecomeneligible. ThisprovidesasimplecriteriatocalculateL(n) f o for bubbles of arbitrary size and for any temperature of the DNA.We verify our findings by exact s calculations of the equilibrium statistical properties of the Peyrard-Bishop-Dauxois model. Our . t method permits calculations of equilibrium thermal openings with several order of magnitude less a numerical expenseas compared with direct evaluations. m - d I. INTRODUCTION. transferintegraloperatormethod(TIO).(Acomplemen- n tarydirectnumericalevaluationofthepartitionfunction o has been reported in Ref. [5]). This allows the precise c [ Local separation of double-stranded DNA into single- evaluation of probabilities of bubbles as a function of strandedDNAisfundamentaltotranscriptionandother temperature, locationin a givenbase-pairsequence, and 1 important intra-cellular processes in living organisms. bubble size. In recent work [4], we reported that the v In equilibrium, DNA will locally denaturate when the probabilities of finding bubbles extending over n sites 4 0 free energy of the separated single-stranded DNA is less do not depend on a specific DNA subsequences. Rather, 3 than that of the double-stranded DNA. Because of the suchprobabilitiesdependonthedensityofsoftA/Tbase 1 largerentropyoftheflexiblesingle-strand,themorerigid pairswithinaregionoflengthL(n). Herewesuggestthat 0 double-strand can be thermally destabilized locally to this characteristiclength is simply relatedto the charac- 6 form temporary “bubbles” in the molecule already at teristic distance away from an AT base pair – consid- 0 physiologicaltemperatures[1]. Consideringthis entropic ered as a defect placed in a homogeneous GC-sequence– / t effect together with the inherent energetic heterogene- where the probability values of the base pairs return to a m ity – GC base pairs are 25 % more strongly bound than the GC bulk-value. Lastly, based on this concept of ef- the AT bases – of a DNA sequence, it is plausible that fective density approximation, we examine five different - d certain regions (subsequences) are more prone to such humanpromotersequences,anddemonstratethestriking n thermal destabilization than others. In fact recent work agreement in the predictions from the two methods. o [2] demonstrates not only that such a phenomena exists c but more importantly that the location of these large : v bubble openings in a variety of DNA sequences coincide II. THE PBD MODEL AND THE TIO METHOD. i with sites active during transcription events. This dis- X coveryrepresentasignificantadvanceintheunderstand- The potential energy of the PBD model is r ing of the relationship between local conformation and a function in bio-molecules. While there is no guarantee N N that this mechanism applies to all transcription initia- E = [V(y )+W(y ,y )]= E(y ,y ) n n n−1 n n−1 tion events, the present agreement is very encouraging. nX=1 nX=1 Similarly, other large bubbles identified may well have a . (1) relationshiptootherbiologicalfunctions. Theagreement is based on the Peyrard-Bishop-Dauxois (PBD) model, Here V(yn) = Dn(e−anyn −1)2, represents the nonlin- [3],whichevidentlycontainssomeessentialbasicingredi- ear hydrogen bonds between the bases; W(y ,y ) = n n−1 ents –localconstraints(nonlinearity),base-pairsequence k2 1+ρe−b(yn+yn−1) (yn − yn−1)2 is the nearest- (coloreddisorder)and entropy(temperature). The equi- ne(cid:0)ighbor coupling th(cid:1)at represents the (nonlinear) stack- libriumthermodynamicpropertiesofthemodelwerenu- ing interaction between adjacent base pairs: it is com- mericallycalculatedfromthepartitionfunctionusingthe prised of a harmonic coupling with a state-dependent 2 couplingconstanteffectivelymodelingthechangeinstiff- be solved from 4 to 2, and at the same time the ma- nessasthedoublestrandisopened(i.e. entropiceffects). trices that need to be multiplied are lower dimensional. The sum in Eq.(1) is over all base-pairs of the molecule Whenever the sequence heterogeneity results in a non- and y denotes the relative displacement from equilib- symmetric S(x,y), Eq.(4) cannot be used and we resort n rium bases at the nth base pair. The importance of the toasymmetrizationtechnique,basedonsuccessiveintro- heterogeneity of the sequence is incorporated by assign- ductionofauxiliaryintegrationvariables,asexplainedin ingdifferentvaluestotheparametersoftheMorsepoten- Ref. [10]. tial,dependingonthethebase-pairtype. Theparameter We evaluate the probabilities P (s), for a base-pair k values we have used are those in Refs. [6, 9] chosen to openingspanningkbase-pairs(ouroperationaldefinition reproduce a variety of experimentally observed thermo- of a bubble of size k), starting at base-pair s as dynamic properties. TheequilibriumthermodynamicpropertiesofthePBD ∞s+k−1 model can be calculated from the partition function P (s)=Z−1 dy Z (s)e−βE(yn,yn−1) (5) k n k Z t nY=s N Z = dy e−βE(yn,yn−1) Z n where t is the separation (which we have here taken as nY=1 1.5 ˚A) of the double strand above which we define the s+k−1 = dy Z (s)e−βE(yn,yn−1), (2) strand to be melted. n k Z nY=s where we have introduced the notation III. LENGTHSCALES AND EFFECTIVE N DENSITY APPROXIMATION. Z (s)= dy e−βE(yn,yn−1) k n Z n6=s,.Y..,s+k−1 InRef.[4]wesuggestedthattheprobabilitiesoffinding bubbles extending over n sites localized around a given and β = (k T)−1 is the Boltzmann factor. In order to B bp, is, to a very good approximation, proportional to evaluatethepartitionfunction(2)usingtheTIOmethod, the density of soft A/T base pairs within a region of we first symmetrize e−βE(x,y) by introducing [7] length L(n) centered around the same bp, an approach we term here as effective density approximation (EDA). β S(x,y) = exp − (V(x)+V(y)+2W(x,y)) The lengths L(n) were obtained from numerical transfer (cid:18) 2 (cid:19) integralcalculationsofthebubbleprobabilitiesofseveral = S(y,x). simple (but experimentally realizable) sequences. The A/T density profiles were therefore compared with the Here the secondequality holds only when x and y corre- exact probabilities for thermal activation of bubbles of spond to base-pairs of the same kind. Using Eq. (2) the sizes n = 1 and n = 5 of a wild and a mutant version expression for Z (s) is rewritten as k of the AAV P5 promoter. The agreement was excellent. However, no physical explanation for the origin of these N characteristic lengths was provided nor were they con- Zk(s)=  dynS(yn,yn−1) Z nected to any intrinsic length of the PBD model. But, n6=s,.Y..,s+k−1   since they appear prominently in the formation of DNA ×dy0e−β2V(y1)e−β2V(yN), (3) bubbles,itisimportanttoinvestigatebothoftheseques- tions. where open boundary conditions at n = 1, and n = N In Figs.1-3 we consider a sequence composed of 150 have been used. To proceed, a Fredholm integral equa- G/C +1 A/T +150 G/C. In other words, we place a tions with a real symmetric kernel defect (A/T instead of G/C) at the site l = 151. This 0 defect is 150 bp away from the two ends of the sequence dyS(x,y)φ(y)=λφ(x) (4) in order to eliminate boundary effects. Z A/T base pairs have a smaller bonding energy than must be solved separately for the A/T and for the G/C GC bps. Therefore, the A/T defect softens a number base-pairs. of GC-bps aroundit and increasesthe opening probabil- Since the eigenvalues are orthonormal and the eigen- ity. Clearly, sufficiently away from the defect the open- functions form a complete basis, Eq.(4) can be used se- ing probability regains the bulk value of a homogeneous quentially to replace all integrals by matrix multiplica- G/C-sequence at the giventemperature, given threshold tions in Eq. (3). Unlike in Ref. [8] where the kernels andgiven bubble size. Our claim is that the characteris- S(x,y) were expanded in terms of orthonormal bases, ticlengthL(n)isthedistancenecessarytobeawayfrom here we choose to use Eq. (4) iteratively. In this way the defect so that the G/C bps there are no longer af- we reduce the number of integral equations that need to fected. This canbe quantifiedbycalculatingthe relative 3 fluctuation P (l −L(n))−P (110) n 0 n =α, (6) .00007 P (110) n where l −L(n) is the bp site obtained counting L(n) 0 .000065 downstreamfromthedefectsite,seeFigs.1-3,andatsite 110 we assume that the bulk value has been regained. P 7 The remarkable finding is that with the choice of L(n) .00006 considered in our previous work [4], obtained indepen- dently by merely fitting the full numerical TIO calcula- tions of the bubble formation probabilities of different .000055 simple sequences,weobtainfromEq.(6)α≃2.5%,inde- pendently from the size of the bubble and the tempera- ture of the DNA sequence. This can be seen in Figs.1-3: the circle at bp = 151 = l is the A/T defect, while the 110 120 130 140 150 160 170 180 190 0 bp circles at bp = 141,139,135 are the positions of the bp at l −L(n). We can therefore reverse the perspective 0 FIG.2: Theprobability for thecreation of a bubbleof size 7 and define the characteristic length as the one given by bp. The black circle at bp = 151 represents the defect. The Eq.(6), with α ≃ 2.5%. This is important for pratical second black circle is located at bp=151−L(7)=139. The applications, since it gives a simple criterion to estimate relative error P7(139)−P7(110) =0.0279. bubblesprobabilitiesforarbitrarybubblesizesandDNA P7(110) temperatures (and arbitrary PBD inter base-pairs inter- action parameters), but it also immediately suggests a simple physical explanation for L(n). .000012 0.00026 .000011 P 10 0.00024 .00001 P 5 0.00022 .000009 110 120 130 140 150 160 170 180 190 0.0002 bp 110 120 130 140 150 160 170 180 190 FIG. 3: The probability for the creation of a bubble of size bp 10 bp. As before, the defect is represented by a solid black circle,andasecondoneislocatedatbp=151−L(10)=135. FIG. 1: The probability profile for the creation of a bubble The relative error P10(135)−P10(110) =0.0203. of size 5 bp. The isolated A/T bp embedded in a sequence P(110) of G/C bps at bp=151 is denoted by a solid black circle. A second black circle is located at bp=151−L(5)=141. The relative error P5(14P15)(−11P05)(110) =0.0232. of the probability at the site l0 − n + 1, which is the same as the probability value of the defect site l . ξ 0 n Weparametrizethedecayoftheprobabilityvaluesasa is the healing length of the system, namely the char- functionofthedownstreamdistancefromtheA/Tdefect acteristic length for the perturbation to die out, which, according to quitegenerally,dependsonthesizenofthebubble,tem- perature of the DNA and the parameters of the PBD l −n+1−l P (l)=A +B exp[− 0 ], (7) model. Replacing Eq.(7) in Eq.(6), we obtain the re- n n n ξ n lation L(n) = n − 1 + ξ ln Bn , where B /A = where P (l) is the probability for finding a bubble of (P (l )−P (bulk))/P (bunlk)≃(cid:16)0α.A34n.(cid:17)Weemphasnizetnhat n n 0 n n size n located at the site l, A is the bulk value of the both B /A and α are independent of the size n of the n n n homogeneous G/C sequence and A + B is the value bubble. It follows that there is a simple linear relation n n 4 between the healing and characteristic lengths: Also, the probability drops in the middle of the peak becausethebubbletherecontainsadefectthatistrapped L(n)=n−1+2.6ξ . (8) n within G/C bps, and it turns out that the probability of formation of a bubble of this kind is smaller. This is a very important result of this report. The heal- ing length can be easily calculated as a function of the bubble size and temperature with an homogeneous G/C sequenceplusasingledefect,asshownabove. Fromthis, IV. COMPARISON OF THE EDA AND TIO we can calculate the value of L(n) and, therefore, esti- METHOD. mate the probability for the creationof bubbles for arbi- trary DNA sequences at any temperature. For instance, Wenowcomparetheprobabilityprofilesobtainedfrom for bubbles of size n=7, we obtain L(7)=12, while for the effective density approach with the characteristic bubbles ofsizen=10wehaveL(10)=16atT =300K length L(n) calculated as in the previous Section, with and PBD parameters as in [9]. exact results obtained with the TIO method. We con- In order to examine how the values of the parameters sider five different human genome subsequences, and of the PBD model affect those of L(n), we set ρ=0 and comparethecalculationsfortheprobabilityofformation repeatthe calculationofL(10). Since, whenρ decreases, of bubbles of sizes n=7 and n=10. so does the ”cooperativity” of the base base pairs, one Inthepanels(a,b)ofFigs.5-9weplot(asafunctionof wouldexpecttoobserveadropintheL(10)value. Thisis the bp site) the number N and N of A/T bps calcu- indeed the case: L(10)=14,while for ρ=2 the value as 7 10 latedoveradistanceL(7)=12(panela)andL(10)=16 16. We will show in the next Section how this approach (panel b). These A/T density profiles can be compared with the probability for the thermal creation of bubbles .000065 ofseven,P7,andten,P10,sites,panels(c,d). Inallcases (and in severalother not reportedhere) the resemblance in the main features of the respective profiles is strik- ing. In particular, EDA correctly predicts the locations andrelativeweightsoftheprobabilitypeaks. Thecrucial .00006 point is that, while the profiles obtained with the EDA P requiresfewsecondstobe calculated,thefullTIOmeth- 10 odsis verytime consuming(ofthe orderofseveralhours in the cases presentedhere). To fully appreciate this ad- .000055 vantage, we note that with the EDA the entire human genomecanbe sequencedforbubble formationprobabil- ities in few minutes, while a statistical approach based on the calculationof the partition function is clearly im- possible. 110 120 130 140 150 160 170 180 190 bp 8 a b 10 FIG.4: Inthisfigureweshowtheprobabilityforthecreation N 6 N of a bubble of size 10 bp, when the coupling constant ρ = 7 10 0. As is indicated by the relative error P10(137)−P10(110) = 4 5 P(110) 0.0216, now L(10)=14. 2 compares with exact transfer integral operator calcula- −100 0 100 −100 0 100 bp bp tions of the statistical properties of the PBD model. We conclude this Section by noting that the Figs. 1- c d 0.001 3 exhibit symmetry, but not with respect to the defect. 0.0003 P P While the defect is always at bp = 151, the symmetry 7 10 is with respect to bp 149 in the P case, 148 in the P 0.0005 5 7 case, and the axis that separates bps 146-147 in the P 0.0001 10 case. Another feature is the existence of a second local maximum with the same value as P (151), and a slight −100 0 100 −100 0 100 n bp bp drop in the probability values in the middle of the peak. We notice that the two maxima are located at sites l 0 andl −n+1. This suggeststhatabubble witha defect FIG. 5: Effective density profiles for 7 and 10-site long bub- 0 bles(a,b)andprobabilityprofilescalculatedwiththetransfer at its boundary has a higher probability to form: in the integralapproach,(c,d). Thesequenceisthecox8promoter. P (l −n+1)casethe defectisatthe endofthe bubble, n 0 whileintheP (l )caseitisatthebeginingofthebubble. n 0 5 8 a 10 b 10 a b 6 10 N N 7 4 10 5 N7 5 N10 5 2 −100 0 100 −100 0 100 −100 0 100 −100 0 100 bp bp bp bp 0.0015 0.0006 c d c d 0.0008 0.0003 P P P P 7 10 7 10 0.0005 0.0002 0.0004 0.0001 −100 0 100 −100 0 100 −100 0 100 −100 0 100 bp bp bp bp FIG. 6: Effective density profiles for 7 and 10-site long bub- FIG. 8: Effective density profiles for 7 and 10-site long bub- bles(a,b)andprobabilityprofilescalculatedwiththetransfer bles(a,b)andprobabilityprofilescalculatedwiththetransfer integralapproach,(c,d). Thesequenceisthecox11promoter. integral approach (c,d). Thesequence is theh33a promoter. 10 15 a b a b 10 10 10 N7 5 N10 5 N7 5 N10 5 −100 0 100 −100 0 100 −100 0 100 −100 0 100 bp bp bp bp 0.0015 0.0006 0.0025 c d c d 0.0006 P P 0.0015 7 10 P P 0.0005 0.0002 7 10 0.0002 0.0005 −100 0 100 −100 0 100 −100 0 100 −100 0 100 bp bp bp bp FIG. 7: Effective density profiles for 7 and 10-site long bub- FIG. 9: Effective density profiles for 7 and 10-site long bub- bles(a,b)andprobabilityprofilescalculatedwiththetransfer bles(a,b)andprobabilityprofilescalculatedwiththetransfer integralapproach(c,d). Thesequenceisthegtf2f2promoter. integral approach (c,d). Thesequence is theh3b promoter. V. CONCLUSIONS. the DNA. We have clarified the physical originof such a length and suggested a simple procedure for its calucla- tions. The results of our effective density approach are IthasbeensuggestedthattheDNAtranscriptioniniti- inextremelygoodagreementwithfullexactcalculations, ationsites cancoincide with the locationoflarge bubble but with a numerical effort reduced by several order of openings. Athoroughinvestigationofthishypothesisre- magnitudes. In this way, the full human genome can be quires the statistical analysis of many DNA promoters analyzed,openingaunique possibilityto understandthe within the PBD model. Such a task becomes quickly existence and nature of the correlations between ther- prohibitive when studying bubble-promoter correlations mally activated bubbles and promoters. inasignificantlylargenumberofcases(namely,forlarge sequences). This problem has motivated the develop- ment of an alternative simplified approach to calculate VI. ACKNOWLEDGMENTS the bubble formation probabilities. We have found that this probabilities are proportional to the density of soft A/T base-pairs calculated over an effective length which WorkatLosAlamosNationalLaboratoryissupported dependsonthesizeofthebubbleandthetemperatureof bytheUSDepartmentofEnergyundercontractcontract 6 No. W-7405-ENG-36. [1] M. Gueron, M. Kochoyan, and J.L. Leroy Nature 328, (1998). 89(1987); M.Frank-KamenetskiiNature32817 (1987). [7] T.DauxoisandM.Peyrard,Phys.Rev.E514027(1995). [2] C.H.Choi,G.Kalosakas,K.Ø.Rasmussen,M.Hiromura, [8] Y.Zhang,W.-M.Zheng,J.-X.Liu,andY.Z.Chen,Phys. A.R.BishopandA.Usheva,NucleicAcidsRes.32,1584, Rev. E, 56, 7100, (1997). (2004). [9] The parameters were chosen in Ref. [6] to fit thermo- [3] M.PeyrardandA.R.Bishop,Phys.Rev.Lett,62,2755, dynamic properties of DNA: k = 0.025eV/A2, ρ = 2, (1989);T.Dauxois,M.PeyrardandA.R.Bishop,Phys. β = 0.35A−1 for the inter-site coupling; for the Morse Rev.E 47 R44 (1993). potential DGC =0.075eV, aGC =6.9A−1 for a G-C bp, [4] Z.Rapti,A.Smerzi,K.Ø.Rasmussen,A.RBishop, C.H. DAT =0.05eV, aAT =4.2A−1 for theA-Tbp. Choi and A. Usheva,cond-mat/0511128 [10] M. B. Fogel, Nonlinear Order Parameter Fields: I. Soli- [5] T.S.vanErp,S.Cuesta-Lopez,J.-G.Hagmann,andM. ton Dynamics, II. Thermodynamics of a Model Impure Peyrard Phys. Rev.Lett. 95, 218104 (2005). System, Ph.D.Thesis, Cornel University,(1977). [6] A. Campa and A. Giansanti, Phys. Rev. E, 58, 3585,

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.