ebook img

Absolute free energies estimated by combining pre-calculated molecular fragment libraries PDF

0.33 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Absolute free energies estimated by combining pre-calculated molecular fragment libraries

Absolute free energies estimated by ombining pre- al ulated mole ular fragment libraries ∗ Xin Zhang 9 & 0 Department of Physi s Astronomy, 0 2 n University of Pittsburgh, Pittsburgh, Pennsylvania 15260 a J ∗ † 0 Artem B. Mamonov and Daniel M. Zu kerman 3 ] h Department of Computational Biology, S hool of Medi ine, p - o University of Pittsburgh, Pittsburgh, Pennsylvania 15213 i b . s (Dated: January 30, 2009) c i s y Abstra t h p [ The absolute free energy (cid:22) or partition fun tion, equivalently (cid:22) of a mole ule an be estimated 2 v omputationally using a suitable referen e system. Here, we demonstrate a pra ti al method for 4 4 9 stagingsu h al ulationsbygrowingamole ulebasedonaseriesoffragments. Signi(cid:28) ant omputer 3 . 1 time is saved by pre- al ulating fragment on(cid:28)gurations and intera tions for re-use in a variety of 0 9 0 mole ules. We employ su h fragment libraries and intera tion tables for amino a ids and apping : v i X groups to estimate free energies for small peptides. Equilibrium ensembles for the mole ules are r a generated at no additional omputational ost, and are used to he k our results by omparison to standard dynami s simulation. We explain how our work an be extended to estimate relative binding a(cid:30)nities. ∗ These authors ontributed equally. † Ele troni mail: ddmmzz+pitt.edu 1 I. INTRODUCTION The use of a referen e system for free energy al ulations has a long history in physi s 1 and hemistry. The basi idea is to employ a referen e system ((cid:16)ref(cid:17)) for whi h the abso- lute free energy is available, and whi h is as similar as possible to the physi al system of interest ((cid:16)phys(cid:17)). Histori ally, Stoessel and Nowak applied the referen e-system strategy to a mole ular system for the (cid:28)rst time, using a solid harmoni referen e system in Cartesian 2 oordinates. Zu kerman and Ytreberg extended that work in two ways designed to improve 3 overlap between the referen e and physi al systems: (i) by using internal oordinates; and (ii) by using a more (cid:29)exible, numeri ally exa t referen e system based on histograms from a short dynami s simulation, rather than an arti(cid:28) ial analyti ally tra table referen e state. Huang and Makarov also employed the referen e-system approa h embodied in (1), but in 4 a di(cid:27)erent way. Fphys = Fref +∆F , ref→phys (1) Fx ∆F ref→phys where istheabsolute free energy ofmodel(cid:16)x,(cid:17) and isthefree energy di(cid:27)eren e between the systems. In essen e, this paper is about pra ti al hoi es for the both the ∆F ref→phys referen e system and strategies for al ulating when the physi al system is a large mole ule. Histori ally, Stoessel and Nowak applied the referen e-system strategy to a mole ular 2 system for the (cid:28)rst time, using a solid harmoni referen e system in Cartesian oordinates. Zu kerman and Ytreberg extended that work in two ways designed to improve overlap 3 between the referen e and physi al systems: (i) by using internal oordinates; and (ii) by using a more (cid:29)exible, numeri ally exa t referen e system based on histograms from a short dynami s simulation, rather than an arti(cid:28) ial analyti ally tra table referen e state. Huang 2 andMakarovalsoemployedthereferen e-system approa hembodiedin(1),butinadi(cid:27)erent 4 way. Other e(cid:27)orts to al ulate absolute free energies for mole ular systems have been ongoing 5,6,7,8 9,10,11,12 for years in the groups of Meirovit h and Gilson and more approximately using 13 harmoni and quasi-harmoni methods. The work of Meirovit h builds on long-standing polymer-growth methodologies for estimating partition fun tions whi h date to the work 14 of Rosenbluth and Rosenbluth. The original Rosenbluth work was generalized for higher 15,16,17,18,19,20,21,22,23,24,25,26,27 e(cid:30) ien y and more realisti models by many workers. Ideas from these polymer-growth sampling methods also inform the present work. 3 Thepresentpapersigni(cid:28) antlyextendsthepreviousworkbyYtrebergandZu kerman by estimating absolute free energies for mole ules built up gradually from mole ular fragments. 3 Larger mole ules an be treated, ompared to our previous paper , be ause a series of staged intermediate systems are adopted. In essen e the free energy di(cid:27)eren e of Eq. (1) is sub-divided (staged) into a sum of easy-to- al ulate terms. Staging in rements are highly tunable, based on the hoi e of fragment sizes and even by sele tion of subsets of intera tions as detailed below. The use of fragments in other types of mole ular me hani s al ulations 28,29,30 has a long history. A key ontribution of this work is a pra ti al strategy of pre- al ulation whi h minimizes the number of energy terms whi h need to be omputed at ea h stage. Spe i(cid:28) ally, for ea h fragment, a statisti al library (cid:22) i.e., an ensemble of on(cid:28)gurations and their energies (cid:22) is 31 stored; we have also used su h libraries in Monte Carlo sampling . Additionally, for ea h ovalently bonded fragment pair, we store the full intera tion energy (based on all atoms) for every possible pair of on(cid:28)gurations. Su h storage is quite pra ti al on typi al modern omputers with > 1 GB of RAM. During produ tion simulations it is only ne essary to 3 omputeintera tionsbetween fragmentsseparatedbyoneormoreotherfragments. Needless to say, the stored libraries and intera tion tables an be re-used in future simulations of the same or di(cid:27)erent mole ules. The pre- al ulation strategy, whi h has early on eptual 32,33,34 roots , appears to represent a signi(cid:28) ant pra ti al advan e over earlier polymer-growth al ulations. The use of non-statisti allibraries has been popularized in the Rosetta protein 35 folding program. There is substantial (cid:29)exibility in the division of a mole ule into fragments. We have used single amino a ids as fragments in this study, but larger segments and even di(cid:27)erent intera tion subsets as detailed below (cid:21) may also be pra ti al. The fragment-based approa h ould also be used to study protein-ligand binding, by growing smallmole ules into re eptor bindingpo kets andestimatingthefree energies. This anbe seen asastatisti alme hani al 29,30 generalization of fragment-based ideas developed earlier. Our results, whi h employ single amino-a id fragments, are extremely en ouraging. The dataindi atethat absolutefree energiesforsmallpeptides anbe al ulatedrapidlyandreli- ∼ ably. Spe i(cid:28) ally, high-pre ision free energy estimates, with (cid:29)u tuations of 0.3 k al/mole, are obtained for 52-atom tetra-alanine in less than an hour of single-pro essor omputing time, with a simple diele tri "solvent". We he k our data by omparing the equilib- rium ensembles (obtained simultaneously with the free energy estimates) with independent Langevin simulations. As a further he k, in one ase, the free energy results are veri(cid:28)ed by an independent al ulation using di(cid:27)erent fragments. The remaining se tions of the paper des ribe the methods, results, and our on lusions. Our methods se tion provides full details for performing the al ulations, in luding the generation of fragment libraries and intera tion tables. We also orre t a te hni al error 3 in our earlier study. Our results des ribe both the free energy values and the analysis of 4 the equilibrium ensembles. Our dis ussion se tion des ribes possible improvements to the method and extension to the estimation of relative binding a(cid:30)nities using absolute free energies. II. METHODS Our basi approa h is to al ulate the free energy of the physi al system of interest based on the di(cid:27)eren e from a known referen e system, as in Eq. (1), and also to stage the al ulation using mole ular fragments.The fragments not only permit the gradual staging of the al ulation but also a tremendous savings of omputer time based on the storage of (i) fragment on(cid:28)gurations, (ii) energies internal to ea h fragment on(cid:28)guration, and (iii) intera tion energies between ovalently bonded fragments. The low ost and high pre ision of the resulting estimates suggests we are far from the pra ti al limit of the approa h in the present implementation. However, a number of improvements to the implementationappear to be within easy rea h, as des ribed in our Dis ussion. All fragment libraries used in the present al ulations are available at our website (www. bb.pitt.edu/Zu kerman). A. Model and systems T = 298K 36 All al ulationsemploy a standard atomisti for e(cid:28)eld, OPLS-AA at . In the present report, our fragments are individual amino a ids and apping groups. For simpli ity in this initial investigation, we model the solvent by a simple uniform diele tri onstant ǫ = 60 . We ompute free energy estimates for alanine dipeptide (A e-Ala-Nme), di-alanine 2 4 (A e-(Ala) -Nme), and tetra-alanine (A e-(Ala) -Nme). Following standard onventions, 3 3 A e is A etyl (CH -CO), Ala is Alanine (NH-CH(CH )-CO), and Nme is N-methylamide 5 3 (NH-CH ). B. A simple example Consider the al ulation of the on(cid:28)gurational free energy of alanine dipeptide based on a divisioninto three fragments (A e, Ala, Nme) whi h an be denoted (A, B, C) respe tively (see Fig.1). In advan e, we al ulate statisti al libraries of on(cid:28)gurations for ea h fragment, Figure 1: whi h are onstant-temperature OPLS-AA ensembles based only on the atoms within the given fragment. The libraries additionally in lude the six degrees of freedom ne essary for joining the fragments, based on the use of (cid:16)dummy atoms(cid:17) as des ribed below. During the librarygeneration pro ess, the absolute free energy forea h fragment is also al ulated using 3 a referen e system as des ribed previously. A typi al library will ontain 10,000 on(cid:28)gu- rations. We also pre- al ulate every possible intera tion energy between ovalently bound 6 108 fragments (cid:22) i.e., a table of intera tion energies for the A-B and B-C fragment pairs. The al ulationpro eeds as s hematized in Fig.1, where the presen e of a line onne ting two fragments indi ates that all intera tions between the fragments is in luded. The refer- en e system (not shown) onsists of fully independent oordinates, so that the fragments are not yet onstru ted. The (cid:28)rst intermediate onsists of the three non-intera ting fragments, whi h in lude, however, all intera tions within ea h fragment. Thus the fragment free ener- gies, whi h are al ulated and stored in advan e, properly in lude the intera tions among all degrees of freedom internal to ea h fragment. Other intera tions are added in three stages: A-B intera tions (cid:28)rst, followed by B-C, and ompleted by A-C ouplings. In the (cid:28)rst intermediate stage, the absolute free energies for the individual fragments are 3 retrieved from disk. (They are initially al ulated following referen e as detailed below.) Next, A-B intera tions are added by a standard free energy di(cid:27)eren e al ulation. Spe i(cid:28)- ally,anensembleofnon-intera tingA-B on(cid:28)gurationsisgeneratedbyrandom ombination of fragments from the A and B libraries, and the resulting energy hange is exponentially averaged in the usual way (cid:22) via Eq. (6) below. The energy hanges due to the ombination do not need to be al ulated in our s heme, however, be ause they have been tabulated in 37 advan e. Additionally, the now intera ting A-B fragments are (cid:16)resampled(cid:17) to orrespond to the full potential for all degrees of freedom in both fragments. The details of resampling aregivenbelow(cid:22)see Eq. (7)(cid:22)butthebottomlineisthatoneobtains10,000- on(cid:28)guration ensemble of the partially grown mole ule onsisting of the A-B fragments. The al ulationsthen pro eedsasiftherearetwo fragments, A-BandC.Thetwo libraries are joined ombinatorially but only a ounting for the B-C intera tions at this stage. The A-C intera tions will be handled at a later stage. On e again, the free energy hange is al ulated and the ensemble isresampled to re(cid:29)e t B-C intera tions. The resultingensemble 7 ontains 10,000 on(cid:28)gurations of the full mole ule re(cid:29)e ting all intera tions ex ept those between fragments A and C. In the (cid:28)nal stage sket hed in Fig. 1, the A-C intera tions are added in a standard free energy di(cid:27)eren e al ulation based on the the ensemble of the previous stage. However, a standard energy all (for the whole mole ule) is not required to save CPU time. Rather, our ode only omputes energy terms spe i(cid:28) to the A and C fragments (cid:22) i.e., ele trostati and van der Waals intera tions between atoms of A and those of C. On e the energy hanges have been obtained, the full free energy is rapidly estimated. Resampling into the fully intera ting ensemble an also be performed rapidly without additional energy al ulations. It is not di(cid:30) ult to imagine generalizing this example to systems with more fragments. It is also worth noting that, stri tly speaking, the (cid:28)nal stage was not ne essary. That is, we ould have added the A-C intera tions simultaneously with the B-C ombination sin e the full mole ular on(cid:28)gurations were onstru ted at that point. These hoi es illustrate the (cid:29)exibility intrinsi to staging the al ulation with fragments, as we detail further in our Dis ussion. Additional staging (cid:29)exibility results, of ourse, from the initial hoi e of the fragments (cid:22) i.e., smaller fragments lead to staging in (cid:28)ner in rements. C. Basi formalism Fphys Our al ulation of the absolute free energy for a mole ule divided into fragments is based on standard, straightforward equations. The only novel aspe t of the formalism is our parti ular hoi e of stages based on the addition of fragments and/or inter-fragment intera tions. Although our heavy relian e on pre- al ulated information has very signi(cid:28) ant pra ti al impli ations, it does not a(cid:27)e t the formalism. 8 De(cid:28)nition of the free energy Fphys The fundamental obje t of interest is the absolute lassi al free energy for an N impli itly solvated mole ule. The mole ule is taken to onsist of atoms, and its internal- x oordinate on(cid:28)gurationsaredenoted by . The potentialenergy fun tionwillbe astandard 36 for e(cid:28)eld (here, OPLS-AA ), possibly augmented by an impli it solvation model; the full Uphys(x) potential energy in ludingany solvation will be denoted by . The free energy, whi h Uphys is a fun tional of , is de(cid:28)ned by the dimensionless on(cid:28)gurational partition fun tion at T = 1/k β B temperature via 1 Fphys[Uphys] = −k T ln dxe−βUphys(x) , B 3N−3 (2) ( ) Z 1 Å (cid:0) (cid:1) dx where the measure of integration is understood to in lude any ne essary Ja obians. Ki- neti energy terms have already been integrated out. Both the dimensionless hara ter of the partition fun tion in Eq. (2) and the angstrom-based normalizationresult from a parti - C◦ 3,10 ular hoi e for the standard on entration (de(cid:28)ned in referen es ) (cid:22) or equivalently, for the volume ontaining our impli itly solvated mole ule. In parti ular, we have hosen C◦ ≡ 8π2( )3N−3Q /σ Q = N (2πm k T/h2)3/2 1 Å p , where p i=1 i B results from the momentum Q h σ integrals, is Plan k's onstant, and is the mole ule's symmetry number. We note that our hosen standard on entration varies based on the mole ule (i.e., based on the number Q p and masses of its atoms), and also eliminates the temperature dependen e of . However, in almost every appli ation of interest (see Dis ussion, below), the absolute free energy al- ulated here ultimately will be used to estimate a free energy di(cid:27)eren e and eliminate any C◦ artifa ts due to . Oursingle-mole uleformulation,asnoted, allowsfor(cid:16)impli itsolvation(cid:17) usingane(cid:27)e tive Uphys x solvent term in that is solely a fun tion of the internal mole ular oordinates . The 9 ǫ = 60 present al ulationsemploy a simple uniformdiele tri onstant ( ). In our Dis ussion, we address the minor te hni al issues involved with using a more realisti impli it solvent model. 3N − 6 One issue of dimensionality is worth emphasizing. Although there are internal N oordinates for a mole ule onsisting of atoms, the integral of Eq. (2) has dimensionality 3N −3 N −1 of length to the power . This is be ause bond lengths remain in the full set x of internal oordinates , ea h of whi h ontributes three powers of length. Put another way, of the six ex luded rigid-body/ enter-of-mass oordinates, the three orientation angles 8π2 are dimensionless; more spe i(cid:28) ally, the angles integrate to the fa tor of in luded in the C◦ de(cid:28)nition of Staging the free energy al ulation As illustrated in the example of Se . IIB, we will al ulate the free energy in a series of stages. These an be understood most easily by adding and subtra ting the free energies k orresponding to intermediate models, Fphys = Fphys −F +(F −F )+···+ F −Fref +Fref k k k−1 1 (3) (cid:0) (cid:1) (cid:0) (cid:1) = ∆F +∆F +···+∆F +Fref , k→phys k 1 (4) ∆F = F − F F [U ] j j j−1 j j where and is de(cid:28)ned in analogy to Eq. (2) for the intermediate U U j j models de(cid:28)ned by . The potentials will be spe i(cid:28)ed below. All free energy di(cid:27)eren e al ulations will be performed here using the (cid:16)perturbation(cid:17) 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.