Multi-Layer Free Energy Perturbation Ying-Chih Chiang and Frank Otto Department of Physics, Chinese University of Hong Kong, Shatin, N.T., Hong Kong (Dated: January 31, 2017) Abstract Free energy perturbation (FEP) is frequently used to evaluate the free energy change of a bio- 7 1 logical process, e.g. the drug binding free energy or the ligand solvation free energy. Due to the 0 2 sampling inefficiency, FEP is often employed together with computationally expensive enhanced n a sampling methods. Here we show that this sampling inefficiency, which stems from not accounting J 0 for theenvironmental reorganization, is an inherent property of the single-ensemble ansatz of FEP, 3 and hence simply prolonging the MD simulation can hardly alleviate the problem. Instead, we ] h p propose a new, multi-ensemble ansatz – the multi-layer free energy perturbation (MLFEP), which - m allows environmental reorganization processes (relaxation) to occur automatically during the MD e h sampling. Our study paves the way toward a fast but accurate free energy calculation that can be c . s employed in computer-aided drug design. c i s y h Accurately evaluating the free energy and β = 1/k T, with the Boltzmann fac- B p [ change of a ligand binding to its receptor has tor denoted by k and temperature by T. B 1 v a very practical use in computational drug The term u denotes the perturbation intro- 0 9 design, i.e. determining the relative bind- duced to the initial reference state, and its 4 8 ingfreeenergiesbetween twodrugcandidates value is given by the potential difference be- 0 . for lead- optimization. One of the most fre- tween the target state T and the reference 1 0 quently employed method for this purpose is state, u = U − U . Finally, the symbol 7 T R 1 the so-called free energy perturbation (FEP) h···i represents that the canonical ensem- : R v i method [1], which states that the free energy ble average is performed over the reference X r change between the final target state T and state R. In other words, the sampling is per- a the initial reference state R can be evaluated formed using the Hamiltonian of the refer- via a single ensemble average, i.e. ence state. Similarly, one can also sample the target state T for calculating ∆A, this leads e−β∆A = he−βui , (1) R to the so-called backward FEP calculation, i.e. eβ∆A = heβui . Although Eq. 1 is the- T where ∆A denotes the free energy change 1 oretically exact, numerically evaluating the lower the system (free) energy. Clearly, the ensemble average often suffers from a prob- nuclear motion in the electronic decay pro- lemofthesamplinginefficiency. Whileplenty cess is always governed by the corresponding of methods, e.g. stratification (multi-step Hamiltonianofaspecific electronic state[16]. FEP) [2, 3], confine-and-release method [4– Similarly, in the classical molecular dynam- 6], or replica-exchange molecular dynamics ics, the molecular motion is also governed (with solute tempering) [7–12], have been de- by the Hamiltonian of the simulated system. veloped to improve the sampling efficiency The only difference is that the classical sys- and hence advance the FEP convergence, the temisdescribedbyNewtonianmechanics[26] current computationalcostofusing enhanced with force fields [27]. sampling methods combined withFEPisstill Let us now consider a common illustrative rather prohibitive to be regularly applied in example in free energy calculations, namely, drug design [13, 14]. Hence, further pursu- the ligand solvation process. According to ing an accurate but fast free energy method Eq. 1, collecting the ensemble governed by is still desirable. the Hamiltonian of reference state R (lig- Previously we have shown that the in- and and water solvent are separated) is suf- sufficient sampling comes from missing the ficient for correctly evaluating ∆A. How- environmental reorganization [15], e.g. al- ever, as illustrated in Fig. 1, the two end lowing the water to move or reorient to ac- states can have very different potential en- commodate the inserted ligand (perturba- ergylandscapessothattheirassociatedprob- tion). This process is a type of relaxation ability distributions center at different ge- process, which is well studied in gas phase re- ometries, as indicated by the dotted curves actions. For instance, consider the quantum in Fig. 1. Consequently, when sampling the nuclear dynamics [16–18] during the inter- distribution via MD simulation in order to atomic/intermolecular Coulombic decay pro- sample all possible conformations of the ref- cess(ICD)[19–24], intheneondimer [17,25]. erence state R, one faces the sampling inef- After introducing a strong perturbation to ficiency because the relevant microstates be- the system (ionizing an inner valence elec- longing to the target state T are generally tron on Ne), the system quickly responds to missed, leading to a non-converged free en- this perturbation by emitting one electron on ergyresult. Thisproblemcanbesolvedbyin- the neighboring Ne, resulting in a Ne+-Ne+ troducing the reorganization process (relax- state that undergoes Coulomb explosion to ation) into the sampling procedure by start- 2 mational space but continue using Eq. 1 to evaluate ∆A. Rather, we believe that the y sit n insufficient sampling problem is an inherent T R e D ntial ob. property such that the best way to solve it e r ot P is to use a different working equation than P Eq. 1. Coordinate Does such a new equation, which allows FIG. 1: Local potential trap and the the system to relax automatically during the insufficient sampling problem. The reference simulation, exist? Exploiting the fact that state R and the target state T have their own potential landscapes (solid curves) that confine e−β∆A is a constant under the given NVT en- the sampled probability distributions (dotted semble, further imposing one additional sam- curves). When any conformation that belongs to the distribution of R is placed on the pling over a normalized distribution will not landscape of T, it will move according to the change its value, e.g. he−β∆Ai = e−β∆A, as Hamiltonian of T, leading to the reorganization T process. long as the sampling is sufficient. Hence we have, ing at the same conformation as reference state R but performing the MD simulation e−β∆A = hhe−βui i , (2) R T based on the Hamiltonian of target state T, where the definitions of all symbols are iden- see e.g. the orange dotted curve in Fig. 1. tical with Eq. 1. In Eq. 2, we further im- While this idea may not be so familiar to the posed the sampling over the distribution of nativebiophysicssociety, itsquantumversion target state T, which does not affect ∆A, is regularly performed in studying gas phase since its value is already determined at the molecular dynamics involving multiple elec- samplingofthereferencestateR.WhileEq.2 tronic states [16–18, 28, 29]. Furthermore, seems to introduce more effort in MD sam- our approach is very different from contem- pling to evaluate ∆A, this equation actually porary enhanced sampling schemes, e.g. in- allows the environmental reorganization. Let creasing temperature to overcome the poten- us explain. When evaluating Eq. 2, one first tialbarrier, addingabiasing potential toflat- performs a short equilibrium sampling to col- tenthepotential landscape, oreven using the lect the microstates that belongs to state T, “adiabatic” potential (black curve) for sam- and then from each microstate (each frame pling [30]. These schemes focus on forcing of the collected trajectory) one performs an the MD sampling to explore a larger confor- 3 MD sampling using the Hamiltonian of state to distinguish it from the virtual substitution R to evaluate the free energy change within scan(VSS)[15,31]whichispurelybasedona this simulation. Thus, each microstate of single-ensemble approach but also has a dual state T gives one e−β∆A that will later par- sampling format. ticipate in the ensemble average over state T.Interestingly, foreachmicrostate, thesam- pling now always begins at a non-equilibrium [1] R. W. Zwanzig, J. Chem. Phys. 22, 1420 high energy conformation. This conforma- (1954). tion will then undergo a relaxation process [2] J. P. Valleau and D. N. Card, J. Chem. automatically due to the governing Hamil- Phys. 57, 5457 (1972). tonian, and hence the sampling is more ef- [3] A. Pohorille, C. Jarzynski, and C. Chipot, ficient than waiting for rare events to hap- J. Phys. Chem. B 114, 10235 (2010). pen. For practical purposes, Eq. 2 can also [4] S. Boresch, F. Tettinger, M. Leitgeb, and beexpressed inareversedsamplingformthat M. Karplus, J. Phys. Chem. B 107, 9535 reads, (2003). eβ∆A = heβ∆Ai = hheβui i . (3) [5] H.-J. Woo and B. Roux, Proc. Natl. Acad. R T R Sci. USA 102, 6825 (2005). This new format describes the process in [6] D. L. Mobley, J. D. Chodera, and K. A. Fig. 1: start the sampling under the refer- Dill, J. Chem. Theory Comput. 3, 1231 ence state R, and then introduce the envi- (2007). ronmental reorganization via the relaxation [7] Y. Sugita and Y. Okamoto, Chem. Phys. process governed by the target state T. One Lett. 314, 141 (1999). additional advantage of Eq. 3 is that we can [8] Y. Sugita, A. Kitao, and Y. Okamoto, J. now assign a common reference state R and Chem. Phys. 113, 6042 (2000). save the trajectory for evaluating ∆A be- [9] P. Liu, B. Kim, R. A. Friesner, and B. J. tween the reference state and different tar- Berne, Proc. Natl. Acad. Sci. USA 102, get states. This can further save some com- 13749 (2005). putational effort. Finally, since Eqs. 2-3 al- [10] L. Wang, R. A. Friesner, and B. J. Berne, ready go beyond the usual FEP theory, we J. Phys. Chem. B 115, 9431 (2011). will termournewapproachasthemulti-layer [11] S. L. C. Moors, S. Michielssens, and free energy perturbation (MLFEP), in order A. Ceulemans, J. Chem. Theory Comput. 4 7, 231 (2011). Phys. 6, 143 (2010). [12] L. Wang, B. J. Berne, and R. A. Fries- [23] K. Gokhberg, P. Kolorenˇc, A. Kuleff, and ner, Proc. Natl. Acad. Sci. USA 109, 1937 L. Cederbaum, Nature 505, 661 (2014). (2012). [24] V. Stumpf, K. Gokhberg, and L. Ceder- [13] L. Wang et al., J. Am. Chem. Soc. 137, baum, Nat. Chem. 8, 237 (2016). 2695 (2015). [25] T. Jahnke et al., Phys. Rev. Lett. 93, [14] N. M. Lin, L. Wang, R. Abel, and D. L. 163401 (2004). Mobley,J.Chem.TheoryComput.12,4620 [26] J.C.Phillips,R.Braun,W.Wang, J.Gum- (2016). bart, E. Tajkhorshid, E. Villa, C. Chipot, [15] Y.-C. Chiang, Y. T. Pang, and Y. Wang, R. D. Skeel, L. Kale, and K. Schulten, J. J. Chem. Phys. 145, 234109 (2016). Comput. Chem. 26, 1781 (2005). [16] E. Pahl, H.-D. Meyer, and L. S. Ceder- [27] K. Vanommeslaeghe, E. Hatcher, baum, Z. Phys. D 38, 215 (1996). C. Acharya, S. Kundu, S. Zhong, [17] S. Scheit et al., J. Chem. Phys. 121, 8393 J. Shim, E. Darian, O. Guvench, P. Lopes, (2004). I. Vorobyov, and A. D. M. Jr., J. Comput. [18] Y.-C. Chiang, F. Otto, H.-D. Meyer, and Chem. 31, 671 (2010). L. S. Cederbaum, Phys. Rev. Lett. 107, [28] H. Ko¨ppel, W. Domcke, and L. S. Ceder- 173001 (2011). baum, Adv. Chem. Phys. 57, 59 (1984). [19] L.S.Cederbaum,J.Zobeley, andF.Taran- [29] W. Domcke, D. R. Yarkony, and telli, Phys. Rev. Lett. 79, 4778 (1997). H. Ko¨ppel, eds., Conical Intersections: [20] N. Sisourat, N. V. Kryzhevoi, P. Kolorenˇc, Electronic Structure, Dynamics & Spec- S.Scheit, T.Jahnke, andL.S.Cederbaum, troscopy (World Scientific Publishing Co. Nature Phys. 6, 508 (2010). Pte. Ltd., Singapore, 2004). [21] T. Jahnke et al., Nature Phys. 6, 139 [30] C. D. Christ and W. F. van Gunsteren, J. (2010). Chem. Phys. 126, 184110 (2007). [22] M.Mucke,M.Braune,S.Barth,M.F¨orstel, [31] Y.-C. Chiang and Y. Wang, Biopolymers T. Lischke, V. Ulrich, T. Arion, U. Becker, 105 (2016). A. Bradshaw, and U. Hergenhahn, Nature 5 MD trajectory of the reference state Perturbation s n o i t a l u m i s D M t r o h S