ebook img

Development and testing of a general amber force field PDF

18 Pages·2004·0.31 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Development and testing of a general amber force field

Development and Testing of a General Amber Force Field JUNMEI WANG,1 ROMAIN M. WOLF,2 JAMES W. CALDWELL, PETER A. KOLLMAN, DAVID A. CASE3 1EncysivePharmaceuticalsInc.,7000Fannin,Houston,Texas77030 2NovartisInstitutesforBiomedicalResearch,Basle,WSJ-88.10.14,P.O.Box, CH-4002Basle,Switzerland 3DepartmentofMolecularBiology,TPC15,TheScrippsResearchInstitute, 10550N.TorreyPinesRd.,LaJolla,California92037 Received26August2003;Accepted16February2004 Abstract: We describe here a general Amber force field (GAFF) for organic molecules. GAFF is designed to be compatible with existing Amber force fields for proteins and nucleic acids, and has parameters for most organic and pharmaceuticalmoleculesthatarecomposedofH,C,N,O,S,P,andhalogens.Itusesasimplefunctionalformanda limited number of atom types, but incorporates both empirical and heuristic models to estimate force constants and partialatomiccharges.TheperformanceofGAFFintestcasesisencouraging.IntestI,74crystallographicstructures werecomparedtoGAFFminimizedstructures,witharoot-mean-squaredisplacementof0.26Å,whichiscomparable to that of the Tripos 5.2 force field (0.25 Å) and better than those of MMFF 94 and CHARMm (0.47 and 0.44 Å, respectively). In test II, gas phase minimizations were performed on 22 nucleic acid base pairs, and the minimized structuresandintermolecularenergieswerecomparedtoMP2/6-31G*results.TheRMSofdisplacementsandrelative energieswere0.25Åand1.2kcal/mol,respectively.ThesedataarecomparabletoresultsfromParm99/RESP(0.16Å and1.18kcal/mol,respectively),whichwereparameterizedtothesebasepairs.TestIIIlookedattherelativeenergies of71conformationalpairsthatwereusedindevelopmentoftheParm99forcefield.TheRMSerrorinrelativeenergies (compared to experiment) is about 0.5 kcal/mol. GAFF can be applied to wide range of molecules in an automatic fashion,makingitsuitableforrationaldrugdesignanddatabasesearching. ©2004WileyPeriodicals,Inc. JComputChem25:1157–1174,2004 Keywords: generalAMBERforcefield;additiveforcefield;forcefieldparameterization;restrainedelectrostatic potential(RESP) Introduction The desirability of a general purpose force field to describe a wide variety of organic molecules has been recognized before. Molecular mechanics force fields are a key component under- Several widely used general force fields include MMFF94,9 lying many investigations of the protein–ligand structure for MM3,10MM4,11Tripos5.2,12andtheAMBER*parametersinthe rational drug design and other tasks.1–4 The use of empirical Macromodelprogram.13,14Inaddition,theOPLSforcefields15are parameters enables them (in favorable cases) to model confor- based on functional groups, so that extensions to many organic mational changes and noncovalent interaction energies quite molecules are straightforward. For various reasons, those force accurately.Asuccessfulforcefieldindrugdesignshouldwork fieldshavenotbeenaswidelyusedinstudyingbiologicalsystems well both for biological molecules and the organic molecules as more specifically parameterized force fields such as AMBER, that interact with them. For example, the “Amber” force CHARMM,andOPLS;thelatter,inturn,havenoautomaticway fields5–8wereprimarilydevelopedforproteinandnucleicacid of assigning parameters for arbitrary organic molecules. There is systems. The fact that Amber only has limited parameters for arguably a significant advantage in developing an organics force organic molecules has prevented it from being widely used in fieldthatisconsistentinitsformandparameterizationwiththose drugdesignandotherstudiesofligand–proteinorligand–DNA interactions.Here,wedescribeageneralAmberforcefieldthat works for most of the pharmaceutical molecules, and which is Correspondenceto: D.A.Case;e-mail:[email protected] designed to be as compatible as possible to the traditional Contract/grantsponsor:NIH;contract/grantnumber:GM56531(to Amber protein force fields. P.A.K.andD.A.C.) ©2004WileyPeriodicals,Inc. 1158 Wangetal. • Vol.25,No.9 • JournalofComputationalChemistry usedforproteinsandnucleicacids.Herewelookatextendingthe force field parameterization and on the performance of GAFF in AMBERforcefieldtocovermoreorganicspecies.Thisispossible thethreetestcases. because an extensible strategy was used for development of the Itisworthnotingthattheproceduredescribedhererequiresas biomolecular parameters, and some extension to organics has inputacompletethree-dimensionalstructure(includinghydrogen alreadybeendescribed.8Here,wedescribesuchanextensiontoa atoms)thatiscloseenoughtoanenergyminimumtobeoptimized muchwiderclassofmolecules. by standard semiempirical quantum mechanical methods. Gener- Asimplefunctionalform,suchastheso-called“ClassI”model ating such structures for libraries of potential ligands is itself a of eq. (1), is often adequate to describe the structures and non- nontrivialtask,andisbeyondthescopeofthisarticle. bondedenergiesfororganicandbioorganicsystems: (cid:1) (cid:1) (cid:1) Generation of Parameters v E (cid:1) k(cid:1)r(cid:2)r (cid:2)2(cid:3) k (cid:1)(cid:4)(cid:2)(cid:4) (cid:2)2(cid:3) n pair r eq (cid:4) eq 2 bonds angles dihedrals AtomTypeDefinition (cid:1)(cid:2) (cid:3) (cid:5)(cid:3)1(cid:3)cos(cid:1)n(cid:6)(cid:2)(cid:7)(cid:2)(cid:4)(cid:3) Aij (cid:2)Bij(cid:3)qiqj (1) In molecular mechanics force fields, “similar” chemical environ- R12 R6 (cid:8)R mentsareencodedasatomtypes,andthesedefinitionsarecrucial i(cid:5)j ij ij ij tosuccessandtransferability.Byusingmoreatomtypes,onecan describe subtle chemical environments more accurately, but this Here, r and (cid:4) are equilibration structural parameters; K , K , also leads to a bigger parameterization burden. Therefore, atom eq eq r (cid:4) V areforceconstants;nismultiplicityand(cid:7)isthephaseanglefor types should be used as economical as possible, and new atom n thetorsionalangleparameters.TheA, B, andq parameterschar- types introduced only to significantly improve the force field acterizethenonbondedpotentials. performance. Parameterizationplaysacrucialroleinmolecularmechanics.A For GAFF, we have introduced 35 basic atom types: five good parameter set should reproduce the experimental data not carbon,eightnitrogen,threeoxygen,fivesulfur,fourphosphorus, only for the molecules in a training set, but also for molecules six hydrogen, and one atom type for each of the four common outsideofthetrainingset.Moreover,theparametersetshouldbe halogens;thesearelistedastypes1–35inTable1.Forthesebasic completeandself-consistent.InGAFF,thebasicparameterization atom types, GAFF is a complete force field, which means all the philosophy,whichfollowsideasoutlinedearlierforthe“Parm99” parameters are available or can be estimated according to our parameters,8isasfollows.Forthenonbondedpart,vanderWaals empiricalrules.Atomtypeswerechosenaccordingtothefollow- parameters are inherited from traditional Amber force fields di- ingatomproperties,frommoregeneraltomorespecific:element rectly;partialchargesareassignedusingarestrainedelectrostatic type, hybridization, aromaticity, and chemical environment. For potential fit (RESP) model,16,17 because of its clear physical pic- example, fluorine, chlorine, bromine, and iodine have only one tureandstraightforwardimplementationscheme.Fortheinternal atomtype(elementtype),whereascarbonhasfiveatomtypes:c3 terms[thefirstthreetermsineq.(1)],parameterizationswerefirst (sp3 hybridization), c1 (sp1 hybridization), c (sp2 hybridization, performed on bond lengths and bond angles that are weakly chemicalenvironment,i.e.,bondedtoSorO),ca(sp2hybridiza- coupled to other parts in the energy function. Typically, equilib- tion,aromaticproperty),andc2(sp2hybridization). rium bond lengths and bond angles come from experiment and We have then included another 22 special atom types (Nos. high-levelabinitiocalculations;theforceconstantsareestimated 36–57),asdescribedinTable1.Specificatomtypesweredefined throughanempiricalapproach(presentedbelow)andoptimizedto mainlybasedonthechemicalenvironment.Thespecialatomtypes reproduce experimental and high-level ab initio vibrational fre- are classified into four groups, and were introduced for different quencies. purposes. Unlike the “hard parameters” of bond length and bond angle, GroupI(Nos.53–57)consistsfivehydrogentypes(h1–h5)that torsionalangleparametersaresoft,andarehighlycoupledtothe are attached to carbons in different chemical environments, and nonbondedenergyterms.Torsionalangleparametersareparame- have different van der Waals parameters. This need for multiple terized last, to cover other effects that cannot be considered in a vanderWaalsparametersforhydrogens,dependingonwhatthey simplefunctionalform(suchaspolarization,chargetransfer,and are bonded to, is inherited from AMBER force fields, and its many body effects), in addition to intrinsic bond torsion prefer- justificationhasbeendiscussedearlier.8 ences.Inpractice,torsionalangleparametersarederivedtorepro- Group II (Nos. 39–42) contains cx and cy (sp3 carbons in duce the energy differences of two conformations and rotational three- and four-membered rings), and cu and cv (sp2 carbons in profiles,basedonexperimentalorhigh-levelabinitiodata.Parm- three- and four-membered rings). When the Group II atom types scan, a force field parameterization program developed in our occurinabondangle,theequilibriumparameterisusuallymuch group,18 was extensively used to derive the bond length, bond different from that of the corresponding angle parameter for the angle, and torsional angle parameters in this work. Some of our generalcarbontypes.Forexample,cx–cx–cxhasanequilibrium modelmolecules(suchasNo.90inFig.4),lookstrangebecause bondangleof60°comparedto109.5°forc3–c3–c3.Althoughthe theyarenotstablespecies;however,suchexamplesarenecessary minimizedstructureswithandwithoutspecialatomtypesarequite to make our force field complete (for the basic atom types) and similar (the dominant driving force are bond length parameters, make our empirical rules for deriving missing parameters work which are similar for the special and general atom types), the smoothly.Inthefollowingpartsofthisarticle,detailsaregivenon calculatedvibrationalfrequenciesarequitedifferent. DevelopmentofaGeneralAmberForceField 1159 Table1. AtomTypesandTheirDefinitionsinGAFF. Atom Atom No. type Description No. type Description 1 c sp2carboninCAO,CAS 2 c1 sp1carbon 3 c2 sp2carbon,aliphatic 4 c3 sp3carbon 5 ca sp2carbon,aromatic 6 n sp2nitrogeninamides 7 n1 sp1nitrogen 8 n2 sp2nitrogenwith2subst.,realdoublebonds 9 n3 sp3nitrogenwith3subst. 10 n4 sp3nitrogenwith4subst. 11 na sp2nitrogenwith3subst. 12 nh aminenitrogen connectedtoaromaticrings 13 no Nitrogeninnitrogroups 14 o sp2oxygeninCAO,COO(cid:6) 15 oh sp3oxygeninhydroxylgroups 16 os sp3oxygeninethersandesters 17 s2 sp2sulfur(pAS,CAS,etc.) 18 sh sp3sulfurinthiolgroups 19 ss sp3sulfurin—SRandS—S 20 s4 hypervalentsulfur,3subst. 21 s6 hypervalentsulfur,4subst. 22 p2 sp2phosphorus(CAP,etc.) 23 p3 sp3phosphorus,3subst. 24 p4 hypervalentphosphorus,3subst. 25 p5 hypervalentphosphorus,4subst. 26 hc hydrogenonaliphaticcarbon 27 ha hydrogenonaromaticcarbon 28 hn hydrogenonnitrogen 29 ho hydrogenonoxygen 30 hs hydrogenonsulfur 31 hp hydrogenonphosphorus 32 f anyfluorine 33 cl anychlorine 34 br anybromine 35 i anyiodine 36 cc(cd) innersp2carboninconjugatedring 37 ce(cf) innersp2carboninconjugatedchain systems systems 38 cp(cq) bridgearomaticcarboninbiphenyl 39 cu sp2carboninthree-memberedrings systems 40 cv sp2carboninfour-memberedrings 41 cx sp3carboninthree-memberedrings 42 cy sp3carboninfour-memberedrings 43 nb aromaticnitrogen 44 nc(nd) innersp2nitrogeninconjugatedring 45 ne(nf) innersp2nitrogeninconjugatedchain systems,2subst. systems,2subst. 46 pb aromaticphosphorus 47 pc(pd) innersp2phosphorusinconjugatedring systems,2subst. 48 pe(pf) innersp2phosphorusinconjugatedchain 49 px conjugatedphosphorus,3subst. systems,2subst. 50 py conjugatedphosphorus,4subst. 51 sx conjugatedsulfur,3subst. 52 sy conjugatedsulfur,4subst. 53 h1 hydrogenonaliphaticcarbonwith1 electron-withdrawalgroup 54 h2 hydrogenonaliphaticcarbonwith2 55 h3 hydrogenonaliphaticcarbonwith3 electron-withdrawalgroups electron-withdrawalgroups 56 h4 hydrogenonaromaticcarbonwith1 57 h5 hydrogenonaromaticcarbonwith2 electron-withdrawalgroup electron-withdrawalgroups Nos.1–35arethebasicatomtypesandNos.36–55arethespecialatomtypes. InGroupIII(Nos.36,37,44,45,47,48),cc(cd),nc(nd),and av of11.2kcal/mol,whereasthece—ceorcf—cfsinglebonds 2 pc(pd)areinnersp2atomsinconjugatedringsystems;ce(cf),ne have a v of 5.6 kcal/mol. Figure 1 shows some examples of 2 (nf), and pe (pf) are inner sp2 atom in conjugated chain systems; moleculeshavingatomtypesinthisgroup. and cg (ch) are inner sp carbons in conjugated systems. Because In Group IV (Nos. 38, 43, 46, 49–52), cp (cq) is the bridge both cc and cd are inner sp2 carbons in conjugated ring systems, atoms of two aromatic rings. A rule similar to that for conju- they are assigned according to a simple rule: both cc—cc and gated systems is also applied for cp(cq): both cp—cp and cd—cd are conjugated single bonds, and cc—cd is conjugated cq—cq are single bond and other types (cp—cq, ca—cp, ca— double bond. Other pairs of atom types in Group III behave in a cq) are aromatic bonds. Atom types nb and pb are aromatic similar fashion: bonds between the same type are single bonds, nitrogen and phosphate; sx and sy are sulfur in conjugated whereasbondsbetweendifferenttypesaredoublebonds.GroupIII sulfoxide and sulfone systems; px and py are phosphorus in atoms are designed to describe alternating conjugated systems, conjugated phosphite and phosphate systems. Figure 1 shows without adding an explicit concept of bond order. The bond pa- some examples of molecules having atom types in Group IV. rametersandtorsionalpotentialsofthetwobondtypesarediffer- TheatomtypesinGroupIVhavesimilarfunctionasatomtypes ent.Forexample,c2—ceorc2—cforce—cfdoublebondshave inGroupIII.Forexample,withoutpy,itwouldbeverydifficult 1160 Wangetal. • Vol.25,No.9 • JournalofComputationalChemistry to discriminate single vs. double p—p bonds in compounds 24 and 25 in Figure 1. Bond type information (such as bond orders) can be very helpful in classifying and discriminating among similar chemical environments. For various reasons, many force fields, including AMBER,onlyapplyatomtypeinformation,anddonotseparately nameorkeeptrackofbondordersortypes.Tobeconsistentwith theexistingAMBERforcefieldsandcodes,wehaveusedthesets of identical atom type pairs described above (cc/cd, cp/cq, ce/cf, etc.) instead of explicit bond orders to discriminate conjugated/ aromatic single and double bonds. It is notable that although our scheme works for most of the molecules, there are still some specialmoleculesthatcannotbeproperlyhandled.Wethinkthat mostofthefailureshappentoconjugated/aromaticringsattached tolargealiphaticrings[10 (cid:7) 4n (n (cid:8) 0, 1, 2) memberedrings]. Figure1(c)liststwoexamplesofthekindofmoleculesforwhich ourcurrentschemewouldfail.Inourexperience,suchfailuresare onlyrarelyencountered,butfutureextensionsoftheGAFFforce fieldwillhavetoconsiderthesesortsofmolecules. Wehavedevelopedanatom-typeperceptionprogram,whichis partoftheantechambersuiteofAmber,toassigntheatomtypes described here, based only on an input geometry. Details of the algorithmsinvolvedwillbepresentedinaseparatearticle.19 Charges To accurately fit conformational and nonbonded energies in a transferable fashion, one should choose consistent charge ap- proach.Therestrainedelectrostaticpotential(RESP)16,20atHF/6- 31G*isthedefaultchargeapproachappliedintheAmberprotein force fields. Although RESP is expensive compared to empirical schemessuchasGasteigercharges,ithasmanydesirablefeatures, andallowsonetousefewertorsionaltermsthanmightotherwise berequired.8Ithasworkedwellintestsofsmallmolecules21,22as well as proteins. This is the default charge scheme in GAFF parameterization. Unfortunately, the fact that this charge scheme needs to run ab initio optimization at the HF/6-31G* level has preventeditfrombeingwidelyusedinhandlinglargenumbersof molecules. In this situation, one may apply an alternative charge scheme called AM1-BCC (bond charge correction),23,24 which is much cheaper than HF/6-31G* RESP. The basic idea of AM1- BCC is to first carry out a semiempirical AM1 calculation to get Mullikencharges,followedbyabondchargecorrectionschemeto obtainresultsthatarecompatiblewithRESPcharges.Weusethe BCCparametersderivedbyJakalianetal.,24whicharedesignedto make AM1-BCC charges match the electrostatic potential at the HF/6-31G*level. Figure 1. Example molecules that elucidate the definitions of atom types introduced in GAFF. (a) basic atom types; (b) special atom types;(c)examplesoffailedmoleculesthatcannotbeproperlyhan- dledwithouratomtypescheme.InI(b),unmarkedaromaticcarbonin No.11–15haveanatomtypeof“ca”;inI(c),atomtypesthatcauses failurearemarkedwithbolditalicfont. DevelopmentofaGeneralAmberForceField 1161 Figure2. Least-squaresfitsofbondstretchingforceconstants(K )andequilibriumbondlengths(r).For r eachbondtypebroadlydefinedbyelements,nisthenumberofdata,misthenegativeofslopeandcis theintercept.(a)C—Cbonds(n (cid:8) 25, m (cid:8) 4.96, c (cid:8) 7.82); (b)C—Nbonds(n (cid:8) 24, m (cid:8) 3.99, c (cid:8) 7.34); (c) C—O bonds (n (cid:8) 7, m (cid:8) 3.98, c (cid:8) 7.20). Lennard–JonesParameters BondParameters ThevanderWaalsparametersofGAFFareassameasthoseused GAFF uses three sources of information about equilibrium bond bytheAmberparm94orparm99forcefields.Thereisonlyoneset lengths r : the Amber protein force fields, ab initio calculations eq of parameters (the internuclear separation of the ij pair at the (MP2/6-31G*), and crystal structures. Most of the experimental potential minimum, r*, and the potential well, (cid:8) ) for each ele- data comes from mean values of r obtained from X-ray and ij ij eq ment except hydrogen. Therefore, we believe the van der Waals neutrondiffraction.25,26Datawerecarefullyclassifiedbasedonthe parameters can be reliably transferred to new introduced atom atom types, and the most precise mean values from these three typesinGAFF. compilationswerechosenasther inGAFF. eq 1162 Wangetal. • Vol.25,No.9 • JournalofComputationalChemistry Table2. ComparisonsofVibrationalFrequencies. Scaled Experimental Abinitio abinitio GAFF frequencies frequencies frequencies frequencies No Compoundname Vibrationalmode (cm(cid:6)1) (cm(cid:6)1) (cm(cid:6)1) (cm(cid:6)1) 1 H H—Hstretch 4401 4401 2 2 F F—Fstretch 916 917 2 3 Cl Cl—Clstretch 560 559 2 4 Br Br—Brstretch 325 326 2 5 I I—Istretch 214 215 2 6 CH—NH—NH—CH N—N—Csymbend 482 458 394 3 3 7 N—Nstretch 857 814 868 8 C—Nstretch 1152 1094 1174 9 CH—O—O—CH C—Otorsion 273 259 273 3 3 10 O—O—CsymBend 489 464 405 11 O—OsymStretch 820 779 891 12 C—OsymStretch 1094 1039 1011 13 C—Ostretch(cid:7)C—H 1274 1210 1175 asymstretch 14 CH—PH—PH—CH P—P—CsymBend 256 243 295 3 3 15 P—Pstretch 463 440 477 16 C—Pstretch 703 668 775 17 CH—S—S—CH S—S—CsymBend 240 246 234 223 3 3 18 S—S—CasymBend 272 275 261 255 19 S—Sstretch 509 512 486 490 20 S—CasymStretch 691 746 708 706 21 S—Csymstretch 694 748 710 711 AbinitiofrequencieswerecomputedattheMP /6-311G(cid:7)(d,p)levelandthenscaleddownby0.9496assuggestedby 2 ScottandRadom(Scott,A.P.;Radom,L.JPhysChem1996,100,16502).Experimentalfrequenciesaretakenfrom ref.9. As does MMFF 94,9 GAFF applies an empirical rule to esti- throughlinearleast-squaresfitting.Weperformedsuchfittingsfor mate the missing parameters of bond length force constants. C—C(25data),C—N(24data),andC—O(7data)bondsusing MMFF94appliedaninversesixth-powerdependenceinBadger’s theAmberbondlengthparametersastrainingset.TakingC—Oas formulatorelatethedesiredforceconstantstotabulatedreference anexample,7outof11C—Ouniquebondlengthparametersare values: represented by the following bond length types: C—O, C—O2, (cid:4) (cid:5) C—OH,C2—OH,C2—OS,C3—OH,CT—OS. K (cid:1)Kref rrijef 6 (2) Figure2showstheresults,withregressioncoefficientsmof5.0 r ij r forC—C,and4.0forC—NandC—O.Basedontheseresults,a ij mean value of m of 4.5 was chosen. The three ln(K ) constants ij BecausethefunctionalformofGAFFisdifferentfromthatof werethenrefitusingthefinalvalueofm. MMFF94,weexperimentedwithamoregeneralpowerlaw: ToderivetheK parametersforX—X,(X(cid:8)H,F,Cl,Br,I,N, ij (cid:4) (cid:5) O,andP),wehavedesignedsomemodelmoleculesandperformed 1 m high-level ab initio vibrational frequencies analysis, as described K (cid:1)K (3) r ij r in the Methods section. The force constants were then optimized ij usingParmscantoreproducethevibrationfrequencies.Forsulfur, where K ofS—Swasestimatedusingbondlengthparametersofdisul- ij fidesintheAmberproteinforcefields.[Ifoneknowsr andK , ij r K (cid:6)rref(cid:2)rref(cid:6)(cid:3)K (cid:6)rref(cid:2)rref(cid:6) thenK canbecomputedaccordingtoeq.(3).]Table2liststheab K (cid:1) ii ij jj jj ij ii (4) ij ij (cid:6)rref(cid:2)rref(cid:6)(cid:3)(cid:6)rref(cid:2)rref(cid:6) initioaswellasGAFFvibrationalfrequenciesfortheeightmodel ij jj ij ii molecules and CH SSCH . For the 21 frequencies, the unsigned 3 3 Here, m is the new power order for GAFF; K is the calculated average error (UAE) and the root-mean-square error (RMSE) r force constant; r is the actual bond length; K is empirical betweenabinitioandGAFFare37and48cm(cid:6)1,respectively.If ij ij parameterofelementi andj, anditisequivalenttoKreftimesthe the comparisons were made to scaled ab initio frequencies, the ij sixthpowerofrrefineq.(2).m andln(K ) canbeparameterized UAEandRMSEare48and70cm(cid:6)1. ij ij DevelopmentofaGeneralAmberForceField 1163 Table3. ParameterstoEstimatetheBondStretchingForceConstants Table4. ParameterstoEstimatetheBondAngleBendingForce inGAFF. ConstantsinGAFF. No. i j rref lnK Element C Z ij ij 1 H H 0.738 4.661 H — 0.784 2 C C 1.526 7.643 C 1.339 1.183 3 N N 1.441 7.634 N 1.300 1.212 4 O O 1.460 7.561 O 1.249 1.219 5 F F 1.406 7.358 F — 1.166 6 Cl Cl 2.031 8.648 Cl — 1.272 7 Br Br 2.337 9.012 Br — 1.378 8 I I 2.836 9.511 I — 1.398 9 P P 2.324 8.805 P 0.906 1.620 10 S S 2.038 8.316 S 1.448 1.280 11 H C 1.090 6.217 12 H N 1.010 6.057 13 H O 0.960 5.794 14 H F 0.920 5.600 15 H Cl 1.280 6.937 Finally,forceconstantswerecalculatedfor103bondlengthsin 16 H Br 1.410 7.301 AMBER force field using the above model. The average percent 17 H I 1.600 7.802 error is 1.6%, which shows a good consistency with the Amber 18 H P 1.410 7.257 bondstretchingparametersthatwereempiricallyfittovibrational 19 H S 1.340 7.018 spectraofpeptidefragments.Table3liststhefinal47K param- 20 C N 1.470 7.504 ij 21 C O 1.440 7.347 eters. This strategy of using the force constants in the AMBER 22 C F 1.370 7.227 protein force field as a references has the obvious limitation that 23 C Cl 1.800 8.241 theresultsaredependentuponthequalityofthepreviousparam- 24 C Br 1.940 8.478 eterization; on the other hand, it helps to ensure that GAFF is 25 C I 2.160 8.859 consistent with the AMBER parameters. Further studies are 26 C P 1.830 8.237 planned, aimed at improving both small molecule and protein 27 C S 1.820 8.117 values. 28 N O 1.420 7.526 29 N F 1.420 7.475 30 N Cl 1.750 8.266 31 N Br 1.930 8.593 32 N I 2.120 8.963 33 N P 1.720 8.212 34 N S 1.690 8.073 35 O F 1.410 7.375 36 O Cl 1.700 8.097 37 O Br 1.790 8.276 38 O I 2.110 8.854 39 O P 1.640 7.957 40 O S 1.650 7.922 41 F Cl 1.648 7.947 42 Cl I 2.550 9.309 43 Br I 2.671 9.380 44 F P 1.500 7.592 45 F S 1.580 7.733 46 Cl P 2.040 8.656 47 Cl S 2.030 8.619 48 Br P 2.240 8.729 49 Br S 2.210 8.728 Figure 3. Principle of torsional angle parameterization using Parm- 50 I P 2.490 9.058 scan:redandbluecurvesarerotationalprofilesproducedbyMP4/6- 51 I S 2.560 9.161 311G(d,p)andGAFF,respectively.ThemodelcompoundisMolecule 52 P S 2.120 8.465 1showninFigure4(a)andthetorsionalangleinquestionisX–c–c–X. For this molecule, the ab initio torsional angle scanning was carried outfrom0to330°withastepof30°.WithaV termof1.2(phase 2 angleis180°andmultiplicityis4)kcal/mol,thetwocurvesarewell matched,andtheRMSDofrelativeenergiesis0.24kcal/mol.[Color figure can be viewed in the online issue, which is available at www.interscience.wiley.com] Figure4 DevelopmentofaGeneralAmberForceField 1165 AngleParameters out at the MP4/6-311G(d,p)//MP2/6-31G* level; then Parmscan wasappliedtoderivetorsionalanglepotentialV toreproducethe GAFFusesthesamethreesourcesofinformation(previousforce i abinitiorotationalprofile.ThebasicideawasshowninFigure3. fields,MP2/6-31G*calculations,andcrystaldata)forequilibrium The red curve is the MP4/6-311G(d,p) rotational profile and the bondanglesasforbondlengths.Experimentaldatamainlycomes blueoneisthemolecularmechanicalprofile.AV termwithforce fromref.25.Datafromallthreeresourceswerecarefullyclassified 2 constantof1.2(phaseangleis180°andmultiplicityof4)makes based on the atom types, and the most precise mean values were the MM curve match the ab initio one perfectly except at range chosenfor(cid:4)eq. (cid:6)15to15°. It is a very heavy burden to work out all the bond angle Intotal,wehavederived200torsionalangleparametersusing parameters. If all the combinations are considered, there are this strategy. Figure 4 shows the 200 model molecules and the roughly 20,000 bond angle parameters only for the basic atom torsionalangleusedforscanningwascoloredinred.Table5lists types, and most of them are not available. We have found an the unsigned average error as well as the RMS error for each empirical approach that works well in most situations. For bond torsional angle parameters. Overall, the average UAE and RMS angle (cid:4) , the equilibrium bond angle is taken as the mean A–B–C error of 200 torsional angles are 0.54 and 0.72 kcal/mol, respec- valueof(cid:4) and(cid:4) .Therefore,ifwehaveparameterized A–B–A C–B–C tively. allbondanglesoftheform(cid:4) , (cid:4) ofanybondanglecanbe A–B–A eq Tomakeparametersmoretransferable,itisnotagoodideato estimated from this simple rule. In total we have carried out includemoretermsjustforthepurposeofreducingUAEandRMS minimizationson1800modelmoleculesattheMP2/6-31G*level, error.Typically,mostofthe200parametershaveonlyonegeneral withanaimtogettheequilibriumangleof(cid:4) inallkindsof A–B–A term, either V or V or V . Special torsional angle parameters combinations. 1 2 3 were added only when UAE and RMS can be significantly re- As with bond lengths, GAFF applies empirical formulas to duced. estimate bond angle force constants. A similar formula was also used by MMFF 94 to estimate the missing bond angle force constants. Test Cases K(cid:4) (cid:1)143.9ZCZ(cid:1)req(cid:3)req(cid:2)(cid:6)1(cid:1)(cid:4)eq(cid:2)(cid:6)2exp(cid:1)(cid:6)2D(cid:2) (5) ijk i j k ij jk ijk TheperformanceofGAFFisillustratedherewiththreetestcases. TestIwasdesignedtotesthowwellGAFFpredictsthemolecular (cid:1)req(cid:2)req(cid:2)2 D(cid:1) ij jk (6) structures,andTestsIIandIIIwereusedtotesthowwellGAFF (cid:1)req(cid:3)req(cid:2)2 ij jk predictstheinter-andintramolecularenergies. Here,Z andZ areempiricalparametersforthefirstandthethird i k TestI:MolecularStructures atomsinaangle;C isempiricalparameterforthesecondatomin j aangle,reqandreqaretheequilibriumbondlengths,and(cid:4)eq isthe ij jk ijk In this test, we have performed minimizations for 74 molecules equilibrium bond angle. Table 4 lists the parameters of C and Z that have crystallographic structures. Those molecules were also derivedusing252bondangleparametersinAMBERforcefields. studiedbyotherwidelyusedforcefieldsuchasMMFF94,Tripos Withthismodel,wehavecalculatedbondangleforceconstants 6.8, etc. None of the 74 molecules were in the training set to for252bondanglesintheAmberproteinforcefield.Theaverage parameterize GAFF. Minimization were performed using the percent error is 9.0%. One should realize that bond angle force AM1-BCCschemebothingasphase(dielectricconstants(cid:8)(cid:8)1) constants in Amber force fields are really simple. Typically for and a medium with (cid:8) (cid:8) 4. The minimized structures were than bondangle(cid:4) ,ifAandCarebothhydrogenatoms,theforce A–B–C comparedtothecrystalones.Table6liststheroot-mean-squareof constantisroughly30–35kcal/mol(cid:1)rad(cid:6)2;ifAorCishydrogen, displacement of the two structures, the root-mean-square devia- the force constant is roughly 50 kcal/mol (cid:1) rad(cid:6)2; in other cases, tionsofbondlengthsandbondangles. the force constant is roughly 70 kcal/mol (cid:1) rad(cid:6)2. As with the For gas phase dielectric constant, a RMS displacement of comparableparameterizationofbondlengthforceconstantsabove, 0.256 Å was achieved. This performance is comparable to that ourproceduremainlyhelpstoensureaconsistentpatternofbond of Tripos 5.2 force field12 (0.25 Å) and better than those of angle distortion energies, compared to the AMBER protein force MMFF 94 and CHARMm (0.474 and 0.438 Å, respectively).9 field. Theroot-mean-squaredeviationofbondlengthandbondangle were 0.023 Å and 2.201°, which are comparable to those of TorsionalAngleParameters MMFF 94 (0.021 Å and 1.97°, respectively) and better than InGAFF,torsionalangleparameterizationswereperformedusing those of Tripos 5.2 (0.025 Å and 2.5°) and DREIDING (0.035 thefollowingstrategy.First,torsionalanglescanningwascarried Å and 3.22°, respectively).9 MostofthelargedeviationshappeninmoleculescontainingS andP.WefoundtheatomicchargesofSandPseemtoolargein Figure4. Modelmoleculesforusedtoderiveangleparametersusing GAFF. Some molecules (Compounds 6, 66, and 68) give poor Parmscan. The torsional angles in question are colored in red. (a) geometries, presumably because of so strong an electrostatic in- Molecule1–30;(b)Molecule31–60;(c)Molecule61–90;(d)Mole- teraction in the gas phase. However, applying a larger dielectric cule91–120;(e)Molecule121–145;(f)Molecule146–175;(g)Mol- constantcanscaledowntheelectrostaticinteraction,andwefound ecule176–200. adielectricconstantof4workswellforsuchsituations(datanot 1166 Wangetal. • Vol.25,No.9 • JournalofComputationalChemistry Table5. TorsionAngleParameterizations. No.of No.of No. Torsionalangle conf. UAE RMSE No. Torsionalangle conf. UAE RMSE 1 X–c–c–X 14 0.1649 0.2394 47 X–ce–sy–Xor 7 0.5964 0.8844 2 X–c–na–X 7 1.6509 2.6026 X–cf–sy–X 3 X–c–ne–Xor 7 2.3056 2.783 48 X–n–n–X 7 1.2296 1.7180 X–c–nf–X 49 X–n–n2–X 7 1.4559 1.9685 4 X–c–no–X 7 0.1759 0.2486 50 X–n–n3–X 7 0.6168 0.9310 5 X–c–p3–X 7 0.3915 0.5918 51 X–n–n4–X 7 0.3400 0.4919 6 X–c–pe–Xor 7 0.8514 1.1017 52 X–n–na–X 7 0.5733 0.7916 X–c–pf–X 53 X–n–nh–X 6 0.7868 0.8980 7 X–c–px–X 7 0.071 0.0969 54 X–n–no–X 7 0.7202 0.8854 8 X–c–py–X 3 0.2970 0.3883 55 X–n–oh–X 7 1.5664 1.8988 9 X–c–sh–X 7 0.3682 0.5417 56 X–n–os–X 7 1.0857 1.2498 c3–c–sh–hs 57 X–n–p2–X 7 0.4518 0.5484 10 X–c–ss–X 7 0.5854 0.7267 58 X–n–p3–X 7 1.0356 1.5099 11 X–c–sx–X 7 1.4433 1.7906 59 X–n–p4–X 5 0.3747 0.4949 12 X–c–sy–X 5 0.3786 0.4713 60 X–n–p5–X 5 0.3143 0.4094 13 X–c2–n–X 7 0.8837 1.2102 61 X–n–s4–X 7 0.8008 1.0519 c–n–c2–c2 62 X–n–s6–X 7 0.8440 1.1573 14 X–c2–n4–X 7 0.8027 1.018 63 X–n–sh–X 7 0.6584 0.8607 15 X–c2–na–X 5 0.3833 0.6314 64 X–n–ss–X 7 0.7577 0.9103 16 X–c2–nh–X 7 0.3964 0.4654 65 X–n2–n3–X 7 1.1041 1.5665 17 X–c2–no–X 7 0.205 0.2525 66 X–n2–n4–X 5 0.2192 0.2728 18 X–c2–p3–X 7 1.4872 2.2239 67 X–n2–na–X 7 0.2445 0.2949 19 X–c2–sh–X 7 0.3988 0.5041 68 X–n2–nh–X 7 1.0014 1.2859 20 X–c2–ss–X 7 0.4856 0.6310 69 X–n2–no–X 7 0.2609 0.2963 c2–c2–ss–c3 70 X–n2–oh–X 7 0.3967 0.4742 21 X–c3–p2–X 7 0.4531 0.5784 71 X–n2–os–X 7 0.6858 0.8952 22 X–c3–p3–X 5 0.4177 0.5743 72 X–n2–p3–X 6 0.7170 0.8680 23 X–c3–p4–X 4 0.0814 0.0982 73 X–n2–sh–X 7 0.5857 0.8611 24 X–c3–p5–X 4 0.2564 0.348 74 X–n2–ss–X 7 0.7068 0.9469 25 X–c3–s4–X 7 0.0228 0.0324 75 X–n3–n3–X 6 0.6755 0.8119 26 X–c3–s6–X 7 0.0407 0.0527 76 X–n3–n4–X 6 0.3593 0.4909 27 X–ca–n–X 7 0.0877 0.1050 77 X–n3–na–X 7 0.3112 0.3781 28 X–ca–n3–X 7 0.2633 0.3230 78 X–n3–nh–X 7 0.6448 0.7711 29 X–ca–n4–X 5 0.3677 0.56 79 X–n3–no–X 3 0.6028 0.7623 30 X–ca–na–X 7 0.3222 0.4074 80 X–n3–oh–X 7 0.7214 0.8752 31 X–ca–ne–Xor 7 1.0585 1.3077 81 X–n3–os–X 7 0.8600 1.0886 X–ca–nf–X 82 X–n3–p2–X 7 0.7801 1.1569 32 X–ca–nh–X 7 0.1854 0.2691 83 X–n3–p3–X 7 0.4412 0.5675 33 X–ca–no–X 7 0.098 0.1572 84 X–n3–p4–X 7 0.7148 1.3278 34 X–ca–p3–X 6 0.0485 0.0696 85 X–n3–p5–X 7 1.0947 1.3306 35 X–ca–pe–Xor 7 0.3187 0.4453 86 X–n3–s4–X 7 0.7126 0.8621 X–ca–pf–X 87 X–n3–s6–X 4 0.1986 0.2672 36 X–ca–px–X 6 0.1520 0.2532 88 X–n3–sh–X 7 1.1239 1.6974 37 X–ca–py–X 7 0.1082 0.1520 89 X–n3–ss–X 7 1.0435 1.3692 38 X–ca–sh–X 7 0.1724 0.2463 90 X–n4–n4–X 7 0.5497 0.66430 39 X–ca–ss–X 7 0.0679 0.0943 91 X–n4–na–X 6 0.0845 0.1140 40 X–ca–sx–X 7 0.9675 1.1595 92 X–n4–nh–X 7 0.3832 0.4680 41 X–ca–sy–X 7 0.1924 0.2154 93 X–n4–no–X 7 0.5606 0.7750 42 X–ce–ne–Xor 7 0.7232 1.018 94 X–n4–oh–X 7 0.4798 0.5384 X–cf–nf–X 95 X–n4–os–X 7 0.7807 0.9072 43 X–ce–pe–Xor 7 0.4658 0.7620 96 X–n4–p2–X 7 0.3801 0.4473 X–cf–pf–X 97 X–n4–p3–X 7 0.3761 0.4583 44 X–ce–px–Xor 3 0.5512 0.7518 98 X–n4–p4–X 7 0.0245 0.0344 X–cf–px–X 99 X–n4–p5–X 7 0.2688 0.3755 45 X–ce–py–Xor 5 0.5660 0.7028 100 X–n4–s4–X 7 1.0123 1.3855 X–cf–py–X 101 X–n4–s6–X 7 0.7639 1.2253 46 X–ce–sx–Xor 7 0.6322 1.0069 102 X–n4–sh–X 7 0.4657 0.5475 X–cf–sx–X 103 X–n4–ss–X 7 0.3767 0.4407

Description:
Abstract: We describe here a general Amber force field (GAFF) for organic molecules . parameters are inherited from traditional Amber force fields di-.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.