The Amber Biomolecular Simulation Programs DAVID A. CASE,1 THOMAS E. CHEATHAM, III,2 TOM DARDEN,3 HOLGER GOHLKE,4 RAY LUO,5 KENNETH M. MERZ, JR.,6 ALEXEY ONUFRIEV,7 CARLOS SIMMERLING,8 BING WANG,1 ROBERT J. WOODS9 1DepartmentofMolecularBiology,TheScrippsResearchInstitute,10550NorthTorreyPines Raod,TPC15,LaJolla,California92037 2DepartmentsofMedicinalChemistry,Pharmaceutics&PharmaceuticalChemistry,and Bioengineering,2000East,30South,SkaggsHall201,UniversityofUtah,SaltLakeCity, Utah84112 3NationalInstituteofEnvironmentalHealthSciences,LaboratoryofStructuralBiology,Mail DropF3-02,POBox12233ResearchTrianglePark,NorthCarolina27709 4FachbereichBiologieundInformatik,J.W.Goethe-Universita¨t,Marie-Curie-Str.9, 60439Frankfurt/Main,Germany 5DepartmentofMolecularBiologyandBiochemistry,UniversityofCalifornia, 5.3206NaturalScienceI,Irvine,California92697 6DepartmentofChemistry,PennsylvaniaStateUniversity,UniversityPark,Pennsylvania16802 7DepartmentofComputerScience,VirginaTech,600McBryde(MC0106)Blacksburg, Virginia24061 8DepartmentofChemistry,StonyBrookUniversity,StonyBrook,NewYork11794 9ComplexCarbohydrateResearchCenter,UniversityofGeorgia,315RiverbendRd., Athens,Georgia30602 Received7January2005;Accepted5March2005 DOI10.1002/jcc.20290 PublishedonlineinWileyInterScience(www.interscience.wiley.com). Abstract: Wedescribethedevelopment,currentfeatures,andsomedirectionsforfuturedevelopmentoftheAmber package of computer programs. This package evolved from a program that was constructed in the late 1970s to do Assisted Model Building with Energy Refinement, and now contains a group of programs embodying a number of powerful tools of modern computational chemistry, focused on molecular dynamics and free energy calculations of proteins,nucleicacids,andcarbohydrates. ©2005WileyPeriodicals,Inc. JComputChem26:1668–1688,2005 Keywords: Amber;biomolecularsimulationprograms Introduction icalforcefieldsthatareimplementedhere.Itshouldberecognized, however,thatthecodeandforcefieldsareseparate;severalother Moleculardynamicssimulationsofproteins,whichbeganabout25 computerpackageshaveimplementedtheamberforcefields,and yearsago,arebynowwidelyusedastoolstoinvestigatestructure otherforcefieldscanbeusedwithintheAmberprograms. and dynamics under a variety of conditions; these range from A history of Amber’s early development was presented about studies of ligand binding and enzyme reaction mechanisms to 10 years ago;2 here we give an overview of more recent efforts. problems of denaturation and protein refolding to analysis of Our goal is to provide scientific background for the simulation experimental data and refinement of structures. “Amber” is the techniques that are implemented in the Amber programs and to collectivenameforasuiteofprogramsthatallowsuserstocarry illustratehowcertaincommontasksarecarriedout.Wecannotbe out and analyze molecular dynamics simulations, particularly for proteins, nucleic acids and carbohydrates. None of the individual programscarriesthisname,butthevariouspartsworkreasonably Correspondenceto: D.A.Case;e-mail:[email protected] welltogether,providingapowerfulframeworkformanycommon Contract/grantsponsor:NIH;contract/grantnumber:RR12255(to calculations.1Thetermambersometimesalsoreferstotheempir- D.A.C.) ©2005WileyPeriodicals,Inc. AmberBiomolecularSimulationPrograms 1669 others. For example, the NAMD simulation program8 can take advantage of Amber tools and force fields by virtue of knowing how to interpret the information in Amber’s prmtop files; simi- larly, the VMD graphics tools9 can read our trajectory format to prepare animations. The toolkit from the Multiscale Modeling Tools for Structural Biology (MMTSB) project10 can be used to control many aspects of Amber simulations (see Fig. 1). Other users have written tools to improve upon LeAP’s preparation interface,sometimesusingaportionoftheLEaPcodeandsome- timesbypassingitentirely. Ofcourse,therearealsodisadvantagesinthecodefragmenta- tion implied by Figure 1. There is generally no consistent user interfaceforthevariouscomponents,whichmakesitmoredifficult to learn. Furthermore, there is no (easy) way for code in one sectiontomodifytheresultsofanothersection.Forexample,atom types or charges (which are established in the preparation phase) cannot be modified as the simulation proceeds; similarly, the Figure1. InformationflowintheAmberprogramsuite. simulationcannotdecidewhichtrajectorydatatoarchivebasedon the sort of analysis to be done, because it does not have any information about that. Despite these limitations, breaking up of exhaustivehere(theUsers’Manualis310pageslong!),butwedo the code into distinct pieces has generally served the user and trytogiveasenseofthetrade-offsthatareinevitablyinvolvedin developercommunitieswell. maintainingalargesetofprogramsassimulationprotocols,force fields,andcomputerpowerrapidlyevolve.Therearecertaintasks PreparationPrograms thatAmbertriestosupportwell,andtheseareourfocushere.We hope that this account will help potential users decide whether The main preparation programs are antechamber (which assem- Ambermightsuittheirneeds,andtohelpcurrentusersunderstand blesforcefieldsforresiduesororganicmoleculesthatarenotpart whythingsareimplementedthewaytheyare. ofthestandardlibraries)andLEaP(whichconstructsbiopolymers It should be clear that this is not a primer on biomolecular from the component residues, solvates the system, and prepares simulations, for there are many excellent books that cover this lists of force field terms and their associated parameters). The subject at various levels of detail.3–7 More information about result of this preparation phase is contained in two text files: a Amberitself,includingtutorialsandaUsers’Manual,isavailable coordinate (prmcrd) file that contains just the Cartesian coordi- athttp://amber.scripps.edu. natesofallatomsinthesystem,andaparameter-topology(prm- top) file that contains all other information needed to compute energies and forces; this includes atom names and masses, force Overview of Amber Simulations field parameters, lists of bonds, angles, and dihedrals, and addi- tional bookkeeping information. Since version 7, the prmtop file Amberisnotasingleprogram,butisratheracollectionofcodes hasbeenwritteninanextensibleformatthatallowsnewfeatures thataredesignedtoworktogether.Theprincipalflowofinforma- oruser-suppliedinformationtobeincludedifrequired. tionisshowninFigure1.Therearethreemainsteps,showntopto LEaPincorporatesafairlyprimitiveX-windowgraphicsinter- bottominthefigure:systempreparation,simulation,andtrajectory face that allows for visual checking of results and for some analysis.Encodingtheseoperationsinseparateprogramshassome interactivestructuremanipulation.Mostusers,however,findother important advantages. First, it allows individual pieces to be up- programsbettersuitedtoinspectionofPDBfilesandconstruction graded or replaced with minimal impact on other parts of the of initial coordinates, and primarily use LEaP in a text mode programsuite;thishashappenedseveraltimesinAmber’shistory. (tLEaP) to assemble the system from a “clean” PDB-format file Second, it allows different programs to be written with different that contains acceptable starting coordinates. The nucleic acid codingpractices:LEaPiswritteninCusingX-windowlibraries, builder (NAB) program integrates well with Amber to construct ptrajandantechamberaretext-basedCcodes,mm-pbsaisimple- initialmodelsfornucleicacids,11andtheAmber/GLYCAMcon- mented in Perl, and the main simulation programs are coded in figuratortool(http://glycam.ccrc.uga.edu/Amber)servesasimilar Fortran 90. Third, this separation often eases porting to new purpose for carbohydrates. Tools for manipulating protein struc- computingplatforms:onlytheprincipalsimulationcodes(sander tures(e.g.,forconstructinghomologymodels)arewidespread,and and pmemd) need to be coded for parallel operation or need to theresultingPDB-formatfilescangenerallybeprocessedbyLEaP know about optimized (perhaps vendor-supplied) libraries. Typi- withlittleornomodification. cally, the preparation and analysis programs are carried out on localmachinesonauser’sdesktop,whereastime-consumingsim- SimulationPrograms ulation tasks are sent to a batch system on a remote machine; having stable and well-defined file formats for these interfaces The main molecular dynamics program is called sander; as with facilitates this mode of operation. Finally, the code separation “Amber,”theoriginalacronymisnolongerverydescriptive.This showninFigure1facilitatesinteractionwithprogramswrittenby codeiswritteninFortran90,andusestheFortrannamelistsyntax 1670 Caseetal. • Vol.26,No.16 • JournalofComputationalChemistry toreaduser-definedparametersaslabel-valuepairs.Asonemight AnalysisPrograms imagine,therearemanypossibleoptions,andabout150possible The task of analyzing MD trajectories faces two main obstacles. input variables. Of these, only 32 (identified in boldface in the First, trajectory files may become very large, and the total time Users’Manual)generallyneedtobechangedformostsimulations. coursemayneedtobeassembledfrompiecesthatwerecomputed Asmuchaspossible,thedefaultoptionshavebeenchosentogive in different runs of the simulation program. Second, the types of goodqualitysimulations. analysesthatcanbecarriedoutarequitevaried,andchangewith Sanderisaparallelprogram,usingtheMPIprograminginter- time, as simulation science progresses and as new ideas are de- face to communicate among processors. It uses a replicated data veloped. The ptraj analysis program (see http://www.chpc. structure, in which each processor “owns” certain atoms, but utah.edu/(cid:1)cheatham/software.html) was designed with these ob- where all processors know the coordinates of all atoms. At each stacles in mind, although it provides only a partial resolution to step, processors compute a portion of the potential energy and them. It can process both Amber and CHARMM trajectories, correspondinggradients.Abinarytreeglobalcommunicationthen parsing their respective prmtop or psf files to atom and residue sums the force vector, so that each processor gets the full force namesandconnectivity,andcanassembletrajectoriesfrompartial vector components for its “owned” atoms. The processors then ones,oftenstrippingoutparts(suchassolvent)thatmightnotbe performamoleculardynamicsupdatestepforthe“owned”atoms, needed for a particular analysis. After this, a variety of common anduseasecondbinarytreetocommunicatetheupdatedpositions analysis tasks may be carried out; some of these are illustrated to all processors, in preparation for the next molecular dynamics step.Detailsofthisprocedurehavebeengivenelsewhere.2,12 below.Thecommandsyntaxisdesignedtoallowuserstoaddnew tasks,butthisrequiresknowledgeofCprogramming,andisnotas Because all processors know the positions of all atoms, this straightforward as it might be. But it does provide a powerful modelprovidesaconvenientprogrammingenvironment,inwhich framework for adding new commands, and the ptraj repertoire thedivisionofforce-fieldtasksamongtheprocessorscanbemade continuestogrow;recentadditionsincludeavarietyofclustering inavarietyofways.Themainproblemisthatthecommunication algorithmsandvariantsofprincipalcomponentanalysis. required at each step is roughly constant with the number of Our eventual goal is that ptraj will support analyses based on processors, which inhibits parallel scaling. In practice, this com- energiesaswellasstructures,butthisisnotthecaseinthecurrent municationoverheadmeansthattypicalexplicitsolventmolecular codes.Rather,avarietyofprogramsareusedtoestimateenergies dynamics simulations do not scale well beyond about eight pro- andentropiesfromthesnapshotscontainedwithintrajectoryfiles. cessorsforatypicalclusterwithgigabitethernet,orbeyond16–32 The calculations are organized and spawned by a Perl script, clusters for machines with more efficient (and expensive) inter- mm-pbsa, which also collects statistics and formats the output in connection hardware. Implicit solvent simulations, which have tabularform.Asitsnamesuggests,theanalysisisprimarilybased manyfewerforcesandcoordinatestocommunicate,scalesignif- oncontinuumsolvationmodels.13–15 icantly better. For these relatively small numbers of processors, inequitiesinload-balancingandserialportionsofthecodearenot limiting factors, although more work would have to be done for OverallStrengthsandWeaknesses largerprocessorcounts. To improve performance, Bob Duke has prepared an exten- An overall view of our perception of Amber’s strengths and sivelyrevisedversionofsander,calledpmemd,whichcommuni- weaknesses is given in Table 1. In brief, the suite is good at catestoeachprocessoronlythecoordinateinformationnecessary carrying out and analyzing “standard” MD simulations for pro- for computing the pieces of the potential energy assigned to it. teins, nucleic acids, and carbohydrates, using either explicit sol- Many other optimizations were also made to improve single- vent with periodic boundary conditions or an implicit solvent processor performance. This code does not support all of the model. Our aim is to make it easy for users to carry out good- optionsfoundinsander,buthasasignificantperformanceadvan- qualitysimulations,andtokeepthecodesuptodateasstandards tageforthemostcommonlyusedsimulationoptions.Somepmemd and expectations evolve. However, some desirable options are performance numbers may be found at http://amber.scripps.edu/ missing, and changing the way in which the calculations are amber8.bench2.html. A similar communications strategy is being performed can require the user to understand and modify a core addedtodevelopmentversionsofsanderbyMikeCrowley. codethathasvaryingstandardsofreadability.Aswepreparenew The normal mode analysis code, nmode, is by now quite old, versionsofthecode,ourgoalistoamelioratesomeoftheweak- but is still useful for certain types of analysis. It is limited to nesses,andtocontinuetoincorporatenewsimulationtechniques, nonperiodicsimulationsand,asapracticalmatter,tosystemswith especially those that accelerate convergence of conformational fewer than 3000 atoms. Its primary use now is to compute esti- sampling,orwhichallowustouseforcefieldsthathaveagreater mates of thermodynamic quantities (in particular, vibrational en- underlyingphysicalrealism. tropies) for configurations extracted from molecular dynamics No program can hope to do all tasks well, and subjective simulations,asinthemm-pbsaschemediscussedbelow.Itsorig- choicesusuallyneedtobemadeaboutwhichcapabilitiesaremost inalpurpose,tocomputevibrationalpropertiesofsmallmolecules needed,andwhichonescanbesupportedbythedevelopers.These used in force-field parameterization, is still relevant as well. The choices may be expected to change with time, as computers code supports the Amber polarizable potentials, but not (yet) the becomemorepowerful,andasalgorithmsevolveandexperience generalized Born model; second derivatives of GB energies are is gained about which sorts of calculations provide the greatest available in the NAB program (see http://www.scripps.edu/case/ amountofphysicalrealism.Overtheyears,Amberhasdiscarded nab.html), which has a GB parameterization identical to that of anumberoffeaturesthatwereonceviewedasimportantpartsof Amber. itsfeatureset.Somecommentsaboutwhatwenolongersupport AmberBiomolecularSimulationPrograms 1671 Table1. StrongandWeakPointsoftheAmberBiomolecularSimulationPrograms. Strengths Weaknesses Amberimplementsefficientsimulationswithperiodicboundary Onecannotdogoodsimulationsofjustpartofasystem,suchasthe conditions,usingthePMEmethodforelectrostaticinteractions activesiteofanenzyme:stochasticboundaryconditionsforthe andacontinuummodelforlong-rangevanderWaalsinteractions. water-continuuminterfacearemissing,asareefficientmeansfor handlinglong-rangeelectrostaticsandareactionfield. Non-periodicsimulationsaresupported,usingageneralizedBornor Thecomponentprogramslackaconsistentuserinterface;thereis numericalPoisson-Boltzmannimplicitsolventmodel. onlylimitedscriptingcapabilitytosupporttypesofcalculations notanticipatedbytheauthors. Explicitsupportisprovidedforcarbohydratesimulations,aswellas Thereislimitedsupportforforcefieldsotherthanthosedeveloped forproteins,nucleicacidsandsmallorganicmolecules. byAmbercontributors. Free-energycalculationsusethermodynamicintegrationorumbrella Missingfeaturesinclude:“dualtopology”freeenergycalculations, samplingtechniques,andarenotlimitedtopairwisedecomposable reaction-pathanalysis,MonteCarlosampling,torsionangle potentials. dynamics,andinteractivesteeredmoleculardynamics. Convergenceaccelerationcanuselocally-enhancedsamplingor QM/MMsimulationsarelimitedtosemiempiricalHamiltonians,and replicaexchangetechniques. cannotcurrentlybecombinedwiththePMEorgeneralizedBorn solvationoptions. Thereisaextensivesupportfortrajectoryanalysisandenergetic Thecodeswerewrittenbymanyauthorsovermanyyears,andmuch post-processing. ofitisdifficulttounderstandormodify. Restraintscanbeveryflexible,andcanbebasedonmanytypesof Efficientparallelscalingbeyondaboutadozenprocessorsmay NMRdata. requiredaccesstospecialhardwareortheadoptionofanimplicit solventmodel. Thereisalargeandactiveusercommunity,plustutorialsanda Usersarerequiredtocompiletheprogramsthemselves,anditcanbe User’sManualtoguidenewusers.Thesourcecodeisportable tedioustoassembletheneededcompilersandlibraries. andisavailableforinspectionandmodification. (orsupportwell)mayhelpusersbetterknowwhattoexpectfrom classicalmolecularmechanicspotentialsallowtheenergytobe Amber.Someofthefeaturesthatarenowmissingarethese: writtenassumsoftermsthatprimarilydependupontwoatoms atatime.Freeenergycalculationsthatinvolvechangestoonly 1. “Vacuum”simulations:Amberisstronglyaimedatsimulations a small part of the system (such as a single amino acid side ofbiomoleculesinwater.Althoughitisstillpossibletousethe chain) can exploit this feature by calculating only the small codes in situations where there is no solvent environment, we number of terms that change when accumulating free energy no longer implement this capability in a manner that is as differences. This efficiency comes at the expense of some efficient or robust as in earlier versions of the code. The complexbookkeeping,andinanyeventbreaksdownwhenthe generalized Born (or numerical Poisson–Boltzmann) contin- energyisnotwritteninthispairwiseform;thelatter,however, uum solvent models allow for more realistic (albeit more ex- isthecaseforGBorPBtheories,forpolarizablepotentials,and pensive) simulations for cases where an explicit treatment of forthe“reciprocalspace”portionofPMEsimulations.Because solvation is not appropriate. This lack of full support for sol- thesemoremoderntechniquesaretheoneswerecommend,we vent-free simulations can be a genuine limitation for cases have dropped development of the gibbs module that imple- wherecoarse-grained,statistical,orother“effective”potentials mented free energy calculations for pairwise potentials. With areappropriate,andfutureversionsofAmbermayreintroduce perhapslessjustification,wehavealsodroppedsupportforfree someofthisolderfunctionality. energy perturbation (FEP) calculations, in favor of thermody- 2. Simulations with nonbonded cutoffs: simulations that ignore namicintegration(TI)techniquesthatare(inmostcases)more long-rangenonbondedinteractions(particularforelectrostatics) robust and efficient, and are certainly easier to program and often exhibit biased behavior and artifacts that are difficult to debug. predict or correct for. For many types of simulations, it is feasible to include all long-range electrostatics by means of a particle-meshEwald(PME)scheme(discussedbelow),witha Force Fields for Biomolecular Simulations simulation cost that is comparable to (or even less than) schemes that do include cutoffs. Given this, Amber neither It has long been recognized that the accuracy of the force field needs nor implements cutoff-based switching or smoothing model is fundamental to successful application of computational functions. In the generalized Born model, electrostatic effects methods. The Amber-related force fields are among the most aremuchlesslongranged(especiallywhenanonzeroconcen- widely used for biomolecular simulation; the original 1984 arti- trationofaddedsaltismodeled),andAmbercanusecutoffsfor cle16iscurrentlythe10thmost-citedinthehistoryoftheJournal this situation, although recommended cutoffs are still fairly of the American Chemical Society, and the 1995 revision17 has large(ca.15Å). beenthatjournal’smost-citedarticlepublishedinthelastdecade. 3. Freeenergysimulationsforpairwise-decomposablepotentials: Butthiswidespreadusealsomeansthatsomesignificantdeficien- 1672 Caseetal. • Vol.26,No.16 • JournalofComputationalChemistry ApplicationstoCarbohydrates In eukaryotes, the majority of proteins are glycosylated, that is, theyhavecarbohydratescovalentlylinkedtotheirsurfaces,either throughtheamidonitrogenatomofasparaginesidechains(known asN-linkedglycosylation)orthroughthehydroxyloxygenatoms ofserineorthreoninesidechains(O-linked).Thesemodifications serveamultitudeofroles,spanningthepurelystructural,suchas protecting the protein from proteolytic degradation, to the func- tional,suchasenablingcelladhesionthroughcarbohydrate-protein binding.Computationalmethodscanassistintheinterpretationof Figure2. Illustrationof3ofthe10possibledisaccharidesgenerated otherwise insufficient experimental data, and can provide models from linking two (cid:1)-D-glucopyranosyl residues ((cid:1)-D-glcp): (a) (cid:1)-D- for the structure of oligosaccharides and insight into the mecha- glcp-(136)-(cid:1)-D-glcp (iso-maltose), (b) (cid:1)-D-glcp-(134)-(cid:1)-D-glcp nismsofcarbohydraterecognition. (maltose),and(c)(cid:1)-D-glcp-(131)-(cid:1)-D-glcp((cid:1),(cid:1)-trehalose).Confor- Oligosaccharides frequently populate multiple conformational mation-determiningglycosidictorsionanglesareindicatedin(a). families arising from rotation about the glycosidic linkages (see Fig. 2). As a result, they do not generally exhibit well-defined tertiary structures. A simulation time scale that may be adequate forestablishingtheperformanceofaforcefieldforfoldedproteins cies are known, especially in terms of the relative stabilities of cannotbeexpectedtobeappropriateforoligosaccharides,whose (cid:1)-helical vs. extended conformations of peptides.18–20 There are conformationallifetimesareontheorderof5–10ns.Inthisregard, good recent reviews of the Amber force fields for proteins21 and oligo- and polysaccharides behave more like peptides, and force fornucleicacids.22Thesediscusssomeofthetrade-offsthatwere fieldvalidationmustbebasedonpropertiescomputedfromstruc- madeinconstructingtheforcefields,andprovidecomparisonsto turalensembles. other potentials in common use. For this reason, we will not The GLYCAM force field introduced to Amber all of the summarize this information here. Rather, we will focus on two featuresthatwerenecessaryforcarbohydrateconformationalsim- more recent aspects of the Amber force fields: applications to ulations, with the key focal points being treatment of glycosidic carbohydrates,andthefacilitiesfordefiningamodelinwhichpart torsionanglesandnonbondedinteractions.28Manyvalenceterms, of the system is treated as a quantum subsystem, embedded in a withtheexceptionofthosedirectlyassociatedwiththeanomeric (generally)largerenvironmentdescribedbymolecularmechanics carbon atom [notably R(C1–O5), R(C1–O1) and (cid:2)(O5–C1–O1)], forcefields. were taken from the parm94 parameter set, as were all van der Amber also supports a more generic force field for organic Waalsterms.Theforceconstantsforthenewlyintroducedvalence molecules, called GAFF (the general Amber force field).23 The terms were derived by fitting to quantum data at the HF/6-31G* antechamberprogramtakesathree-dimensionalstructureasinput, level. and automatically assigns charges, atom types, and force field Toaddresstheelectrostaticpropertiesuniquetoeachmonosac- parameters. Given the wide diversity of functionality in organic charide, as fully as possible in a nonpolarizable framework, all molecules, it is not surprising that the resulting descriptions are releases of GLYCAM have employed residue-specific partial sometimes not optimal, and “hand-built” force fields are often atomic charges. In versions of GLYCAM up to and including requiredfordetailedstudies.Nevertheless,thetablesthatdrivethe GLYCAM04,thesechargeswerecomputedbyfittingtothequan- antechamberprogramcontinuebeimprovedaswegainexperience tummechanicalelectrostaticpotentialcomputedattheHF/6-31G* withtheirweakpoints.TheprimaryapplicationsofaroftheGAFF level for the methyl glycoside of each residue in each anomeric potentials has been to the analysis of complexes of small mole- configuration.Forexample,thepartialchargesontheatomsinthe cules with proteins or nucleic acids, especially for the automated glucopyranose methyl (cid:1)-D-glcp are distinct from those on the analysisofdiverselibrariesofsuchligands.TheQM/MMfacility atomsinmethyl(cid:3)-D-glcp(seeFig.3). described below can also be used to describe fairly arbitrary Similarly, the structures and partial charges in glucopyranose organicmoleculesinthepresenceofareceptordescribedinterms are distinct from those in mannopyranose (manp) and galactopy- ofmolecularmechanicspotentials. It is worth noting that the next generation of force fields for proteins and nucleic acids is likely to be significantly more com- plex than the ones we use today. These coming implementations will certainly have some representation of electronic polarizabil- ity,24–26andarealsolikelytoincludefixedatomicmultipolesor off-center charges to provide more realistic descriptions of the electrondensity.Theremayalsobemorecomplexdescriptionsof internaldistortions(especiallyforthepeptidegroup27)andterms beyond Lennard–Jones 6–12 interactions to represent exchange– Figure3. Numberingandanomericconfigurationin(a)methyl(cid:1)-and repulsion and dispersion. A key goal for future development of (b)methyl(cid:3)-D-glucopyranoside.InGLYCAM,anomericcarbonatom Amber is to support efficient and parallel simulations that track C1isatomtypeACin(a)andatomtypeECin(b);inGLYCAM04 thesenewdevelopments. bothareatomtypeCG. AmberBiomolecularSimulationPrograms 1673 ranose(galp)etc.Duetothepotentialforrotationsofthehydroxyl tionhasbeenassociatedwiththemoregeneralgaucheeffect.32In groups to influence the electrostatic properties, beginning in thegasphaseitarisesprimarilyfromhydrogenbondingbetween 200029 the partial charges were no longer computed from the thevicinalhydroxylgroups;however,thereisasmallcontribution neutrondiffractionstructuresofthemethylglucosides,butfroman fromhyperconjugationthatisfurtherstabilizing.Inthecondensed ensembleof100configurationsextractedfromsolvatedMDsim- phase,itsoriginismorecomplex.33Insolution,inpyranosidesin ulationsoftheglycosides.Thechargeswerecomputedatthesame whichbothC6andO4areequatorial(mostnotablyingluco-and quantum level as earlier, consistent with the philosophy of AM- mannopyranose),theO6–C6–C5–O5torsionangleadoptsnearly BER,butwerebasedontheHF/6-31G*optimizedgeometriesof exclusivelyoneortheotherofthetwogaucheorientations,known theensembleofconformations.Itshouldbenotedthatthetorsion as gauche–gauche (gg) or gauche–trans (gt); labels that refer to anglesoftheexocyclicgroupswererestrainedduringthegeometry theorientationoftheO6–C6–C5–O5andO6–C6–C5–C4angles, optimizationsinthesolvationpreferredorientations.Althoughthe respectively. Remarkably, when O4 is axial (as in galactopyr- charges were derived at the HF/6-31G* level, the fitting was anose)allthreerotamersarepopulated,withasignificantamount performed with a larger RESP restraint weight (0.01) than that ofthetrans(tg)rotamerbeingobserved.34Toaddresstheseissues employed in fitting the charges for the amino acids in AMBER muchcomputationalfocushasbeengiventotheO–C–C–Olink- (0.001).Simulationsofthecrystallatticesofmonosaccharidesled age.35–37 In gas phase quantum calculations it is only when in- to the inescapable conclusion that the HF/6-31G* ESP charges tramolecular hydrogen bonding is disallowed that the rotational were too polar and required the larger damping afforded by the energycurvesagreeevenqualitativelywiththeexperimentaldata higherrestraintweight.30Althoughaffectingthestrengthsofdirect (seeFig.4).37Inacombinedquantumandsimulationstudyitwas hydrogen bonds, this damping has negligible effect on molecular concluded that the gauche effect in carbohydrates arose solely dipolemomentsandensuinglong-rangeelectrostatics. fromtheattenuationofinternalhydrogenbondingbyinteractions Asinthecaseofpeptides,theinterresiduerotationalproperties withexplicitsolvent.37 (thepositionsofminima,theirrelativeenergies,andinterconver- ToparameterizethispropertyintotheGLYCAMforcefieldit sion barriers) have a direct influence on the 3D structure and was necessary to ensure that the quantum mechanical rotational dynamicsofanoligosaccharide.Inthemajorityofcarbohydrates propertiesoftheO–C–C–Olinkage,inthepresenceandabsenceof these rotational properties are associated with only three atomic internalhydrogenbonding,couldbereproduced.Thispresenteda linkages,emanatingfromtheanomericcarbonatom(Fig.2). more serious challenge than might first be expected, largely be- The first of these is known as the (cid:4)-angle, and has been the causeofthedifficultyinachievingabalancebetweenthestrengths subjectofextensivetheoreticalandexperimentalexaminationbe- of the internal nonbonded energies associated with interactions causeofitsuniquerotamericproperties.Inpyranosides,the(cid:4)-an- betweenO6,O5,andO4.Inpyranosides,theO6...O4interaction gle refers to the orientation of the exocyclic O5–C1–O1–Cx se- isa1–5type,whiletheO6...O5isa1–4.Yet,geometrically,the quence. The hyperconjugation associated with the O–C–O relevantinteratomicdistances,bothinthepresenceandabsenceof sequence imparts a preference for this linkage to adopt gauche internal hydrogen bonding are similar for each case. Thus, the orientations, rather than populating all three potential rotamers. practiceinAMBERofdampingthemagnitudeof1–4non-bonded This preference is known as the exo-anomeric effect.31 The pop- interactions relative to all others by applying a scale factor (of ulation of the gauche rotamers differs in (cid:1)- and (cid:3)-pyranosides between 0.5–0.83) introduced an artificial imbalance in the because of further steric effects introduced by the pyran ring strengths of the O6 ... O4 and O6 ... O5 interactions in carbo- system.Forthisreason,inGLYCAMtherotationalpropertiesof hydrates.ItwasonlywhentheGLYCAMparameterswerederived the(cid:4)-anglewereincorporatedwithaseparatetorsiontermfor(cid:1)- intheabsenceof1–4scalingthatquantitativeagreementwiththe and(cid:3)-pyranosides.Eachtermwasindividuallyderivedbyfitting experimental solution rotamer populations for the (cid:6)-angle could to the rotational energy curves for 2-methoxytetrahydropyran (2- beachieved.37 axial serving as a model for (cid:1)-pyranosides and 2-equatorial for Althoughadequateforagreatvarietyofcarbohydratesystems, (cid:3)-pyranosides). All torsion terms associated with the (cid:4)-angle, thepresenceinGLYCAMofauniqueatomtypefortheanomeric otherthanthosefortheO5–C1–O1–Cxsequence,weresettozero. carbon in (cid:1)- and (cid:3)-glycosides meant that it was not possible to Thisapproachnecessitatedtheintroductionoftwonewatomtypes simulateprocesses,suchasringflipping,inwhichthesubstituents for the anomeric carbon in pyranosides, one for use when the attheanomericcenterchangedconfiguration.Thisisirrelevantto linkagewaspresentinanequatorialconfiguration(atomtypeEC, mostpyranosides,whichpopulateonlyonechairforminsolution, generallycorrespondingto(cid:3)-pyranosides)andoneforaxial(AC, but some (like the monosaccharide idopyranose, present in hepa- generallycorrespondingto(cid:1)-). rin)existasanequilibriumbetweenthe4C and1C chairforms. 1 4 Thesecondimportanttorsionangle,the(cid:5)-angle,isassociated Further, many carbohydrate processing enzymes are believed to with the C1–O1–Cx–Hx sequence and does not exhibit any pro- distortthe4C chairsoastolowertheactivationenergy.Tostudy 1 foundstereoelectronicpropertiesthatareuniquetocarbohydrates. these systems it was necessary to derive a parameter set that As such, it was treated with the parameters applicable to ether employed the same atom type for the anomeric carbon in all linkagesofthistype. pyranosides. This was the motivation behind the development of The most problematic carbohydrate-specific torsion term for GLYCAM04.Recallingthatnewanomericcarbonatomtypeshad hexopyranosesisthatassociatedwiththeexocyclicrotationofthe been introduced to facilitate parameterization of the rotational hydroxymethyl group. This O6–C6–C5–O5 linkage is known as (cid:4)-angleenergycurves,itwastheseparametersthatwereagainthe the(cid:6)-angle,andplaysacrucialroleindefiningthe3Dstructures focus of the parameter development. To achieve adequate repro- ofanyoligosaccharidecontaininga16linkage.Thetendencyfor ductionofthequantumrotationalcurves,itwasnolongerpossible the O–C–C–O sequence to preferentially adopt a gauche orienta- to employ only torsion term for the O5–C1–O1–Cx linkage. To 1674 Caseetal. • Vol.26,No.16 • JournalofComputationalChemistry Figure 4. HF/6-31G(d), B3-LYP/6-31(cid:2)(cid:2)G(2d,2p), and GLYCAM (cid:6)-angle rotational curvesformethyl(cid:1)-D-glucopyranoside,without(a)andwith(b)internalhydrogenbonding; and methyl 6-O-methyl-(cid:1)-D-glucopyranoside, without (c) and with (d) internal hydrogen bonding. employasingleatomtypeforC1inGLYCAM04,itwasnecessary ducedthatgreatlyfacilitatesthepreparationofthefilesnecessary to introduce torsion terms for each of the relevant linkages asso- forrunningMDsimulationswithAMBERofproteins,glycopro- ciated with the (cid:4)-angle, namely for the O5–C1–O1–Cx, C2–C1– teins,andoligosaccharides.Furtherintergrationofthisfacilityinto O1–Cx, and H1–C1–O1–Cx linkages. Unambiguous partitioning the Amber codes is planned, as is continued work on testing and of the rotational energies into contributions from each of these improvementoftheGLYCAMforcefields. sourcesnecessitatedadetailedstudyoftheunderlyingsequences in model structures. This undertaking ultimately led to a com- TheQM/MMApproach pletelynewsetofvalenceandtorsionterms,eachfittoquantum data, largely now at the B3LYP/cc-pVTZ level. Partial atomic Prior to AMBER 8, the QM/MM module was called Roar. This chargesremainascomputedearlier.Inthecourseofthedevelop- coupledanearlierversionoftheAmberenergyminimization/MD mentofGLYCAM04,itwasdecidedtoremovealldefaulttorsion module with Mopac 6.0. As a part of modernization efforts, we parameters and introduce quantum-derived torsion terms for all wanted to merge more things into the sander program, and we linkagesfoundincarbohydrates.Thisisaparadigmshiftthathas recognized that semiempirical technology had advanced well be- many advantages; most notable being that it results in a transfer- yondwhatwasavailableinMopac6.0,particularlyintheareaof ableandgeneralizableforcefieldthatenablestheintroductionof linear-scaling approaches. Because the theory and application of functionalgroupsandchemicalmodificationsintoacarbohydrate, the QM/MM approach38,39 has been extensively reviewed,40–42 without the need to develop a large number of new parameters. weonlygiveabriefdescriptionofourapproach. However,foragivenlinkage,thesumofthegeneraltorsionterms The combination of quantum mechanics and molecular me- employed in GLYCAM04 may not be as precise as the explicit chanicsisanaturalapproachforthestudyofenzymereactionsand fitting employed in GLYCAM. Nevertheless, the accuracy of the protein–ligand interactions. The active site or binding site is GLYCAM04 rotational curves can be rationally tuned, and the treatedbytheabinitiodensityfunctionaltheoryorsemiempirical parameters extended in ways that were all but impossible in potentials,whereastherestofthesystemiscalculatedbytheforce GLYCAM. Whereas the GLYCAM parameters augmented the fields based on molecular mechanics. In the current version of PARM94 AMBER parameters, the GLYCAM04 parameters are sander, one can use the MNDO, AM1, or PM3 semiempirical self-contained. To maintain orthogonality with the AMBER pro- Hamiltonian for the quantum mechanical region. Interaction be- teinparametersets,atomtypeCTwasrenamedinGLYCAM04to tween the QM and MM regions includes electrostatics (based on CG. In principle, the GLYCAM04 parameters could be extended partial charges in the MM part) and Lennard–Jones terms, de- togenerateacompletelygeneralizableforcefieldforproteins. signed to mimic the exchange-repulsion terms that keep QM and Currentlyunderdevelopment,inboththeGLYCAMandGLY- MMatomsfromoverlapping. CAM04 formats, are TIP5P compatible parameters that include Standard semiempirical molecular orbital methods are widely TIP5P-like lone pairs on the oxygen atoms, and polarizability usedinQM/MMapplicationsbecausetheyareabletoprovidefast parameters that employ quantum-derived atomic polarizabilities. and reliable QM calculations for energies and molecular proper- An interactive tool (http://glycam.ccrc.uga.edu) has been intro- ties. However, these methods are still hampered by the need for AmberBiomolecularSimulationPrograms 1675 Figure5. Timingsforsemiempiricalcalculationsofchymotrypsin,asafunctionofthesizeoftheQM region. repeated global matrix diagonalizations in the SCF procedure, and quantum subsystems. Although many approaches have been resulting in computational expense scaling as O(N3), where N is proposedtoanswerthisquestion,47,48thelinkatommethodisstill thenumberofbasisfunctions.IftheQMpartisextendedtoseveral themostcommonlyusedduetoitssimplicity.Generallyhydrogen hundred atoms, it becomes increasingly difficult to apply the atoms are added in the QM part at the covalent bonds cut by the standardsemiempiricalMOmethodforQM/MMcalculation. QM/MMinterface.Althoughthelinkatomapproachissubjectto This expense can be greatly reduced with a linear scaling thecriticismthatitpartitionsthesystemsinanunphysicalmanner, approach43,44 based on the density matrix divide and conquer resultsfromthelinkatomapproach,ifcarefullyselected,arenot (D&C) method.45 In this approach, global Fock matrix diago- so different from other approaches.49 Moreover, because the lin- nalization and cubic scaling are avoided by decomposing a ear-scalingD&CQMtreatmentisavailable,wecanreadilymake large-scale electronic structure calculation into a series of rel- theQMpartsolargethatthelinkatomeffectisnegligibleforthe atively inexpensive calculations involving a set of small, over- regionofinterest. lapping subregions of a system. Each subsystem consists of a Amber versions under development will allow QM regions to core, surrounded by inner and outer buffer regions, and an bepresentwhenthePMEorgeneralizedBornoptionsarechosen, accurate global description of the system is obtained by com- andwillbesignificantlyfaster,atleastforsmallquantumregions. bining information from all subsystem density matrices. When Forprotein–ligandcomplexes,itisnaturaltotreattheligandand the system is large enough, the so-called crossover point is associatedresiduesinthebindingsiteastheQMpart,andtherest reached,theD&Ccalculationbecomesfasterthanthestandard as the MM part,50 and this facility will be further integrated into MO calculation. The accuracy of the D&C approximation can thecodes. be controlled by the two buffer sizes. According to our expe- rience, the subsetting scheme with the inner and outer buffer layers of 4.0 (cid:3) 0.5 Å and 2.0 (cid:3) 0.5 Å, respectively appear to Treating Solvent Effects be a good compromise between speed and accuracy.46 Figure 5 shows SCF cycle timings for standard and D&C ExplicitSolventModels calculations for different QM region sizes studied in the bovine (cid:1)-chymotrypsinsystemcomplexedwith4-fluorobenzylamine.The Amber provides support for the TIP3P,51 TIP4P and TIP4P- crossover point occurs at about 170 QM atoms. After this point, Ew,52–54TIP5P,55SPC/E,56andPOL357modelsforwater,aswell the D&C calculation is significantly faster than the standard cal- as solvent models for methanol, chloroform, N-methylacetamide, culation. and urea/water mixtures. General triclinic unit cells can be used, In some cases, the partitioning of the whole system into QM althoughsomeoftheanalysisandvisualizationtoolsarelimitedto andMMpartsinvolvesthecuttingofcovalentbondsandraisesthe themostcommon(rectangularandtruncatedoctahedron)shapes; questionofhowtobestmodeltheinterfacebetweentheclassical there is no support for symmetry elements (such as screw axes) 1676 Caseetal. • Vol.26,No.16 • JournalofComputationalChemistry that involve rotations. By default, electrostatic interactions are cell.BecausethecomputationalcostofcalculatingS(m)isoforder handled by a particle-mesh Ewald (PME) procedure, and long- Nforeachsuchvectorm,thecostofthereciprocalsumisthusof rangeLennard–Jonesattractionsaretreatedbyacontinuummodel. orderN.2Unfortunately,thiscostbecomesprohibitiveforsystems This gives densities and cohesive energies of simple liquids that containingtensofthousandsofparticlesasistypicaltoday. are in excellent agreement with more elaborate methods,54 at a WhatthePMEalgorithmdoesistoaccuratelyapproximatethe computational cost that is often less than that of cutoff based structure factors S(m) using the 3DFFT. The essential idea is to simulations. first note that exp(2(cid:7)im (cid:1) r) can be factored into three one- j ThePMEis,asthenamesuggests,amodifiedformofEwald dimensionaltrigonometricterms(evenintriclinicunitcells).One summation that is inspired by and closely related to the original canthensimplyapplytablelookuptotheseterms,approximating Hockney–Eastwood PPPM method.58 Ewald summation is a thetrigonometricfunctions(evaluatedatthecrystallographicfrac- methodtoefficientlycalculatetheinfiniterangeCoulombinterac- tional coordinates of particle j) in terms of their values at nearby tion under periodic boundary conditions (PBC), and PME is a gridpoints.Bythismeansthestructurefactorsareapproximatedas modificationtoacceleratetheEwaldreciprocalsumtonearlinear sums over regular grid points, that is as a discrete Fourier trans- scaling, using the three dimensional fast Fourier transform formthatcanberapidlycalculatedusingthe3DFFT,deliveringall (3DFFT). theneededstructurefactorsatorderNlog(N)computationalcost. BecausetheCoulombinteractionhasinfiniterange,underPBC The original version62 of the PME utilized Lagrange interpo- particle i within the unit cell interacts electrostatically with all lationtodothetablelookupoftrigonometricfunctions.TheEwald other particles j within the cell, as well as with all the periodic reciprocal sum forces were approximated separately from the images of j. It also interacts with all of its own periodic images. energy. The smooth PME (SPME)63 replaced Lagrange polyno- Theelectrostaticenergyoftheunitcell,andrelatedquantitiessuch mialswithcardinalB-splines,whicharebetterbehavedathigher asforcesonindividualparticles,arefoundbysummingtheresult- order, and which can be differentiated via recursion, allowing ing infinite series. This latter converges (slowly) to a finite limit forces to be derived analytically. Thus, in SPME, the forces are only if the unit cell is electrically neutral, and furthermore, the obtained from the gradient of the energy, unlike in the original limit is found to depend on the order of summation (conditional PME.ItisalsopossibletouseB-splinestoseparatelyapproximate ratherthanabsoluteconvergence).Ewald59appliedaJacobitheta the forces, and used in this mode, PME becomes essentially transformtoconvertthisslowly,conditionallyconvergentseriesto identicaltoPPPMasimplementedbyPollockandGlosli64(orig- a pair of rapidly, absolutely convergent series, called the Ewald inally we used B-spline interpolation, but least-squares B-spline direct and reciprocal sums. The conditional convergence of the approximation, advocated by Pollock and Glosli, yields superior originalseriesisexpressedinathirdterm60asaquadraticfunction accuracy,soweadopteditbeginningwithversion6ofAmber65). ofthedipolemomentoftheunitcell,whoseformdependsonthe Separateforceapproximationhastheadvantagethattheresulting order of summation. It is standard to assume that the whole forcessatisfyNewton’s2ndlaw,thusconservingmomentum.The assemblyofunitcellsisimmersedinanexternaldielectric.Most accuracyisalsobetterthanthatofSPME.However,thediscrep- commonlythisdielectricisassumedtobefullyconducting(“tin- ancyinmomentumconservationwithSPMEisoftheorderofthe foil”boundaryconditions),inwhichcasethethirdtermvanishes force error, which so far appears to be adequate for typical bio- and the order of summation becomes irrelevant. A more elemen- molecularsimulations(theremayyetbesituationsinwhichmore tary derivation of the Ewald sum, using compensating Gaussian precise momentum conservation is important). Furthermore, charge densities together with Poisson’s equation under PBC in SPMErequireshalfasmany3DFFTs,andthuswhenusedwiththe placeoftheJacobithetatransform,canbefoundintheappendix standard defaults in sander or pmemd is more efficient than the toKittel.61 method of separate force approximation. Finally, the SPME ap- The Ewald direct sum resembles the standard Coulomb inter- proach of differentiating the approximate reciprocal sum electro- action,butwiththetermqq/r ,representingtheCoulombenergy static potential using the properties of B-splines is critical for i j ij ofinteractionbetweenparticlesiandj,replacedbyqq erfc((cid:3)r )/ efficient approximation of Ewald sums involving atomic dipoles i j ij r , where (cid:3)is the so-called Ewald convergence parameter. This andmultipoles(seebelow).Forthesereasonswehavepursuedthe ij latterterminvolvingerfcconvergesrapidlytozeroasafunctionof SPMEapproachinrecentsanderversionsandinpmemd. theinterparticledistancer ,allowingtheuseofafinitecutoff.In Over the last 25–30 years objections have sometimes been ij the sander and pmemd programs the Ewald direct sum is calcu- raisedtotheuseofEwaldsummationinliquidstatesimulations. lated together with the van der Waals interactions. The default Intheearly1980s,itwasdemonstratedthatsimulationsofliquids valueofthedirectsumcutoffis8Å,independentofsystemsize. suchaswater,usingEwaldsummation,ledtocalculatedproperties Accordingly, (cid:3)is chosen to be (cid:4)0.35 Å(cid:5)1, leading to a relative in quantitative agreement with those of similar simulations using RMSforceerrorduetotruncationbelow5(cid:6)10(cid:5)4. reaction field boundary conditions.60 This alleviated many of the TheEwaldreciprocalsumisthesum,overallreciprocallattice earlyconcerns.Morerecently,Hummeretal.66demonstratedthat vectorsm,(m(cid:7)0),ofaGaussian-likeweightfactorexp((cid:5)(cid:7)2m2/ ionic charging free energies could be calculated accurately using (cid:3)2)/(2(cid:7)m2) multiplied by (cid:1)S(m)(cid:1),2 where the so-called structure Ewald summation, and that the finite size effects, due to limited factor S(m) is given by the sum of qexp(2(cid:7)im (cid:1) r) over all unit cells, could be accounted for. The picture emerged that sim- j j particlesjintheunitcell(r istheCartesiancoordinatevectorof ulations of biomolecules in water, using Ewald summation in j particlej).Acutoffcanalsobeappliedtothereciprocalsum.With sufficiently large unit cells, accurately represented the idealized theabovechoiceof(cid:3),thenumberofreciprocalvectorsneededso state of the isolated biomolecule in solution. Poisson–Boltzmann that the relative RMS force error due to truncation is below 5 (cid:6) calculationsunderPBCvs.nonperiodicboundaryconditionshave 10(cid:5)4istypicallyseveraltimesthenumberofparticlesintheunit been used67,68 to estimate the artifacts in conformational free AmberBiomolecularSimulationPrograms 1677 energyduetofiniteunitcellsize(thatis,artificialpreferencefor modified in reciprocal space, and passed to the inverse 3DFFT certainconformationsduetotheimpositionofstrictPBCwithina precisely as in the standard charge-based SPME algorithm. The limitedunitcell).Fromtheseitisclearthatsmallbutnonnegligible reciprocal electrostatic potential is then obtained at an atom i finitesizeartifactsexist,whichwouldbeparticularlyimportantin preciselyasinthechargesonlycase,andderivativesofthiswith protein folding studies and systems containing a low dielectric respect to fractional coordinates of i are multiplied against trans- region(suchaslipidbilayers).ThecurewithinEwaldsummation formedCartesianatomicmultipolemomentsofiforallatomsito (oranymodifiedEwaldmethodsuchasPME)istoenlargetheunit get the energy and forces. Due to the above mentioned factoriz- cell,which,ofcourse,incursmorecomputationalcost.Anumber abilityoftheB-splineapproximation,theSPMEforatomicmul- of alternative treatments of long-range electrostatics have been tipoles is very efficient. Indeed, for the same grid density and proposed. These typically involve a modified form of cutoff of splineorder,calculatingreciprocalsuminteractionsuptohexade- standard Coulombic interactions, possibly including an approxi- capole–hexadecapole order is only twice as expensive as calcu- matetreatmentofthefieldduetochargesoutsidethecutoff,asin latingcharge–chargeinteractions(note,however,thatthehexade- reactionfieldstypemethods.Noneofthesealternativeapproaches capole–hexadecapolecasegenerallyrequiresagreatergriddensity havebeendemonstratedtobereliablysuperiortoEwaldsumma- andhighersplineorderthanthecharge–chargecase).TheSPME tion.BecauseEwaldsummationinalargeunitcellisknowntobe for atomic multipoles up to quadrupole is currently being imple- correct, superiority would mean that the alternative method gave mented into sander and pmemd along with the AMOEBA force the same results as Ewald summation in a large unit cell, using, fieldofRenandPonder.75,79 however,asmallerunitcell.Notably,thesealternativetreatments Calculating the inductive response of the electron density, all lead to spherically isotropic interaction potentials [An excep- modeledbyasetofinducibleatomicdipoles,usuallyrequiresthe tion,forslabboundaryconditions,istherecentmethodofWuand solutionofasetoflinearequationsrelatingtheelectricfieldtothe Brooks(X.W.Wu,personalcommunication)].Thissphericalisot- induceddipoles.Inthesandercodewehaveimplementedseveral ropycanbeexpectedtobeproblematicinmembranesimulations variants of an iterative scheme for solving these, as well as andothercasesinvolvingnonisotropicdielectricmedia.69,70 Car–Parrinello-likeextendedLagrangianscheme,whereinthein- Force fields designed for macromolecules have until recently duceddipolesaregivenafictitiousmassandevolveddynamically modeled the electrostatic interaction between molecules using along with the atoms. In practice, this latter method is quite fixedatomicpointcharges.Thisapproximationisthoughtbymany efficient, requiring, however, a 1-fs time step and gentle temper- to be the principal limitation in current macromolecular force ature control for stability. We have found that the AMBER ff02 fields.RecentAMBERforcefieldshavebeguntoadvancebeyond forcefield,havingatomicpointchargesandinduceddipolesleads this electrostatic model by introducing off-center charges and tostabletrajectoriesusingtheextendedLagrangianscheme,with inducibleatomicdipoles71,72toaccountforthenonisotropyofthe some improvement over previous force fields, at least for DNA electrondensitynearatomsandfortheinductiveresponseofthat simulations.80 However, stability problems can occur with force densitytothelocalelectricfield,whichchangesdynamicallyasa fieldsinvolvingextrapointsinadditiontoinducibledipoles,dueto functionofthechangingconfigurationofthesystem. occasionalcloseapproachesoftheunshieldedextrapointcharges. Theintroductionofatomicdipolesnecessitatesageneralization ofthestandardEwaldsum,andhence,ofthePMEalgorithm.The ImplicitSolventModels generalizationofEwaldsummationwasprovidedbySmith,73who gaveexplicitformulasforEwaldsummationofatomicmultipoles Anaccuratedescriptionoftheaqueousenvironmentisessentialfor up to quadrupole level. Smith’s methodology was utilized by realistic biomolecular simulations, but may become very expen- Toukmaji et al.74 for performing the Ewald sum involving fixed sivecomputationally.Forexample,anadequaterepresentationof atomic charges and inducible dipoles, and is implemented in this thesolvationofamedium-sizeproteintypicallyrequiresthousands formintosander.Morerecently,itwasusedbyRenandPonder75 ofdiscretewatermoleculestobeplacedaroundit.Analternative fortheAMOEBAforcefieldthatincludesfixedatomicmultipoles replacesthediscretewatermoleculesby“virtualwater”—aninfi- uptoquadrupolelevelaswellasinducibleatomicdipolesusinga nite continuum medium with some of the dielectric and “hydro- Tholedampingmodel.76WehavenowimplementedEwaldsum- phobic”propertiesofwater. mation as well as SPME for atomic Cartesian multipoles up to These continuum implicit solvent models have several advan- hexadecapolelevel,77usingtheMcMurchieDavidsonrecursion78 tagesovertheexplicitwaterrepresentation,especiallyinmolecu- fortheEwalddirectsuminplaceofSmith’sformalism. lardynamicssimulations: Generalizing the SPME to the case of Cartesian multipoles is straightforward.OnenotesthattheEwaldreciprocalsumstructure 1. Implicitsolventmodelsareoftenlessexpensive,andgenerally factors now involve a sum of products of Cartesian multipole scalebetteronparallelmachines. momentsofparticlejmultipliedbytheappropriatederivative,with 2. There is no need for the lengthy equilibration of water that is respecttor,ofthefunctionexp(2(cid:7)im(cid:1)r).Again,thesemoments typically necessary in explicit water simulations; implicit sol- j j and derivatives can all be reexpressed in terms of the crystallo- vent models correspond to instantaneous solvent dielectric re- graphic fractional coordinates of particle j. The function sponse. exp(2(cid:7)im(cid:1)r)aswellasitssuccessivederivativeswithrespectto 3. Continuumsimulationsgenerallygiveimprovedsampling,due j fractionalcoordinatesareapproximatedintermsofB-splinesand to the absence of viscosity associated with the explicit water their successive derivatives. In practice these derivatives, multi- environment; hence, the macromolecule can more quickly ex- plied by the corresponding transformed moments, are summed ploretheavailableconformationalspace. onto the PME grid. This grid is then passed to the 3DFFT, 4. There are no artifacts of periodic boundary conditions; the
Description: