Lecture Notes 49 in Computational Science and Engineering Editors TimothyJ.Barth MichaelGriebel DavidE.Keyes RistoM.Nieminen DirkRoose TamarSchlick Benedict Leimkuhler Christophe Chipot Ron Elber Aatto Laaksonen Alan Mark Tamar Schlick Christophe Schütte Robert Skeel (Eds.) New Algorithms for Macromolecular Simulation With90Figuresand22Tables ABC Editors BenedictLeimkuhler AlanMark DepartmentofMathematics LaboratoryofBiophysicalChemistry UniversityofLeicester UniversityofGroningen UniversityRoad Nijenborgh4 LeicesterLE17RH,U.K. 9747AG,Groningen,TheNetherlands email:[email protected] email:[email protected] ChristopheChipot TamarSchlick Institutnancéien DepartmentofChemistry dechimiemoléculaire CourantInstituteofMathematicalSciences UniversitéHenriPoincaré-NancyI NewYorkUniversity B.P.239 MercerStreet251 54506Vandoeuvre-lès-Nancy,France NewYork,NY10012,U.S.A. email:[email protected] email:[email protected] RonElber ChristophSchütte DepartmentofComputerScience FBMathematikundInformatik CornellUniversity FreieUniversitätBerlin 4130UpsonHall Arnimallee2-6 Ithaca,NY14853-7501,U.S.A. 14195Berlin,Germany email:[email protected] email:[email protected] AattoLaaksonen RobertSkeel ArrheniusLaboratory DepartmentofComputerScience DivisionofPhysicalChemistry PurdueUniversity StockholmUniversity N.UniversityStreet250 10691Stockholm,Sweden WestLafayette,IN47907-2066,U.S.A. email:[email protected] email:[email protected] LibraryofCongressControlNumber:2005932829 MathematicsSubjectClassification:92C40,92C05 ISBN-10 3-540-25542-7SpringerBerlinHeidelbergNewYork ISBN-13 978-3-540-25542-0SpringerBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting, reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9, 1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsare liableforprosecutionundertheGermanCopyrightLaw. SpringerisapartofSpringerScience+BusinessMedia springer.com (cid:1)c Springer-VerlagBerlinHeidelberg2006 PrintedinTheNetherlands Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelaws andregulationsandthereforefreeforgeneraluse. Typesetting:bytheauthorsandTechBooksusingaSpringerLATEXmacropackage Coverdesign:design&productionGmbH,Heidelberg Printedonacid-freepaper SPIN:11360575 46/TechBooks 543210 Preface ThisvolumeconsistsofacollectionofpapersgivenatthefourtheditionoftheAl- gorithms for Macromolecular Modelling meeting held in Leicester, UK in August 2004.Thepurposeofthemeetingseriesistofosterahighleveldiscussionofmath- ematicalformulation,algorithmictools,andsimulationmethodologyforimproving computationalstudyof(primarily)biologicalmolecules. With the advent of supercomputers, workstation clusters, and parallel comput- ing software, molecular modelling and simulation has become the essential coun- terparttoexperiment.Computersimulationsarehelpingtolinkthestaticstructural informationonproteinsandnucleicacidsobtainedbyX-raycrystallographyandnu- clearmagneticresonancewiththedynamicbehaviorintherealisticenvironmentof thecell.Moleculardynamics(MD)andothermodellingandsimulationtechniques can offer an effective tool for refining experimental models and for probing sys- tematically how molecules fold and reshape to perform the basic functions of life. Some problems that are beyond experiment can also be tackled by modelling and simulation. Indeed, experimentalists are becoming increasingly interested in mod- ellingtodecipherdetailedconformationalandstructuralaspectsofmacromolecular function. Yet many fundamental and practical limitations face biomolecular mod- elers. They include the approximate nature of the governing force fields as well as simulation protocols, the limited range of configurational sampling and relatively shorttrajectorytimes,theneglectofquantumeffectsinclassicalmoleculardynam- ics, and the enormous computational requirements (needed to simulate a solvated macro-molecularsystemwithfulldetailsoftheenvironment).Newalgorithmicap- proaches,hierarchicalspatialrepresentations,andimprovedcomputingplatformsare thus continuously in demand to enhance the reliability of macromolecular simula- tions, enhance the scope of theoretical work, and address biological problems with greatspecificity.Inlightofthethehighlymultidisciplinarynatureofmacromolecu- larmodelling,educationaleffortsarealsocrucialfortrainingthecurrentgeneration ofyoungbiomolecularmodelers. The purpose of this volume is to help shape a dialog between developers of computational tools for biomolecular simulations and those biological and chemi- cal scientists interested in modelling applications. In keeping with the spirit of the VI Preface AM3 meetings, the authorship is very broad, including chemists, physicists, biolo- gists,mathematiciansandcomputerscientists.Thebookisdividedintotosixparts basedonthecontentofsubmissions.Theopeningsectionconsidersnovelmodelling paradigms for biomolecules, including articles on challenging applications such as membrane proteins ( Bond et al), enzyme simulation (Ma et al), RNA modelling ( Laserson, Gan and Schlick), and sequence alignment (Joachims, Galor and Elber). Part II presents the cornerstone of most molecular modelling: the classical picture of the potential energy function and the minimization problem for molecular land- scapes (Wales, Car and James), followed by an up to date overview of the protein foldingproblem(Scheragaetal). Computingtrajectoriesontheclassicalenergylandscapeisthetraditionalgoalof moleculardynamics,althoughtheultimatepurposeofthesecomputationsisusually toconstructanefficientsamplingofthelowenergybasinsandtodeterminetherel- ativelikelihoodoftransitionsamongstthem;mathematiciansarepayingincreasing attentiontotheseissues.InPartIII,wepresentavarietyofperspectivesonefficient samplingbasedondynamicsandstochastic-dynamics,beginningwithanoverview ( Hampton et al) of biomolecular sampling methodology. Schemes for sampling based on extended Hamiltonians are discussed in the article by Barth, Leimkuhler and Sweet. Advances in hybrid Monte Carlo methods (Akhmatskaya and Reich) and Langevin formulation (Akkermans) are then presented, followed by a study of metastability(HuisingaandSchmidt)basedoneigenstructureofthetransferopera- tor. Increasingly,molecularsimulationisusedtoevaluatefreeenergiesorpotentials ofmeanforce,forexampletothedetermineprotein-ligandbindingaffinities,anes- sential challenge in de novo drug design. This topic is discussed in Part IV. In the articles of Chipot and Darve, free energy calculation methods based on evolving a system from one state to another are presented and compared. An alternative ap- proach, based on replica-exchange methods, is considered in the article of Woods, KingandEssex. The final two parts of the volume focus on modification of the foundations of molecular simulation. In Part V, articles by Baker, Bashford and Case and Sagui, Roland , Pedersen and Darden consider alternative models of solvation, in the one case using a simplified implicit solvent model to increase efficiency and facilitate treatment of more complex macromolecules, in the other case focussing on more accurate electrostatic models based on higher order multipoles than are tradition- allyused.Thelastsection,includingarticlesbyTuandLaaksonen,andLee,Sagui andRolandfocussesonincorporationofevenmoreaccuratetreatmentofmolecular interactionsbyuseofquantum-mechanicalcalculations. Authors have been asked to explain terminology and notation carefully and to provide complete bibliographies, enhancing the usefulness of the book as an ad- vancedtextbookforgraduatestudyandmultidisciplinaryresearchpreparation. DuringtheAM3meeting,anpaneldiscussionwasheldatwhichcertainquestions regardingthedirectionofthefieldwereputtoasmallgroupofleadingpractitioners drawn from the various fields covered by the conference. A summary of this dis- cussionisincludedattheendofthebook.Theircandidresponses,alongwithsome Preface VII additionalcommentsandquestionsraisedbytheassembledaudience,aremeantto betakeninthespiritoffriendlyandopenexchange. In2004,theAM3meetingwasfundedbygrantsfromtheEngineeringandPhys- ical Sciences Research Council (UK), the National Science Foundation (US), the NationalInstitutesofHealth(US),theBurroughsWellcomeFoundation(US).Itwas anofficialcooperativeactivityoftheSocietyforIndustrialandAppliedMathemat- ics.Awebsiteforthemeetingseriesismaintainedatwww.am-3.org. Leicester,UK BenedictLeimkuhler Nancy,France ChristopheChipot Ithaca,NewYork RonElber Stockholm,Sweden AattoLaaksonen Groningen,TheNetherlands AlanMark NewYork,NewYork TamarSchlick Berlin,Germany ChristofSchütte WestLafayette,Indiana RobertSkeel August2005 Contents PartI MacromolecularModels:FromTheories toEffectiveAlgorithms MembraneProteinSimulations:ModellingaComplexEnvironment P.J.Bond,J.Cuthbertson,S.S.Deol,L.R.Forrest,J.Johnston,G.Patargias, M.S.P.Sansom.................................................... 3 1 Introduction–MembraneProteinsandTheirImportance............... 3 2 MembraneProteinEnvironmentsinVivoandinVitro.................. 5 3 SimulationMethodsforMembranes ................................ 6 4 UsingSimulationstoExploreMembraneProteinSystems.............. 6 5 ComplexSolvents ............................................... 7 6 DetergentMicelles............................................... 10 7 LipidBilayers................................................... 12 8 Self-AssemblyandComplexSystems............................... 14 References ......................................................... 16 ModelingandSimulationBasedApproaches forInvestigatingAllostericRegulationinEnzymes M.Q.Ma,K.Sugino,Y.Wang,N.Gehani,A.V.Beuve ..................... 21 1 Introduction .................................................... 21 2 ModelingandSimulationofsGC .................................. 25 3 FutureWork ................................................... 29 References ......................................................... 31 ExploringtheConnectionBetweenSyntheticandNaturalRNAs inGenomesviaaNovelComputationalApproach U.Laserson,H.H.Gan,T.Schlick .................................... 35 1 Introduction:ImportanceofRNAStructureandFunction .............. 35 2 ExploringtheConnectionbetweenSynthetic andNaturalRNAs ............................................... 36 3 Methods ....................................................... 37 X Contents 4 Results ........................................................ 41 5 ConclusionsandFutureDirections ................................. 48 References ......................................................... 53 LearningtoAlignSequences: AMaximum-MarginApproach T.Joachims,T.Galor,R.Elber....................................... 57 1 Introduction .................................................... 57 2 SequenceAlignment ............................................. 58 3 InverseSequenceAlignment ...................................... 59 4 AMaximum-MarginApproachtoLearning theCostParameters.............................................. 60 5 TrainingAlgorithm .............................................. 63 6 Experiments .................................................... 66 7 Conclusions .................................................... 68 References ......................................................... 68 PartII MinimizationofComplexMolecularLandscapes OvercomingEnergeticandTimeScaleBarriersUsing thePotentialEnergySurface D.J.Wales,J.M.Carr,T.James ...................................... 73 1 Introduction .................................................... 73 2 DiscretePathSampling........................................... 74 3 Basin-HoppingGlobalOptimisation................................ 79 References ......................................................... 83 TheProteinFoldingProblem H.A.Scheraga,A.Liwo,S.Oldziej,C.Czaplewski,J.Pillardy,J.Lee,D.R. Ripoll,J.A.Vila,R.Kazmierkiewicz,J.A.Saunders,Y.A.Arnautova,K.D. Gibson,A.Jagielska,M.Khalili,M.Chinchio,M.Nanias,Y.K.Kang,H. Schafroth,A.Ghosh,R.Elber,M.Makowski ............................ 89 1 Introduction .................................................... 89 2 EarlyApproachestoStructurePrediction............................ 89 3 GlobalOptimizationofCrystalStructures ........................... 90 4 All-atomTreatmentofProteinAandtheVillinHeadpiece ............. 91 5 HierarchicalApproachtoPredictStructures ofLargeProteinMolecules ....................................... 91 6 PerformanceinCASPTests ....................................... 92 7 ComputationofFoldingPathways.................................. 93 8 MolecularDynamicswiththeUNRESModel ........................ 95 9 Conclusions .................................................... 96 References ......................................................... 97 Contents XI PartIII DynamicalandStochastic-DynamicalFoundations forMacromolecularModelling BiomolecularSampling:Algorithms, TestMolecules,andMetrics S.S.Hampton,P.Brenner,A.Wenger,S.Chatterjee,J.A.Izaguirre ...........103 1 Introduction .................................................... 103 2 SamplingAlgorithms ............................................ 105 3 TestSystems,Methods,andMetrics ................................ 111 4 SimulationResults............................................... 114 5 Discussion ..................................................... 119 References ......................................................... 121 ApproachtoThermalEquilibrium inBiomolecularSimulation E.Barth,B.Leimkuhler,C.Sweet.....................................125 1 Introduction .................................................... 125 2 MolecularDynamicsFormulation ................................. 128 3 Thermostatting using Nosé-Hoover Chains, Nosé-Poincaré andRMTMethods .............................................. 130 4 Conclusions .................................................... 137 References ......................................................... 139 TheTargetedShadowingHybridMonteCarlo(TSHMC)Method E.Akhmatskaya,S.Reich ...........................................141 1 Introduction .................................................... 141 2 DescriptionoftheBasicHMCMethod.............................. 142 3 Störmer-VerletTime-SteppingMethod andModifiedHamiltonian ........................................ 143 4 TargetedShadowingHMC ........................................ 144 5 ComputerExperiment............................................ 148 6 Conclusion ..................................................... 151 References ......................................................... 151 TheLangevinEquationforGeneralizedCoordinates R.L.C.Akkermans .................................................155 1 Introduction .................................................... 155 2 GeneralizedCoordinates.......................................... 157 3 GeneralizedLangevinEquation.................................... 158 4 PointTransformations............................................ 160 5 Conclusions .................................................... 163 References ......................................................... 164 XII Contents MetastabilityandDominantEigenvaluesofTransferOperators W.Huisinga,B.Schmidt ............................................167 1 Introduction .................................................... 167 2 MarkovianMolecularDynamics ................................... 169 3 MarkovChains,TransferOperators, andMetastability ................................................ 170 4 UpperandLowerBounds......................................... 172 5 IllustrativeExamples............................................. 176 6 Outlook........................................................ 180 References ......................................................... 180 PartIV ComputationoftheFreeEnergy FreeEnergyCalculationsinBiologicalSystems.HowUsefulAreThey inPractice? C.Chipot........................................................185 1 Introduction .................................................... 185 2 MethodologicalBackground ...................................... 186 3 FreeEnergyCalculationsandDrugDesign .......................... 194 4 FreeEnergyCalculationsandSignalTransduction .................... 199 5 FreeEnergyCalculationsandPeptideFolding........................ 201 6 FreeEnergyCalculationsandMembraneProteinAssociation........... 204 7 Conclusion ..................................................... 206 References ......................................................... 207 NumericalMethodsforCalculatingthePotentialofMeanForce E.Darve ........................................................213 1 Introduction .................................................... 213 2 GeneralizedCoordinatesandLagrangianFormulation ................. 218 3 DerivativeoftheFreeEnergy...................................... 223 4 PotentialofMeanConstraintForce................................. 226 5 AdaptiveBiasingForce........................................... 233 6 NumericalResults ............................................... 238 7 Conclusion ..................................................... 244 References ......................................................... 246 Replica-Exchange-BasedFree-EnergyMethods C.J.Woods,M.A.King,J.W.Essex ....................................251 1 Introduction .................................................... 251 2 FreeEnergyCalculations ......................................... 251 3 HydrationofWaterandMethane................................... 253 4 HalideBindingtoaCalix[4]PyrroleDerivative ....................... 255 5 Conclusion ..................................................... 257 References ......................................................... 257