Interdisciplinary Applied Mathematics Volume 21 Editors S.S. Antman J .E. Marsden L. Sirovich S. Wiggins Geophysics and Planetary Seiences Mathematical Biology L. Glass, J.D. Murray Mechanics and Materials R.V. Kohn Systems and Control S.S. Sastry, P.S. Krishnaprasad Problems in engineering, computational science, and the physical and biological sciences are using increasingly sophisticated mathematical techniques. Thus, the bridge between the mathematical sciences and other disciplines is heavily trav eled. The correspondingly increased dialog between the disciplines has led to the establishment of the series: Interdisciplinary Applied Mathematics. The purpose of this series is to meet the current and future needs for the interac tion between various science and technology areas on the one band and mathe matics on the other. This is done, firstly, by encouraging the ways that mathe matics may be applied in traditional areas, as well as point towards new and innovative areas of applications; and, secondly, by encouraging other scientific disciplines to engage in a dialog with mathematicians outlining their problems to both access new methods and suggest innovative developments within mathe matics itself. The series will consist of monographs and high-level texts from researchers working on the interplay between mathematics and other fields of science and technology. Interdisciplinary Applied Mathematics Volumes published are listed at the end of this book. Springer Science+Business Media, LLC Tamar Schlick Molecular Modeling and Simulation An Interdisciplinary Guide With 147 Full-Color Illustrations Springer Tamar Schlick Department of Mathematics and Chemistry Courant Institute of Mathematical Sciences New York University New York, NY 10012 USA [email protected] Editors J.E. Marsden L. Sirovich Control and Dynarnical Systems Division of Applied Mathematics Mail Code 107-81 Brown University California Institute of Technology Providence, RI 02912 Pasadena, CA 91125 USA USA chico @camelot.mssm.edu marsden @cds.caltech.edu S. Wiggins S.S. Antman School of Mathematics Department of Mathematics University of Bristol and Bristol, BS8 1T W Institute for Physical Science and Technology United Kingdom University of Maryland [email protected] College Park, MD 20742-4015 USA [email protected] Cover illustration: © Wayne Thiebaud/Licensed by VA GA, New York, NY. Courtesy of the Alian Stone Gallery, NYC. Mathematics Subject Classification (2000): 92-01, 92Exx, 92C40 Library of Congress Cataloging-in-Publication Data Schlick, Tamar. Molecular modeling and simulation: an interdisciplinary guide 1 Tamar Schlick. p. cm.-(lnterdisciplinary applied mathematics ; 21) lncludes bibliographical references and index. ISBN 978-1-4757-5893-1 ISBN 978-0-387-22464-0 (eBook) DOI 10.1007/978-0-387-22464-0 1. Biomo1ecules-Models. 2. Biomolecules-Models-Computer simulation. 1. Title. Il. Interdisciplinary applied mathematics ; v. 21. QD480 .S37 2002 572'.33'015118-dc21 2002016003 ISBN 978-1-4757-5893-1 Printed on acid-free paper. © 2002 Springer Science+Business Media New York Originally published by Springer-Verlag New York, Inc. in 2002 Softcover reprint of the hardcover 1st edition 2002 AII rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher Springer Science+Business Media, LLC , except for brief excerpts in connection with reviews or scho1arly analysis. U se in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 Typesetting: Pages created by author using a Springer TEX macro package. www.springer-ny.com About the Cover Molecular modelers are artists in some respects. Their subjects are complex, ir regular, multiscaled, highly dynamic, and sometimes multifarious, with diverse states and functions. To study these complex phenomena, modelers must apply computer programs based on precise algorithms that stem from solid laws and theories from mathematics, physics, and chemistry. Many of Wayne Thiebaud's Iandscape paintings, like Reservoir Study shown on the cover, embody this productive blend of nonuniformity with orderliness. Thiebaud's body of water - organic, curvy, and multilayered (reminiscent of a cell) - is surrounded by apparently ordered fields and land sections. Upon close inspection, the multiplicity in perspectives and interpretations emerges. This artwork thus mirrors the challenging crossdisciplinary interplay, as well as blend of science and art, central to biomolecular modeling. To my grandparents - Fanny and Iancu Iosupovici, Lucy and Charles Schlick- whose love and courage I carry forever. Book URLs ForText: monod.biomath. nyu.edu/index/book.html ForCourse: monod.biomath.nyu.edu/index/course/lndexMM.html Preface Science is a way of looking, reverencing. And the purpose of all science, like living, which amounts to the same thing, is not the ac cumulation of gnostic power, the fixing of formulas for the name of God, the stockpiling of brutal efficiency, accomplishing the sadistic myth of progress. The purpose of science is to revive and cultivate a perpetual state of wonder. For nothing deserves wonder so much as our capacity to experience it. Roald Hoffman and Shira Leibowitz Schmidt, in Old Wine, New Flasks: Re.flections on Science and Jewish Tradition (W.H. Freeman, 1997). Challenges in Teaching Molecular Modeling This textbook evolved from a graduate course termed Molecular Modeling intro duced in the fall of 1996 at New York University. The primary goal of the course is to stimulate excitement for molecular modeling research - much in the spirit of Hoffman and Leibowitz Schmidt above - while providing grounding in the discipline. Such knowledge is valuable for research dealing with many practical problems in both the acadernic and industrial sectors, from developing treatments for AIDS (via inhibitors to the protease enzyme of the human imrnunodeficiency virus, HIV-1) to designing potatoes that yie1d spot-free potato chips (via trans genic potatoes with altered carbohydrate metabolism). In the course of writing xii Preface this text, the notes have expanded to function also as an introduction to the field for scientists in other disciplines by providing a global perspective into problems and approaches, rather than a comprehensive survey. As a textbook, my intention is to provide a framework for teachers rather than a rigid guide, with material to be supplemented or substituted as appropriate for the audience. As a reference book, scientists who are interested in learning about biomolecular modeling may view the book as a broad introduction to an exciting new field with a host of challenging, interdisciplinary problems. The intended audience for the course is beginning graduate students in medical schools and in all scientific departments: biology, chemistry, physics, mathe matics, computer science, and others. This interdisciplinary audience presents a special challenge: it requires a broad presentation of the field but also good cover age of specialized topics to keep experts interested. Ideally, a good grounding in basic biochemistry, chemical physics, statistical and quantum mechanics, scien tific computing (i.e., numerical methods ), and programming techniques is desired. The rarity of such a background required me to offer tutorials in both biological and mathematical areas. The introductory chapters on biomolecular structure are included in this book (after much thought) and are likely tobe of interest to physical and mathematical scientists. Chapters 3 and 4 on proteins, together with Chapters 5 and 6 on nucleic acids, are thus highly abbreviated versions of what can be found in numerous texts specializing in these subjects. The selections in these tutorials also reflect some of my group's areas of interest. Because many introductory and up-to-date texts exist for protein structure, only the basics in protein structure are provided, while a somewhat more expanded treatment is devoted to nucleic acids. Similarly, the introductory material on mathematical subjects such as basic op timization theory (Chapter 10) and random number generators (Chapter 11) is likely tobe of use more to readers in the biological I chemical disciplines. General readers, as well as course instructors, can skip around this book as appropriate and fill in necessary gaps through other texts (e .g., in protein structure or programming techniques). Text Limitations By construction, this book is very broad in scope and thus no subjects are covered in great depth. References to the Iiterature are only representative. The material presented is necessarily selective, unbalanced in parts, and reflects some of my areas of interest and expertise. This text should thus be viewed as an attempt to introduce the discipline of molecular modeling to students and to scientists from disparate fields, and should be taken tagether with other related texts, such as those listed in Appendix C, and the representative references cited. The book format is somewhat unusual for a textbook in that it is nonlinear in parts. For example, protein folding is introduced early (before protein ba- Preface xiii sics are discussed) to illustrate challenging problems in the field and to interest more advanced readers; the introduction to molecular dynamics incorporates il lustrations that require more advanced techniques for analysis; some specialized topics are also included throughout. Forthis reason, I recommend that students re read certain parts of the book (e.g., first two chapters) after covering others (e.g., the biomolecular tutorial chapters). Still, I hope most of all to grab the reader's attention with exciting and current topics. Given the many caveats of introducing and teaching such a broad and inter disciplinary subject as molecular modeling, the book aims to introduce selected biomolecular modeling and simulation techniques, as well as the wide range of biomolecular problems being tackled with these methods. Throughout these pre sentations, the central goal is to develop in students a good understanding of the inherent approximations and errors in the field so that they can adequately as sess modeling results. Diligent students should emerge with basic knowledge in modeling and simulation techniques, an appreciation of the fundamental prob lems - such as force field approximations, nonbonded evaluation protocols, size and timestep limitations in simulations - and a healthy critical eye for research. A historical perspective and a discussion of future challenges are also offered. Dazzling Modeling Advances Demand Perspective The topics I chose for this course are based on my own unorthodox introduc tion to the field of modeling. As an applied mathematician, I became interested in the field during my graduate work, hearing from Professor Suse Broyde - whose path I crossed thanks to Courant Professor Michael Overton - about the fascinating problern of modeling carcinogen/DNA adducts. The goal was to understand some structural effects induced by certain com pounds on the DNA (deduced by energy minirnization); such alterations can render DNA more sensitive to replication errors, which in turn can eventually lead to mutagenesis and carcinogenesis. I bad to roam through many references to obtain a grasp of some of the underlying concepts involving force fields and simulation protocols, so many of which seemed so approximate and not fully physically grounded. By now, however, I have learned to appreciate the practical procedures and compromises that computational chemists have formulated out of sheer necessity to obtain answers and insights into important biological processes that cannot be tackled by instrumentation. In fact, approximations and simplifi cations are not only tolerated when dealing with biomolecules; they often lead to insights that cannot easily be obtained from more detailed representations. Fur thermore, it is often the neglect of certain factors that teaches us their importance, sometimes in subtle ways. For example, when Suse Broyde and I viewed in the mid 1980s her intriguing carcinogen/modified DNA models, we used a large Evans and Sutherland com puter while wearing special stereoviewers; the hard-copy drawings were ball and