77 Biomolecular chemistry 4. From amino acids to proteins Required reading: Sections 14.5 to 14.8 (not 14.7.3 and 14.8.1) of Mikkelsen and Cortón, Bioanalytical Chemistry Primary Source Material • Chapters 4 and 12 of Introduction to Genetic Analysis Anthony: J.F. Griffiths, Jeffrey H. Miller, David T. Suzuki, Richard C. Lewontin, William M. Gelbart (courtesy of the NCBI bookshelf). • Chapters 4, 4 and 6 of Biochemistry: Berg, Jeremy M.; Tymoczko, John L.; and Stryer, Lubert (courtesy of the NCBI bookshelf). • Chapters 3 and 7 of Molecular Cell Biology: Lodish, Harvey; Berk, Arnold; Zipursky, S. Lawrence; Matsudaira, Paul; Baltimore, David; Darnell, James E. (courtesy of the NCBI bookshelf). • ExPASy: online course on Principles of Protein Structure • Many figures and the descriptions for the figures are from the educational resources provided at the Protein Data Bank (http://www.pdb.org/) • Most of these figures and accompanying legends have been written by David S. Goodsell of the Scripps Research Institute and are being used with permission. I highly recommend browsing the Molecule of the Month series at the PDB (http://www.pdb.org/pdb/101/motm_archive.do) Some objectives for this section: • You will be familiar with the 20 common amino acids • You will understand how these amino acids are linked together in a polypeptide • You will know the meaning of 1°, 2°, 3°, and 4° structure • You will have a basic understanding of the molecular basis for protein structural and functional diversity • You will be have a basic understanding of the common elements of protein structure and how these elements can be arranged in 3-dimensions • You will have a basic understanding how non-covalent forces and covalent bonds hold a protein together • You will be familiar with some of the common graphical representations of proteins • You will have an overview of how protein purifications are typically carried out, and what are some of the major considerations in designing a purification protocol. Where are we and how did we get here? 78 We are here! • We are done with the Central Dogma and now we move into the realms of protein structure and function. The Central Dogma only relates to the flow of genetic information, not to the function of biological macromolecules. Proteins come in all shapes and sizes 79 http://www.rcsb.org/pdbstatic/education_discussion/molecule_of_the_month/poster_quickref.pdf • Proteins are diverse and versatile ‘nano’ structures and machines • Large number of potential combinations • There is a relatively large number number of amino acids (a.a.) which you can use to construct a protein. • Includes 20 common a.a.’s plus numerous post-translational modifications. • 200 amino-acid protein could have 20 to the 200th power possible sequences. • Structurally versatile • Polypeptide backbone can adopt a variety of conformations • Many conformers of side chains • Secondary structural elements can pack together in a wide variety of orientations • Various states of homo- and hetero- oligomerization • Proteins can bind prosthetic groups or cofactors (non-protein) • Heme • Metal ions • flavins • Structurally dynamic • Allosteric activation • Active and inactive forms www.rcsb.org 80 • The Research Collaboratory for Structural Bioinformatics (RCSB) is a non-profit consortium dedicated to improving our understanding of the function of biological systems through the study of the 3-D structure of biological macromolecules. RCSB members work cooperatively and equally through joint grants and subsequently provide free public resources and publications to assist others and further the fields of bioinformatics and biology. • The RCSB maintains the Protein Data Bank; the freely available database of all protein structures. • Follow the education links (bottom left hand corner) to learn more about ‘Methods for determining structures’ and ‘Software programs for looking at structures’: • http://www.rcsb.org/pdb/static.do?p=education_discussion/Looking-at-Structures/methods.html • http://www.rcsb.org/pdb/static.do?p=education_discussion/Looking-at-Structures/graphics.html www.rcsb.org 81 • The Protein Data Bank is the repository of all atomic structures for proteins as determined by x-ray crystallography, NMR, and (more rarely) electron microscopy. • Each structure is indicated by a 4 character code. In this example the code is 2i5j. • The real fun begins when you start looking at protein structures in 3 dimensions. To do this you require a molecular visualization tool. Their are a wide variety of these currently available. It seems as though everyone has their favorite one and this is strongly influenced by what they need to use it for and what operating system they are running. • You can view 3-D structures online using interactive tools such as Jmol, which is now built in to every PDB webpage. • Alternatively, a structure file can be downloaded to your computer by clicking on the link on the left hand side of the page. • The classic visualizer is RasMol. It is very simple and sufficient for most purposes. However, I don’t recommend it for creation of publication quality images. I think that RasMol is still available for all platforms, but it is no longer updated and has been superseded by other programs. • For example, in terms of ease of use and versatility, RasMol has now been surpassed by Jmol which is a Java based visualizer that works in your web browser. Essentially it does everything that RasMol does and looks a lot nicer. I like to use iMol for simple visualization on the mac. Once again, it is not great for publication quality images. www.rcsb.org 82 • PyMOL is currently the most popular software for making rendered images of proteins. • Some other promising visualization programs for producing publication quality images are QuteMol (it really is Cute!!) and VMD (looks intimidating...). Molscript is probably the most powerful program for creating figures, but it has a very steep learning curve. There are probably many others out there that I don’t know about (i.e., because they are Windows only) • Using either online tools or programs on your computer, the protein can be represented in a variety of different styles. • Typical options include: • protein backbone representation style and color • atom representation (ball-and-stick vs. spacefill) and colors • showing/hiding various parts • zoom, rotation • molecular surfaces • By combining these options, attractive and informative figures can be created. • Some software will allow you to create animations in which the structure is rotating • To get ‘publication quality’ pictures it is typically necessary to use proper rendering or ray-tracing software. Popular options include Raster3D (http://skuld.bmsc.washington.edu/raster3d/) and POV-ray. The popular software program PyMol has a built in rendering function. • Rendering is the process of generating a high quality image in which aspects such as light source, reflections, transparencies, and shadows are all taken into account. • Ray tracing is a rendering technique that produces some of the most photo-realistic images Various representations of 3° structure 83 Ras, a guanine nucleotide– binding protein. • The simplest way to represent three-dimensional structure is to trace the course of the backbone atoms with a solid line; the most complex model shows the location of every atom. • The former shows the overall organization of the polypeptide chain without consideration of the amino acid side chains; the latter details the interactions among atoms that form the backbone and that stabilize the protein’s conformation. Even though both views are useful, the elements of secondary structure are not easily discerned in them. • Another type of representation uses common shorthand symbols for depicting secondary structure, cylinders or fancy cartoon helices for α-helices, arrows for β-strands, and a flexible string-like form for parts of the backbone without any regular structure. This type of representation emphasizes the organization of the secondary structure of a protein, and various combinations of secondary structures are easily seen. • Computer analysis in which a water molecule is rolled around the surface of a protein can identify the atoms that are in contact with the watery environment. On this water-accessible surface, regions having a common chemical (hydrophobicity or hydrophilicity) and electrical (basic or acidic) character can be mapped. Such models show the texture of the protein surface and the distribution of charge, both of which are important parameters of binding sites. This view represents a protein as seen by another molecule. • Question: What do you mean by "rendered images"? I remember you said high quality about this in class, but could you give me more details about this explanation? • Answer: Compare this image: http://www.pymolwiki.org/index.php/File:No_ray_trace.png and this image: http://www.pymolwiki.org/index.php/File:Ray_traced.png. You could read the wikipedia entry on Rendering (http://en.wikipedia.org/wiki/Rendering_(computer_graphics)), but this is way more information than you need. Essentially, 'Rendering' refers to generating high (that is, publication) quality images. Images that could be printed in high resolution on the glossy cover of a journal and still look good. Every visualization program will handle things a bit differently. I recommend using Pymol, which you can learn more about here: http://www.pymol.org/ and http://www.pymolwiki.org/index.php/Main_Page. Here is the information about ray- tracing: http://www.pymolwiki.org/index.php/Ray The structure of a protein is determined by the8 4 linear sequence of amino acids (1º structure) Ribonuclease http://www.users.csbsju.edu/~hjakubow/classes/rasmolchime/01ch331finproj/Rnase/templateprot.htm • The classic work of Christian Anfinsen in the 1950s on the enzyme ribonuclease revealed the relation between the amino acid sequence of a protein and its conformation. For this work he was awarded the Nobel Prize in Chemistry in 1972. Anfinsen discovered that: • Ribonuclease is a single polypeptide chain consisting of 124 amino acid residues cross-linked by four disulfide bonds. • Agents such as urea or guanidinium chloride effectively disrupt the noncovalent bonds., • The disulfide bonds can be cleaved reversibly by reducing them with a reagent such as β- mercaptoethanol. • When ribonuclease was treated with β-mercaptoethanol in 8 M urea, the product was a fully reduced, randomly coiled polypeptide chain devoid of enzymatic activity. In other words, ribonuclease was denatured by this treatment. • Anfinsen then made the critical observation that the denatured ribonuclease, freed of urea and β- mercaptoethanol by dialysis, slowly regained enzymatic activity. All the measured physical and chemical properties of the refolded enzyme were virtually identical with those of the native enzyme. • These experiments showed that the information needed to specify the catalytically active structure of ribonuclease is contained in its amino acid sequence. The 20 common amino acids 85 Ala showing L- stereochemistry ml ht 0. -1 -1 5 _ p a m e /sit p a m e /sit b e /n m o c b. e n w. w w // p: htt • 20 different common amino acids only differing in side chain • Note that stereochemistry at Cα has not been indicated in this figure. • All natural a.a.’s are in L-configuration • A more general system of stereochemical designation is the R/S system. The L-configuration nearly always corresponds to S in the R/S system. The exception is L-cysteine which is R. • You might want to keep this sheet handy as a reference. • I will often used the one letter codes and you should learn these. • Most are easy, but I find D, E, N and Q the most tricky to remember The 20 common amino acids 86 • You should know the name, 3 letter code, and 1 letter code for all 20 amino acids. • This page should serve as a study guide and hopefully help you get familiar with the preferred conformations.
Description: