ebook img

Prediction of Protein Secondary Structures From Conformational Biases PDF

9 Pages·0.23 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Prediction of Protein Secondary Structures From Conformational Biases

Prediction of Protein Secondary Structures From Conformational Biases Trinh Xuan Hoang1,2, Marek Cieplak3,4, Jayanth R. Banavar3, and Amos Maritan1,2 (1) International School for Advanced Studies (SISSA/ISAS) and INFM, via Beirut 2-4, 34014 Trieste, Italy (2) The Abdus Salam International Center for Theoretical Physics (ICTP), Strada Costiera 11, 34100 Trieste, Italy (3) 104 Davey Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802 2 0 (4) Institute of Physics, Polish Academy of Sciences, 02-668 Warsaw, Poland 0 2 n in the CASP competitions,1–3 wherein the structures WeuseLINUS,aproceduredevelopedbySrinivasan and a of large protein fragments, comprising as many as 100 J Rose, to provide a physical interpretation of and to predict residues, were predicted with an accuracy of 4-7 ˚A in the secondary structures of proteins. The secondary struc- 7 rmsd. A notable success reported was that of the Baker 1 ture type at a given site is identified by the largest confor- group and entailed the assembly of protein conforma- mational bias during short time simulations. We examine tions from fragments of known structures in the protein ] therateof successful prediction as afunction of temperature h database, which have local sequences similar to that of and the interaction window. At high temperatures, there is c the target sequence, using statistically derived scoring alargepropensityfortheestablishmentofβ-strandswhereas e functions.4–6 In Levitt’s approach,7,8 secondary struc- m α-helices appear only when the temperature is lower than a tures,whichwerepredictedbyusingseveralexistingsec- certain threshold value. It is found that there exists an op- - ondary structure prediction methods,9–11 are fitted to t timal temperature at which the correct secondary structures a arepredictedmostaccurately. Wefindthatthistemperature best scoring compact conformations obtained on a sim- st is close to thepeak temperature of thespecific heat. Chang- plified tetrahedral lattice. Scheraga and coworkers12,13 t. ingtheinteractionwindoworcarryingoutlongersimulations use an off-lattice Cα-based model with interactions im- a approaching equilibrium lead to little change in the optimal posed on virtual side-chains and virtual peptide groups. m success rate. Our findings are in accord with the observa- The lowest-energy Cα trace obtained by extensive con- - tion by Srinivasan and Rose that the secondary structures formational space annealing is then converted to an all- d are mainly determined by local interactions and they appear atom backbone for further refinements. Skolnick et al.14 n in theearly stage of folding. built discretized protein conformations using predicted o c secondary structures and a number of tertiary restraints [ derived from multiple sequence alignments. The suc- Keywords: protein folding; secondary structures; cess of these ab initio methods relies to a large extent 1 α-helix; β-strand; protein structure prediction on knowledge-based information, i.e. data derived from v 1 known protein structures, such as that used in the scor- 1 ing functions, secondary structure prediction or in the INTRODUCTION 3 choice of fragments to incorporate in the model. 1 Our work deals with secondary structure prediction 0 A knowledge of the three dimensional structure of a and builds on a truly ab initio protein structure pre- 2 protein is crucial for understanding its biological func- diction procedure called LINUS developed by Srinivasan 0 tionality. Unfortunately,the rateatwhichproteinstruc- and Rose.15 LINUS does not use any knowledge-based / at turescanbeexperimentallysolvedisfarbehindthespeed information and thus provides a clear picture of the role m at which the sequences are determined. With progress played by the different factors in folding. Furthermore, in the Human Genome Project, a good computer-based the algorithm for determining the structure is not based - d methodforthepredictionofproteinstructuresfromtheir on energy minimization – LINUS captures the interplay n sequenceswouldbeaninvaluabletoolformodernmicro- betweenenergyandentropyindeterminingthelocalsec- o biology as well as for drug design. The existing methods ondary structure. c for structure prediction can be divided into two classes: The most powerful aspect of LINUS is its simplicity – : v 1) template-based methods which compare a sequence it is based on just 4 essential aspects of protein behav- Xi with unknown structure against the library of solved ior: (1) excluded volume, (2) preferred occupancies of structuresand2)abinitiomethodswhichseektoidentify the dihedral angles in certain regions in the Ramachan- r a thenativefoldusuallydefinedasthelowestenergypoint dran plot,17 (3) hydrophobic interactions and hydrogen in conformational space. The latter are specially useful bonding and (4) the hierarchical organization of protein when a target sequence has a low similarity with the ex- structures.18,19 In spite of this simplicity, LINUS has al- isting protein sequences of known structures. It should ready proved to be effective in predicting the secondary be noted though that many so called ab initio methods and super-secondary structures of protein fragments.15 do use information derived from the protein database as Note that the hierarchicalalgorithm steers folding along input. some specific pathways and the resulting structure does Significant progress has been achieved in the ab ini- notnecessarilycorrespondtotheglobalenergyminimum. tioapproachtoproteinstructurepredictionaswitnessed 1 In a subsequent study,16 Srinivasanand Rose used LI- atoms: hydrogen bonding (H-bond), hydrophobic inter- NUS to propose a physical basis for secondary struc- action, and salt bridges. All backbone nitrogens, except tures, which showed that protein secondary structures for those that belong to a proline, are considered to be aremainly determinedbystericeffects andlocalinterac- H-bond donors and participate in no more than one H- tions. This conclusion recently obtained strong support bondbutthenitrogenattheN-terminusmayparticipate from experimental evidence that unfolded protein con- in up to three H-bonds. The backbone oxygens and the formations,under highlydenaturing conditionsandthus sidechainsofsomeaminoacids(Ser,Thr,Asn,Asp,Gln, in the absence of long-range contacts, are still charac- Glu) are acceptors. A backbone-to-backbone hydrogen terized by local native-like topology.20,21 In LINUS, the bond is assumed to be formed between residues i and j conformational bias towards a type of secondary struc- whentheyareatleastthreeresidueapartinthesequence ture is determined through the probability of being in and when the distance between a donor and an acceptor this conformation during simulation. issmallerthan5˚A. Anenergyof−0.5ǫisassigned,where Wehavefoundthisideatobeintriguingandworthyof ǫisanenergyunit,andtheenergyisscaledquasi-linearly acarefulreexamination. Here,wemakeanassessmentof from 0 to its minimal value as the distance decreases to howwellthesecondarystructurescanbepredictedbased 3.5˚A. It is also required that the out-of-plane dihedral ontheanalysisofconformationalbiases. Inparticularwe angle O(j)–N(i)–C (i)–C(i − 1) should be larger than α concentrateonthe role playedby the temperature, T,in 140o. A sidechain-to-backbone hydrogen bond is formed determining the success rate and find that there is an when the donor-to-acceptor distance is smaller than 4˚A optimalT atwhichthe secondarystructurepredictionis and the acceptor must be not further than four residues the best. For most of the proteins studied, this temper- away from the donor in the sequence. In this case an ature coincides with the one that Rose and Srinivasan energy of −1.0ǫ is assigned and no scaling of the energy used in their studies and is found to be near the peak is involved. in the specific heat where the conformational conversion Hydrophobicattractionis postulatedtooccur forcon- in the system is the largest. The optimal conditions for tacts between the sidechain atoms of hydrophobic (Cys, the structurepredictiondonotdependmuchonwhether Ile,Leu,Met, Phe,Trp,Val)andamphipathic (Ala,His, the window in the interactions allowed for purely local Thr, Tyr) residues. The minimal value of the contact or also for non-local interactions. They are also insensi- energyis −0.5ǫ when both residues are hydrophobicand tivetothedurationofthesimulations. Weobtainedvery −0.25ǫ when one of them is hydrophobic and the other similarresultswhenlong,nearlyequilibrium,simulations is amphipathic. A contact between two atoms i and j is were considered. said to form when the distance between them is smaller TheaimofourstudyistoelucidatehowLINUSworks than R(i)+R(j)+1.4˚A, where R(i) and R(j) are the andwhatitsstrengthsandweaknessesare. Theultimate contact radii of the two atoms. The contact radii of the goal would be to determine what kinds of improvements atoms16 dependonthekindofatomsandarelargerthan couldbe made in this physically appealing frameworkto their hard sphere radii. The energy of a contact scales move towards first principles tertiary structure predic- from 0 to its minimal value as the distance between two tion. atoms decreases from its cut-off value to R(i) + R(j). A salt bridge is assigned to contacts between oppositely chargedgroups(namelythesidechainsofArgorLyswith METHODS Glu or Asp). The minimal energy of a salt bridge is −0.5ǫ. InLINUSthereisalsoanenergyfunctiontochase A detailed description of LINUS can be found in the residuesawayfromtherighthandsideoftheRamachan- original papers of Srinivasan and Rose.15,16 We have de- dran plot. When a residue has a positive torsional angle veloped our own version of LINUS that strictly follows φ it is punished with an energy of −1.0ǫ if the residue is the improved development as described in the PNAS not a glycine, otherwise it is rewarded with an energy of paper16. Briefly, in LINUS, the coordinates of all back- −1.0ǫ. bone atoms are considered whereas a sidechain is repre- The main degrees of freedom used in LINUS are the sented in a simplified manner. Specifically, glycine has Ramachandrantorsionalanglesφandψandthetorsional no sidechain, alanine’s sidechain is made of a C and χ which corresponds to rotation of the sidechains. Ad- β the remaining amino acids are represented by C and ditionally the torsional angle ω about the peptide bond β one or two pseudo Cγ atoms, depending on whether the and the N-Cα-C bond angle are allowed to be perturbed sidechain is branched out or not. The atoms are mod- slightly during the simulation. All the other bond an- eledashardspheresthatarenotallowedtooverlap. The gles and bond lengths are kept fixed. Three consecu- sizes of the spheres depend on the type of the atom and tive residues (i,i+1,i+2) are perturbed at a time and the sizes of the pseudo-atoms depend on the size of the the movements advance from the N-terminus to the C- sidechains that they represent. terminus. The moves at an ith residue are repeatedly Apart from steric interactions, the Hamiltonian con- chosen until a move is obtained in which there are no sists of just a few terms that provide attraction between steric clashes within the three residue fragment consid- ered. Atthenextstagethewholeproteinchainischecked 2 forthepresenceofstericclashes. Upto50suchattempts The conformational bias, P, of a given type of sec- are performed in order to find a conformation without ondarystructure is defined as the probabilityofbeing in any steric clashes. If the new conformation found still this structure during the simulation. P is usually com- has steric clashes it is rejected, otherwise it is accepted putedasafunctionofresidueinthe sequence. The com- with a probability P =min(cid:8)1,e−∆E/kBT(cid:9), where kB is putation of P requires a procedure of secondary struc- the Boltzmann constant,T is the temperature measured ture assignment, which allows one to determine which in the units of ǫ/k , and ∆E is the energy difference. A type of secondary motifs a residue belongs to at a given B complete progressionfrom N to C is called a cycle. instant. We use an assignment procedure in the most LINUS uses a smart move set that consists of the fol- recent unpublished development of LINUS,22 which pro- lowing, equally probable, move types ceeds through the following steps: 1. α-helix : threeconsecutiveresidues(i−1,i,i+1)are 1. Set all residues to the coil conformation (c). set to having φ=−64±7o, ψ =−43±7o. 2. For i running from 1 through N −3, where N is the 2. β-strand : three residues (i−1,i,i+1) are set to numberofresidues,computethetorsionΘbetween havingφ=−130±15o,ψ =135±15o. If aresidue four consecutive Cα’s (i,i+1,i+2,i+3). is a proline then φ is reset to 70±15o. a. If |Θ| ≥ 135o then residues (i+1) and (i+2) 3. turn : there are4 types ofturns, namely I, I’, II and are set to the strand conformation (s). II’. For each turn type there are two possibilities: b. If45o ≤Θ≤65o thenresidues(i+1)and(i+2) a) setting residues (i − 1,i) to have turn φ and are set to the helix conformation (h). ψ values while residue i+1 is set to random coil c. If−50o ≤Θ<45othenresidue(i+1)and(i+2) and b) setting i−1 to random coil and (i,i+1) are set to the turn conformation (t). to have turn φ and ψ values. Overall there are 8 suchpossibilities. The turn φ and ψ values for two 3. Check again all residues from 1 through N: consecutive residues are given below for each type ofa turn move (the notations used for the residues a. If a segment of less than 5 residues with a h are i−1 and i but they can also be i and i+1). assignment is found then all residues in this segment are set to t. i. Type I: residue(i−1): φ=−60±15o, ψ =−30±15o b. If a segment of less then 3 residues with a s residue (i): φ=−90±15o, ψ =0±15o assignment is found then all residues in this segment are set to c. ii. Type I’: residue (i−1): φ=55±15o, ψ =40±15o To compute P, one starts from an open conformation residue (i): φ=80±15o, ψ =5±15o andmakesasimulationof1000cycles. Aftereachcyclea iii. Type II: conformation assignment is determined to gather statis- residue (i−1): φ=−60±15o, ψ =110±15o tics on P. The average is taken over 10 simulations for residue (i): φ=90±15o, ψ =−5±15o each T. In order to make comparisons with the DSSP-based iv. Type II’: nativeassignments26usedinthePDB,23 weadoptasim- residue (i−1): φ=60±15o, ψ =−120±15o plified correspondence in which the 310, π and α- helix residue (i): φ=−80±15o, ψ =0±15o correspond to h, the isolated β-bridges and extended β- strands to s, the hydrogen bonded turn to t, and bends For all turn moves if a residue is a proline then its φ is reset to 70±15o. and undefined segments to c. It should be noted that the native state secondary structure assignment used by 4. random coil : φ and ψ are chosen randomly in Srinivasan and Rose16 for the proteins studied does not one of the favorite regions of the Ramachandran fully agree with the one used in the PDB. In the follow- plot. For non-glycine and non-proline residues ing,our resultsare benchmarkedagainstthe PDB-based (φ,ψ)∈{(−135±45o,135±45o),(−75±30o,−30± assignment. 30o),(75±15o,30±15o)}. For glycine φ ∈ {90± Inordertoexploretheroleoflocalandnonlocalinter- 30o,180±30o} and ψ ∈ {0±30o,180±30o}. For actions we consider two choices for the interaction win- proline φ = −70±15o and ψ ∈ {135±30o,−45± dow, ∆, of 6 and N. The interaction window restricts 30o}. interactions along the sequence. ∆ = 6 means that all interactionsbetweentworesiduesiandj with|i−j|>6 For the first three move types, ω =180±5o whereas for are switched off, whereas in the case of ∆=N all inter- the coil move ω = 180±10o. For all move types, the actions are present. sidechain torsional angles (χs) are chosen at random in 10o windows around −60o, 60o and 180o. 3 RESULTS AND DISCUSSION η are the largest around T = 0.5ǫ/k . At this tempera- B ture the rate of prediction is the highest for helices and We begin our discussions with protein G (PDB code exceeds 90%(it is 100%at T =0.55ǫ/kB), while strands 1GB1), the protein showing the best conformational bi- and turns are predicted at 74% and 23% levels respec- ases towards native secondary structures in the set of tively. The values of η were obtained with the reference proteinsstudied by SrinivasanandRose. Figure 1 shows to PDB assignment. If the Srinivasan and Rose assign- P as a function of residue at three different tempera- mentisusedinstead,the correspondingsuccessratesare turesT =0.8ǫ/k ,0.5ǫ/k and0.2ǫ/k . T =0.5ǫ/k is 90%, 63% and 35% respectively. As T increases the rate B B B B thetemperaturewhichSrinivasanandRoseusedintheir for the strands first decreases and then remains roughly simulations. The interaction window is set to 6. The ataconstantvaluewhiletherateforhelicesdropsrapidly conformational biases towards α-helices (h), β-strands andvanishesatT =0.7ǫ/kB. Anearlyoppositescenario (s), turns (t) and coils (c) are shown. Note that at is observed as T becomes smaller than 0.5 – the rate for T = 0.8ǫ/k the strands dominate over all other struc- strands drops rapidly while it remains high for the he- B tures. Thus the whole protein chain prefers to be in the lices. strand conformation at high temperatures. This follows Figure 3 shows η as a function of T for protein G but from the simple observation that the entropy is largest as calculated with ∆ = N. We still observe the same inthe strandconformationandisthe dominantfactorin pictureasfor∆=6,exceptthatatlowtemperaturesthe the free energy at high temperatures. At low tempera- predictionratesforhelicesandstrandsbecomesomewhat tures, such as T =0.2ǫ/k , the bias towardsthe strands higher. Theoptimaltemperature,however,remainsclose B vanishes while the highest biases belong to helices and to 0.5ǫ/kB. turns. This is because helices and turns involve favor- Figures 4 and 5 show η as functions of T for 6 other able interactions, which are predominantly local and are proteins in the set studied by Rose for ∆ = 6 and N thus stabilized at low temperatures. At the intermedi- respectively. For ∆ = 6, as in protein G, the best pre- ate temperature, T =0.5ǫ/kB, the dominating structure diction rates are obtained at about T = 0.5ǫ/kB for all variesasoneproceedsalongthesequence. Somepartsof of the proteins except for plastocyanin(6PCY) and myo the protein prefer to be in a strand conformation while hemerythrin(2HMQ).Thelatterproteinsarespecialbe- others form helices and turns. causetheyconsistofonlyonekindofsecondarystructure Becausethe biasesstronglydependonT,one mayask inadditiontotheturn. Thenativeconformationofplas- what the temperature is at which the native secondary tocyanin is built only of β-strands and that of hemery- structure can be most reliably predicted. In order to thrinisafour-helixbundle. Thebestpredictionratesare answer this question we have carried out an extensive obtained for a range of temperatures which corresponds analysis of the biases overa wide range of temperatures. to T ≥0.4ǫ/kB for plastocyanin and to T ≤0.4ǫ/kB for Figure 2a shows two sequences of secondary structure hemerythrin. For∆=N,theoptimaltemperaturevaries assignments. The first corresponds to the known native alittlebutformostoftheproteinsitremainsintherange conformation of protein G, and the second is obtained from 0.4ǫ/kB to 0.6ǫ/kB. It should also be noted that, fromthebiasesgiveninthemiddlepanelofFigure1,i.e. when ∆ = N, the rates for the strands at low tempera- at T = 0.5ǫ/k . In the latter case an assignment at a tures become significantly larger than in the ∆=6 case B givensite is set to the type of secondary structure show- for all proteins. The reason for this behavior is that, at ing the highest bias. We introduce a parameter η which lowT,thestrandscanbestabilizedonlybynon-localin- estimates overlaps between the two sets of assignments teractions, which are absent when ∆=6. The results in for each kind of secondarystructure. For a giventype of Figures 2 through 5 are generally similar even when the conformation,x(x∈{s,h,t,c}),ηisdefinedasthenum- Srinivasan-Rose secondary structure assignment is used ber of sites at which both assignments (from PDB and underscoringthe robustness of our results. The only dif- from the biases) are x divided by the number of sites at ferenceisthatthepredictionspertainingtotheturnsare which at least one of the assignments is x. Specifically improved compared to the PDB-based secondary struc- if A is a set of sites of type x in the PDB assignment ture assignment. and B is a set of sites of the same type of conformation What is the principle that governs the choice of the predicted by the biases then optimal temperature? Figure 6 shows how conforma- tionalchangesoccurwithrespecttotemperatureforeach f(A∩B) residue of protein G for the case of ∆ = 6. It is seen η ≡ , (1) f(A∪B) clearly that most of the strands are destabilized at low temperatureswhereashelicesareabsentathightempera- where f(X) is a function which returns the number of tures. Thusinordertohavebothkindsofstructurespre- elements in X. Thus, if A ≡ B then η = 1. We call η dictedthe optimaltemperatureforthe predictionshould the rate of successful prediction. be ina rangeofintermediate temperatureswherehelices Figure 2b shows η as function of T for the strands, have started to form but strands have not vanished. It helices and turns for protein G. Note that the values of can be seen in Figure 6 that as the temperature is low- ered,the strands undergoa transitionto helices or other 4 kindsofstructuressuchasturnsorcoils. Ahelixcanalso CONCLUSIONS beformedfromacoilasthetemperaturecontinuestode- crease. Becausehelicesandturnsareassociatedwiththe Ourresultsconfirmthattheanalysisofconformational establishment of H-bonds while strands and coils usu- biases is a fast and useful tool to get information about ally have no contacts such transitions entail a change in native protein secondary structures. We find that the energy which is reflected in the specific heat. This sug- mostcommonsecondarymotifs,α-helicesandβ-strands, gests that the optimal temperature for secondary struc- can be predicted with an accuracy ranging from roughly ture prediction ought to be in the vicinity of the peak of 40% up to 100%. Our analysis shows that while the thespecificheat,andlikelyabithigherthanthetemper- rate of successful prediction is insensitive to the interac- atureofitsmaximuminordernottodiscriminateagainst tion window as well as to the length of the simulations, the strands. the choice of temperature appears to be critical. The The connection between the thermodynamics of the optimal run temperature is found to be related to the system and the optimal temperature for the prediction peak temperature in the specific heat. Unlike commonly are shown in Figures 7 and 8 for protein G and plasto- used algorithms in which one attempts to minimize an cyanin and hemerythrin respectively. The specific heat energy function to determine the native state structure, C as a function of temperature is calculated using the LINUS is analgorithmthat relies on a delicate interplay 24 histogram technique. For each protein we performed a between the entropy favoring the strands and energetic long simulation of 200000 cycles at T = 0.5ǫ/kB to de- considerations favoring turns and helices. Because there duce the thermodynamic behavior at that temperature is no procedure in LINUS that allows for an assembly and other temperatures in its vicinity. The results are ofstrandsthroughtheappropriatenon-localinteractions shown for the two values of ∆. For ∆ = 6 the maxi- intoasheet,secondarystructurepredictioninessencede- mum in C occurs roughly at T = 0.4ǫ/kB for the three pends on the persistence of a strand conformation down proteins. For ∆ = N the magnitude of the peak in C is tointermediatetemperaturesinregionscorrespondingto higheranditalsooccursataslightlyhighertemperature strandsinthenativestructure,whileotherregionsadopt due to the presence of the long range interactions. (It the helix and the turn conformations due to the energy isinterestingtonotethattheexperimentallydetermined gain through the local contacts. An improvement in the specific heat of plastocyanin25 shows a sharp maximum prediction might be expected on extending the ”Local around 68oC.) Note that for protein G the optimal tem- IndependentlyNucleatedUnitsofStructure”tosomeju- perature for the secondary structure prediction is found diciously chosen non-local interactions for the assembly inthevicinityofthepeakinthespecificheat–justtothe of β-sheets. rightof the maximum. A similar behavior is observedin the caseofplastocyaninexceptthatthe optimaltemper- atureat∆=6isfartherfromthemaximuminC. Thisis ACKNOWLEDGMENTS duetothefactthatplastocyaninconsistsonlyofβ-sheets and the strands are more favored at high temperatures. WeareindebtedtoRajSrinivasanandGeorgeRosefor However, in the case of myo hemerythrin, whose native inspiring this work, giving us the source code of LINUS state contains mainly α-helices, the behavior is just the andforprovidinguswonderfulsupportinunderstanding opposite. The optimal temperature for the prediction is the implementation of LINUS. This work was supported now on the low temperature side of the peak in the spe- by NASA and INFM. cific heat. A comparison of Figures 4 and 5 shows that atthe temperaturecorrespondingto themaximuminC, theratesofsuccessfulpredictionarealreadyclosetotheir best values. The results described so far are based on short time simulations which last for 1000 cycles at each T. Figure 9showsη asfunctionofT forproteinGwhentheconfor- 1Moult J, Hubbard T, Fidelis K, Pedersen JT. Critical mational biases are determined from nearly equilibrium assessment of methods of protein structure prediction simulations. These simulations are performed similar to (CASP):round III. Proteins Suppl1999;3:2-6. thecalculationofthespecificheat: wemakealongsimu- 2Orengo CA, Bray JE, Hubbard T, LoConte L, Sillitoe I. lationof 200000cycles at T =0.5ǫ/kB and then use the Analysisandassessmentofabinitiothree-dimensionalpre- histogrammethodtoobtainthebiasesatothertempera- diction,secondarystructure,andcontactsprediction.Pro- tures. The profilesofη overT aresurprisinglysimilar to teins Suppl1999;3:2-6. those shown in Figure 3 and the peaks seem to be even 3Bonneau R, Baker D. Ab initio protein structure predic- more pronounced. The predictions given by the confor- tion: progress and prospects. Annu Rev Biophys Biomol mational biases are found to be insensitive to the length Struct 2001;30:173-189. ofsimulations,atleastforarangeoftemperatureswhich 4Simons KT, Bonneau R, Ruczinski I, Baker D. Ab ini- are close to the optimal value. tio protein structure prediction of CASP III targets using ROSETTA.Proteins Suppl1999;3:2-6. 5 5Simons KT, Kooperberg C, Huang E, Baker D. Assem- J Comp Chem 1992;13:1011-1021. blyofproteintertiarystructuresfromfragmentswithsim- 25MilardiD,LaRosaC,GrassoD,GuzziR,SportelliL,Fini ilarlocalsequencesusingsimulatedannealingandBayesian C. Thermodynamics and kinetics of thethermal unfolding scoring functions. J Mol Biol 1997;268:209-225. of plastocyanin. Euro BiophysJ 1998;27:273-282. 6SimonsKT,StraussC,BakerD.Prospectsforabinitiopro- 26Kabsch W, Sander C. Dictionary of protein secondary tein structuralgenomics. J Mol Biol 2001;306:1191-1199. structure: patternrecognitionofhydrogen-bondedandge- 7Park B, Levitt M. Energy functions that discriminate X- ometrical features. 1983;22:2577-2637. ray and near-native folds from well-constructed decoys. J Mol Biol 1996;258:367-392. 8Samudrala R, Xia Y, Huang E, Levitt M. Ab initio pro- teinstructurepredictionusingacombinedhierarchicalap- proach. Proteins Suppl1999;3:2-6. 9Rost B, Sander C. Prediction of protein secondary struc- tureatbetterthan70%accuracy.JMolBiol1993;232:584- 599. 10RossD,SternbergM.Identificationand application of the concepts important for accurate and reliable protein sec- ondary structureprediction. Protein Sci 1996;5:2298-2310. 11Frishman D, Argos P. Knowledge-based secondary struc- tureassignment. Proteins 1995;23:566-579. 12LeeJ,LiwoA,ScheragaHA.Energy-baseddenovoprotein foldingbyconformationalspaceannealingandanoff-lattice united-residueforcefield: applicationtothe10-55fragment ofstaphylococcalproteinAandtoapocalbindinD9K.Proc Natl Acad Sci USA1999;96:2025-2030. 13Lee J, Liwo A, Ripoll DR, Pillardy J, Scheraga HA. Cal- culation of protein conformation by global optimization of a potential energy function. Proteins Suppl1999;3:2-6. 14OrtizAR,KolinskiA,RotkiewiczP,IlkowskiB,SkolnickJ. Ab initio folding of proteins using restraints derived from evolutionary information. Proteins Suppl1999;3:2-6. 15Srinivasan R, Rose GD. LINUS:a hierarchic procedure to predict the fold of a protein. Proteins: Struct. Func. Gen. 1995;22:81-99. 16Srinivasan R, Rose GD. A physical basis for pro- tein secondary structure. Proc. Nat. Acad. Sci. USA 1999;96:14258-14263. 17Ramachandran GN, Sasisekharan V. Conformation of polypeptides and proteins. Adv Prot Chem 1968;28:283- 437. 18Crippen GM. The treestructuralorganization of proteins. J Mol Biol 1978;126:315-332. 19Rose GD. Hierarchic organization of domains in globular proteins. J Mol Biol 1979;134:447-470. 20Shortle D, Ackerman MS. Persistence of native-like topol- ogy in a denatured protein in 8 M urea. Science 2001;293:487-489. 21Plaxco KW, Gross M. Unfolded, yes, but random? Never! Nat Struct Biol 2001;8:659-660. 22Srinivasan R,Rose GD. Private communication. 23Bernstein FC, Koetzle TF, Williams GJB, Meyer Jr. EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M. The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol 1977;112:535-542. 24FerrenbergAM,SwendsenRH.NewMonteCarlotechnique forstudyingphasetransitions.PhysRevLett1988;61:2635- 2638. Kumar S, Bouzida D, Swendsen RH, Kollman PA, Rosenberg JM. The weighted histogram analysis method forfree-energycalculationsonbiomolecules.I.Themethod. 6 FIGURE CAPTIONS temperatures. The lower peak (continuous line) and the higher one (dashed line) correspondto the Fig. 1. Conformationalbiases towards secondarystruc- interaction window equal to 6 and N respectively. tures, P, as functions of residues determined for The arrows show the temperatures at which the protein G at three different temperatures T = secondarystructuresarebestpredictedforthe two 0.8ǫ/kB, 0.5ǫ/kB and 0.2ǫ/kB. The types of sec- values of ∆. ondary structures are denoted as h for α-helices (continuous line), s for β-strands (dotted line), t Fig. 8. SameasFigure7butforplastocyanin(top)and for turns (dashed line) and c for coils (long-dashed myo hemerythrin (bottom). line). The biases are computed by averaging over Fig. 9. SameasFigure3buttheratesofsuccessfulpre- 10trajectorieseachof1000cycles startingfroman diction are computed from an equilibrium simula- open conformation. The interaction window is set tion. The secondary structure biases as a function to 6. of temperature are computed using the histogram Fig. 2. a) The assignment of the secondary structures method. A long simulation of 200000 cycles at for protein G extracted from the PDB structure T = 0.5ǫ/kB is performed to extract quantities at using the DSSP method of Kabsch and Sander26 other nearby temperatures. (box) and predicted from the analysis of the con- formational biases at T = 0.5ǫ/k and for ∆ = 6. B b)Therateofsuccessinprediction,η,asafunction of temperature for three kinds of secondary struc- tures: helix (h), strand (s) and turn (t). The sim- ulations are performed for protein G with ∆ = 6. At each temperature studied, the conformational biases are computed by averaging over 10 trajec- tories each of 1000 cycles starting from an open conformation. The error bars are determined from three simulations at each temperature. Fig. 3. Therateofsuccessinprediction,η,asafunction of temperature for three kinds of secondary struc- tures: helix (h), strand(s) and turn (t) for protein G with ∆=N or with no restriction on the range of interactions. The details are the same as in the lower part of Figure 2. Fig. 4. The rate of successful prediction of secondary structures as a function of temperature for plasto- cyanin(6PCY),myohemerythrin(2HMQ),staphy- lococcal nuclease (1STG), ubiquitin (1UBQ), ri- bonuclease A (7RSA) and ribonuclease H (2RN2). The simulations are performed in the same way as for protein G as described in the caption of Figure 2. The interaction window is set equal to 6. Fig. 5. Same as Figure 4 but with ∆=N. FIG.1. Fig. 6. Top: Conformationaldiagramplottedasafunc- tion of temperature and residue in protein G. The dark and light grey areas correspond to the helix (h) and strand (s) conformations respectively. In these calculations, ∆ = 6. Bottom: strand (thin box) and helix (thick box) fragments found in the native conformation of protein G. Fig. 7. The specific heat as a function of temperature for protein G. The thermodynamic averages were carried out by performing a long simulation of 200000 cycles at T = 0.5ǫ/k and then using the B histogram method to extract quantities at other 7 a) 1GB1, T=0.5ε / k , ∆=6 B PDB −ssssssscc ccsssssssc cchhhhhhhh Predicted −−sssssstt ttttssssss sshhhhhhhh hhhhhtttcc cssssstttt sssss− hhhhhttctt tsssssstts ssss−− b) FIG.2. FIG.4. FIG.3. FIG.5. 8 FIG.6. FIG.8. FIG.7. FIG.9. 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.