ebook img

Data mining for important amino acid residues in multiple sequence alignments and protein ... PDF

141 Pages·2014·6.21 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data mining for important amino acid residues in multiple sequence alignments and protein ...

Data mining for important amino acid residues in multiple sequence alignments and protein structures Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) der Fakultät für Biologie und vorklinische Medizin der Universität Regensburg vorgelegt von Jan-Oliver Janda aus Linz, Österreich im Jahr 2014 Das Promotionsgesuch wurde eingereicht am: 18.02.2014 Die Arbeit wurde angeleitet von: apl. Prof. Dr. Rainer Merkl Unterschrift: Contents List of figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Zusammenfassung . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1 Proteins and enzymes . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3 Protein structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Multiple sequence alignments . . . . . . . . . . . . . . . . . . . . 14 1.5 Aim of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Classification of highly conserved residue positions . . . . . . . . . 21 2.1.1 CLIPS-1D: A solely sequence-based classifier . . . . . . . . 22 2.1.2 CLIPS-3D: A solely structure-based classifier . . . . . . . . 28 2.1.3 CLIPS-4D: A sequence- and structure-based classifier . . . 29 2.2 Identification of correlated mutations . . . . . . . . . . . . . . . . 34 i Contents 2.2.1 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . 35 2.2.2 Case studies that illustrate classification performance . . . 37 3 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4 List of publications and personal contribution . . . . . . . . . . . . . 67 5 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1 Publication A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Publication B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.3 Publication C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 ii List of Figures 1.1 Venn-diagram of the properties of the 20 amino acids . . . . . . . 6 1.2 Conformation of a peptide bond . . . . . . . . . . . . . . . . . . . 7 1.3 Simplified reaction coordinate diagram . . . . . . . . . . . . . . . 9 1.4 Principle of a support vector machine . . . . . . . . . . . . . . . . 10 1.5 Representation of a protein structure . . . . . . . . . . . . . . . . 13 1.6 Schematics of the solvent accessible surface area . . . . . . . . . . 14 1.7 Example of a multiple sequence alignment . . . . . . . . . . . . . 15 1.8 Schematics of a correlated mutation . . . . . . . . . . . . . . . . . 18 2.1 Location of predicted STRUC_sites in TrpC . . . . . . . . . . . . 27 iii List of Tables 2.1 Performance of CLIPS-1D, the binary SVMs, and FRpred . . . . 24 2.2 Residue and class-specific performance values of CLIPS-1D . . . . 25 2.3 Classification of catalytic and binding sites for TrpC . . . . . . . . 26 2.4 Performance of CLIPS-1D, CLIPS-3D, CLIPS-4D, and Consurf . 30 2.5 Classification performance of firestar, CLIPS-4D and an ensemble classifier on ligand-binding sites of six CASP targets . . . . . . . . 31 2.6 Residue and class-specific performance values of CLIPS-4D . . . . 33 2.7 Performance of H2rs . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.8 OverlappingpredictionsofH2rswithH2randPSICOVonfivecase studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 v

Description:
2.1 Classification of highly conserved residue positions Die Datensätze mit dem größten Informationsgehalt für solche Algorithmen sind.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.