ebook img

Bioinformatics and Molecular Evolution - EVA PDF

398 Pages·2009·3.67 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Bioinformatics and Molecular Evolution - EVA

BAMA01 09/03/2009 16:31 Page ii BAMA01 09/03/2009 16:31 Page i BIOINFORMATICS AND MOLECULAR EVOLUTION BAMA01 09/03/2009 16:31 Page ii BAMA01 09/03/2009 16:31 Page iii Bioinformatics and Molecular Evolution Paul G. Higgs andTeresa K. Attwood BAMA01 09/03/2009 16:31 Page iv © 2005 by Blackwell Science Ltd a Blackwell Publishing company BLACKWELL PUBLISHING 350 Main Street, Malden, MA 02148-5020, USA 108 Cowley Road, Oxford OX4 1JF, UK 550 Swanston Street, Carlton, Victoria 3053, Australia The right of Paul G. Higgs and Teresa K. Attwood to be identified as the Authors of this Work has been asserted in accordance with the UK Copyright, Designs, and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs, and Patents Act 1988, without the prior permission of the publisher. First published 2005 by Blackwell Science Ltd Library of Congress Cataloging-in-Publication Data Higgs, Paul G. Bioinformatics and molecular evolution / Paul G. Higgs and Teresa K. Attwood. p. ; cm. Includes bibliographical references and index. ISBN 1–4051–0683–2 (pbk. : alk. paper) 1. Molecular evolution—Mathematical models. 2. Bioinformatics. [DNLM: 1. Computational Biology. 2. Evolution, Molecular. 3. Genetics. QU 26.5 H637b 2005] I. Attwood, Teresa K. II. Title. QH371.3.M37H54 2005 572.8—dc22 2004021066 A catalogue record for this title is available from the British Library. Set in 91/2/12pt Photina by Graphicraft Limited, Hong Kong Printed and bound in the United Kingdom by TJ International, Padstow, Cornwall The publisher’s policy is to use permanent paper from mills that operate a sustainable forestry policy, and which has been manufactured from pulp processed using acid-free and elementary chlorine-free practices. Furthermore, the publisher ensures that the text paper and cover board used have met acceptable environmental accreditation standards. For further information on Blackwell Publishing, visit our website: www.blackwellpublishing.com BAMA01 09/03/2009 16:31 Page v Short Contents Preface x Chapter plan xiii 1 Introduction: The revolution in biological information 1 2 Nucleic acids, proteins, and amino acids 12 3 Molecular evolution and population genetics 37 4 Models of sequence evolution 58 5 Information resources for genes and proteins 81 6 Sequence alignment algorithms 119 7 Searching sequence databases 139 8 Phylogenetic methods 158 9 Patterns in protein families 195 10 Probabilistic methods and machine learning 227 11 Further topics in molecular evolution and phylogenetics 257 12 Genome evolution 283 13 DNA Microarrays and the ’omes 313 Mathematical appendix 343 List of Web addresses 355 Glossary 357 Index 363 v BAMA01 09/03/2009 16:31 Page vi BAMA01 09/03/2009 16:31 Page vii Full Contents Preface x 3 MOLECULAR EVOLUTION AND POPULATION GENETICS 37 1 INTRODUCTION: THE 3.1 What is evolution? 37 REVOLUTION IN BIOLOGICAL 3.2 Mutations 39 INFORMATION 1 3.3 Sequence variation within and 1.1 Data explosions 1 between species 40 1.2 Genomics and high-throughput 3.4 Genealogical trees and coalescence 44 techniques 5 3.5 The spread of new mutations 46 1.3 What is bioinformatics? 6 3.6 Neutral evolution and adaptation 49 1.4 The relationship between population Box 3.1 The influence of selection on genetics, molecular evolution, and the fixation probability 50 bioinformatics 7 Box 3.2 A deterministic theory for Summary lReferences lProblems 10 the spread of mutations 51 Summary lReferences lProblems 54 2 NUCLEIC ACIDS, PROTEINS, AND AMINO ACIDS 12 4 MODELS OF SEQUENCE 2.1 Nucleic acid structure 12 EVOLUTION 58 2.2 Protein structure 14 4.1 Models of nucleic acid sequence 2.3 The central dogma 16 evolution 58 2.4 Physico-chemical properties of the Box 4.1 Solution of the Jukes–Cantor amino acids and their importance in model 61 protein folding 22 4.2 The PAM model of protein sequence Box 2.1 Polymerase chain reaction evolution 65 (PCR) 23 Box 4.2 PAM distances 70 2.5 Visualization of amino acid properties 4.3 Log-odds scoring matrices for amino using principal component acids 71 analysis 25 Summary lReferences lProblems l 2.6 Clustering amino acids according to Self-test Molecular evolution 76 their properties 28 Box 2.2 Principal component analysis 5 INFORMATION RESOURCES in more detail 29 FOR GENES AND PROTEINS 81 Summary lReferences lSelf-test 5.1 Why build a database? 81 Biological background 34 5.2 Database file formats 82 vii BAMA01 09/03/2009 16:31 Page viii 5.3 Nucleic acid sequence databases 83 8.6 The maximum-likelihood criterion 173 5.4 Protein sequence databases 89 Box 8.2 Calculating the likelihood of 5.5 Protein family databases 95 the data on a given tree 174 5.6 Composite protein pattern 8.7 The parsimony criterion 177 databases 108 8.8 Other methods related to maximum 5.7 Protein structure databases 111 likelihood 179 5.8 Other types of biological database 113 Box 8.3 Calculating posterior Summary lReferences 115 probabilities 182 Summary lReferences lProblems l 6 SEQUENCE ALIGNMENT Self-test Phylogenetic methods 185 ALGORITHMS 119 6.1 What is an algorithm? 119 9 PATTERNS IN PROTEIN 6.2 Pairwise sequence alignment – FAMILIES 195 The problem 121 9.1 Going beyond pairwise alignment 6.3 Pairwise sequence alignment – methods for database searches 195 Dynamic programming methods 123 9.2 Regular expressions 197 6.4 The effect of scoring parameters on 9.3 Fingerprints 200 the alignment 127 9.4 Profiles and PSSMs 205 6.5 Multiple sequence alignment 130 9.5 Biological applications – G Summary lReferences lProblems 136 protein-coupled receptors 208 Summary lReferences lProblems l 7 SEARCHING SEQUENCE Self-test Protein families and databases 216 DATABASES 139 7.1 Similarity search tools 139 7.2 Alignment statistics (in theory) 147 10 PROBABILISTIC Box 7.1 Extreme value distributions 151 METHODS AND MACHINE LEARNING 227 Box 7.2 Derivation of the extreme 10.1 Using machine learning for value distribution in the pattern recognition in word-matching example 152 bioinformatics 227 7.3 Alignment statistics (in practice) 153 10.2 Probabilistic models of sequences – Summary lReferences lProblems l Basic ingredients 228 Self-test Alignments and database Box 10.1 Dirichlet prior distributions 232 searching 155 10.3 Introducing hidden Markov models 234 Box 10.2 The Viterbi algorithm 238 8 PHYLOGENETIC METHODS 158 8.1 Understanding phylogenetic trees 158 Box 10.3 The forward and backward 8.2 Choosing sequences 161 algorithms 239 8.3 Distance matrices and clustering 10.4 Profile hidden Markov models 241 methods 162 10.5 Neural networks 244 Box 8.1 Calculation of distances in the Box 10.4 The back-propagation neighbor-joining method 167 algorithm 249 8.4 Bootstrapping 169 10.6 Neural networks and protein 8.5 Tree optimization criteria and tree secondary structure prediction 250 search methods 171 Summary lReferences lProblems 253 viii

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.