ebook img

Microarray Image Analysis: An Algorithmic Approach (Chapman & Hall CRC Computer Science & Data Analysis) PDF

300 Pages·2010·4.802 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Microarray Image Analysis: An Algorithmic Approach (Chapman & Hall CRC Computer Science & Data Analysis)

Microarray Image Analysis An Algorithmic Approach Chapman & Hall/CRC Computer Science and Data Analysis Series The interface between the computer and statistical sciences is increasing, as each discipline seeks to harness the power and resources of the other. This series aims to foster the integration between the computer sciences and statistical, numerical, and probabilistic methods by publishing a broad range of reference works, textbooks, and handbooks. SERIES EDITORS David Blei, Princeton University David Madigan, Rutgers University Marina Meila, University of Washington Fionn Murtagh, Royal Holloway, University of London Proposals for the series should be sent directly to one of the series editors above, or submitted to: Chapman & Hall/CRC 4th Floor, Albert House 1-4 Singer Street London EC2A 4BQ UK Published Titles Bayesian Artificial Intelligence Introduction to Machine Learning Kevin B. Korb and Ann E. Nicholson and Bioinformatics Sushmita Mitra, Sujay Datta, Clustering for Data Mining: Theodore Perkins, and George Michailidis A Data Recovery Approach Boris Mirkin Microarray Image Analysis: Computational Statistics Handbook with An Algorithmic Approach MATLAB®, Second Edition Karl Fraser, Zidong Wang, and Xiaohui Liu Wendy L. Martinez and Angel R. Martinez Pattern Recognition Algorithms for Correspondence Analysis and Data Data Mining Coding with Java and R Sankar K. Pal and Pabitra Mitra Fionn Murtagh R Graphics Design and Modeling for Computer Paul Murrell Experiments R Programming for Bioinformatics Kai-Tai Fang, Runze Li, and Agus Sudjianto Robert Gentleman ® Exploratory Data Analysis with MATLAB Semisupervised Learning for Wendy L. Martinez and Angel R. Martinez Computational Linguistics Introduction to Data Technologies Steven Abney Paul Murrell Statistical Computing with R Maria L. Rizzo Microarray Image Analysis An Algorithmic Approach Karl Fraser Zidong Wang Xiaohui Liu Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2010 by Taylor and Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number: 978-1-4200-9153-3 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit- ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright. com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging‑in‑Publication Data Fraser, Karl. Microarray image analysis : an algorithmic approach / Karl Fraser, Zidong Wang, Xiaohui Liu. p. cm. -- (Computer science and data analysis series) Includes bibliographical references and index. ISBN 978-1-4200-9153-3 (hardcover : alk. paper) 1. DNA microarrays. 2. Image processing--Digital techniques. I. Wang, Zidong. II. Liu, Xiaohui. III. Title. QP624.5.D726F73 2010 572.8’636--dc22 2009044676 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com Table of Contents List of Figures xiii List of Algorithms xix Preface and Acknowledgments xxi Biographies xxiii 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Current state of art . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Experimental approach . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Key issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 Noise reduction . . . . . . . . . . . . . . . . . . . . . . 8 1.4.2 Gene spot identification . . . . . . . . . . . . . . . . . 8 1.4.3 Gene spot quantification . . . . . . . . . . . . . . . . . 8 1.4.4 Slide and experiment normalization. . . . . . . . . . . 9 1.5 Contribution to knowledge . . . . . . . . . . . . . . . . . . . . 10 1.6 Structure of the book . . . . . . . . . . . . . . . . . . . . . . . 13 2 Background 17 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Molecular biology . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 Inheritance and the structure of DNA . . . . . . . . . 18 2.2.2 Central dogma . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Microarraytechnology . . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 Gene expression. . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 Microarrays . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.3 Process summary . . . . . . . . . . . . . . . . . . . . . 26 2.3.4 Final output . . . . . . . . . . . . . . . . . . . . . . . 27 2.4 Microarrayanalysis . . . . . . . . . . . . . . . . . . . . . . . . 30 2.4.1 Addressing . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . 33 2.4.3 Feature extraction . . . . . . . . . . . . . . . . . . . . 41 2.4.4 GenePix interpretation . . . . . . . . . . . . . . . . . . 42 2.4.5 Gene morphology . . . . . . . . . . . . . . . . . . . . . 45 2.5 Copasetic microarrayanalysis framework overview . . . . . . 47 vii viii Table of Contents 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3 Data Services 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.2 Image transformation engine . . . . . . . . . . . . . . . . . . . 55 3.2.1 Surface artifacts . . . . . . . . . . . . . . . . . . . . . 55 3.2.2 ITE precursor . . . . . . . . . . . . . . . . . . . . . . . 59 3.2.3 The method . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3.1 Experiment results . . . . . . . . . . . . . . . . . . . . 67 3.3.2 Strengths and weaknesses . . . . . . . . . . . . . . . . 75 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4 Structure Extrapolation I 79 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 Pyramidic contextual clustering . . . . . . . . . . . . . . . . . 82 4.2.1 The algorithm. . . . . . . . . . . . . . . . . . . . . . . 82 4.2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3.1 Search grid analysis . . . . . . . . . . . . . . . . . . . 90 4.3.2 Synthetic data . . . . . . . . . . . . . . . . . . . . . . 91 4.3.3 Real-worlddata . . . . . . . . . . . . . . . . . . . . . . 95 4.3.4 Strengths and weaknesses . . . . . . . . . . . . . . . . 97 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5 Structure Extrapolation II 99 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2 Image layout - master blocks . . . . . . . . . . . . . . . . . . 101 5.2.1 The algorithm. . . . . . . . . . . . . . . . . . . . . . . 103 5.2.2 Evaluation. . . . . . . . . . . . . . . . . . . . . . . . . 106 5.3 Image structure – meta-blocks . . . . . . . . . . . . . . . . . . 113 5.3.1 Stage one - create meta-block . . . . . . . . . . . . . . 114 5.3.2 Stage two - external gene spot locations (phase I) . . . 115 5.3.3 Stage three - internal gene spot locations (phase I) . . 117 5.3.4 Stage four - external gene spot locations (phase II) . . 119 5.3.5 Stage five - internal gene spot locations (phase II) . . 122 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 6 Feature Identification I 127 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 6.2 Spatial binding . . . . . . . . . . . . . . . . . . . . . . . . . . 129 6.2.1 Pyramidic contextual clustering - revisited . . . . . . . 129 6.2.2 The method . . . . . . . . . . . . . . . . . . . . . . . . 129 6.3 Evaluation of feature identification . . . . . . . . . . . . . . . 138 6.3.1 Finding a gene spot’s location and morphology . . . . 140 Table of Contents ix 6.3.2 Recovering weak genes . . . . . . . . . . . . . . . . . . 142 6.3.3 Strengths and weaknesses . . . . . . . . . . . . . . . . 146 6.4 Evaluation of copasetic microarrayanalysis framework . . . . 147 6.4.1 Peak signal-to-noise ratio for validation . . . . . . . . 147 6.4.2 Strengths and weaknesses . . . . . . . . . . . . . . . . 149 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 7 Feature Identification II 153 7.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 7.2 Proposed approach – subgrid detection . . . . . . . . . . . . . 158 7.2.1 Step 1: Filter the image . . . . . . . . . . . . . . . . . 158 7.2.2 Step 2: Spot spacing calculation. . . . . . . . . . . . . 160 7.2.3 Step 3: Subgrid shape detection . . . . . . . . . . . . . 161 7.2.4 Step 4: SubGrid detection . . . . . . . . . . . . . . . . 169 7.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . 175 7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 8 Chained Fourier Background Reconstruction 189 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 8.2 Existing techniques . . . . . . . . . . . . . . . . . . . . . . . . 190 8.3 A new technique . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . 193 8.3.2 Example and pseudo-code . . . . . . . . . . . . . . . . 194 8.4 Experiments and results . . . . . . . . . . . . . . . . . . . . . 196 8.4.1 Dataset characteristics . . . . . . . . . . . . . . . . . . 196 8.4.2 Synthetic data . . . . . . . . . . . . . . . . . . . . . . 197 8.4.3 Real data . . . . . . . . . . . . . . . . . . . . . . . . . 198 8.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 9 Graph-Cutting for Improving Microarray Gene Expression Reconstructions 205 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 9.2 Existing techniques . . . . . . . . . . . . . . . . . . . . . . . . 206 9.3 Proposed technique . . . . . . . . . . . . . . . . . . . . . . . . 209 9.3.1 Description . . . . . . . . . . . . . . . . . . . . . . . . 209 9.3.2 Pseudo-code and example . . . . . . . . . . . . . . . . 210 9.4 Experiments and results . . . . . . . . . . . . . . . . . . . . . 211 9.4.1 Dataset characteristics . . . . . . . . . . . . . . . . . . 212 9.4.2 Synthetic data . . . . . . . . . . . . . . . . . . . . . . 212 9.4.3 Real data . . . . . . . . . . . . . . . . . . . . . . . . . 214 9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 x Table of Contents 10 StochasticDynamicModelingofShortGeneExpressionTime Series Data 219 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 10.2 Stochastic dynamic model for gene expression data . . . . . . 221 10.3 An EM algorithm for parameter identification . . . . . . . . . 223 10.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 228 10.4.1 Modeling of yeast gene expression time series . . . . . 228 10.4.2 Modeling of virus gene expression time series . . . . . 231 10.4.3 Modelingofhumanmalariaandwormgeneexpression time series . . . . . . . . . . . . . . . . . . . . . . . . . 234 10.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 10.5.1 Model quality evaluation . . . . . . . . . . . . . . . . . 235 10.5.2 Comparisons with existing modeling methods . . . . . 240 10.6 Conclusions and Future Work . . . . . . . . . . . . . . . . . . 242 11 Conclusions 245 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 11.2 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 11.2.1 Noise reduction . . . . . . . . . . . . . . . . . . . . . . 247 11.2.2 Gene spot identification . . . . . . . . . . . . . . . . . 249 11.2.3 Gene spot quantification . . . . . . . . . . . . . . . . . 249 11.2.4 Slide and experiment normalization. . . . . . . . . . . 250 11.3 Contributions to microarray biology domain . . . . . . . . . . 250 11.3.1 Technical . . . . . . . . . . . . . . . . . . . . . . . . . 250 11.3.2 Practical. . . . . . . . . . . . . . . . . . . . . . . . . . 251 11.4 Contributions to computer science domain . . . . . . . . . . . 252 11.4.1 Technical . . . . . . . . . . . . . . . . . . . . . . . . . 253 11.4.2 Practical. . . . . . . . . . . . . . . . . . . . . . . . . . 253 11.5 Future researchtopics . . . . . . . . . . . . . . . . . . . . . . 255 11.5.1 Image transformation engine . . . . . . . . . . . . . . 255 11.5.2 Pyramidic contextual clustering . . . . . . . . . . . . . 256 11.5.3 Image layout and image structure . . . . . . . . . . . . 256 11.5.4 Spatial binding . . . . . . . . . . . . . . . . . . . . . . 257 11.5.5 Combining microarrayimage channel data . . . . . . . 257 11.5.6 Other image sets . . . . . . . . . . . . . . . . . . . . . 257 11.5.7 Distributed communication subsystems. . . . . . . . . 258 12 Appendices 259 12.1 Appendix A: Microarray variants . . . . . . . . . . . . . . . . 259 12.1.1 Building the chips . . . . . . . . . . . . . . . . . . . . 259 12.1.2 Digital generation . . . . . . . . . . . . . . . . . . . . 261 12.2 Appendix B: Basic transformations . . . . . . . . . . . . . . . 263 12.2.1 Linear transform generation . . . . . . . . . . . . . . . 263 12.2.2 Square root transform generation . . . . . . . . . . . . 264 12.2.3 Inverse transform generation . . . . . . . . . . . . . . 266 Table of Contents xi 12.2.4 Movement transform generation. . . . . . . . . . . . . 267 12.3 Appendix C: Clustering . . . . . . . . . . . . . . . . . . . . . 268 12.3.1 K-means algorithm . . . . . . . . . . . . . . . . . . . . 272 12.3.2 Fuzzy c-means algorithm. . . . . . . . . . . . . . . . . 273 12.3.3 Hierarchicalclustering . . . . . . . . . . . . . . . . . . 273 12.3.4 Distances . . . . . . . . . . . . . . . . . . . . . . . . . 275 12.4 Appendix D: A glance on mining gene expression data . . . . 275 12.4.1 Data analysis . . . . . . . . . . . . . . . . . . . . . . . 276 12.4.2 New challenges and opportunities . . . . . . . . . . . . 277 12.4.3 Data mining methods for gene expression analysis . . 278 12.5 Appendix E: Autocorrelation and GHT . . . . . . . . . . . . 278 12.5.1 Autocorrelation . . . . . . . . . . . . . . . . . . . . . . 278 12.5.2 Generalized “circular” Hough transform . . . . . . . . 279 References 281 Index 301

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.