Next-generation Sequencing and Bioinformatics for Plant Science Edited by Vijai Bhadauria Caister Academic Press Next-generation Sequencing and Bioinformatics for Plant Science https://doi.org/10.21775/9781910190654 Edited by Vijai Bhadauria Crop Development Centre and Department of Plant Sciences University of Saskatchewan Saskatoon, SK Canada Caister Academic Press Copyright © 2017 Caister Academic Press Norfolk, UK www.caister.com British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-1-910190-65-4 (paperback) ISBN: 978-1-910190-66-1 (ebook) Description or mention of instrumentation, software, or other products in this book does not imply endorsement by the author or publisher. The author and publisher do not assume responsibility for the validity of any products or procedures mentioned or described in this book or for the consequences of their use. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher. No claim to original U.S. Government works. Cover design adapted from an image provided by Dr Xifeng Chen and Dr Bojun Ma. Ebooks Ebooks supplied to individuals are single-user only and must not be reproduced, copied, stored in a retrieval system, or distributed by any means, electronic, mechanical, photocopying, email, internet or otherwise. Ebooks supplied to academic libraries, corporations, government organizations, public libraries, and school libraries are subject to the terms and conditions specified by the supplier. Contents Editorial vii 1 Status and Prospects of Next-generation Sequencing Technologies in Crop Plants 1 Tilak R. Sharma, Basavantraya N. Devanna, Kanti Kiran, Pankaj K. Singh, Kirti Arora, Priyanka Jain, Ila M. Tiwari, Himanshu Dubey, Banita K. Saklani, Mandeep Kumari, Jyoti Singh, Rajdeep Jaswal, Ritu Kapoor, Deepak V. Pawar, Shruti Sinha, Deepak S. Bisht, Amolkumar U. Solanke and Tapan K. Mondal 2 Next-generation Sequencing Promoted the Release of Reference Genomes and Discovered Genome Evolution in Cereal Crops 37 Yong Huang, Haiyang Liu and Yongzhong Xing 3 Advanced Applications of Next-generation Sequencing Technologies to Orchid Biology 51 Chuan-Ming Yeh, Zhong-Jian Liu and Wen-Chieh Tsai 4 Bioinformatics Resources for Plant Genomics: Opportunities and Bottlenecks in the -omics Era 71 Luca Ambrosino, Chiara Colantuono, Francesco Monticolo and Maria Luisa Chiusano 5 Applications of Bioinformatics to Plant Biotechnology 89 Diego F. Gomez-Casati, María V. Busi, Julieta Barchiesi, Diego A. Peralta, Nicolás Hedin and Vijai Bhadauria 6 Quantitative Genetics of Disease Resistance in Wheat 105 Vijai Bhadauria and Lucia Popescu 7 A Rice Genetic Improvement Boom by Next-generation Sequencing 109 Xiangchun Zhou, Xufeng Bai and Yongzhong Xing 8 Dual RNA-seq to Elucidate the Plant–Pathogen Duel 127 Sanushka Naidoo, Erik Andrei Visser, Lizahn Zwart, Yves du Toit, Vijai Bhadauria and Louise Simone Shuey 9 Next-generation Sequencing Sheds New Light on Small RNAs in Plant Reproductive Development 143 Xiaobai Li iv | Contents 10 ChIP-seq: A Powerful Tool for Studying Protein–DNA Interactions in Plants 171 Xifeng Chen, Vijai Bhadauria and Bojun Ma 11 Cataloguing Plant Genome Structural Variations 181 Xingtan Zhang, Xuequn Chen, Pingping Liang and Haibao Tang Current Books of Interest Bacterial Evasion of the Host Immune System 2017 Illustrated Dictionary of Parasitology in the Post-Genomic Era 2017 The CRISPR/Cas System: Emerging Technology and Application 2017 Brewing Microbiology: Current Research, Omics and Microbial Ecology 2017 Metagenomics: Current Advances and Emerging Concepts 2017 Bacillus: Cellular and Molecular Biology (Third Edition) 2017 Cyanobacteria: Omics and Manipulation 2017 Foot-and-Mouth Disease Virus: Current Research and Emerging Trends 2017 Brain-eating Amoebae: Biology and Pathogenesis of Naegleria fowleri 2016 Staphylococcus: Genetics and Physiology 2016 Chloroplasts: Current Research and Future Trends 2016 Microbial Biodegradation: From Omics to Function and Application 2016 Influenza: Current Research 2016 MALDI-TOF Mass Spectrometry in Microbiology 2016 Aspergillus and Penicillium in the Post-genomic Era 2016 The Bacteriocins: Current Knowledge and Future Prospects 2016 Omics in Plant Disease Resistance 2016 Acidophiles: Life in Extremely Acidic Environments 2016 Climate Change and Microbial Ecology: Current Research and Future Trends 2016 Biofilms in Bioremediation: Current Research and Emerging Technologies 2016 Microalgae: Current Research and Applications 2016 Gas Plasma Sterilization in Microbiology: Theory, Applications, Pitfalls and New Perspectives 2016 Virus Evolution: Current Research and Future Directions 2016 Arboviruses: Molecular Biology, Evolution and Control 2016 Shigella: Molecular and Cellular Biology 2016 Aquatic Biofilms: Ecology, Water Quality and Wastewater Treatment 2016 Alphaviruses: Current Biology 2016 Thermophilic Microorganisms 2015 Flow Cytometry in Microbiology: Technology and Applications 2015 Probiotics and Prebiotics: Current Research and Future Trends 2015 Epigenetics: Current Research and Emerging Trends 2015 Corynebacterium glutamicum: From Systems Biology to Biotechnological Applications 2015 Advanced Vaccine Research Methods for the Decade of Vaccines 2015 Antifungals: From Genomics to Resistance and the Development of Novel Agents 2015 Bacteria-Plant Interactions: Advanced Research and Future Trends 2015 Full details at www.caister.com Editorial With the advent of high-throughput sequencing rice, maize and sorghum (Chapter 2) and orchids platforms, such as Illumina’s Genome Analyzer, (Chapter 3) shed light on the challenges in the HiSeq, MiSeq and NextSeq, Roche/454’s Genome post-NGS era, such as genome assembly and anno- Sequencer FLX, Thermo-Fisher Scientific’s SOLiD, tation. Gomez-Casati and colleagues, in Chapter 5, Ion Torrent and Ion Proton, PacBio’s Real-Time describe the application of omics (gene expression Sequencer and more recently Oxford Nanopore’s and regulation as well as quantitative proteomics MinION, it has become feasible to sequence entire approaches, such as iTRAQ) in fruit development genomes and transcriptomes at an exponential and ripening, and plant disease resistance. Bioinfor- pace. This has huge implications in plant breeding matics resources for plant genomics are detailed and and genetics. The reviews presented in this volume discussed in two reviews (Chapters 4 and 5). The summarize recent developments in next-generation NGS-based genotyping-by-sequencing approaches sequencing (NGS) and bioinformatics tools and are useful to map polymorphism in experimental their application in understanding and improving populations and germplasm, which then can be agronomic traits. used to track genomic regions controlling quantita- tive traits, such as fusarium head blight and stripe Next-generation sequencing (NGS) coupled with rust resistance in wheat (Chapater 6) and rice high-performance computing have revolutionized (Chapter 7). Sequencing of the transcriptome from the field of plant breeding and genetics (Bhadauria infected plant tissues (dual RNA-seq) can provide and Banniza, 2014; Bhadauria et al., 2016). This molecular insight into host defence and pathogen volume compiles recent advances in the NGS as virulence during incompatible and compatible well as the application of NGS in understanding interactions, thereby facilitating in designing crops and improving agronomics traits such as yield, with improved resistance (Chapter 8). In addition drought tolerance and disease resistance. to genome and transcriptome sequencing, the In this volume, the review by Sharma et al. NGS can also be used in sequencing of small RNA (Chapter 1) outlines the evolution of DNA (20–24 nucleotides; Chapter 9) and transcription sequencing techniques and platforms, including factor binding sites (ChIP-seq; Chapter 10) in the first generation (Sanger’s chain-termination genomes. Structural variations, such as abnormal method and Maxam–Gilbert’s chemical cleav- chromosome number, chromosomal rearrange- age method), second-generation (Illumina’s GA, ment, copy number variation, presence or absence HiSeq and MiSeq as well as Roche/454’s GS variation, mobile element insertion and deletion FLX), third-generation single molecule real-time and homologous exchange play key role in pheno- sequencing (PacBio’s SMRT RS) and the more typic diversity of agronomic traits, such as biotic recent fourth-generation sequencing platform and abiotic stress tolerance. Zhang and colleagues Oxford Nanopore’s MinIon. The application of looked into the application of NGS in mapping NGS in the genome sequencing and evolution of such structural variations. viii | Editorial References Bhadauria, V., Wong, M.M., Bett, K.E., and Banniza, S. (2016). Wild help for enhancing genetic resistance in Bhadauria, V., and Banniza, S. (2014). What lies ahead lentil against fungal diseases. Curr. Issues Mol. Biol. 19, in post-genomics era: a perspective on genetic 3–6. improvement of crops for fungal disease resistance? Plant Signal Behav. 9, e28503. Vijai Bhadauria Crop Development Centre and Department of Plant Sciences, University of Saskatchewan, Saskatoon, SK, Canada Lucia Popescu Department of Soil Science, University of Saskatchewan, Saskatoon, SK, Canada https://doi.org/10.21775/9781910190654.01 Status and Prospects of 1 Next-generation Sequencing Technologies in Crop Plants Tilak R. Sharma*, Basavantraya N. Devanna, Kanti Kiran, Pankaj K. Singh, Kirti Arora, Priyanka Jain, Ila M. Tiwari, Himanshu Dubey, Banita K. Saklani, Mandeep Kumari, Jyoti Singh, Rajdeep Jaswal, Ritu Kapoor, Deepak V. Pawar, Shruti Sinha, Deepak S. Bisht, Amolkumar U. Solanke and Tapan K. Mondal ICAR-National Research Centre on Plant Biotechnology, Pusa Campus, New Delhi, India. *Correspondence: [email protected] and [email protected] https://doi.org/10.21775/9781910190654.02 Abstract Introduction The history of DNA sequencing dates back to The overall growth, development and behavioural 1970s. During this period, the two first-generation characteristics of every living creature are largely nucleotide sequencing techniques were devel- determined by its genetic constitution. Subse- oped. Subsequently, Sanger’s dideoxy method of quent to the famous double-helix model of DNA, sequencing gained popularity over Maxam and Gil- proposed by Watson and Crick (1953), scientists bert’s chemical method of sequencing. However, began to find the ways and means to determine the in the last decade, we have observed revolutionary nucleotide sequence of DNA. The first significant changes in DNA sequencing technologies leading breakthrough in this area was achieved in late 1970s to the emergence of next-generation sequenc- when two groups working independently reported ing (NGS) techniques. NGS technologies have two different approaches for DNA sequencing enhanced the throughput and speed of sequencing (Maxam and Gilbert, 1977; Sanger et al., 1977). combined with bringing down the overall cost of Though Maxam and Gilbert’s approach for DNA the process over a time. The major applications of sequencing was preferred initially, it was Sanger’s NGS technologies being genome sequencing and sequencing technology which subsequently got resequencing, transcriptomics, metagenomics in popularized among the scientific community. The relation to plant–microbe interactions, exon and classical genome sequencing projects such as the genome capturing, development of molecular Human Genome Project (HGP), the Arabidopsis markers and evolutionary studies. In this review, Genome Initiative and the International Rice we present a broader picture of evolution of Genome Sequencing Project were successfully NGS tools, its various applications in crop plants, completed using Sanger’s sequencing approach. and future prospects of the technology for crop Subsequently, many plant genomes were sequenced improvement. using this sequencing technology. Though Sanger’s dideoxy sequencing method is considered as gold standard with respect to genome sequencing, there