Methods in Molecular Biology 1533 Aalt D.J. van Dijk Editor Plant Genomics Databases Methods and Protocols M M B ethods in olecular iology Series Editor John M. Walker School of Life and Medical Sciences University of Hertfordshire Hatfield, Hertfordshire, AL10 9AB, UK For further volumes: http://www.springer.com/series/7651 Plant Genomics Databases Methods and Protocols Edited by Aalt D.J. van Dijk PRI Bioscience, Biometris, and Bioinformatics, Wageningen University & Research, Wageningen, The Netherlands Editor Aalt D.J. van Dijk PRI Bioscience, Biometris and Bioinformatics Wageningen University & Research Wageningen The Netherlands ISSN 1064-3745 ISSN 1940-6029 (electronic) Methods in Molecular Biology ISBN 978-1-4939-6656-1 ISBN 978-1-4939-6658-5 (eBook) DOI 10.1007/978-1-4939-6658-5 Library of Congress Control Number: 2016958617 © Springer Science+Business Media New York 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Humana Press imprint is published by Springer Nature The registered company is Springer Science+Business Media LLC The registered company address is: 233 Spring Street, New York, NY 10013, U.S.A. Preface Plant genomics has witnessed a dramatic increase in data production, in particular due to the revolution in sequencing technologies. This volume of Methods in Molecular Biology introduces databases containing the results of this data explosion. Chapters describe data- base contents as well as typical use cases, written in the spirit of the Series which aims to provide practical guidance and troubleshooting advice. Clearly, an assembled genome sequence is simply a foundation. The challenge for any researcher interested in the biology of a particular plant is to identify the features of the genome that describe this biology. Chapters 1–10 describe databases that primarily present genome sequences, integrated with various features relevant for biology. This includes large databases including data from vari- ous species, as well as databases focusing on one or a few related species. Expression and co-expression are in particular useful in order to add biological value to genomes. Databases presenting these data are described in Chapters 11–13. Finally, Chapters 14–19 present more specific and focused databases. This volume focuses on “databases” as distinct from “analysis tools.” Hence, several tools are not included, because they do not present data but aim to analyze data provided by users. Other inclusion criteria were that the resource should be up to date and of mini- mal sufficient size. Small databases obviously can be extremely relevant but would not make for a useful chapter in this volume. However, a use case is included in Chapter 9 in which various small species-specific databases are compared. It should also be noted that this vol- ume focuses on plant-specific resources. For that reason, various more general resources have not been included. Finally, the focus of this volume on genomics databases means that databases presenting purely other types of omics data, e.g., purely metabolomics data, are not included. The data explosion mentioned above is ongoing. Much more data—de novo genome sequencing, resequencing of individuals, transcriptomics, epigenomics, etc—will be added to the databases described in this volume in the near future. That notwithstanding, the chapters presented here provide clear guidance in accessing an important collection of plant databases which can be used to add biological value to genomics data. Wageningen, The Netherlands Aalt-Jan van Dijk v Contents Preface.......................................................... v Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomic Data................................. 1 Dan M. Bolser, Daniel M. Staines, Emily Perry, and Paul J. Kersey 2 PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data................................. 33 Manuel Spannagl, Thomas Nussbaumer, Kai Bader, Heidrun Gundlach, and Klaus F.X. Mayer 3 Plant Genome DataBase Japan (PGDBj)............................. 45 Akihiro Nakaya, Hisako Ichihara, Erika Asamizu, Sachiko Shirasawa, Yasukazu Nakamura, Satoshi Tabata, and Hideki Hirakawa 4 FLAGdb++: A Bioinformatic Environment to Study and Compare Plant Genomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Jean Philippe Tamby and Véronique Brunaud 5 Mining Plant Genomic and Genetic Data Using the GnpIS Information System............................................. 103 A.-F. Adam-Blondon, M. Alaux, S. Durand, T. Letellier, G. Merceron, N. Mohellibi, C. Pommier, D. Steinbach, F. Alfama, J. Amselem, D. Charruaud, N. Choisne, R. Flores, C. Guerche, V. Jamilloux, E. Kimmel, N. Lapalu, M. Loaec, C. Michotey, and H. Quesneville 6 The Bio-Analytic Resource for Plant Biology.......................... 119 Jamie Waese and Nicholas J. Provart 7 The Evolution of Soybean Knowledge Base (SoyKB).................... 149 Trupti Joshi, Jiaojiao Wang, Hongxin Zhang, Shiyuan Chen, Shuai Zeng, Bowei Xu, and Dong Xu 8 Using TropGeneDB: A Database Containing Data on Molecular Markers, QTLs, Maps, Genotypes, and Phenotypes for Tropical Crops ............. 161 Manuel Ruiz, Guilhem Sempéré, and Chantal Hamelin 9 Species-Specific Genome Sequence Databases: A Practical Review.......... 173 Aalt D.J. van Dijk 10 A Guide to the PLAZA 3.0 Plant Comparative Genomic Database ......... 183 Klaas Vandepoele 11 Exploring Plant Co-Expression and Gene-Gene Interactions with CORNET 3.0............................................. 201 Michiel Van Bel and Frederik Coppens 12 PlaNet: Comparative Co-Expression Network Analyses for Plants .......... 213 Sebastian Proost and Marek Mutwil vii viii Contents 13 Practical Utilization of OryzaExpress and Plant Omics Data Center Databases to Explore Gene Expression Networks in Oryza Sativa and Other Plant Species ......................................... 229 Toru Kudo, Shin Terashima, Yuno Takaki, Yukino Nakamura, Masaaki Kobayashi, and Kentaro Yano 14 Pathway Analysis and Omics Data Visualization using Pathway Genome Databases: FragariaCyc, A Case Study........................ 241 Sushma Naithani and Pankaj Jaiswal 15 CSGRqtl: A Comparative Quantitative Trait Locus Database for Saccharinae Grasses .......................................... 257 Dong Zhang and Andrew H. Paterson 16 Plant Genome Duplication Database................................ 267 Tae-Ho Lee, Junah Kim, Jon S. Robertson, and Andrew H. Paterson 17 Variant Effect Prediction Analysis Using Resources Available at Gramene Database ........................................... 279 Sushma Naithani, Matthew Geniza, and Pankaj Jaiswal 18 Plant Promoter Database (PPDB).................................. 299 Kazutaka Kusunoki and Yoshiharu Y. Yamamoto 19 Construction of the Leaf Senescence Database and Functional Assessment of Senescence-Associated Genes .......................... 315 Zhonghai Li, Yi Zhao, Xiaochuan Liu, Zhiqiang Jiang, Jinying Peng, Jinpu Jin, Hongwei Guo, and Jingchu Luo Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Contributors A.-F. AdAm-Blondon • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France m. AlAux • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France F. AlFAmA • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France J. Amselem • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France erikA AsAmizu • Department of Plant Life Sciences, Faculty of Agriculture, Ryukoku University, Otsu, Shiga, Japan kAi BAder • Plant Genome and Systems Biology, Helmholtz Center Munich, Neuherberg, Germany michiel VAn Bel • Department of Plant Systems Biology, VIB, Ghent, Belgium; Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium dAn m. Bolser • European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK Véronique BrunAud • Institute of Plant Sciences Paris-Saclay IPS2, CNRS, INRA, University Paris-Sud, University Evry, Univ Paris-Saclay, Orsay, France; Institute of Plant Sciences Paris-Saclay IPS2, Univ Paris-Diderot, Sorbonne Paris Cité, Orsay, France d. chArruAud • Research Unit in Genomics-Info UR1164, INRA, Université Paris- Saclay, Versailles, Versailles Cedex, France; ADRINORD Espace Recherche Innovation, Lille, France shiyuAn chen • Department of Computer Science, Christopher S . Bond Life Science Center, University of Missouri, Columbia, MO, USA n. choisne • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France Frederik coppens • Department of Plant Systems Biology, VIB, Ghent, Belgium; Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium AAlt d.J. VAn diJk • Applied Bioinformatics, Plant Sciences Group, Wageningen University & Research Centre (WUR), Wageningen, The Netherlands; Laboratory of Bioinformatics, Plant Sciences Group, Wageningen University & Research Centre (WUR), Wageningen, The Netherlands; Biometris, Plant Sciences group, Wageningen University & Research Centre (WUR), Wageningen, The Netherlands s. durAnd • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France r. Flores • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France mAtthew GenizA • Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA; Molecular and Cellular Biology Graduate Program, Oregon State University, Corvallis, OR, USA ix x Contributors c. Guerche • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France heidrun GundlAch • Plant Genome and Systems Biology, Helmholtz Center Munich, Neuherberg, Germany honGwei Guo • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China chAntAl hAmelin • UMR Amélioration Génétique et Adaptation des Plantes Méditerranéennes et Tropicales (AGAP), CIRAD, Montpellier, France hideki hirAkAwA • Department of Technology Development, Kazusa DNA Research Institute, Kisarazu, Chiba, Japan hisAko ichihArA • Department of Technology Development, Kazusa DNA Research Institute, Kisarazu, Chiba, Japan pAnkAJ JAiswAl • Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA V. JAmilloux • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France zhiqiAnG JiAnG • Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA; State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences and Center for Bioinformatics, Peking University, Beijing, China Jinpu Jin • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences and Center for Bioinformatics, Peking University, Beijing, China trupti Joshi • Department of Molecular Microbiology and Immunology, Medical Research Office School of Medicine, Informatics Institute, University of Missouri, Columbia, MO, USA; Department of Computer Science, Christopher S . Bond Life Science Center, University of Missouri, Columbia, MO, USA pAul J. kersey • European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK JunAh kim • Genomics Division, Department of Agricultural Bio-resource, National Academy of Agricultural Science, Rural Development Administration (RDA), Jeonju, South Korea e. kimmel • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France mAsAAki koBAyAshi • Bioinformatics Laboratory, School of Agriculture, Meiji University, Kawasaki, Kanagawa, Japan toru kudo • Bioinformatics Laboratory, School of Agriculture, Meiji University, Kawasaki, Kanagawa, Japan kAzutAkA kusunoki • United Graduate School of Agricultural Science, Gifu University, Gifu City, Gifu, Japan n. lApAlu • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versaille, Versailles Cedex, France; UMR BIOGER, UMR1290, INRA, AgroParisTech, Thiverval-Grignon, France tAe-ho lee • Genomics Division, Department of Agricultural Bio-Resource, National Academy of Agricultural Science, Rural Development Administration (RDA), Jeonju, South Korea; Plant Genome Mapping Laboratory, University of Georgia, Athens, GA, USA Contributors xi t. letellier • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France zhonGhAi li • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China xiAochuAn liu • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China; Department of Microbiology, Biochemistry, and Molecular Genetics, Rutgers University, New Brunswick, NJ, USA m. loAec • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France JinGchu luo • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences and Center for Bioinformatics, Peking University, Beijing, China klAus F.x. mAyer • Plant Genome and Systems Biology, Helmholtz Center Munich, Neuherberg, Germany G. merceron • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France c. michotey • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France n. mohelliBi • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France mArek mutwil • Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany sushmA nAithAni • Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA yAsukAzu nAkAmurA • Department of Technology Development, Kazusa DNA Research Institute, Kisarazu, Chiba, Japan yukino nAkAmurA • Bioinformatics Laboratory, School of Agriculture, Meiji University, Kawasaki, Kanagawa, Japan Akihiro nAkAyA • Department of Genome Informatics, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan thomAs nussBAumer • Plant Genome and Systems Biology, Helmholtz Center Munich, Neuherberg, Germany Andrew h. pAterson • Plant Genome Mapping Laboratory (Dept #398), University of Georgia, Athens, GA, USA JinyinG penG • State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China emily perry • European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, UK c. pommier • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France seBAstiAn proost • Max Planck Institute of Molecular Plant Physiology, Potsdam-G olm, Germany nicholAs J. proVArt • Department of Cell and Systems Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada h. quesneVille • Research Unit in Genomics-Info UR1164, INRA, Université Paris-Saclay, Versailles, Versailles Cedex, France