Trends and Applications of Text Summarization Techniques Alessandro Fiori Candiolo Cancer Institute – FPO, IRCCS, Italy A volume in the Advances in Data Mining and Database Management (ADMDM) Book Series Published in the United States of America by IGI Global Engineering Science Reference (an imprint of IGI Global) 701 E. Chocolate Avenue Hershey PA, USA 17033 Tel: 717-533-8845 Fax: 717-533-8661 E-mail: [email protected] Web site: http://www.igi-global.com Copyright © 2020 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data Names: Fiori, Alessandro, 1982- editor. Title: Trends and applications of text summarization techniques / Alessandro Fiori, editor. Description: Hershey, PA : Engineering Science Reference, [2020] | Includes bibliographical references. Identifiers: LCCN 2019001873| ISBN 9781522593737 (hardcover) | ISBN 9781522593744 (softcover) | ISBN 9781522593751 (ebook) Subjects: LCSH: Automatic abstracting. | Electronic information resources--Abstracting and indexing. | Text processing (Computer science) Classification: LCC Z695.92 .T74 2020 | DDC 025.4/10285635--dc23 LC record available at https://lccn.loc.gov/2019001873 This book is published in the IGI Global book series Advances in Data Mining and Database Management (ADMDM) (ISSN: 2327-1981; eISSN: 2327-199X) British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher. For electronic access to this publication, please contact: [email protected]. Advances in Data Mining and Database Management (ADMDM) Book Series ISSN:2327-1981 EISSN:2327-199X Editor-in-Chief: David Taniar, Monash University, Australia Mission With the large amounts of information available to organizations in today’s digital world, there is a need for continual research surrounding emerging methods and tools for collecting, analyzing, and storing data. The Advances in Data Mining & Database Management (ADMDM) series aims to bring together research in information retrieval, data analysis, data warehousing, and related areas in order to become an ideal resource for those working and studying in these fields. IT professionals, software engineers, academicians and upper-level students will find titles within the ADMDM book series particularly useful for staying up-to-date on emerging research, theories, and applications in the fields of data mining and database management. Coverage • Web Mining IGI Global is currently accepting • Association Rule Learning manuscripts for publication within this • Data Mining series. To submit a proposal for a volume in • Decision Support Systems this series, please contact our Acquisition • Information Extraction Editors at [email protected] or • Data Analysis visit: http://www.igi-global.com/publish/. • Database Testing • Quantitative Structure–Activity Relationship • Cluster Analysis • Predictive Analysis The Advances in Data Mining and Database Management (ADMDM) Book Series (ISSN 2327-1981) is published by IGI Global, 701 E. Chocolate Avenue, Hershey, PA 17033-1240, USA, www.igi-global.com. This series is composed of titles available for purchase individually; each title is edited to be contextually exclusive from any other title within the series. For pricing and ordering information please visit http://www.igi-global.com/book-series/advances-data-mining- database-management/37146. Postmaster: Send all address changes to above address. Copyright © 2020 IGI Global. All rights, including translation in other languages reserved by the publisher. No part of this series may be reproduced or used in any form or by any means – graphics, electronic, or mechanical, including photocopying, recording, taping, or information and retrieval systems – without written permission from the publisher, except for non commercial, educational use, including classroom teaching purposes. The views expressed in this series are those of the authors, but not necessarily of IGI Global. Titles in this Series For a list of additional titles in this series, please visit: https://www.igi-global.com/book-series/advances-data-mining-database-management/37146 Emerging Perspectives in Big Data Warehousing David Taniar (Monash University, Australia) and Wenny Rahayu (La Trobe University, Australia) Engineering Science Reference • copyright 2019 • 348pp • H/C (ISBN: 9781522555162) • US $245.00 (our price) Emerging Technologies and Applications in Data Processing and Management Zongmin Ma (Nanjing University of Aeronautics and Astronautics, China) and Li Yan (Nanjing University of Aeronautics and Astronautics, China) Engineering Science Reference • copyright 2019 • 458pp • H/C (ISBN: 9781522584469) • US $265.00 (our price) Online Survey Design and Data Analytics Emerging Research and Opportunities Shalin Hai-Jew (Kansas State University, USA) Engineering Science Reference • copyright 2019 • 226pp • H/C (ISBN: 9781522585633) • US $215.00 (our price) Handbook of Research on Big Data and the IoT Gurjit Kaur (Delhi Technological University, India) and Pradeep Tomar (Gautam Buddha University, India) Engineering Science Reference • copyright 2019 • 568pp • H/C (ISBN: 9781522574323) • US $295.00 (our price) Managerial Perspectives on Intelligent Big Data Analytics Zhaohao Sun (Papua New Guinea University of Technology, Papua New Guinea) Engineering Science Reference • copyright 2019 • 335pp • H/C (ISBN: 9781522572770) • US $225.00 (our price) Optimizing Big Data Management and Industrial Systems With Intelligent Techniques For an entire list of titles in this series, please visit: https://www.igi-global.com/book-series/advances-data-mining-database-management/37146 701 East Chocolate Avenue, Hershey, PA 17033, USA Tel: 717-533-8845 x100 • Fax: 717-533-8661 E-Mail: [email protected] • www.igi-global.com Editorial Advisory Board Silvia Chiusano, Politecnico di Torino, Italy Paolo Garza, Politecnico di Torino, Italy George Giannakopoulos, NCSR Demokritos, Greece Saima Jabeen, University of Wah, Pakistan Jochen L. Leidner, Refinitiv Labs, UK & University of Sheffield, UK Andrea Mignone, Candiolo Cancer Institute (FPO IRCCS), Italy Josef Steinberger, University of West Bohemia, Czech Republic Table of Contents Foreword.............................................................................................................xiv Preface..................................................................................................................xv Acknowledgment.................................................................................................xx Section 1 Concepts and Methods Chapter 1 Combining.Machine.Learning.and.Natural.Language.Processing.for. Language-Specific,.Multi-Lingual,.and.Cross-Lingual.Text.Summarization:. A.Wide-Ranging.Overview....................................................................................1 Luca Cagliero, Politecnico di Torino, Italy Paolo Garza, Politecnico di Torino, Italy Moreno La Quatra, Politecnico di Torino, Italy Chapter 2 The.Development.of.Single-Document.Abstractive.Text.Summarizer.During. the.Last.Decade.....................................................................................................32 Amal M. Al-Numai, King Saud University, Saudi Arabia Aqil M. Azmi, King Saud University, Saudi Arabia Chapter 3 Mining.Scientific.and.Technical.Literature:.From.Knowledge.Extraction.to. Summarization......................................................................................................61 Junsheng Zhang, Institute of Scientific and Technical Information of China, China Wen Zeng, Institute of Scientific and Technical Information of China, China Chapter 4 Data.Text.Mining.Based.on.Swarm.Intelligence.Techniques:.Review.of.Text. Summarization.Systems........................................................................................88 Mohamed Atef Mosa, Institute of Public Administration, Department of Information Technology, Riyadh, Saudi Arabia Chapter 5 Named.Entity.Recognition.in.Document.Summarization..................................125 Sandhya P., Vellore Institute of Technology, Chennai Campus, Tamil Nadu, India Mahek Laxmikant Kantesaria, Vellore Institute of Technology, Chennai Campus, Tamil Nadu, India Section 2 Domain Applications Chapter 6 Text.Classification.and.Topic.Modeling.for.Online.Discussion.Forums:.An. Empirical.Study.From.the.Systems.Modeling.Community................................151 Xin Zhao, University of Alabama, USA Zhe Jiang, University of Alabama, USA Jeff Gray, University of Alabama, USA Chapter 7 Summarization.in.the.Financial.and.Regulatory.Domain...................................187 Jochen L. Leidner, Refinitiv Labs, UK & University of Sheffield, UK Chapter 8 Opinion.Mining.and.Product.Review.Summarization.in.E-Commerce.............216 Enakshi Jana, Pondicherry University, India V. Uma, Pondicherry University, India Chapter 9 Scaling.and.Semantically-Enriching.Language-Agnostic.Summarization.........244 George Giannakopoulos, NCSR Demokritos, Greece & SciFY PNPC, Greece George Kiomourtzis, SciFY PNPC, Greece & NCSR Demokritos, Greece Nikiforos Pittaras, NCSR Demokritos, Greece & National and Kapodistrian University of Athens, Greece Vangelis Karkaletsis, NCSR Demokritos, Greece Compilation of References...............................................................................293 About the Contributors....................................................................................329 Index...................................................................................................................334 Detailed Table of Contents Foreword.............................................................................................................xiv Preface..................................................................................................................xv Acknowledgment.................................................................................................xx Section 1 Concepts and Methods Chapter 1 Combining.Machine.Learning.and.Natural.Language.Processing.for. Language-Specific,.Multi-Lingual,.and.Cross-Lingual.Text.Summarization:. A.Wide-Ranging.Overview....................................................................................1 Luca Cagliero, Politecnico di Torino, Italy Paolo Garza, Politecnico di Torino, Italy Moreno La Quatra, Politecnico di Torino, Italy The.recent.advances.in.multimedia.and.web-based.applications.have.eased.the. accessibility.to.large.collections.of.textual.documents..To.automate.the.process.of. document.analysis,.the.research.community.has.put.relevant.efforts.into.extracting. short.summaries.of.the.document.content..However,.most.of.the.early.proposed. summarization.methods.were.tailored.to.English-written.textual.corpora.or.to. collections.of.documents.all.written.in.the.same.language..More.recently,.the.joint. efforts.of.the.machine.learning.and.the.natural.language.processing.communities.have. produced.more.portable.and.flexible.solutions,.which.can.be.applied.to.documents. written.in.different.languages..This.chapter.first.overviews.the.most.relevant.language- specific.summarization.algorithms..Then,.it.presents.the.most.recent.advances.in. multi-.and.cross-lingual.text.summarization..The.chapter.classifies.the.presented. methodology,.highlights.the.main.pros.and.cons,.and.discusses.the.perspectives.of. the.extension.of.the.current.research.towards.cross-lingual.summarization.systems.