ebook img

Knowledge Patterns for the Web: extraction, tranformation and reuse PDF

201 Pages·2014·8.33 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Knowledge Patterns for the Web: extraction, tranformation and reuse

Alma Mater Studiorum - Universita` di Bologna DOTTORATO DI RICERCA IN INFORMATICA Ciclo XXVI Settore Concorsuale di afferenza: 01/B1 Settore Scientifico disciplinare: INF01 Knowledge Patterns for the Web: extraction, tranformation and reuse Presentata da: Andrea Giovanni Nuzzolese Coordinatore Dottorato: Relatore: Maurizio Gabbrielli Paolo Ciancarini Esame finale anno 2014 To my family. iv Abstract This thesis aims at investigating methods and software architectures for discovering what are the typical and frequently occurring structures used for organizing knowl- edge in the Web. We identify these structures as Knowledge Patterns (KPs), i.e., small, well connected units of meaning which are task-based, well-grounded, and cognitively sound. KPs are an abstraction of frames as introduced by Fillmore [51] and Minsky [101]. KP discovery needs to address two main research problems: the heterogeneity of sources, formats and semantics in the Web (i.e., the knowledge soup problem) and the difficulty to draw relevant boundary around data that al- lows to capture the meaningful knowledge with respect to a certain context (i.e., the knowledge boundary problem). Hence, we introduce two methods that provide different solutions to these two problems by tackling KP discovery from two differ- ent perspectives: (i) the transformation of KP-like artifacts (i.e., top-down defined artifacts that can be compared to KPs, such as FrameNet frames [11] or Ontology Design Patterns [65]) to KPs formalized as OWL2 ontologies; (ii) the bottom-up extraction of KPs by analyzing how data are organized in Linked Data. The two methods address the knowledge soup and boundary problems in different ways. The first method provides a solution to the two aforementioned problems that is based on a purely syntactic transformation step of the original source to RDF followed by a refactoring step whose aim is to add semantics to RDF by select meaningful RDF triples. The second method allows to draw boundaries around RDF in Linked Data by analyzing type paths. A type path is a possible route through an RDF that v takes into account the types associated to the nodes of a path. Unfortunately, type paths are not always available. In fact, Linked Data is a knowledge soup because of the heterogeneous semantics of its datasets and because of the limited intentional as well as extensional coverage of ontologies (e.g., DBpedia ontology 1, YAGO [133]) or other controlled vocabularies (e.g., SKOS [99], FOAF [28], etc.). Thus, we propose a solution for enriching Linked Data with additional axioms (e.g., rdf:type ax- ioms) by exploiting the natural language available for example in annotations (e.g. rdfs:comment) or in corpora on which datasets in Linked Data are grounded (e.g. DBpedia is grounded on Wikipedia). Then we present K∼ore, a software architec- ture conceived to be the basis for developing KP discovery systems and designed according to two software architectural styles, i.e, the Component-based and REST. K∼ore is the architectural binding of a set of tools, i.e., K∼tools, which implements the methods for KP transformation and extraction. Finally we provide an example of reuse of KP based on Aemoo, an exploratory search tool which exploits KPs for performing entity summarization. 1http://dbpedia.org/ontology vi Acknowledgements Now that my Ph.D. is going to an end and this dissertation is finalized it is time to write acknowledgements. I know that, as it usually happens in writing acknowl- edgements, I will miss someone whose support has been very important during these years, but I am sure that they will understand that these acknowledgements are also for them. First of all, I would like to thank Pamela for her love that has made my life marvelous. This achievement is mine and yours as well. I would like to thank my parents that always and unconditionally endured, sup- ported and encouraged me in everything. A big thanks to my brother Paolo who introduced me in Computer Science some years ago and gave me, together with his wife Erika, my wonderful nephew Aurora. I would like to express my deep gratitude to my tutors, prof. Aldo Gangemi and dr. Valentina Presutti, who have involved me in their extraordinary research group and who have patiently guided and encouraged me during my Ph.D. I would like to offer my special thanks to my advisor, prof. Paolo Ciancarini, for his frank, valuable and constructive suggestions and useful critiques to my research activities. I wish to acknowledge prof. Paola Mello for having always been ready to discuss with me about my Ph.D. topics. Mygratefulthanksarealsoextendedtothereferees, i.e., profEnricoMotta, prof. Lora Aroyo and prof. Robert Tolksdorf, for their precious and careful comments and vii advises. I would also like to extend my thanks to all the people that during these years have been part of my research group, namely, Alberto Musetti, Francesco Draicchio, Silvio Peroni, Angelo Di Iorio, Enrico Daga, Alessandro Adamou, Eva Blomqvist, Diego Reforgiato, Sergio Consoli, Daria Spampinato and Stefania Capotosti. Special thanks go to the people who shared with me this hard but amazing Ph.D. program, namely, Ornela Dardha, Alexandru Tudor Lascu, Giulio Pellitta, Francesco Poggi, Roberto Amadini and Gioele Barabucci. Another big thanks to Michele, Katia and Alfonso for their cheerfulness and the great fun we have had so far and we will still have. Last but not least, I would like to thank all my friends that during these years have been simply friends. viii Contents Abstract v Acknowledgements vii List of Tables xiii List of Figures xv List of Publications xxi 1 Introduction 1 2 Background 7 2.1 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Ontologies and Ontology Design Patterns . . . . . . . . . . . . . . . . 12 2.2.1 Ontology Design Patterns . . . . . . . . . . . . . . . . . . . . 13 2.2.2 Pattern-based methodologies . . . . . . . . . . . . . . . . . . . 20 2.3 Ontology Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 Knowledge patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 ix 3 Knowledge Patterns for the Web 27 3.1 A definition for Knowledge Pattern . . . . . . . . . . . . . . . . . . . 27 3.2 Knowledge Patterns in literature . . . . . . . . . . . . . . . . . . . . 29 3.3 Sources of Knowledge Patterns . . . . . . . . . . . . . . . . . . . . . 34 3.3.1 KP-like repositories . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3.2 The Web of Data . . . . . . . . . . . . . . . . . . . . . . . . . 39 4 Knowledge Pattern transformation from KP-like sources 43 4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2 A case study: transforming KPs from FrameNet . . . . . . . . . . . . 49 4.2.1 FrameNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2.2 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5 Knowledge Pattern extraction from the Web of Data 65 5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.1.1 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1.2 Boundary induction . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1.3 KP formalization . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.2 A case study: extracting KPs from Wikipedia links . . . . . . . . . . 73 5.2.1 Matherial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2.2 Obtained results . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.2.3 KP discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 x

Description:
4.1 The result of the reengineering applied to the sample database shown 4.6 Example of reengineering of the frame “Abounding with” with its.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.