ONTOLOGY-DRIVEN SEMI-SUPERVISED MODEL FOR CONCEPTUAL ANALYSIS OF DESIGN SPECIFICATIONS BY ARUNPRASATH SHANKAR Submitted in partial fulfillment of the requirements For the degree of Master of Science Thesis Advisor: Dr. Christos Papachristou Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY 2014 August CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES we hereby approve the thesis of ARUNPRASATH SHANKAR candidate for the MASTER OF SCIENCE degree. chair of the committee DR. CHRISTOS PAPACHRISTOU DR. FRANCIS MERAT DR. FRANCIS WOLFF date 15 2014 MAY , ∗Wealsocertifythatwrittenapprovalhasbeenobtainedforanyproprietaryma- terial contained therein. “And, when you want something, all the universe conspires in helping you to achieve it.” — Paulo Coelho, The Alchemist Dedicated to my family and friends. CONTENTS 1 introduction 1 2 background and related work 7 21 7 . Information Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 9 . Ontology Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 12 . Component Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 specification mining 14 31 14 . The Proposed Model: NEFCIS . . . . . . . . . . . . . . . . . . . . . . . . . 32 15 . Preprocessing and Frequency Analysis . . . . . . . . . . . . . . . . . . . . 33 16 . Attribute Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 17 . . Forward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 19 . . Seed Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 20 . . Spec Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 24 . Fuzzy Sets generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 26 . Estimating Priority indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 28 . Estimation of Fuzzy Parameter Scores . . . . . . . . . . . . . . . . . . . . 37 32 . Estimation of Feature Proximity Scores . . . . . . . . . . . . . . . . . . . 38 35 . Fuzzy Concept Based Information Retrieval . . . . . . . . . . . . . . . . . 4 conceptual analysis 39 41 39 . Candidate Entity Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 42 41 . Relation Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 42 . Conceptualization Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 . Ontology Building Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 component retrieval 47 51 47 . Estimation of Spec Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . iv contents v 52 56 . Bridging SPARQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 experimental results 59 7 conclusion and future work 65 appendix 66 i a fuzzy matching implementation 67 b uag to dag algorithm implementation 71 bibliography 74 LIST OF FIGURES 1 15 Figure NEFCIS: Neuro-Fuzzy Concept based Inference System . . . . . 2 18 Figure Forward Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 18 Figure Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 20 Figure Entity Segregation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 21 Figure N-grams Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 6 23 Figure Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 23 Figure Attribute Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . 8 25 Figure Word Clusters before Fuzzification . . . . . . . . . . . . . . . . . 9 26 Figure Word Clusters after Fuzzification . . . . . . . . . . . . . . . . . . 10 27 Figure NEFCIS: Fuzzy Sets Generation . . . . . . . . . . . . . . . . . . . 11 28 Figure NEFCIS: Prioritization of keywords . . . . . . . . . . . . . . . . . 12 40 Figure Specification Analysis System . . . . . . . . . . . . . . . . . . . . 13 42 Figure Concept UAG Representation . . . . . . . . . . . . . . . . . . . . 14 43 Figure Example Concept UAG . . . . . . . . . . . . . . . . . . . . . . . . 15 43 Figure Example Concept DAG . . . . . . . . . . . . . . . . . . . . . . . . 16 50 Figure Spec to Spec mapping using multiple domain ontologies . . . . 17 52 Figure Component Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . 18 56 Figure Source Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 57 Figure Target Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 57 Figure Generated RDF Triples . . . . . . . . . . . . . . . . . . . . . . . . 21 58 Figure Ontology FPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 59 Figure NEFCIS Precision across Design Domains. . . . . . . . . . . . . . 23 60 Figure Comparative Analysis based on retrieval performance metrics. . 24 61 Figure Precision Vs Recall . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures vii 25 61 Figure Iteration: Saturation Point . . . . . . . . . . . . . . . . . . . . . . 26 62 Figure Ontology Construction: Statistics of Generated Suggestions . . . 27 63 Figure OWL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES 1 45 Table UAG to DAG transformation . . . . . . . . . . . . . . . . . . . . 2 63 Table Spec Analysis using FPU Domain Ontology . . . . . . . . . . . . viii ACRONYMS SoC System on Chip NEFCIS Neuro-fuzzy Concept based Inference System for Specification Mining URI Universal Resource Identifier RDF Resource Data Finder UAG Undirected Acyclic Graph DAG Directed Acyclic Graph OWL Web Ontology Language RDF Resource Description Framework ix ACKNOWLEDGEMENTS Firstandforemost,IwouldliketothankmyadvisorDr.ChristosPapachristouforhis great patience and careful guidance through the past three years. He is a knowledge- able person with great passion on everything he works on. It has been a pleasure to meet with him weekly, having discussions on either research or life. He always granted me enough flexibility on scheduling my time, as well as gave me insightful advices on difficulties I encountered. I appreciate these a lot and have learned how to treat other people the same way. It has been great fun to listen to and discuss artificial intelligence and other interesting topics with him. I am thankful to the two people with whom I worked on this project, Dr. Francis Wolff and Bhanu Singh. This work would not have been possible without invaluable help and suggestions from them. A special gratitude and love goes to my family for their unfailing support. I thank my parents for their abiding love. Lastly, I would like to thank all my friends, with- out them I would not have gotten a chance to enjoy these wonderful three years at Case Western! x
Description: