ebook img

SCALABLE INFERENCE TECHNIQUES FOR MARKOV LOGIC by Deepak Venugopal APPROVED ... PDF

243 Pages·2015·1.54 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview SCALABLE INFERENCE TECHNIQUES FOR MARKOV LOGIC by Deepak Venugopal APPROVED ...

SCALABLE INFERENCE TECHNIQUES FOR MARKOV LOGIC by Deepak Venugopal APPROVED BY SUPERVISORY COMMITTEE: Vibhav Gogate, Chair Gopal Gupta Sanda M. Harabagiu Raymond J. Mooney Vincent Ng Copyright c 2015 (cid:13) Deepak Venugopal All rights reserved SCALABLE INFERENCE TECHNIQUES FOR MARKOV LOGIC by DEEPAK VENUGOPAL, BE, MS DISSERTATION Presented to the Faculty of The University of Texas at Dallas in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE THE UNIVERSITY OF TEXAS AT DALLAS August 2015 ACKNOWLEDGMENTS I wish to thank my advisor Vibhav Gogate without whom this dissertation would not have been possible. Vibhav has been a great advisor who inspired me to work hard and maintain highstandardsofresearchbysettingafineexample. Hehasshownmehowtothinkandwrite with clarity, and always seemed to have an idea whenever I have been struck with seemingly unsolvable problems. Best of all, he has always been patient, friendly and approachable. I would certainly hope to emulate some of his traits as I embark on my own academic career. Next, I wish to thank members of my dissertation committee, Dr. Gopal Gupta, Dr. Sanda Harabagiu, Dr. Ray Mooney and Dr. Vincent Ng, for the time they have taken not only for my dissertation but also in my job search process. Particularly, Dr. Mooney was kind enough to serve on my committee from UT-Austin and I am grateful for this. I also wish to thank Somdeb, Chen, Dr. Parag Singla and Dr. Vincent Ng for collaborating with me on various projects due to which I gained a lot of insight into several different research areas. Needless to say, my family has supported me enormously in completing this dissertation. It is hard to thank Krithika enough for the love, patience and faith that she has shown in me during numerous ups and downs associated with Ph.D. life. Were it not for her support, I certainly would not have made it this far. Special thanks to Esha for all the laughs and joy she has given me when writing this dissertation. My parents have given me every possible opportunity and let me pursue my interests at all times. I am extremely thankful for their wishes, love and encouragement at all stages of my life. I am also deeply indebted to my extended family in India that has given me much needed support during my Ph.D. I will also cherish the friendships that I made at UT-Dallas. Somdeb, David, Tahrima, Li, Chen and several others have been great friends and I have enjoyed countless interesting iv conversations with them about work and life in general. I thank them and will certainly miss their company as I finish my studies. Last but not least, I thank various funding agencies: ARO, AFRL and DARPA for providing me financial support through grant numbers W911NF-08-1-0242, FA8750-14-C-0021 and FA8750-14-C-0005. June 2015 v SCALABLE INFERENCE TECHNIQUES FOR MARKOV LOGIC Publication No. Deepak Venugopal, PhD The University of Texas at Dallas, 2015 Supervising Professor: Vibhav Gogate In this dissertation, we focus on Markov logic networks (MLNs), an advanced modeling language that combines first-order logic, the cornerstone of traditional Artificial Intelligence (AI), with probabilistic graphical models, the cornerstone of modern AI. MLNs are routinely usedinawidevarietyofapplicationdomainsincludingnaturallanguageprocessingandcom- puter vision, and are preferred over propositional representations because unlike the latter they yield compact, interpretable models that can be easily modified and tuned. Unfortu- nately, even though the MLN representation is compact and efficient, inference in them is notoriously difficult and despite great progress, several inference tasks in complex real-world MLNs are beyond the reach of existing technology. In this dissertation, we greatly advance the state-of-the-art in MLN inference, enabling it to solve much harder and larger problems than existing approaches. We develop several domain-independent principles, techniques and algorithms for fast, scalable and accurate inference that fully exploit both probabilistic and logical structure. This dissertation makes the following five contributions. First, we propose two approaches that respectively address two fundamental problems with Gibbs sampling, a popular approx- vi imate inference algorithm: it does not converge in presence of determinism and it exhibits poor accuracy when the MLN contains a large number of strongly correlated variables. Sec- ond, weliftsampling-basedapproximateinferencealgorithmstothefirst-orderlevel, enabling them to take full advantage of symmetries and relational structure in MLNs. Third, we de- velop novel approaches for exploiting approximate symmetries. These approaches help scale up inference to large, complex MLNs, which are not amenable to conventional lifting tech- niques that exploit only exact symmetries. Fourth, we propose a new, efficient algorithm for solving a major bottleneck in all inference algorithms for MLNs: counting the number of true groundings of each formula. We demonstrate empirically that our new counting ap- proach yields orders of magnitude improvements in both the speed and quality of inference. Finally, we demonstrate the power and promise of our approaches on Biomedical event ex- traction, a challenging real-world information extraction task, on which our system achieved state-of-the-art results. vii TABLE OF CONTENTS ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CHAPTER 2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.2 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3 Discrete Graphical Models . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.4 Markov Logic Networks . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Exact Inference in Markov Networks . . . . . . . . . . . . . . . . . . 22 2.2.2 Sampling Based Approximate Inference . . . . . . . . . . . . . . . . . 24 2.2.3 Lifted Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.3.1 Weight Learning in MLNs . . . . . . . . . . . . . . . . . . . . . . . . 41 CHAPTER 3 HANDLING LOGICAL DEPENDENCIES IN MCMC BASED INFER- ENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.1 GiSS: Sampling in PGMs with determinism . . . . . . . . . . . . . . . . . . . 45 3.1.1 The GiSS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.1.2 Computing the Sample Weights . . . . . . . . . . . . . . . . . . . . . 49 3.1.3 Related Work and Discussion . . . . . . . . . . . . . . . . . . . . . . 54 3.1.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 viii 3.2 Dynamic Blocking and Collapsing . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.1 Combining Blocking and Collapsing . . . . . . . . . . . . . . . . . . . 63 3.2.2 Optimally Selecting Blocked and Collapsed Variables . . . . . . . . . 65 3.2.3 Dynamic Blocked-Collapsed Gibbs Sampling . . . . . . . . . . . . . . 68 3.2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.2.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 CHAPTER 4 LIFTING SAMPLING BASED INFERENCE ALGORITHMS . . . . 83 4.1 Lifted Blocked Gibbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.1.1 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.1.2 PTP-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.1.3 Lifted Blocked Gibbs . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.1.4 Lifted Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.1.5 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.1.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.2 Lifted Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2.1 PTP-based Importance Sampling . . . . . . . . . . . . . . . . . . . . 109 4.2.2 New Lifting Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.2.3 Constructing the Proposal Distribution . . . . . . . . . . . . . . . . . 116 4.2.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 CHAPTER 5 EXPLOITING APPROXIMATE SYMMETRIES FOR SCALABLE IN- FERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.1 Grounding and Evidence Problems . . . . . . . . . . . . . . . . . . . . . . . 125 5.2 Approximate Lifting using Evidence-based Clustering . . . . . . . . . . . . . 126 5.2.1 Input Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.2.3 Evidence Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.2.4 Algorithm Specification . . . . . . . . . . . . . . . . . . . . . . . . . . 133 ix 5.2.5 Evidence Based Distance Function . . . . . . . . . . . . . . . . . . . 135 5.2.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 5.2.7 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3 Application: Scalable Importance Sampling . . . . . . . . . . . . . . . . . . 149 5.3.1 Constructing and Sampling the Proposal Distribution . . . . . . . . . 150 5.3.2 Computing the Importance Weight . . . . . . . . . . . . . . . . . . . 153 5.3.3 Rao-Blackwellisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5.3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 CHAPTER 6 EXPLOITING EFFICIENT COUNTING STRATEGIES FOR SCAL- ABLE INFERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 6.2 Encoding the Counting Problem . . . . . . . . . . . . . . . . . . . . . . . . . 164 6.2.1 CSP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 6.2.2 Counting the Number of Solutions of the CSP . . . . . . . . . . . . . 168 6.2.3 Junction Trees for Solution Counting . . . . . . . . . . . . . . . . . . 170 6.3 Application I: Gibbs Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 171 6.4 Application II: MaxWalkSAT . . . . . . . . . . . . . . . . . . . . . . . . . . 172 6.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.5.1 Existential Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6.5.2 Lifted Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 6.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 6.6.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 6.6.2 Results for Gibbs Sampling . . . . . . . . . . . . . . . . . . . . . . . 180 6.6.3 Results for MaxWalkSAT . . . . . . . . . . . . . . . . . . . . . . . . 180 6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 CHAPTER 7 JOINT INFERENCE FOR EXTRACTING BIOMEDICAL EVENTS 183 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 7.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 x

Description:
We develop several domain-independent principles, techniques and In the past, research on logic and probability in AI/ML was carried out by
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.