ebook img

Approximation Methods for Efficient Learning of Bayesian Networks PDF

148 Pages·2008·1.27 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Approximation Methods for Efficient Learning of Bayesian Networks

APPROXIMATION METHODS FOR EFFICIENT LEARNING OF BAYESIAN NETWORKS Frontiers in Artificial Intelligence and Applications Volume 168 Published in the subseries Dissertations in Artificial Intelligence Recently published in this series Vol. 167. P. Buitelaar and P. Cimiano (Eds.), Ontology Learning and Population: Bridging the Gap between Text and Knowledge Vol. 166. H. Jaakkola, Y. Kiyoki and T. Tokuda (Eds.), Information Modelling and Knowledge Bases XIX Vol. 165. A.R. Lodder and L. Mommers (Eds.), Legal Knowledge and Information Systems – JURIX 2007: The Twentieth Annual Conference Vol. 164. J.C. Augusto and D. Shapiro (Eds.), Advances in Ambient Intelligence Vol. 163. C. Angulo and L. Godo (Eds.), Artificial Intelligence Research and Development Vol. 162. T. Hirashima et al. (Eds.), Supporting Learning Flow Through Integrative Technologies Vol. 161. H. Fujita and D. Pisanelli (Eds.), New Trends in Software Methodologies, Tools and Techniques – Proceedings of the sixth SoMeT_07 Vol. 160. I. Maglogiannis et al. (Eds.), Emerging Artificial Intelligence Applications in Computer Engineering – Real World AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies Vol. 159. E. Tyugu, Algorithms and Architectures of Artificial Intelligence Vol. 158. R. Luckin et al. (Eds.), Artificial Intelligence in Education – Building Technology Rich Learning Contexts That Work Vol. 157. B. Goertzel and P. Wang (Eds.), Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms – Proceedings of the AGI Workshop 2006 Vol. 156. R.M. Colomb, Ontology and the Semantic Web Vol. 155. O. Vasilecas et al. (Eds.), Databases and Information Systems IV – Selected Papers from the Seventh International Baltic Conference DB&IS’2006 Vol. 154. M. Duží et al. (Eds.), Information Modelling and Knowledge Bases XVIII Vol. 153. Y. Vogiazou, Design for Emergence – Collaborative Social Play with Online and Location-Based Media Vol. 152. T.M. van Engers (Ed.), Legal Knowledge and Information Systems – JURIX 2006: The Nineteenth Annual Conference Vol. 151. R. Mizoguchi et al. (Eds.), Learning by Effective Utilization of Technologies: Facilitating Intercultural Understanding ISSN 0922-6389 Approximation Methods for Efficient Learning of Bayesian Networks Carsten Riggelsen Institute of Geosciences, University of Potsdam, Golm b. Potsdam, Germany Amsterdam (cid:129) Berlin (cid:129) Oxford (cid:129) Tokyo (cid:129) Washington, DC © 2008 The author and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-58603-821-2 Library of Congress Control Number: 2007942192 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: [email protected] Distributor in the UK and Ireland Distributor in the USA and Canada Gazelle Books Services Ltd. IOS Press, Inc. White Cross Mills 4502 Rachael Manor Drive Hightown Fairfax, VA 22032 Lancaster LA1 4XS USA United Kingdom fax: +1 703 323 3668 fax: +44 1524 63232 e-mail: [email protected] e-mail: [email protected] LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS Contents Foreword ix 1. INTRODUCTION 1 2. PRELIMINARIES 5 1 Random variables and conditional independence 5 2 Graph theory 6 3 Markov properties 7 3.1 The Markov blanket 11 3.2 Equivalence of DAGs 12 4 Bayesian networks 14 5 Bayesian network specification 14 5.1 Bayesian parameter specification 15 3. LEARNING BAYESIAN NETWORKS FROM DATA 19 1 The basics 19 2 Learning parameters 20 2.1 The Maximum-Likelihood approach 20 2.2 The Bayesian approach 21 3 Learning models 24 3.1 The penalised likelihood approach 25 3.2 The Bayesian approach 26 3.2.1 Learning via the marginal likelihood 26 3.2.2 Determining the hyper parameter 29 3.3 Marginal and penalised likelihood 32 vi EFFICIENT LEARNING OF BAYESIAN NETWORKS 3.4 Search methodologies 33 3.4.1 Model search space 34 3.4.2 Traversal strategy 35 4. MONTE CARLO METHODS AND MCMC SIMULATION 37 1 Monte Carlo methods 37 1.1 Importance sampling 38 1.1.1 Choice of the sampling distribution 40 2 Markov chain Monte Carlo—MCMC 41 2.1 Markov chains 42 2.1.1 The invariant target distribution 43 2.1.2 Reaching the invariant distribution 43 2.1.3 Metropolis-Hastings sampling 44 2.1.4 Gibbs sampling 46 2.1.5 Mixing, burn-in and convergence of MCMC 49 2.1.6 The importance of blocking 51 3 Learning models via MCMC 53 3.1 Sampling models 54 3.2 Sampling edges 54 3.3 Blocking edges 56 3.3.1 Blocks and Markov blankets 58 3.3.2 Sampling blocks 59 3.3.3 Validity of the sampler 62 3.3.4 The MB-MCMC model sampler 62 3.3.5 Evaluation 64 3.3.6 Conclusion 69 5. LEARNING FROM INCOMPLETE DATA 71 1 The concept of incomplete data 72 1.1 Missing data mechanisms 73 2 Learning from incomplete data 75 2.1 Likelihood decomposition 76 2.1.1 Complications for learning parameters 78 2.1.2 Bayesian sequential updating 79 2.1.3 Complications for learning models 80 3 Principled iterative methods 81 3.1 Expectation Maximisation—EM 81 Contents vii 3.1.1 Structural EM—SEM 85 3.2 Data Augmentation—DA 87 3.2.1 DA and eliminating the P-step—DA-P 90 3.3 DA-P and model learning—MDA-P 92 3.4 Efficiency issues of MDA-P 93 3.4.1 Properties of the sub-MCMC samplers 93 3.4.2 Interdependence between samplers 95 3.5 Imputation via importance sampling 97 3.5.1 The general idea 97 3.5.2 Importance sampling in the I-step—ISMDA-P 98 3.5.3 Generating new population vs. re-weighing 101 3.5.4 The marginal likelihood as predictive distribution 102 3.5.5 The eMC4 sampler 103 3.5.6 Evaluation—proof of concept 104 3.5.7 Conclusion 109 4 Ad-hoc and heuristic methods 110 4.1 Available cases analysis 110 4.2 Bound and Collapse—BC 111 4.3 Markov Blanket Predictor—MBP 114 4.3.1 The general idea 115 4.3.2 Approximate predictive distributions 115 4.3.3 Parameter estimation 117 4.3.4 Prediction and missing parents 119 4.3.5 Predictive quality 120 4.3.6 Selecting predictive variables 120 4.3.7 Implementation of MBP 123 4.3.8 Parameter estimation 124 4.3.9 Model learning 126 4.3.10 Conclusion and discussion 128 6. CONCLUSION 131 References 133 This page intentionally left blank Foreword This dissertation is the result of 4 years at Utrecht University as a Ph.D-student at the Department of Information and Computing Sci- ences. The work presented in this thesis is mainly based on the research published in various papers during that time. However, it is not merely a bundle of research articles. In order to provide a coherent treatment of matters, thereby helping the reader to gain a thorough understanding of the whole concept of learning Bayesian networks from (in)complete data, this thesis combines in a clarifying way all the issues presented in the papers with previously unpublished work. I hope that my efforts have been worthwhile! Mygratitudegoestomysupervisorandco-promotor,dr.AdFeelders. I am thankful to my promotor, prof.dr. Arno Siebes and to Jeroen De Knijf, Edwin de Jong and all the other colleagues in the Algorithmic Data Analysis group, and at the computer science department. I would like to thank the members of the outstanding reading com- mittee: prof.dr.ir. Linda van der Gaag, prof.dr. Richard D. Gill, prof.dr. Finn V. Jensen, prof.dr. Bert Kappen and prof.dr. Pedro Larran˜aga. The present dissertation was successfully defended on October 23, 2006 at Utrecht University. For publication in the series “Frontiers in Artificial Intelligence and Applications”, IOS Press, only minor details have been corrected and very few additions have been made. Carsten Riggelsen Berlin, October 2007 This page intentionally left blank

Description:
This publication offers and investigates efficient Monte Carlo simulation methods in order to realize a Bayesian approach to approximate learning of Bayesian networks from both complete and incomplete data. For large amounts of incomplete data when Monte Carlo methods are inefficient, approximations
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.