Structural Model Discovery in Temporal Event Data Streams Chreston Allen Miller Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science and Applications Francis Quek, Chair Christopher L. North Narendran Ramakrishnan Denis Gracanin Louis-Philippe Morency March 25, 2013 Blacksburg, Virginia Keywords: Structural Model Learning, Temporal Behavior Models, Model Evolution, Human-Machine Cooperation, Temporal Event Data Copyright 2013, Chreston Allen Miller Structural Model Discovery in Temporal Event Data Streams Chreston Allen Miller ABSTRACT This dissertation presents a unique approach to human behavior analysis based on expert guidance and intervention through interactive construction and modification of behavior models. Our focus is to introduce the research area of behavior analysis, the challenges faced by this field, current approaches available, and present a new analysis approach: Interactive Relevance Search and Modeling (IRSM). More intelligent ways of conducting data analysis have been explored in recent years. Ma- chine learning and data mining systems that utilize pattern classification and discovery in non-textual data promise to bring new generations of powerful ”crawlers” for knowledge discovery, e.g., face detection and crowd surveillance. Many aspects of data can be captured by such systems, e.g., temporal information, extractable visual information - color, contrast, shape, etc. However, these captured aspects may not uncover all salient information in the data or provide adequate models/patterns of phenomena of interest. This is a challenging problem for social scientists who are trying to identify high-level, conceptual patterns of human behavior from observational data (e.g., media streams). Thepresentedresearchaddresseshowsocialscientists mayderivepatternsofhumanbehavior captured in media streams. Currently, media streams are being segmented into sequences of events describing the actions captured in the streams, such as the interactions among humans. This segmentation creates a challenging data space to search characterized by non- numerical, temporal, descriptive data, e.g., Person A walks up to Person B at time T. This dissertation will present an approach that allows one to interactively search, identify, and discover temporal behavior patterns within such a data space. Therefore, this research addresses supporting exploration and discovery in behavior analysis through a formalized method of assisted exploration. The model evolution presented sup- ports the refining of the observer’s behavior models into representations of their understand- ing. The benefit of the new approach is shown through experimentation on its identification accuracy and working with fellow researchers to verify the approach’s legitimacy in analysis of their data. GRANT INFORMATION This research has been partially supported by: FODAVA grant CCF-0937133, NSF grant IIS-1053039, and NSF IIS-1118018. Dedication To my beloved wife, Christa To my daughter, Hannah To my parents, Keith and Joyce To my brother, Justin To all my friends in Blacksburg iii Acknowledgments This research was partially funded by FODAVA grant CCF-0937133, NSF IIS-1053039, and NSF IIS-1118018. I want to thank Dr. Francis Quek for his guiding support, my committee members for their input, my family for their love and support, and my wife, Christa Hixson Miller, for her never ending support. It is because of her that I was able to finish. Due to copyright permissions, the video frame in Figure 3.5 was replaced with one owned by me. iv Contents Contents v List of Figures xi List of Tables xxi 1 Introduction 1 1.1 Challenges Faced by Behavior Analysis . . . . . . . . . . . . . . . . . . . . . 2 1.2 Overview of Behavior Analysis Approaches . . . . . . . . . . . . . . . . . . . 6 1.3 Interactive Model Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 Analysis Approaches 11 v 2.1 Behavior Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Discourse Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Temporal Reasoning and Relational Ordering . . . . . . . . . . . . . . . . . 19 3 Temporal Data Modeling 25 3.1 Parametric vs Structural Learning . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 Temporal Data Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 “Music Score” Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Exploration of a Temporal Event Data-Space 34 4.1 Event Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.1.2 Related Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.1.3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1.4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.1.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 46 4.1.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2 Situated Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 vi 4.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 Approach Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.3 Background and Related Work . . . . . . . . . . . . . . . . . . . . . 52 4.2.4 Multimodal Data to Events . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.5 Model Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2.6 Assisted Situated Analysis . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.7 Implementation and Use . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.8 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.9 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.3 Search Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.3 STIS method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.3.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . 100 4.3.6 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4 Interactive Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 vii 4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.4.2 Data Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.4.4 Model Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 4.4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.4.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 135 4.4.7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 5 Evaluation 137 5.1 Phase 1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 5.2 Phase 2 and 3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 5.2.1 Demographics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 5.2.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 5.2.4 Gathering Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 5.2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 5.2.6 Strategies of Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 5.2.7 Search Strategies Developed . . . . . . . . . . . . . . . . . . . . . . . 157 viii 5.2.8 Strategies Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 5.2.9 Aid in Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.2.10 Feature Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 5.2.11 Problems, Challenges, and Criticisms . . . . . . . . . . . . . . . . . . 194 5.2.12 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.3 Phase 4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 6 Software Versions 205 6.1 Prototype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 6.2 Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.2.1 Temporal Relation Processing Library . . . . . . . . . . . . . . . . . 210 6.2.2 Temporal Relation Viewer . . . . . . . . . . . . . . . . . . . . . . . . 216 6.3 Version 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 6.3.1 Database and Parameter Adjustment Improvements . . . . . . . . . . 218 6.3.2 Predicate Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 6.3.3 Event Sequence Overview . . . . . . . . . . . . . . . . . . . . . . . . 222 6.3.4 Video View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 6.3.5 Features Added during Use-Cases . . . . . . . . . . . . . . . . . . . . 226 ix 7 Conclusions 229 7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 7.2 Addressing Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . 230 7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 7.3.1 Continuing Use-Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 7.3.2 Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 7.3.3 Unified Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 233 7.3.4 Further Research Pursuits . . . . . . . . . . . . . . . . . . . . . . . . 234 7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 A Theorem 1 Proof 239 B Model Occurrence Likelihood 241 C Use-Case Documents 244 Bibliography 252 x
Description: