Data Mining and Predictive Analysis Intelligence Gathering and Crime Analysis Second Edition Colleen McCue AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Butterworth-Heinemann is an imprint of Elsevier Acquiring Editor: Sara Scott Editorial Project Manager: Marisa LaFleur Project Manager: Punithavathy Govindaradjane Designer: Mark Rogers Butterworth-Heinemann is an imprint of Elsevier The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK 225 Wyman Street, Waltham, MA 02451, USA Copyright © 2015, 2007 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data McCue, Colleen. Data mining and predictive analysis : intelligence gathering and crime analysis / Colleen McCue. -- 2 Edition. pages cm ISBN 978-0-12-800229-2 1. Crime analysis. 2. Data mining in law enforcement. 3. Law enforcement--Data processing. 4. Criminal behavior, Prediction of. I. Title. HV7936.C88M37 2015 363.250285’6312--dc23 2014031816 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-800229-2 For information on all Butterworth-Heinemann publications visit our website at http://store.elsevier.com/ Dedication This book is dedicated to Naval Criminal Investigative Service, Supervisory Special Agent (Ret.) Richard J. McCue, my partner in crime and everything else that matters. v Foreword If ever there was any doubt about the presence of evil in our world, one need only conduct a quick Internet search for Joseph Kony and his Lord’s Resistance Army. Over a horrific generation, Kony has marauded his way across central Africa—Uganda, South Sudan, Democratic Republic of Congo, and the Central African Republic—killing and maiming uncounted thousands, many of them children, almost all of them innocents. To its great credit, the government of Uganda, along with its neighbors and a dedicated cohort of international and nongovernmental organizations, has been relentlessly pursuing Kony and his henchmen in an effort to protect local populations from the LRA’s attacks and to aid survivors and escapees. The United States had long, but inconsistently, supported anti-Kony efforts, but that changed with President Obama’s signing into law the Lord’s Resis- tance Army Disarmament and Northern Uganda Recovery Act of 2009. And in 2011 Mr. Obama ordered the deployment of approximately 100 U.S. Special Operations personnel to aid in the multinational effort to bring Kony and his top leaders to justice. Among the many challenges, including language, culture, logistical support in remote regions, and many more, was the fundamental difficulty of finding Kony and his top lieutenants in a vast area of Africa—an area roughly the size of the state of Colorado that is densely forested with little infrastructure and even less governmental reach. At United States Africa Com- mand, intelligence analysts and seasoned Foreign Service Officers aggressively sought methodologies and processes to more quickly and accurately predict where and when LRA activities might occur. Traditional pattern analysis and tracking procedures just weren’t working. Enter Dr. Colleen McCue. Dr. McCue’s work with the Richmond, Virginia Police Department had demon- strated the value of more detailed, refined predictive analysis. It appeared that her approach might prove useful in a vastly different region and in a military, vice law enforcement endeavor. That approach was quickly proven accurate. Dr. McCue, using the same methodologies as she has so successfully applied in Virginia, was able to help military analysts sift through mounds of data and incident reports in the effort to find the real nuggets of information that xi xii Foreword would allow the forces in pursuit of the LRA to predict future attacks and even their heretofore clandestine routes of travel. Within just a few months, using Dr. McCue’s methods, Ugandan and American forces were able to interdict LRA routes, deter village attacks, and capture or cause the surrender of several key Kony associates. While Joseph Kony himself remains at large, the results of Dr. McCue’s work mean that this notoriously vicious warlord is operating largely in survival mode rather than roaming the region with impunity. In this new book, an update from her initial 2007 publication, Dr. McCue makes a compelling case for the effectiveness of predictive analysis in a wid- ening array of functional communities. She clearly and concisely lays out the processes she has developed, affording analysts and academics the opportunity to thoroughly assess and examine her work. But, she does so in a way easily understood by operators (like me) who possess neither the academic nor re- search credentials of those who normally work in this space. It is this aspect of Dr. McCue’s writing that appeals to me and, I have found, to others across a wide variety of operational interests—police work, to be sure, but also disaster preparedness and relief specialists, the counter-illicit trafficking community, even those who focus agricultural and medical trends. One can see ready appli- cability for commercial enterprises as well. Essentially, what Dr. McCue offers is a now well-tested and proven method for decision-makers, private or gov- ernmental, to choose how to most effectively apply scarce resources to address a given problem. Dr. McCue’s well-crafted second edition not only provides additional and more current examples of how her processes have been applied operationally in an ever-expanding array of activities, but also addresses how developing concepts and capabilities aid in data mining and predictive analysis. The art of her work lies in the manner in which she takes complex analytical capabilities from the scientific and academic worlds and translates them into real-world issues of understanding and predicting human behavior in support of operational deci- sion makers. It is this blending of analytics with operational experience and ex- pertise that will be of greatest interest to those in law enforcement, military, or other security fields. In short, when operators gain an appreciation of the pow- er of data mining and predictive analysis and when analysts better understand the needs of operators, a synergy is obtained that benefits all (well, maybe not criminals or terrorists). The essence of Dr. McCue’s work is to translate science into meaningful action and she makes a powerful case for doing so. General Carter Ham U.S. Army (Retired) Former Commander, U.S. Africa Command Preface So many things have changed since the first edition of this textbook, particular- ly as relates to data, technology, and tradecraft. Some things have not changed, however, including my love of science and desire to develop innovative solu- tions to some of the really challenging public safety and security challenges. Operational security analytics, at its core, is designed to effectively characterize bad behavior in support of information-based approaches to anticipation and influence. Whether “influence” entails prevention, thwarting, mitigation, response, or consequence management, we are trying to change outcomes for the better. In the beginning of my operational security analytics journey, I became pro- foundly intrigued by how many of the seasoned detectives I worked with were often able to generate quick yet accurate hypotheses about their cases, some- times only moments after they had arrived at the scene. Like the “profilers” on television and in the movies, many of them seemed to have an uncanny ability to accurately describe a likely motive and related suspect based merely on a review of the crime scene and some preliminary knowledge regarding the vic- tim’s lifestyle and related risk factors. Over time, I started to acquire this ability as well, although to a lesser degree. It became much easier to read a report and link a specific incident to others, predict future related crimes, or even calculate the likelihood that a particular case would be solved based on the nature of the incident. Drawing on my training as a scientist, I frequently found myself look- ing for some order in the chaos of crime, trying to generate testable hypoth- eses regarding emerging trends and patterns, as well as investigative outcomes. Sometimes I was correct. However, even when I was not, I was able to include the information in my ever-expanding internal rule sets regarding crime and criminal behavior. Prior to working for the Richmond Police Department, I spent several years working with that organization. Perhaps one of the most interesting aspects of this early relationship with the Department was my weekly meeting with the officer in charge of violent crimes. Each week we would discuss the homi- cides from the previous week, particularly any unique or unusual behavioral xiii xiv Preface characteristics. Over time, we began to generate casual predictions of violent crime trends and patterns that proved to be surprisingly accurate. During the same time period, I also began to examine intentional injuries among incarcer- ated offenders. As I probed the data and drilled down in an effort to identify potentially actionable patterns of risk, it became apparent that many of the individuals I looked at were not just in the wrong place at the wrong time, as they frequently indicated. Rather, they were in the wrong place at the wrong time doing the wrong things with the wrong people and were assaulted as a result of their involvement in these high-risk activities. As I explored the data further, I found that different patterns of offending were associated with different pat- terns of risk. This work had immediate implications for violence reduction efforts. It also had implications for the analysis of crime and intelligence data. Fortunately, the field of data mining and predictive analytics had evolved to the point that many of the most sophisticated algorithms were available in a PC environment, so that everyone from a software-challenged psychologist like myself to a beat cop could begin to not only understand but also use these incredibly powerful tools. Although I did not realize it at the time, a relatively new approach to market- ing and business intelligence was emerging at the same time we were engaging in this lively speculation about crime and criminals at the police department. Professionals in the business community were exploiting artificial intelligence and machine learning to characterize and retain customers, increase sales, focus marketing campaigns, and perform a variety of other business-related tasks. For example, each time I went through the checkout counter at my local supermarket, my purchasing habits were coded, collected, and analyzed. This information was aggregated with data from other shoppers and employed in the creation of models about purchasing behavior and how to turn a shopper into a buyer. These models were then used to gently mold my future behavior through everything from direct marketing based on my existing preferences to the strategic stocking of shelves in an effort to encourage me to make addition- al purchases during my next trip down the aisle. Similarly, data and informa- tion were collected and analyzed each time I perused the Internet. As I skipped through web pages, I left cookies, letting the analysts behind the scenes know where I went and when and in what sequence I moved through their sites. All of this information was analyzed and used to make their sites more friendly and easier to navigate or to subtly guide my behavior in a manner that would benefit the online businesses that I visited. The examples of data mining and predictive analytics in our lives are almost endless, but the contrast between my professional and personal lives was profound. Directly comparing the state of public safety analytical capacity to that of the business community only served to underscore this shortcoming. Throughout almost every aspect of my life, data and information were being collected on me and analyzed using Preface xv sophisticated data mining algorithms; however, the use of these very powerful tools was severely limited or nonexistent in the public safety arena in which I worked. With very few exceptions, data mining and predictive analytics were not readily available for the analysis of crime or intelligence data, particularly at the state and local levels. Like most Americans, I was profoundly affected by the events of September 11th. In the week of September 10th, 2001, I was attending a specialized course in intelligence analysis in northern Virginia. Like many, I can remember exactly what I was doing that Tuesday morning when I saw the first plane hit the World Trade Center and how I felt as the horror continued to unfold throughout the day. As I drove back to Richmond, Virginia that afternoon (the training had been postponed indefinitely), I saw the smoke rise up over the Beltway from the fire at the Pentagon, which was still burning. Those of us working in the public safety community were inundated with information over the next several days, some of it reliable, much of it not. Like many agencies, we were swamped with the intelligence reports and BOLOs (be on the lookout reports) that came in over the teletype, many of which were duplicative or contradic- tory. Added to that were the numerous suspicious situation reports from con- cerned citizens and requests for assistance from the other agencies pursuing the most promising leads. Described as the “volume challenge” by former CIA director George Tenent, the amount of information threatened to overwhelm us. Because of this, it lost its value. There was no way to effectively manage the information, let alone analyze it. In many cases, the only viable option was to catalog the reports in three-ring binders, with the hope that it could be reviewed thoroughly at some later date. Like others in law enforcement, our lives as analysts changed dramatically that day. Our professional work would never again be the same. In addition to violent crimes and vice, we now have the added responsibility of analyzing data related to the war on terrorism and the protection of homeland security, regardless of whether we work at the state, local, or federal level. Moreover, if there was one take-home message from that day as an analyst, particularly in Virginia, it was that the terrorists had been hiding in plain sight among us, sometimes for years, and they had been en- gaging in a variety of other crimes in an effort to further their terrorist agenda, including identity theft, forgery, and smuggling; not to mention the various immigration laws they violated. Many of these crimes fall within the purview of local law enforcement. As we moved through the days and weeks following the attacks, I realized that we could do much better as analysts. The subsequent discussions regarding “connecting the dots” highlighted the sad fact that quite a bit of information had been available before the attacks; however, flaws in the sharing and analy- sis of information resulted in tragic consequences. Although meaningful in- formation sharing remains an important goal, advanced analytical techniques xvi Preface are available now. The same tools that were being used to prevent people from switching their mobile telephone service provider and to stock shelves at our local supermarkets on September 10th can be used to create safer, healthier communities and enhance homeland security. The good news is that these techniques and tools are being used widely in the business community. The key is to apply them to questions or challenges in public safety, law enforce- ment, and intelligence analysis. Again, I thoroughly enjoy science and particularly like the new concept of “data science,” which really captures the creative aspects of analysis and as- sociated promise of transdisciplinary approaches. As someone who likes to color outside the lines and explore novel approaches to analysis, I am intrigued by the use of advanced analytics to improve other aspects of my life and see data science as a means to an end; as a means by which to better understand behavior—good, bad and otherwise—so that we can use it to anticipate and influence outcomes, particularly in support of enhanced public safety and se- curity. Almost everything in my professional life for the previous 20 years has been in direct support of that mission. The second edition of this textbook is no exception. Although I say “I” quite a bit in this book, it certainly was not created in a vacuum. Countless individuals have helped me throughout my career, and a few have truly inspired me. What follows is a very brief list of those that con- tributed directly to this effort in some way. I am tremendously honored by General Carter Ham’s willingness to write the Foreword to the second edition. General Ham has been a great mentor and guide, particularly as relates to improving my understanding of the challenges facing the people of Africa. Our work modeling violent extremism in Africa has been some of the most rewarding for me professionally. The ability to suc- cessfully apply western models of crime analysis to the Lord’s Resistance Army (LRA), underscores the importance of foundation-level concepts in under- standing violent crime and other predatory behavior; concepts that will enable us to effectively respond to other challenging situations, including those that have not yet emerged. This particular problem space is complex and there will be no easy solutions; however, the saying “African solutions to African prob- lems” reinforces the importance of a local approach in support of meaningful and sustainable answers to some of our hardest problems. Moreover, the more that I learn about Africa, the more that I see parallels, not only in our under- standing of challenging behavior, but also in the importance of local solutions to problems in other communities struggling with violence, including those in the United States. I would like to thank Pam Chester from Elsevier for originally approaching me about a second edition. Marisa LaFleur, my new editor, has brought a fresh Preface xvii perspective and approach, which has been a great benefit. Nancy Coleman and Turner Brinton from DigitalGlobe, and Brian Wagner from McBee Strategic Consulting, have provided great insight and guidance regarding the impor- tance of narrative and context in conveying the critical points in the new case material. Most of the early work referenced came out of some very lively discussions that began several years ago with my colleagues at the Federal Bureau of Investiga- tion. In particular, Supervisory Special Agents Charlie Dorsey and Dr. Wayne Lord provided considerable guidance to my early research. Over time, they have become both colleagues and friends, and my work definitely reflects a lev- el of quality that is attributable directly to their input. Also with the FBI, Mr. Art Westveer taught me almost everything that I know about death investigation. I learned a tremendous amount from his lectures, which were punctuated with his dry sense of humor and wonderful anecdotes from a very successful career with the Baltimore Police Department. His untimely passing was a significant loss to our community. Rich Weaver and Tim King graciously allowed me to attend their lectures and training at International Training, Inc. on surveillance detection in support of my research. They also provided some very unique op- portunities for field testing many of my ideas in this area to see how well they would play in the real world. Although many of my former employers merely tolerated my analytical pro- clivities, the Project Safe Neighborhoods folks provided funding, as well as on- going support and encouragement for much of the early work outlined in this book. In particular, Paul McNulty, the United States Attorney for the Eastern District of Virginia, carried the message of our success far beyond the audience that I could reach alone. I also would like to thank Dr. Harvey Sugerman. I still remember the day when he called me out of the blue and told me that he thought that I should be paid for the work I had been doing. A single mother, I had been responding to homicide calls on my own time in the evenings in an effort to gain addi- tional knowledge and insight into violent crime and the investigative process. That particular act made a tremendous positive impact in my life. I gained invaluable experience through my affiliation with the university, but his gentle mentoring and decision to offer me compensation for my work only begins to underscore the kindness in his heart. I owe a tremendous debt of gratitude to the software and consulting companies that provided me with excellent case study material, without which the second edition would be very thin and not terribly interesting. In particular, David Korn, Allen Sackadorf, and John Tomaselli from SAP NS2; Kevin Mergruen, and Ted Desaussure from Information Builders; Bill Wall from Praescient Ana- lytics; Dr. Rick Adderly from A E Solutions (BI) Ltd.; Dave Roberts from the
Description: