Y L F M A E T Team-Fly® 470643 FM.qxd 3/17/04 10:28 AM Page i Data Mining Techniques For Marketing, Sales, and Customer Relationship Management Second Edition Michael J.A. Berry Gordon S. Linoff 470643 ffirs.qxd 3/8/04 11:32 AM Page iv 470643 FM.qxd 3/17/04 10:28 AM Page i Data Mining Techniques For Marketing, Sales, and Customer Relationship Management Second Edition Michael J.A. Berry Gordon S. Linoff 470643 ffirs.qxd 3/8/04 11:32 AM Page ii Vice President and Executive Group Publisher: Richard Swadley Vice President and Executive Publisher: Bob Ipsen Vice President and Publisher: Joseph B. Wikert Executive Editorial Director: Mary Bednarek Executive Editor: Robert M. Elliott Editorial Manager: Kathryn A. Malm Senior Production Editor: Fred Bernardi Development Editor: Emilie Herman, Erica Weinstein Production Editor: Felicia Robinson Media Development Specialist: Laura Carpenter VanWinkle Text Design & Composition: Wiley Composition Services Copyright 2004 by Wiley Publishing, Inc., Indianapolis, Indiana All rights reserved. Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8700. Requests to the Pub lisher for permission should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4447, E-mail: [email protected]. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales mate rials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Trademarks: Wiley, the Wiley Publishing logo, are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries. All other trademarks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Library of Congress Cataloging-in-Publication Data: Berry, Michael J. A. Data mining techniques : for marketing, sales, and customer relationship management / Michael J.A. Berry, Gordon Linoff.— 2nd ed. p. cm. Includes index. ISBN 0-471-47064-3 (paper/website) 1. Data mining. 2. Marketing—Data processing. 3. Business—Data processing. I. Linoff, Gordon. II. Title. HF5415.125 .B47 2004 658.8’02—dc22 2003026693 ISBN: 0-471-47064-3 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 470643 ffirs.qxd 3/8/04 11:32 AM Page iii To Stephanie, Sasha, and Nathaniel. Without your patience and understanding, this book would not have been possible. — Michael To Puccio. Grazie per essere paziente con me. Ti amo. — Gordon 470643 ffirs.qxd 3/8/04 11:32 AM Page iv 470643 flast.qxd 3/8/04 11:32 AM Page xix Acknowledgments We are fortunate to be surrounded by some of the most talented data miners anywhere, so our first thanks go to our colleagues at Data Miners, Inc. from whom we have learned so much: Will Potts, Dorian Pyle, and Brij Masand. There are also clients with whom we work so closely that we consider them our colleagues as well: Harrison Sohmer and Stuart E. Ward, III are in that cat egory. Our Editor, Bob Elliott, Editorial Assistant, Erica Weinstein, and Devel opment Editor, Emilie Herman, kept us (more or less) on schedule and helped us maintain a consistent style. Lauren McCann, a graduate student at M.I.T. and intern at Data Miners, prepared the census data used in some examples and created some of the illustrations. We would also like to acknowledge all of the people we have worked with in scores of data mining engagements over the years. We have learned some thing from every one of them. The many whose data mining projects have influenced the second edition of this book include: Al Fan Herb Edelstein Nick Gagliardo Alan Parker Jill Holtz Nick Radcliffe Anne Milley Joan Forrester Patrick Surry Brian Guscott John Wallace Ronny Kohavi Bruce Rylander Josh Goff Sheridan Young Corina Cortes Karen Kennedy Susan Hunt Stevens Daryl Berry Kurt Thearling Ted Browne Daryl Pregibon Lynne Brennen Terri Kowalchuk Doug Newell Mark Smith Victor Lo Ed Freeman Mateus Kehder Yasmin Namini Erin McCarthy Michael Patrick Zai Ying Huang xix 470643 flast.qxd 3/8/04 11:32 AM Page xx xx Acknowledgments And, of course, all the people we thanked in the first edition are still deserv ing of acknowledgement: Bob Flynn Jim Flynn Paul Berry Bryan McNeely Kamran Parsaye Rakesh Agrawal Claire Budden Karen Stewart Ric Amari David Isaac Larry Bookman Rich Cohen David Waltz Larry Scroggins Robert Groth Dena d’Ebin Lars Rohrberg Robert Utzschnieder Diana Lin Lounette Dyer Roland Pesch Don Peppers Marc Goodman Stephen Smith Ed Horton Marc Reifeis Sue Osterfelt Edward Ewen Marge Sherold Susan Buchanan Fred Chapman Mario Bourgoin Syamala Srinivasan Gary Drescher Prof. Michael Jordan Wei-Xing Ho Gregory Lampshire Patsy Campbell William Petefish Janet Smith Paul Becker Yvonne McCollin Jerry Modes 470643 flast.qxd 3/8/04 11:32 AM Page xxi About the Authors Michael J. A. Berry and Gordon S. Linoff are well known in the data mining field. They have jointly authored three influential and widely read books on data mining that have been translated into many languages. They each have close to two decades of experience applying data mining techniques to busi ness problems in marketing and customer relationship management. Michael and Gordon first worked together during the 1980s at Thinking Machines Corporation, which was a pioneer in mining large databases. In 1996, they collaborated on a data mining seminar, which soon evolved into the first edition of this book. The success of that collaboration gave them the courage to start Data Miners, Inc., a respected data mining consultancy, in 1998. As data mining consultants, they have worked with a wide variety of major companies in North America, Europe, and Asia, turning customer data bases, call detail records, Web log entries, point-of-sale records, and billing files into useful information that can be used to improve the customer experi ence. The authors’ years of hands-on data mining experience are reflected in every chapter of this extensively updated and revised edition of their first book, Data Mining Techniques. When not mining data at some distant client site, Michael lives in Cam bridge, Massachusetts, and Gordon lives in New York City. xxi