SPSS® Statistics for Data Analysis and Visualization SPSS® Statistics for Data Analysis and Visualization Keith McCormick Jesus Salcedo with Jon Peck and Andrew Wheeler SPSS® Statistics for Data Analysis and Visualization Published by John Wiley & Sons, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2017 by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-1-119-00355-7 ISBN: 978-1-119-00557-5 (ebk) ISBN: 978-1-119-00366-3 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permis- sion of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley .com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or war- ranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that Internet websites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com. Library of Congress Control Number: 2017936609 Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permis- sion. SPSS is a registered trademark of International Business Machine Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book. We would like to dedicate this book to Jon Peck, who retired from more than 30 years with SPSS and IBM while this book was in its final stages. We wish him the best of retirements even though he probably won’t be able to resist staying in the SPSS community in some form. About the Authors Keith McCormick is a data mining consultant, trainer, and speaker. A passionate user of SPSS for 25 years, he has trained thousands on how to effectively use SPSS Statistics and SPSS Modeler. He blogs at keithmccormick.com. Jesus Salcedo is an independent statistical consultant. He is a former SPSS Curriculum Team Lead and Senior Education Specialist, who has written numer- ous SPSS training courses and trained thousands of users. Jon Peck, recently retired from IBM and SPSS, was instrumental in developing and introducing the R and Python connections to the SPSS community. This expertise made him uniquely qualified to produce Chapter 18. He is the author of all the extension commands discussed in that chapter and has a patent pend- ing on the algorithm in SPSSINC TURF procedure discussed there. He can be reached at [email protected]. Andrew Wheeler is a professor of criminology at the University of Texas at Dallas and a former crime analyst. The application of geospatial techniques in his research created the opportunity for a powerful real world example in Chapter 8. He has used SPSS for over 10 years, and often blogs SPSS tutorials at andrewpwheeler.wordpress.com. vii About the Technical Editors Jon Peck, now retired from IBM, was a senior engineer, statistician, and product strategy person for SPSS and IBM for 32 years. He earned a Ph.D in economics from Yale University, and taught econometrics and statistics there for 13 years before joining SPSS. He designed and contributed to many features of SPSS Statistics and has consulted with and trained many users. He remains active on social media and in consulting. Terry Taerum has fifteen years’ experience as a statistician at the University of Alberta, fifteen years as a data analyst at SPSS Inc., and five years as a predictive analyst and consultant with IBM Inc. ix Credits Project Editor Business Manager Tom Dinse Amy Knies Technical Editors Executive Editor Jon Peck Jim Minatel Terry Taerum Project Coordinator, Cover Production Editor Brent Savage Dassi Zeidel Proofreader Copy Editor Nancy Carrasco Kim Cofer Indexer Production Manager Johnna VanHoose Dinse Katie Wisor Cover Designer Manager of Content Development Wiley & Assembly Cover Image Mary Beth Wakefield iStock.com/agsandrew Marketing Manager Christie Hilbrich Professional Technology & Strategy Director Barry Pruett xi Acknowledgments Keith and Jesus are especially proud to have worked with Bob Elliot before he retired. Our good friend Dean Abbott recommended Keith to Bob when Bob was seeking out a follow up to Dean’s excellent Applied Predictive Analytics, but specifically in SPSS Statistics. Without both of them, this book would not have been created. Terry’s and Jon’s contribution extended well beyond technical reviewing. We consider both of them mentors and friends. Jon took over technical reviewing when Terry took on a new role with a return to IBM. Jon, in particular, was an interlocutor and trusted advisor, and we produced a better book as a result. Tom, our project editor, had to be patient with us. Deadlines slipped, con- tributors became unavailable, and Bob retired before the book was complete. Whenever it seemed that something wasn’t quite as it should be, it was often Tom that ultimately made it right. He deserves credit for multiple roles, and we thank him. We would also like to thank all of the many SPSSers that we turn to when we have a question even if they haven’t heard from us in a while. We love the sense of community that we have all managed to maintain even when so many have moved on to other roles. And we thank Jason for capturing that sense of community in his foreword. xiii Contents at a Glance Foreword xxiii Introduction xxvii Part I Advanced Statistics 1 Chapter 1 Comparing and Contrasting IBM SPSS AMOS with Other Multivariate Techniques 3 Chapter 2 Monte Carlo Simulation and IBM SPSS Bootstrapping 43 Chapter 3 Regression with Categorical Outcome Variables 71 Chapter 4 Building Hierarchical Linear Models 101 Part II Data Visualization 129 Chapter 5 Take Your Data Visualizations to the Next Level 131 Chapter 6 The Code Behind SPSS Graphics: Graphics Production Language 147 Chapter 7 Mapping in IBM SPSS Statistics 173 Chapter 8 Geospatial Analytics 193 Chapter 9 Perceptual Mapping with Correspondence Analysis, GPL, and OMS 217 Chapter 10 Display Complex Relationships with Multidimensional Scaling 249 Part III Predictive Analytics 271 Chapter 11 SPSS Statistics versus SPSS Modeler: Can I Be a Data Miner Using SPSS Statistics? 275 Chapter 12 IBM SPSS Data Preparation 303 xv xvi Contents at a Glance Chapter 13 Model Complex Interactions with IBM SPSS Neural Networks 325 Chapter 14 Powerful and Intuitive: IBM SPSS Decision Trees 355 Chapter 15 Find Patterns and Make Predictions with K Nearest Neighbors 379 Part IV Syntax, Data Management, and Programmability 393 Chapter 16 Write More Efficient and Elegant Code with SPSS Syntax Techniques 395 Chapter 17 Automate Your Analyses with SPSS Syntax and the Output Management System 421 Chapter 18 Statistical Extension Commands 441 Index 473