INFORMSAnalyticsBodyofKnowledge Wiley Essentials in OPERATIONALRESEARCHANDMANAGEMENTSCIENCE INFORMS Analytics Body of Knowledge Edited by James J. Cochran This edition first published 2019 2019 John Wiley and Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions. The right of James J. Cochran to be identified as the author of this work has been asserted in accordance with law. RegisteredOffice John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA EditorialOffice 111 River Street, Hoboken, NJ 07030, USA For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com. Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats. LimitofLiability/DisclaimerofWarranty While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. LibraryofCongressCataloging-in-PublicationData ISBN: 9781119483212 Set in 10/12 pt WarnockPro-Regular by Thomson Digital, Noida, India 10 9 8 7 6 5 4 3 2 1 v Contents Preface xv ListofContributors xix 1 IntroductiontoAnalytics 1 PhilipT.Keenan,JonathanH.Owen,andKathrynSchumacher 1.1 Introduction 1 1.2 ConceptualFramework 3 1.2.1 Data-CentricAnalytics 3 1.2.2 Decision-CentricAnalytics 4 1.2.3 CombiningData-andDecision-CentricApproaches 5 1.3 CategoriesofAnalytics 6 1.3.1 DescriptiveAnalytics 7 DataModeling 7 Reporting 10 Visualization 10 Software 10 1.3.2 PredictiveAnalytics 10 DataMiningandPatternRecognition 11 PredictiveModeling,Simulation,andForecasting 11 LeveragingExpertise 12 1.3.3 PrescriptiveAnalytics 14 1.4 AnalyticsWithinOrganizations 16 1.4.1 Projects 17 1.4.2 CommunicatingAnalytics 21 1.4.3 OrganizationalCapability 21 1.5 EthicalImplications 23 1.6 TheChangingWorldofAnalytics 25 1.7 Conclusion 28 References 28 vi Contents 2 GettingStartedwithAnalytics 31 KarlG.Kempf 2.1 Introduction 31 2.2 FiveManageableTasks 32 2.2.1 Task1:SelectingtheTargetProblem 33 2.2.2 Task2:AssembletheTeam 34 ExecutiveSponsor 35 ProjectManager 35 DomainExpert 35 ITExpert 35 DataScientist 36 Stakeholders 36 2.2.3 Task3:PreparetheData 36 2.2.4 Task4:SelectingAnalyticsTools 39 AnalyticalSpecificityorBreadth 39 AccesstoData 40 ExecutionPerformance 40 VisualizationCapability 40 DataScientistSkillset 40 VendorPricing 41 TeamBudget 41 SharingandCollaboration 41 2.2.5 Task5:Execute 42 2.3 RealExamples 43 Case1:SensorDataandHigh-VelocityAnalyticstoSaveOperating Costs 43 Case2:SocialMediaandHigh-VelocityAnalyticsforQuick ResponsetoCustomers 44 Case3:SensorDataandHigh-VelocityAnalyticstoSave MaintenanceCosts 44 Case4:UsingOldDataandAnalyticstoDetectNewFraudulent Claims 45 Case5:UsingOldandNewDataPlusAnalyticstoDecrease Crime 45 Case6:CollectingtheDataandApplyingtheAnalyticsIsthe Business 45 References 46 FurtherReading:Papers 47 FurtherReading:Books 48 3 TheAnalyticsTeam 49 ThomasH.Davenport 3.1 Introduction 49 Contents vii 3.2 SkillsNecessaryforAnalytics 50 3.2.1 MoreAdvancedorRecentAnalyticalandDataScience Skills 51 3.2.2 TheLargerTeam 53 3.3 ManagingAnalyticalTalent 57 3.3.1 DevelopingTalent 58 3.3.2 WorkingwiththeHROrganization 59 3.4 OrganizingAnalytics 61 3.4.1 GoalsofaParticularAnalyticsOrganization 62 3.4.2 BasicModelsforOrganizingAnalytics 63 3.4.3 CoordinationApproaches 65 ProgramManagementOffice 66 Federation 67 Community 67 Matrix 67 Rotation 67 AssignedCustomers 67 WhatModelFitsYourBusiness? 68 3.4.4 OrganizationalStructuresforSpecificAnalyticsStrategiesand Scenarios 70 3.4.5 AnalyticalLeadershipandtheChiefAnalyticsOfficer 70 3.5 ToWhereShouldAnalyticalFunctionsReport? 72 InformationTechnology 72 Strategy 72 SharedServices 72 Finance 73 MarketingorOtherSpecificFunction 73 ProductDevelopment 73 3.5.1 BuildinganAnalyticalEcosystem 73 3.5.2 DevelopingtheAnalyticalOrganizationoverTime 74 References 75 4 TheData 77 BrianT.Downs 4.1 Introduction 77 4.2 DataCollection 77 4.2.1 DataTypes 77 4.2.2 DataDiscovery 80 4.3 DataPreparation 86 4.4 DataModeling 93 4.4.1 RelationalDatabases 93 4.4.2 NonrelationalDatabases 95 4.5 DataManagement 97 viii Contents 5 SolutionMethodologies 99 MaryE.Helander 5.1 Introduction 99 5.1.1 WhatExactlyDoWeMeanby“Solution,”“Problem,”and “Methodology?” 99 5.1.2 It’sAllAbouttheProblem 101 5.1.3 SolutionsversusProducts 101 5.1.4 HowThisChapterIsOrganized 103 5.1.5 The“Descriptive–Predictive–Prescriptive”AnalyticsParadigm 105 5.1.6 TheGoalsofThisChapter 105 5.2 Macro-SolutionMethodologiesfortheAnalyticsPractitioner 106 5.2.1 TheScientificResearchMethodology 106 5.2.2 TheOperationsResearchProjectMethodology 109 5.2.3 TheCross-IndustryStandardProcessforDataMining(CRISP-DM) Methodology 112 5.2.4 SoftwareEngineering-RelatedSolutionMethodologies 114 5.2.5 SummaryofMacro-Methodologies 114 5.3 Micro-SolutionMethodologiesfortheAnalyticsPractitioner 116 5.3.1 Micro-SolutionMethodologyPreliminaries 116 5.3.2 Micro-SolutionMethodologyDescriptionFramework 117 5.3.3 GroupI:Micro-SolutionMethodologiesforExplorationand Discovery 119 GroupI:ProblemsofInterest 119 GroupI:RelevantModels 119 GroupI:DataConsiderations 120 GroupI:SolutionTechniques 120 GroupI:RelationshiptoMacro-Methodologies 126 GroupI:Takeaways 126 5.3.4 GroupII:Micro-SolutionMethodologiesUsingModels WhereTechniquestoFindSolutionsAreIndependent ofData 127 GroupII:ProblemsofInterest 127 GroupII:RelevantModels 127 GroupII:DataConsiderations 128 GroupII:SolutionTechniques 128 GroupII:RelationshiptoMacro-Methodologies 135 GroupII:Takeaways 137 5.3.5 GroupIII:Micro-SolutionMethodologiesUsingModelsWhere TechniquestoFindSolutionsAreDependentonData 137 GroupIII:ProblemsofInterest 137 GroupIII:RelevantModels 138 GroupIII:DataConsiderations 138 GroupIII:SolutionTechniques 139 Contents ix GroupIII:RelationshiptoMacro-Methodologies 140 GroupIII:Takeaways 141 5.3.6 Micro-MethodologySummary 141 5.4 GeneralMethodology-RelatedConsiderations 142 5.4.1 PlanninganAnalyticsProject 142 5.4.2 SoftwareandToolSelection 142 5.4.3 Visualization 143 5.4.4 FieldswithRelatedMethodologies 144 5.5 SummaryandConclusions 144 5.5.1 “DingDong,theScientificMethodIsDead!” 145 5.5.2 “MethodologyCrampsMyAnalyticsStyle” 145 5.5.3 “ThereIsOnlyOneWaytoSolveThis” 146 5.5.4 PerceivedSuccessIsMoreImportantThantheRightAnswer 148 5.6 Acknowledgments 149 References 149 6 Modeling 155 GeraldG.Brown 6.1 Introduction 155 6.2 WhenAreModelsAppropriate 155 6.2.1 WhatIstheProblemwithThisSystem? 159 6.2.2 IsThisProblemImportant? 159 6.2.3 HowWillThisProblemBeSolvedWithoutaNewModel? 159 6.2.4 WhatModelingTechniqueWillBeUsed? 159 6.2.5 HowWillWeKnowWhenWeHaveSucceeded? 160 WhoAretheSystemOperatorStakeholders? 160 6.3 TypesofModels 161 6.3.1 DescriptiveModels 161 6.3.2 PredictiveModels 161 6.3.3 PrescriptiveModels 161 6.4 ModelsCanAlsoBeCharacterizedbyWhetherTheyAre DeterministicorStochastic(Random) 161 6.5 Counting 162 6.6 Probability 163 6.7 ProbabilityPerspectivesandSubjectMatterExperts 165 6.8 SubjectMatterExperts 165 6.9 Statistics 166 6.9.1 ARandomSample 166 6.9.2 DescriptiveStatistics 166 6.9.3 ParameterEstimationwithaConfidenceInterval 166 6.9.4 Regression 167 6.10 InferentialStatistics 169 6.11 AStochasticProcess 170