International Series in Operations Research & Management Science Bhimasankaram Pochiraju Sridhar Seshadri E ditors Essentials of Business Analytics An Introduction to the Methodology and its Applications International Series in Operations Research & Management Science Volume 264 SeriesEditor CamilleC.Price StephenF.AustinStateUniversity,TX,USA AssociateSeriesEditor JoeZhu WorcesterPolytechnicInstitute,MA,USA FoundingSeriesEditor FrederickS.Hillier,StanfordUniversity,CA,USA Moreinformationaboutthisseriesathttp://www.springer.com/series/6161 Bhimasankaram Pochiraju • Sridhar Seshadri Editors Essentials of Business Analytics An Introduction to the Methodology and its Applications 123 Editors BhimasankaramPochiraju SridharSeshadri AppliedStatisticsandComputingLab GiesCollegeofBusiness IndianSchoolofBusiness UniversityofIllinoisatUrbanaChampaign Hyderabad,Telangana,India Champaign,IL,USA ISSN0884-8289 ISSN2214-7934 (electronic) InternationalSeriesinOperationsResearch&ManagementScience ISBN978-3-319-68836-7 ISBN978-3-319-68837-4 (eBook) https://doi.org/10.1007/978-3-319-68837-4 ©SpringerNatureSwitzerlandAG2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland ProfessorBhimasankaram:Withthedivine blessingsofBhagawanSriSriSriSatyaSai Baba,Idedicatethisbooktomyparents—Sri PochirajuRamaRaoandSmt.Venkata Ratnamma. SridharSeshadri:Idedicatethisbookto thememoryofmyparents,Smt.Ranganayaki andSriDesikachariSeshadri,my father-in-law, SriKalyanaSrinivasan Ayodhyanath,andmydearfriend, collaboratorandadvisor,Professor Bhimasankaram. Contents 1 Introduction................................................................. 1 SridharSeshadri PartI Tools 2 DataCollection.............................................................. 19 SudhirVoleti 3 DataManagement—RelationalDatabaseSystems(RDBMS)......... 41 HemanthKumarDasararajuandPeeyushTaori 4 BigDataManagement ..................................................... 71 PeeyushTaoriandHemanthKumarDasararaju 5 DataVisualization.......................................................... 111 JohnF.Tripp 6 StatisticalMethods:BasicInferences .................................... 137 VishnuprasadNagadevara 7 StatisticalMethods:RegressionAnalysis................................ 179 BhimasankaramPochirajuandHemaSriSaiKollipara 8 AdvancedRegressionAnalysis............................................ 247 VishnuprasadNagadevara 9 TextAnalytics............................................................... 283 SudhirVoleti PartII ModelingMethods 10 Simulation................................................................... 305 SumitKunnumkal 11 IntroductiontoOptimization.............................................. 337 MilindG.Sohoni vii viii Contents 12 ForecastingAnalytics ...................................................... 381 KonstantinosI.NikolopoulosandDimitriosD.Thomakos 13 CountDataRegression..................................................... 421 ThriyambakamKrishnan 14 SurvivalAnalysis ........................................................... 439 ThriyambakamKrishnan 15 MachineLearning(Unsupervised) ....................................... 459 ShaileshKumar 16 MachineLearning(Supervised) .......................................... 507 ShaileshKumar 17 DeepLearning .............................................................. 569 ManishGupta PartIII Applications 18 RetailAnalytics............................................................. 599 RamandeepS.Randhawa 19 MarketingAnalytics........................................................ 623 S.ArunachalamandAmaleshSharma 20 FinancialAnalytics......................................................... 659 KrishnamurthyVaidyanathan 21 SocialMediaandWebAnalytics.......................................... 719 VishnuprasadNagadevara 22 HealthcareAnalytics....................................................... 765 Maqbool(Mac)DadaandChesterChambers 23 PricingAnalytics............................................................ 793 KalyanTalluriandSridharSeshadri 24 SupplyChainAnalytics.................................................... 823 YaoZhao 25 CaseStudy:IdealInsurance .............................................. 847 DeepakAgrawalandSoumithriMamidipudi 26 CaseStudy:AAAAirline.................................................. 863 DeepakAgrawal,HemaSriSaiKollipara,andSoumithriMamidipudi 27 CaseStudy:InfoMediaSolutions......................................... 873 DeepakAgrawal,SoumithriMamidipudi,andSriramPadmanabhan 28 IntroductiontoR ........................................................... 889 PeeyushTaoriandHemanthKumarDasararaju Contents ix 29 IntroductiontoPython..................................................... 917 PeeyushTaoriandHemanthKumarDasararaju 30 ProbabilityandStatistics.................................................. 945 PeeyushTaori,SoumithriMamidipudi,andDeepakAgrawal Index............................................................................... 965 Disclaimer This book contains information obtained from authentic and highly regarded sources.Reasonableeffortshavebeenmadetopublishreliabledataandinformation, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attemptedtotracethecopyrightholdersofallmaterialreproducedinthispublication andapologizetocopyrightholdersifpermissiontopublishinthisformhasnotbeen obtained.Ifanycopyrightmaterialhasnotbeenacknowledgedpleasewriteandlet usknowsowemayrectifyinanyfuturereprint. xi Acknowledgements Thisbookistheoutcomeofatrulycollaborativeeffortamongstmanypeoplewho have contributed in different ways. We are deeply thankful to all the contributing authorsfortheirideasandsupport.Thebookbelongstothem.Thisbookwouldnot have been possible without the help of Deepak Agrawal. Deepak helped in every way,fromeditorialwork,solutionsupport,programminghelp,tocoordinationwith authors and researchers, and many more things. Soumithri Mamidipudi provided editorialsupport,helpedwithwritingsummariesofeverychapter,andproof-edited theprobabilityandstatisticsappendixandcases.PadmavatiSridhar providededi- torialsupportformanychapters.Twoassociatealumni—RamakrishnaVempatiand Suryanarayana Ambatipudi—of the Certificate Programme in Business Analytics (CBA) at Indian School of Business (ISB) helped with locating contemporary examples and references. They suggested examples for the Retail Analytics and SupplyChainAnalyticschapters.Ramakrishnaalsocontributedtothedraftofthe Big Data chapter. Several researchers in the Advanced Statistics and Computing Lab (ASC Lab) at ISB helped in many ways. Hema Sri Sai Kollipara provided support for the cases, exercises, and technical and statistics support for various chapters. Aditya Taori helped with examples for the machine learning chapters andexercises.SaurabhJugalkishor contributedexamplesforthemachinelearning chapters. The ASC Lab’s researchers and Hemanth Kumar provided technical supportinpreparingsolutionsforvariousexamplesreferredinthechapters.Ashish Khandelwal, Fellow Program student at ISB, helped with the chapter on Linear Regression. Dr. Kumar Eswaran and Joy Mustafi provided additional thoughts for theUnsupervisedLearningchapter.TheeditorialteamcomprisingFaithSu,Mathew Amboy and series editor Camille Price gave immense support during the book proposal stage, guidance during editing, production, etc. The ASC Lab provided theresearchsupportforthisproject. We thank our families for the constant support during the 2-year long project. Wethankeachandeverypersonassociatedwithusduringthebeautifuljourneyof writingthisbook. xiii