ebook img

Sports Analytics and Data Science: Winning the Game with Methods and Models PDF

352 Pages·2015·9.14 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Sports Analytics and Data Science: Winning the Game with Methods and Models

Sports Analytics and Data Science Winning the Game with Methods and Models THOMAS W. MILLER Publisher:PaulBoger Editor-in-Chief:AmyNeidlinger ExecutiveEditor:JeanneGlasserLevine CoverDesigner:AlanClements ManagingEditor:KristyHart ProjectEditor:AndyBeaster ManufacturingBuyer:DanUhrig (cid:13)c2016byThomasW.Miller PublishedbyPearsonEducation,Inc. OldTappan,NewJersey07675 Forinformationaboutbuyingthistitleinbulkquantities,orforspecialsalesopportunities (whichmayincludeelectronicversions;customcoverdesigns;andcontentparticular toyourbusiness,traininggoals,marketingfocus,orbrandinginterests),pleasecontact [email protected](800)382-3419. Forgovernmentsalesinquiries,[email protected]. ForquestionsaboutsalesoutsidetheU.S.,[email protected]. Companyandproductnamesmentionedhereinarethetrademarksorregistered trademarksoftheirrespectiveowners. Allrightsreserved.Nopartofthisbookmaybereproduced,inanyformorbyany means,withoutpermissioninwritingfromthepublisher. PrintedintheUnitedStatesofAmerica FirstPrintingNovember2015 ISBN-10:0-13-388643-3 ISBN-13:978-0-13-388643-6 PearsonEducationLTD. PearsonEducationAustraliaPTY,Limited. PearsonEducationSingapore,Pte.Ltd. PearsonEducationAsia,Ltd. PearsonEducationCanada,Ltd. PearsonEducacio´ndeMexico,S.A.deC.V. PearsonEducation—Japan PearsonEducationMalaysia,Pte.Ltd. LibraryofCongressControlNumber:2015954509 Contents Preface v Figures ix Tables xi Exhibits xiii 1 UnderstandingSportsMarkets 1 2 AssessingPlayers 23 3 RankingTeams 37 4 PredictingScores 49 5 MakingGame-DayDecisions 61 6 CraftingaMessage 69 7 PromotingBrandsandProducts 101 8 GrowingRevenues 119 9 ManagingFinances 133 iii iv SportsAnalyticsandDataScience 10 PlayingWhat-ifGames 147 11 WorkingwithSportsData 169 12 CompetingonAnalytics 193 A DataScienceMethods 197 A.1 MathematicalProgramming 200 A.2 ClassicalandBayesianStatistics 203 A.3 RegressionandClassification 206 A.4 DataMiningandMachineLearning 215 A.5 TextandSentimentAnalysis 217 A.6 TimeSeries,SalesForecasting,andMarketResponseModels 226 A.7 SocialNetworkAnalysis 230 A.8 DataVisualization 234 A.9 DataScience:TheEclecticDiscipline 240 B ProfessionalLeaguesandTeams 255 DataScienceGlossary 261 BaseballGlossary 279 Bibliography 299 Index 329 Preface “Sometimesyouwin,sometimesyoulose,sometimesitrains.” —TIMROBBINSASEBBYCALVINLALOOSHINBullDurham(1988) Businesses attract customers, politicians persuade voters, websites cajole visitors,andsportsteamsdrawfans. Whateverthegoalortarget,dataand modelsruletheday. Thisbookisaboutbuildingwinningteamsandsuccessfulsportsbusinesses. Winningandsuccessaremorelikelywhendecisionsareguidedbydataand models. Sportsanalyticsisasourceofcompetitiveadvantage. This book provides an accessible guide to sports analytics. It is written for anyone who needs to know about sports analytics, including players, managers,owners,andfans. Itisalsoaresourceforanalysts,datascientists, and programmers. The book views sports analytics in the context of data science, a discipline that blends business savvy, information technology, andmodelingtechniques. To use analytics effectively in sports, we must first understand sports— the industry, the business, and what happens on the fields and courts of play. We need to know how to work with data—identifying data sources, gathering data, organizing and preparing them for analysis. We also need toknowhowtobuildmodelsfromdata. Datadonotspeakforthemselves. Useful predictions do not arise out of thin air. It is our job to learn from dataandbuildmodelsthatwork. v vi SportsAnalyticsandDataScience The best way to learn about sports analytics and data science is through examples. We provide a ready resource and reference guide for modeling techniques. We show programmers how to solve real world problems by buildingonafoundationoftrustworthymethodsandcode. The truth about what we do is in the programs we write. The code is there for everyone to see and for some to debug. Data sets and computer programs are available from the website for the Modeling Techniques se- ries at http://www.ftpress.com/miller/. There is also a GitHub site at https://github.com/mtpa/. When working on sports problems, some things are more easily accom- plishedwithR,otherswithPython. Andtherearetimeswhenitisgoodto offersolutionsinbothlanguages,checkingoneagainsttheother. One of the things that distinguishes this book from others in the area of sportsanalyticsistherangeofdatasourcesandtopicsdiscussed. Manyre- searchers focus on numerical performance data for teams and players. We takeabroaderviewofsportsanalytics—theviewofdatascience. Thereare textdataaswellasnumericdata. AndwiththegrowthoftheWorldWide Web, the sources of data are plentiful. Much can be learned from public domain sources through crawling and scraping the web and utilizing ap- plicationprogramminginterfaces(APIs). I learn from my consulting work with professional sports organizations. Research Publishers LLC with its ToutBay division promotes what can be called “data science as a service.” Academic research and models can take usonlysofar. Eventually,tomakeadifference,weneedtoimplementour ideasandmodels,sharingthemwithoneanother. Many have influenced my intellectual development over the years. There werethosegoodthinkersandgoodpeople,teachersandmentorsforwhom Iwillbeforevergrateful. Sadly,nolongerwithusareGeraldHahnHinkle in philosophy and Allan Lake Rice in languages at Ursinus College, and HerbertFeiglinphilosophyattheUniversityofMinnesota. Iamalsomost thankfultoDavidJ.WeissinpsychometricsattheUniversityofMinnesota andKellyEakinineconomics,formerlyattheUniversityofOregon. Preface vii My academic home is the Northwestern University School of Professional Studies. Coursesinsportsresearchmethodsandquantitativeanalysis,mar- ketinganalytics,databasesystemsanddatapreparation,webandnetwork data science, web information retrieval and real-time analytics, and data visualization provide inspiration for this book. Thanks to the many stu- dents and fellow faculty from whom I have learned. And thanks to col- leagues and staff who administer excellent graduate programs, including the Master of Science in Predictive Analytics, Master of Arts in Sports Ad- ministration,MasterofScienceinInformationSystems,andtheAdvanced CertificateinDataScience. Lorena Martin reviewed this book and provided valuable feedback while she authored a companion volume on sports performance measurement andanalytics(Martin2016). AdamGrossmanandTomRobinsonprovided valuablefeedbackaboutcoverageoftopicsinsportsbusinessmanagement. RoySanfordprovidedadviceonstatistics. AmyHendricksonofTEXnology Inc.applied her craft, making words, tables, and figures look beautiful in print—anothervictoryforopensource. CandiceBradleyserveddualroles asareviewerandcopyeditorforallbooksintheModelingTechniquesseries. AndAndyBeasterhelpedinpreparingthisbookforfinalproduction. Iam gratefulfortheirguidanceandencouragement. Thanksgotomyeditor,JeanneGlasserLevine,andpublisher,Pearson/FT Press,formakingthisbookpossible. Anywritingissues,errors,oritemsof unfinishedbusiness,ofcourse,aremyresponsibilityalone. MygoodfriendBrittneyandherdaughterJaniyakeepmecompanywhen time permits. And my son Daniel is there for me in good times and bad, a friendforlife. Mygreatestdebtistothembecausetheybelieveinme. ThomasW.Miller Glendale,California October2015 This page intentionally left blank Figures 1.1 MLB,NBA,andNFLAverageAnnualSalaries 10 1.2 MLBTeamPayrollsandWin/LossPerformance(2014Season) 11 1.3 APerceptualMapofSevenSports 13 2.1 Multitrait-MultimethodMatrixforBaseballMeasures 25 3.1 AssessingTeamStrength:NBARegularSeason(2014–2015) 40 4.1 WorkofDataScience 50 4.2 DataandModelsforResearch 52 4.3 Training-and-TestRegimenforModelEvaluation 54 4.4 Training-and-TestUsingMulti-foldCross-validation 56 4.5 Training-and-TestwithBootstrapResampling 57 4.6 PredictiveModelingFrameworkforTeamSports 59 6.1 HowSportsFitintotheEntertainmentSpace(OrNot) 72 6.2 IndicesofDissimilarityBetweenPairsofBinaryVariables 73 6.3 ConsumerPreferencesforDodgerStadiumSeating 77 6.4 ChoiceItemforAssessingWillingnesstoPayforTickets 79 6.5 TheMarket:AMeetingPlaceforBuyersandSellers 80 7.1 DodgersAttendancebyDayofWeek 104 7.2 DodgersAttendancebyMonth 104 7.3 DodgersWeather,Fireworks,andAttendance 106 7.4 DodgersAttendancebyVisitingTeam 107 7.5 RegressionModelPerformance: BobbleheadsandAttendance 108 8.1 CompetitiveAnalysisforanNBATeam:GoldenStateWarriors 129 9.1 Cost-Volume-ProfitAnalysis 135 9.2 HigherProfitsThroughIncreasedSales 136 9.3 HigherProfitsThroughLowerFixedCosts 137 9.4 HigherProfitsThroughIncreasedEfficiency 137 9.5 DecisionAnalysis:InvestinginaSportsFranchise(OrNot) 143 10.1 Game-daySimulation(OffenseOnly) 152 ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.