AlgorithmicAspectsofMachineLearning Thisbookbridgestheoreticalcomputerscienceandmachinelearningbyexploring whatthetwosidescanteacheachother.Itemphasizestheneedforflexible,tractable modelsthatbettercapturenotwhatmakesmachinelearninghardbutwhatmakesit easy.Theoreticalcomputerscientistswillbeintroducedtoimportantmodelsin machinelearningandtothemainquestionswithinthefield.Machinelearning researcherswillbeintroducedtocutting-edgeresearchinanaccessibleformatand willgainfamiliaritywithamodernalgorithmictoolkit,includingthemethodof moments,tensordecompositions,andconvexprogrammingrelaxations. Thetreatmentgoesbeyondworst-caseanalysistobuildarigorousunderstanding abouttheapproachesusedinpracticeandtofacilitatethediscoveryofexcitingnew waystosolveimportant,long-standingproblems. ankur moitraistheRockwellInternationalAssociateProfessorof MathematicsattheMassachusettsInstituteofTechnology.Heisaprincipal investigatorintheComputerScienceandArtificialIntelligenceLab(CSAIL)anda corememberoftheTheoryofComputationGroup,MachineLearning@MIT,andthe CenterforStatistics.Theaimofhisworkistobridgethegapbetweentheoretical computerscienceandmachinelearningbydevelopingalgorithmswithprovable guaranteesandfoundationsforreasoningabouttheirbehavior.Heistherecipientofa PackardFellowship,aSloanFellowship,aNationalScienceFoundationCAREER Award,anNSFComputingandInnovationFellowship,andaHertzFellowship. ToDianaandOlivia,thesunshineinmylife Algorithmic Aspects of Machine Learning ANKUR MOITRA MassachusettsInstituteofTechnology UniversityPrintingHouse,CambridgeCB28BS,UnitedKingdom OneLibertyPlaza,20thFloor,NewYork,NY10006,USA 477WilliamstownRoad,PortMelbourne,VIC3207,Australia 314-321,3rdFloor,Plot3,SplendorForum,JasolaDistrictCentre, NewDelhi–110025,India 79AnsonRoad,#06–04/06,Singapore079906 CambridgeUniversityPressispartoftheUniversityofCambridge. ItfurtherstheUniversity’smissionbydisseminatingknowledgeinthepursuitof education,learning,andresearchatthehighestinternationallevelsofexcellence. www.cambridge.org Informationonthistitle:www.cambridge.org/9781107184589 DOI:10.1017/9781316882177 ©AnkurMoitra2018 Thispublicationisincopyright.Subjecttostatutoryexception andtotheprovisionsofrelevantcollectivelicensingagreements, noreproductionofanypartmaytakeplacewithoutthewritten permissionofCambridgeUniversityPress. Firstpublished2018 PrintedintheUnitedStatesofAmericabySheridanBooks,Inc. AcataloguerecordforthispublicationisavailablefromtheBritishLibrary. LibraryofCongressCataloging-in-PublicationData Names:Moitra,Ankur,1985–author. Title:Algorithmicaspectsofmachinelearning/AnkurMoitra, MassachusettsInstituteofTechnology. Description:Cambridge,UnitedKingdom;NewYork,NY,USA:Cambridge UniversityPress,2018.|Includesbibliographicalreferences. Identifiers:LCCN2018005020|ISBN9781107184589(hardback)| ISBN9781316636008(paperback) Subjects:LCSH:Machinelearning–Mathematics.|Computeralgorithms. Classification:LCCQ325.5.M652018|DDC006.3/1015181–dc23 LCrecordavailableathttps://lccn.loc.gov/2018005020 ISBN978-1-107-18458-9Hardback ISBN978-1-316-63600-8Paperback CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracy ofURLsforexternalorthird-partyinternetwebsitesreferredtointhispublication anddoesnotguaranteethatanycontentonsuchwebsitesis,orwillremain, accurateorappropriate. Contents Preface pagevii 1 Introduction 1 2 NonnegativeMatrixFactorization 4 2.1 Introduction 4 2.2 AlgebraicAlgorithms 11 2.3 StabilityandSeparability 16 2.4 TopicModels 22 2.5 Exercises 27 3 TensorDecompositions:Algorithms 29 3.1 TheRotationProblem 29 3.2 APrimeronTensors 31 3.3 Jennrich’sAlgorithm 35 3.4 PerturbationBounds 40 3.5 Exercises 46 4 TensorDecompositions:Applications 48 4.1 PhylogeneticTreesandHMMs 48 4.2 CommunityDetection 55 4.3 ExtensionstoMixedModels 58 4.4 IndependentComponentAnalysis 65 4.5 Exercises 69 5 SparseRecovery 71 5.1 Introduction 71 5.2 IncoherenceandUncertaintyPrinciples 74 5.3 PursuitAlgorithms 77 v vi Contents 5.4 Prony’sMethod 80 5.5 CompressedSensing 83 5.6 Exercises 88 6 SparseCoding 89 6.1 Introduction 89 6.2 TheUndercompleteCase 92 6.3 GradientDescent 96 6.4 TheOvercompleteCase 101 6.5 Exercises 106 7 GaussianMixtureModels 107 7.1 Introduction 107 7.2 Clustering-BasedAlgorithms 111 7.3 DiscussionofDensityEstimation 115 7.4 Clustering-FreeAlgorithms 118 7.5 AUnivariateAlgorithm 123 7.6 AViewfromAlgebraicGeometry 127 7.7 Exercises 131 8 MatrixCompletion 132 8.1 Introduction 132 8.2 NuclearNorm 135 8.3 QuantumGolfing 139 Bibliography 143 Index 150 Preface The monograph is based on the class Algorithmic Aspects of Machine LearningtaughtatMITinfall2013,spring2015,andfall2017.Thankyouto allthestudentsandpostdocswhoparticipatedinthisclassandmadeteaching itawonderfulexperience. vii
Description: