Introduction to Machine Learning Second Edition AdaptiveComputationandMachineLearning ThomasDietterich,Editor ChristopherBishop,DavidHeckerman,MichaelJordan,andMichael Kearns,AssociateEditors Acompletelistofbooks publishedinTheAdaptiveComputationand MachineLearningseriesappearsatthebackof thisbook. Introduction to Machine Learning Second E d i t i o n Ethem Alpaydın The MIT Press Cambridge, Massachusetts London, England ©2010MassachusettsInstituteofTechnology Allrightsreserved. Nopartofthisbookmaybereproducedinanyformbyany electronicormechanicalmeans(includingphotocopying,recording,orinforma- tionstorageandretrieval)withoutpermissioninwritingfromthepublisher. Forinformationaboutspecialquantitydiscounts,pleaseemail [email protected]. Typesetin10/13LucidaBrightbytheauthorusingLATEX2ε. PrintedandboundintheUnitedStatesofAmerica. LibraryofCongressCataloging-in-PublicationInformation Alpaydin,Ethem. Introductiontomachinelearning/EthemAlpaydin. —2nded. p. cm. Includesbibliographicalreferencesandindex. ISBN978-0-262-01243-0(hardcover: alk. paper) 1. Machinelearning. I.Title Q325.5.A462010 006.3’1—dc22 2009013169 CIP 10987654321 Brief Contents 1 Introduction 1 2 SupervisedLearning 21 3 BayesianDecisionTheory 47 4 ParametricMethods 61 5 MultivariateMethods 87 6 DimensionalityReduction 109 7 Clustering 143 8 NonparametricMethods 163 9 DecisionTrees 185 10 LinearDiscrimination 209 11 MultilayerPerceptrons 233 12 Local Models 279 13 KernelMachines 309 14 BayesianEstimation 341 15 HiddenMarkovModels 363 16 GraphicalModels 387 17 CombiningMultipleLearners 419 18 Reinforcement Learning 447 19 DesignandAnalysisofMachineLearningExperiments 475 A Probability 517 Contents Series Foreword xvii Figures xix Tables xxix Preface xxxi Acknowledgments xxxiii NotesfortheSecondEdition xxxv Notations xxxix 1 Introduction 1 1.1 WhatIsMachineLearning? 1 1.2 Examplesof MachineLearningApplications 4 1.2.1 LearningAssociations 4 1.2.2 Classification 5 1.2.3 Regression 9 1.2.4 UnsupervisedLearning 11 1.2.5 ReinforcementLearning 13 1.3 Notes 14 1.4 RelevantResources 16 1.5 Exercises 18 1.6 References 19 2 Supervised Learning 21 2.1 LearningaClassfromExamples 21 viii Contents 2.2 Vapnik-Chervonenkis(VC)Dimension 27 2.3 ProbablyApproximatelyCorrect(PAC)Learning 29 2.4 Noise 30 2.5 LearningMultipleClasses 32 2.6 Regression 34 2.7 ModelSelectionandGeneralization 37 2.8 DimensionsofaSupervisedMachineLearningAlgorithm 41 2.9 Notes 42 2.10 Exercises 43 2.11 References 44 3 Bayesian Decision Theory 47 3.1 Introduction 47 3.2 Classification 49 3.3 LossesandRisks 51 3.4 DiscriminantFunctions 53 3.5 UtilityTheory 54 3.6 Association Rules 55 3.7 Notes 58 3.8 Exercises 58 3.9 References 59 4 ParametricMethods 61 4.1 Introduction 61 4.2 MaximumLikelihood Estimation 62 4.2.1 BernoulliDensity 63 4.2.2 MultinomialDensity 64 4.2.3 Gaussian(Normal)Density 64 4.3 EvaluatinganEstimator: BiasandVariance 65 4.4 TheBayes’Estimator 66 4.5 ParametricClassification 69 4.6 Regression 73 4.7 TuningModelComplexity: Bias/VarianceDilemma 76 4.8 ModelSelectionProcedures 80 4.9 Notes 84 4.10 Exercises 84 4.11 References 85 5 MultivariateMethods 87 5.1 MultivariateData 87