ebook img

Machine Learning: a Concise Introduction PDF

268 Pages·2018·8.86 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine Learning: a Concise Introduction

Machine Learning: a Concise Introduction WILEYSERIESINPROBABILITYANDSTATISTICS EstablishedbyWalterA.ShewhartandSamuelS.Wilks Editors:DavidJ.Balding,NoelA.C.Cressie,GarrettM.Fitzmaurice,GeofH.Givens, HarveyGoldstein,GeertMolenberghs,DavidW.Scott,AdrianF.M.Smith,RueyS.Tsay EditorsEmeriti:J.StuartHunter,IainM.Johnstone,JosephB.Kadane,JozefL.Teugels TheWileySeriesinProbabilityandStatisticsiswellestablishedandauthoritative.Itcovers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developmentsinthefieldandclassicalmethods. Reflecting the wide range of current research in statistics, the series encompasses applied, methodologicalandtheoreticalstatistics,rangingfromapplicationsandnewtechniquesmade possiblebyadvancesincomputerizedpracticetorigoroustreatmentoftheoreticalapproaches. Thisseriesprovidesessentialandinvaluablereadingforallstatisticians,whetherinacademia, industry,government,orresearch. Acompletelistoftitlesinthisseriescanbefoundathttp://www.wiley.com/go/wsps Machine Learning: a Concise Introduction Steven W. Knox Thiseditionfirstpublished2018 ThisworkisaU.S.GovernmentworkandisinthepublicdomainintheU.S.A. Published2018byJohnWiley&Sons,Inc Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,or transmitted,inanyformorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise, exceptaspermittedbylaw.Adviceonhowtoobtainpermissiontoreusematerialfromthistitleis availableathttp://www.wiley.com/go/permissions. TherightofStevenW.Knoxtobeidentifiedastheauthorofthisworkhasbeenassertedinaccordance withlaw. RegisteredOffice(s) JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ07030,USA EditorialOffice 111RiverStreet,Hoboken,NJ07030,USA Fordetailsofourglobaleditorialoffices,customerservices,andmoreinformationaboutWileyproducts visitusatwww.wiley.com. Wileyalsopublishesitsbooksinavarietyofelectronicformatsandbyprint-on-demand.Somecontent thatappearsinstandardprintversionsofthisbookmaynotbeavailableinotherformats. LimitofLiability/DisclaimerofWarranty Inviewofongoingresearch,equipmentmodifications,changesingovernmentalregulations,andthe constantflowofinformationrelatingtotheuseofexperimentalreagents,equipment,anddevices,the readerisurgedtoreviewandevaluatetheinformationprovidedinthepackageinsertorinstructionsfor eachchemical,pieceofequipment,reagent,ordevicefor,amongotherthings,anychangesinthe instructionsorindicationofusageandforaddedwarningsandprecautions.Whilethepublisherand authorshaveusedtheirbesteffortsinpreparingthiswork,theymakenorepresentationsorwarranties withrespecttotheaccuracyorcompletenessofthecontentsofthisworkandspecificallydisclaimall warranties,includingwithoutlimitationanyimpliedwarrantiesofmerchantabilityorfitnessfora particularpurpose.Nowarrantymaybecreatedorextendedbysalesrepresentatives,writtensales materialsorpromotionalstatementsforthiswork.Thefactthatanorganization,website,orproductis referredtointhisworkasacitationand/orpotentialsourceoffurtherinformationdoesnotmeanthatthe publisherandauthorsendorsetheinformationorservicestheorganization,website,orproductmay provideorrecommendationsitmaymake.Thisworkissoldwiththeunderstandingthatthepublisheris notengagedinrenderingprofessionalservices.Theadviceandstrategiescontainedhereinmaynotbe suitableforyoursituation.Youshouldconsultwithaspecialistwhereappropriate.Further,readers shouldbeawarethatwebsiteslistedinthisworkmayhavechangedordisappearedbetweenwhenthis workwaswrittenandwhenitisread.Neitherthepublishernorauthorsshallbeliableforanylossof profitoranyothercommercialdamages,includingbutnotlimitedtospecial,incidental,consequential,or otherdamages. LibraryofCongressCataloging-in-PublicationData Names:Knox,StevenW.,author. Title:Machinelearning:aconciseintroduction/byStevenW.Knox. Description:Hoboken,NewJersey:JohnWiley&Sons,2018.|Series:Wileyseriesinprobabilityand statistics| Identifiers:LCCN2017058505(print)|LCCN2018004509(ebook)|ISBN9781119439073(pdf)| ISBN9781119438984(epub)|ISBN9781119439196(cloth) Subjects:LCSH:Machinelearning. Classification:LCCQ325.5(ebook)|LCCQ325.5.K5682018(print)|DDC006.3/1–dc23 LCrecordavailableathttps://lccn.loc.gov/2017058505 Coverimage:©Verticalarray/Shutterstock CoverdesignbyWiley Setin10/12ptTimesStdbyAptaraInc.,NewDelhi,India 10 9 8 7 6 5 4 3 2 1 Contents Preface xi Organization—HowtoUseThisBook xiii Acknowledgments xvii AbouttheCompanionWebsite xix 1 Introduction—ExamplesfromRealLife 1 2 TheProblemofLearning 3 2.1 Domain 4 2.2 Range 4 2.3 Data 4 2.4 Loss 6 2.5 Risk 8 2.6 TheRealityoftheUnknownFunction 12 2.7 TrainingandSelectionofModels,andPurposesofLearning 12 2.8 Notation 13 3 Regression 15 3.1 GeneralFramework 16 3.2 Loss 17 3.3 EstimatingtheModelParameters 17 3.4 PropertiesofFittedValues 19 3.5 EstimatingtheVariance 22 3.6 ANormalityAssumption 23 3.7 Computation 24 3.8 CategoricalFeatures 25 3.9 FeatureTransformations,Expansions,andInteractions 27 3.10 VariationsinLinearRegression 28 3.11 NonparametricRegression 32 vi CONTENTS 4 SurveyofClassificationTechniques 33 4.1 TheBayesClassifier 34 4.2 IntroductiontoClassifiers 37 4.3 ARunningExample 38 4.4 LikelihoodMethods 40 4.4.1 QuadraticDiscriminantAnalysis 41 4.4.2 LinearDiscriminantAnalysis 43 4.4.3 GaussianMixtureModels 45 4.4.4 KernelDensityEstimation 47 4.4.5 Histograms 51 4.4.6 TheNaiveBayesClassifier 54 4.5 PrototypeMethods 54 4.5.1 k-Nearest-Neighbor 55 4.5.2 Condensedk-Nearest-Neighbor 56 4.5.3 Nearest-Cluster 56 4.5.4 LearningVectorQuantization 58 4.6 LogisticRegression 59 4.7 NeuralNetworks 62 4.7.1 ActivationFunctions 62 4.7.2 Neurons 64 4.7.3 NeuralNetworks 65 4.7.4 LogisticRegressionandNeuralNetworks 73 4.8 ClassificationTrees 74 4.8.1 ClassificationofDatabyLeaves(TerminalNodes) 74 4.8.2 ImpurityofNodesandTrees 75 4.8.3 GrowingTrees 76 4.8.4 PruningTrees 79 4.8.5 RegressionTrees 81 4.9 SupportVectorMachines 81 4.9.1 SupportVectorMachineClassifiers 81 4.9.2 Kernelization 88 4.9.3 ProximalSupportVectorMachineClassifiers 92 4.10 Postscript:ExampleProblemRevisited 93 5 Bias–VarianceTrade-off 97 5.1 Squared-ErrorLoss 98 5.2 ArbitraryLoss 101 6 CombiningClassifiers 107 6.1 Ensembles 107 6.2 EnsembleDesign 110 6.3 BootstrapAggregation(Bagging) 112 CONTENTS vii 6.4 Bumping 115 6.5 RandomForests 116 6.6 Boosting 118 6.7 Arcing 121 6.8 StackingandMixtureofExperts 121 7 RiskEstimationandModelSelection 127 7.1 RiskEstimationviaTrainingData 128 7.2 RiskEstimationviaValidationorTestData 128 7.2.1 Training,Validation,andTestData 128 7.2.2 RiskEstimation 129 7.2.3 SizeofTraining,Validation,andTestSets 130 7.2.4 TestingHypothesesAboutRisk 131 7.2.5 ExampleofUseofTraining,Validation,andTestSets 132 7.3 Cross-Validation 133 7.4 ImprovementsonCross-Validation 135 7.5 Out-of-BagRiskEstimation 137 7.6 Akaike’sInformationCriterion 138 7.7 Schwartz’sBayesianInformationCriterion 138 7.8 Rissanen’sMinimumDescriptionLengthCriterion 139 7.9 R2andAdjustedR2 140 7.10 StepwiseModelSelection 141 7.11 Occam’sRazor 142 8 Consistency 143 8.1 ConvergenceofSequencesofRandomVariables 144 8.2 ConsistencyforParameterEstimation 144 8.3 ConsistencyforPrediction 145 8.4 ThereAreConsistentandUniversallyConsistentClassifiers 145 8.5 ConvergencetoAsymptopiaIsNotUniformandMayBeSlow 147 9 Clustering 149 9.1 GaussianMixtureModels 150 9.2 k-Means 150 9.3 ClusteringbyMode-HuntinginaDensityEstimate 151 9.4 UsingClassifierstoCluster 152 9.5 Dissimilarity 153 9.6 k-Medoids 153 9.7 AgglomerativeHierarchicalClustering 154 9.8 DivisiveHierarchicalClustering 155 9.9 HowManyClustersAreThere?InterpretationofClustering 155 9.10 AnImpossibilityTheorem 157 viii CONTENTS 10 Optimization 159 10.1 Quasi-NewtonMethods 160 10.1.1 Newton’sMethodforFindingZeros 160 10.1.2 Newton’sMethodforOptimization 161 10.1.3 GradientDescent 161 10.1.4 TheBFGSAlgorithm 162 10.1.5 ModificationstoQuasi-NewtonMethods 162 10.1.6 GradientsforLogisticRegressionandNeural Networks 163 10.2 TheNelder–MeadAlgorithm 166 10.3 SimulatedAnnealing 168 10.4 GeneticAlgorithms 168 10.5 ParticleSwarmOptimization 169 10.6 GeneralRemarksonOptimization 170 10.6.1 ImperfectlyKnownObjectiveFunctions 170 10.6.2 ObjectiveFunctionsWhichAreSums 171 10.6.3 OptimizationfromMultipleStartingPoints 172 10.7 TheExpectation-MaximizationAlgorithm 173 10.7.1 TheGeneralAlgorithm 173 10.7.2 EMClimbstheMarginalLikelihoodofthe Observations 173 10.7.3 Example—FittingaGaussianMixtureModelViaEM 176 10.7.4 Example—TheExpectationStep 177 10.7.5 Example—TheMaximizationStep 178 11 High-DimensionalData 179 11.1 TheCurseofDimensionality 180 11.2 TwoRunningExamples 187 11.2.1 Example1:EquilateralSimplex 187 11.2.2 Example2:Text 187 11.3 ReducingDimensionWhilePreservingInformation 190 11.3.1 TheGeometryofMeansandCovariancesofReal Features 190 11.3.2 PrincipalComponentAnalysis 192 11.3.3 Workingin“DissimilaritySpace” 193 11.3.4 LinearMultidimensionalScaling 195 11.3.5 TheSingularValueDecompositionandLow-Rank Approximation 197 11.3.6 Stress-MinimizingMultidimensionalScaling 199 11.3.7 ProjectionPursuit 199 11.3.8 FeatureSelection 201 11.3.9 Clustering 202 CONTENTS ix 11.3.10 ManifoldLearning 202 11.3.11 Autoencoders 205 11.4 ModelRegularization 209 11.4.1 DualityandtheGeometryofParameterPenalization 212 11.4.2 ParameterPenalizationasPriorInformation 213 12 CommunicationwithClients 217 12.1 BinaryClassificationandHypothesisTesting 218 12.2 TerminologyforBinaryDecisions 219 12.3 ROCCurves 219 12.4 One-DimensionalMeasuresofPerformance 224 12.5 ConfusionMatrices 225 12.6 MultipleTesting 226 12.6.1 ControltheFamilywiseError 226 12.6.2 ControltheFalseDiscoveryRate 227 12.7 ExpertSystems 228 13 CurrentChallengesinMachineLearning 231 13.1 StreamingData 231 13.2 DistributedData 231 13.3 Semi-supervisedLearning 232 13.4 ActiveLearning 232 13.5 FeatureConstructionviaDeepNeuralNetworks 233 13.6 TransferLearning 233 13.7 InterpretabilityofComplexModels 233 14 RSourceCode 235 14.1 Author’sBiases 236 14.2 Libraries 236 14.3 TheRunningExample(Section4.3) 237 14.4 TheBayesClassifier(Section4.1) 241 14.5 QuadraticDiscriminantAnalysis(Section4.4.1) 243 14.6 LinearDiscriminantAnalysis(Section4.4.2) 243 14.7 GaussianMixtureModels(Section4.4.3) 244 14.8 KernelDensityEstimation(Section4.4.4) 245 14.9 Histograms(Section4.4.5) 248 14.10 TheNaiveBayesClassifier(Section4.4.6) 253 14.11 k-Nearest-Neighbor(Section4.5.1) 255 14.12 LearningVectorQuantization(Section4.5.4) 257 14.13 LogisticRegression(Section4.6) 259 14.14 NeuralNetworks(Section4.7) 260 14.15 ClassificationTrees(Section4.8) 263

Description:
AN INTRODUCTION TO MACHINE LEARNING THAT INCLUDES THE FUNDAMENTAL TECHNIQUES, METHODS, AND APPLICATIONS Machine Learning: a Concise Introduction offers a comprehensive introduction to the core concepts, approaches, and applications of machine learning. The author—an expert in the field—presents
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.