ebook img

Data science and machine learning. Mathematical and statistical methods PDF

536 Pages·2020·14.575 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data science and machine learning. Mathematical and statistical methods

Data Science and Machine Learning Mathematical and Statistical Methods Chapman & Hall/CRC Machine Learning & Pattern Recognition Introduction to Machine Learning with Applications in Information Security Mark Stamp A First Course in Machine Learning Simon Rogers, Mark Girolami Statistical Reinforcement Learning: Modern Machine Learning Approaches Masashi Sugiyama Sparse Modeling: Theory, Algorithms, and Applications Irina Rish, Genady Grabarnik Computational Trust Models and Machine Learning Xin Liu, Anwitaman Datta, Ee-Peng Lim Regularization, Optimization, Kernels, and Support Vector Machines Johan A.K. Suykens, Marco Signoretto, Andreas Argyriou Machine Learning: An Algorithmic Perspective, Second Edition Stephen Marsland Bayesian Programming Pierre Bessiere, Emmanuel Mazer, Juan Manuel Ahuactzin, Kamel Mekhnacha Multilinear Subspace Learning: Dimensionality Reduction of Multidimensional Data Haiping Lu, Konstantinos N. Plataniotis, Anastasios Venetsanopoulos Data Science and Machine Learning: Mathematical and Statistical Methods Dirk P. Kroese, Zdravko I. Botev, Thomas Taimre, Radislav Vaisman For more information on this series please visit:https://www.crcpress.com/Chapman--HallCRC-Machine-Learning--Pattern-Recognition/book-series/erie Data Science and Machine Learning Mathematical and Statistical Methods Dirk P. Kroese Zdravko I. Botev Thomas Taimre Radislav Vaisman Front cover image reproduced with permission from J. A. Kroese. CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2020 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed on acid-free paper International Standard Book Number-13: 978-1-138-49253-0 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com To my wife and daughters: Lesley, Elise, and Jessica — DPK To Sarah, Sofia, and my parents — ZIB To my grandparents: Arno, Harry, Juta, and Maila — TT To Valerie — RV CONTENTS Preface xiii Notation xvii 1 Importing,Summarizing,andVisualizingData 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 StructuringFeaturesAccordingtoType . . . . . . . . . . . . . . . . . . 3 1.3 SummaryTables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 SummaryStatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 VisualizingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5.1 PlottingQualitativeVariables . . . . . . . . . . . . . . . . . . . . 9 1.5.2 PlottingQuantitativeVariables . . . . . . . . . . . . . . . . . . . 9 1.5.3 DataVisualizationinaBivariateSetting . . . . . . . . . . . . . . 12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 StatisticalLearning 19 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 SupervisedandUnsupervisedLearning . . . . . . . . . . . . . . . . . . . 20 2.3 TrainingandTestLoss . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 TradeoffsinStatisticalLearning . . . . . . . . . . . . . . . . . . . . . . 31 2.5 EstimatingRisk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.5.1 In-SampleRisk . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.5.2 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6 ModelingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.7 MultivariateNormalModels . . . . . . . . . . . . . . . . . . . . . . . . 44 2.8 NormalLinearModels . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.9 BayesianLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3 MonteCarloMethods 67 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2 MonteCarloSampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.2.1 GeneratingRandomNumbers . . . . . . . . . . . . . . . . . . . 68 3.2.2 SimulatingRandomVariables . . . . . . . . . . . . . . . . . . . 69 3.2.3 SimulatingRandomVectorsandProcesses . . . . . . . . . . . . . 74 3.2.4 Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.2.5 MarkovChainMonteCarlo . . . . . . . . . . . . . . . . . . . . . 78 3.3 MonteCarloEstimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 vii viii Contents 3.3.1 CrudeMonteCarlo . . . . . . . . . . . . . . . . . . . . . . . . . 85 3.3.2 BootstrapMethod . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.3.3 VarianceReduction . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.4 MonteCarloforOptimization . . . . . . . . . . . . . . . . . . . . . . . . 96 3.4.1 SimulatedAnnealing . . . . . . . . . . . . . . . . . . . . . . . . 96 3.4.2 Cross-EntropyMethod . . . . . . . . . . . . . . . . . . . . . . . 100 3.4.3 SplittingforOptimization . . . . . . . . . . . . . . . . . . . . . . 103 3.4.4 NoisyOptimization . . . . . . . . . . . . . . . . . . . . . . . . . 105 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4 UnsupervisedLearning 121 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.2 RiskandLossinUnsupervisedLearning . . . . . . . . . . . . . . . . . . 122 4.3 Expectation–Maximization(EM)Algorithm . . . . . . . . . . . . . . . . 128 4.4 EmpiricalDistributionandDensityEstimation . . . . . . . . . . . . . . . 131 4.5 ClusteringviaMixtureModels . . . . . . . . . . . . . . . . . . . . . . . 135 4.5.1 MixtureModels . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 4.5.2 EMAlgorithmforMixtureModels . . . . . . . . . . . . . . . . . 137 4.6 ClusteringviaVectorQuantization . . . . . . . . . . . . . . . . . . . . . 142 4.6.1 K-Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 4.6.2 ClusteringviaContinuousMultiextremalOptimization . . . . . . 146 4.7 HierarchicalClustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 4.8 PrincipalComponentAnalysis(PCA) . . . . . . . . . . . . . . . . . . . 153 4.8.1 Motivation:PrincipalAxesofanEllipsoid . . . . . . . . . . . . . 153 4.8.2 PCAandSingularValueDecomposition(SVD) . . . . . . . . . . 155 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 5 Regression 167 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 5.2 LinearRegression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.3 AnalysisviaLinearModels . . . . . . . . . . . . . . . . . . . . . . . . . 171 5.3.1 ParameterEstimation . . . . . . . . . . . . . . . . . . . . . . . . 171 5.3.2 ModelSelectionandPrediction . . . . . . . . . . . . . . . . . . . 172 5.3.3 Cross-ValidationandPredictiveResidualSumofSquares . . . . . 173 5.3.4 In-SampleRiskandAkaikeInformationCriterion . . . . . . . . . 175 5.3.5 CategoricalFeatures . . . . . . . . . . . . . . . . . . . . . . . . 177 5.3.6 NestedModels . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.3.7 CoefficientofDetermination . . . . . . . . . . . . . . . . . . . . 181 5.4 InferenceforNormalLinearModels . . . . . . . . . . . . . . . . . . . . 182 5.4.1 ComparingTwoNormalLinearModels . . . . . . . . . . . . . . 183 5.4.2 ConfidenceandPredictionIntervals . . . . . . . . . . . . . . . . 186 5.5 NonlinearRegressionModels . . . . . . . . . . . . . . . . . . . . . . . . 188 5.6 LinearModelsinPython . . . . . . . . . . . . . . . . . . . . . . . . . . 191 5.6.1 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 5.6.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.6.3 AnalysisofVariance(ANOVA) . . . . . . . . . . . . . . . . . . 195 Contents ix 5.6.4 ConfidenceandPredictionIntervals . . . . . . . . . . . . . . . . 198 5.6.5 ModelValidation . . . . . . . . . . . . . . . . . . . . . . . . . . 198 5.6.6 VariableSelection . . . . . . . . . . . . . . . . . . . . . . . . . . 199 5.7 GeneralizedLinearModels . . . . . . . . . . . . . . . . . . . . . . . . . 204 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 6 RegularizationandKernelMethods 215 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 6.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 6.3 ReproducingKernelHilbertSpaces . . . . . . . . . . . . . . . . . . . . . 222 6.4 ConstructionofReproducingKernels . . . . . . . . . . . . . . . . . . . . 224 6.4.1 ReproducingKernelsviaFeatureMapping . . . . . . . . . . . . . 224 6.4.2 KernelsfromCharacteristicFunctions . . . . . . . . . . . . . . . 225 6.4.3 ReproducingKernelsUsingOrthonormalFeatures . . . . . . . . 227 6.4.4 KernelsfromKernels . . . . . . . . . . . . . . . . . . . . . . . . 229 6.5 RepresenterTheorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 6.6 SmoothingCubicSplines . . . . . . . . . . . . . . . . . . . . . . . . . . 235 6.7 GaussianProcessRegression . . . . . . . . . . . . . . . . . . . . . . . . 238 6.8 KernelPCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 7 Classification 251 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 7.2 ClassificationMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 7.3 ClassificationviaBayes’Rule . . . . . . . . . . . . . . . . . . . . . . . 257 7.4 LinearandQuadraticDiscriminantAnalysis . . . . . . . . . . . . . . . . 259 7.5 LogisticRegressionandSoftmaxClassification . . . . . . . . . . . . . . 266 7.6 K-NearestNeighborsClassification . . . . . . . . . . . . . . . . . . . . . 268 7.7 SupportVectorMachine . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 7.8 ClassificationwithScikit-Learn . . . . . . . . . . . . . . . . . . . . . . . 277 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 8 DecisionTreesandEnsembleMethods 287 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 8.2 Top-DownConstructionofDecisionTrees . . . . . . . . . . . . . . . . . 289 8.2.1 RegionalPredictionFunctions . . . . . . . . . . . . . . . . . . . 290 8.2.2 SplittingRules . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 8.2.3 TerminationCriterion . . . . . . . . . . . . . . . . . . . . . . . . 292 8.2.4 BasicImplementation . . . . . . . . . . . . . . . . . . . . . . . . 294 8.3 AdditionalConsiderations . . . . . . . . . . . . . . . . . . . . . . . . . . 298 8.3.1 BinaryVersusNon-BinaryTrees . . . . . . . . . . . . . . . . . . 298 8.3.2 DataPreprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 298 8.3.3 AlternativeSplittingRules . . . . . . . . . . . . . . . . . . . . . 298 8.3.4 CategoricalVariables . . . . . . . . . . . . . . . . . . . . . . . . 299 8.3.5 MissingValues . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 8.4 ControllingtheTreeShape . . . . . . . . . . . . . . . . . . . . . . . . . 300 8.4.1 Cost-ComplexityPruning . . . . . . . . . . . . . . . . . . . . . . 303

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.