ebook img

Machine learning. A Bayesian and optimization perspective PDF

1146 Pages·2020·16.764 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine learning. A Bayesian and optimization perspective

Machine Learning A Bayesian and Optimization Perspective Machine Learning A Bayesian and Optimization Perspective 2nd Edition Sergios Theodoridis Department of Informatics and Telecommunications National and Kapodistrian University of Athens Athens, Greece Shenzhen Research Institute of Big Data The Chinese University of Hong Kong Shenzhen, China AcademicPressisanimprintofElsevier 125LondonWall,LondonEC2Y5AS,UnitedKingdom 525BStreet,Suite1650,SanDiego,CA92101,UnitedStates 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates TheBoulevard,LangfordLane,Kidlington,OxfordOX51GB,UnitedKingdom Copyright©2020ElsevierLtd.Allrightsreserved. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans,electronicormechanical, includingphotocopying,recording,oranyinformationstorageandretrievalsystem,withoutpermissioninwritingfromthe publisher.Detailsonhowtoseekpermission,furtherinformationaboutthePublisher’spermissionspoliciesandour arrangementswithorganizationssuchastheCopyrightClearanceCenterandtheCopyrightLicensingAgency,canbefound atourwebsite:www.elsevier.com/permissions. ThisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythePublisher(otherthanasmay benotedherein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchandexperiencebroadenour understanding,changesinresearchmethods,professionalpractices,ormedicaltreatmentmaybecomenecessary. Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgeinevaluatingandusingany information,methods,compounds,orexperimentsdescribedherein.Inusingsuchinformationormethodstheyshouldbe mindfuloftheirownsafetyandthesafetyofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility. Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors,assumeanyliabilityforany injuryand/ordamagetopersonsorpropertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseor operationofanymethods,products,instructions,orideascontainedinthematerialherein. LibraryofCongressCataloging-in-PublicationData AcatalogrecordforthisbookisavailablefromtheLibraryofCongress BritishLibraryCataloguing-in-PublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary ISBN:978-0-12-818803-3 ForinformationonallAcademicPresspublications visitourwebsiteathttps://www.elsevier.com/books-and-journals Publisher:MaraConner AcquisitionsEditor:TimPitts EditorialProjectManager:CharlotteRowley ProductionProjectManager:PaulPrasadChandramohan Designer:GregHarris TypesetbyVTeX (cid:2)τo (cid:4)(cid:5)σπoινα`κι ForEverything AllTheseYears Contents AbouttheAuthor ........................................................... xxi Preface ................................................................... xxiii Acknowledgments .......................................................... xxv Notation ..................................................................xxvii CHAPTER1 Introduction ................................................ 1 1.1 TheHistoricalContext ........................................... 1 1.2 ArtificialIntelligenceandMachineLearning .......................... 2 1.3 AlgorithmsCanLearnWhatIsHiddenintheData ...................... 4 1.4 TypicalApplicationsofMachineLearning ............................ 6 SpeechRecognition ...................................... 6 ComputerVision ........................................ 6 MultimodalData ........................................ 6 NaturalLanguageProcessing ............................... 7 Robotics .............................................. 7 AutonomousCars ....................................... 7 ChallengesfortheFuture .................................. 8 1.5 MachineLearning:MajorDirections ................................ 8 1.5.1 SupervisedLearning ..................................... 8 1.6 UnsupervisedandSemisupervisedLearning........................... 11 1.7 StructureandaRoadMapoftheBook ............................... 12 References.................................................... 16 CHAPTER2 ProbabilityandStochasticProcesses ............................. 19 2.1 Introduction ................................................... 20 2.2 ProbabilityandRandomVariables .................................. 20 2.2.1 Probability ............................................. 20 2.2.2 DiscreteRandomVariables ................................ 22 2.2.3 ContinuousRandomVariables .............................. 24 2.2.4 MeanandVariance....................................... 25 2.2.5 TransformationofRandomVariables ......................... 28 2.3 ExamplesofDistributions ........................................ 29 2.3.1 DiscreteVariables ....................................... 29 2.3.2 ContinuousVariables ..................................... 32 2.4 StochasticProcesses ............................................ 41 2.4.1 First-andSecond-OrderStatistics ........................... 42 2.4.2 StationarityandErgodicity ................................. 43 2.4.3 PowerSpectralDensity ................................... 46 2.4.4 AutoregressiveModels .................................... 51 2.5 InformationTheory ............................................. 54 vii viii Contents 2.5.1 DiscreteRandomVariables ................................ 56 2.5.2 ContinuousRandomVariables .............................. 59 2.6 StochasticConvergence .......................................... 61 ConvergenceEverywhere .................................. 62 ConvergenceAlmostEverywhere ............................ 62 ConvergenceintheMean-SquareSense ....................... 62 ConvergenceinProbability ................................ 63 ConvergenceinDistribution ................................ 63 Problems ..................................................... 63 References.................................................... 65 CHAPTER3 LearninginParametricModeling:BasicConceptsandDirections ......... 67 3.1 Introduction ................................................... 67 3.2 ParameterEstimation:theDeterministicPointofView ................... 68 3.3 LinearRegression .............................................. 71 3.4 Classification .................................................. 75 GenerativeVersusDiscriminativeLearning .................... 78 3.5 BiasedVersusUnbiasedEstimation ................................. 80 3.5.1 BiasedorUnbiasedEstimation? ............................. 81 3.6 TheCramér–RaoLowerBound .................................... 83 3.7 SufficientStatistic .............................................. 87 3.8 Regularization ................................................. 89 InverseProblems:Ill-ConditioningandOverfitting ............... 91 3.9 TheBias–VarianceDilemma ...................................... 93 3.9.1 Mean-SquareErrorEstimation .............................. 94 3.9.2 Bias–VarianceTradeoff ................................... 95 3.10 MaximumLikelihoodMethod ..................................... 98 3.10.1 LinearRegression:theNonwhiteGaussianNoiseCase ............ 101 3.11 BayesianInference ............................................. 102 3.11.1 TheMaximumaPosterioriProbabilityEstimationMethod ......... 107 3.12 CurseofDimensionality ......................................... 108 3.13 Validation .................................................... 109 Cross-Validation ........................................ 111 3.14 ExpectedLossandEmpiricalRiskFunctions .......................... 112 Learnability ............................................ 113 3.15 NonparametricModelingandEstimation ............................. 114 Problems ..................................................... 114 MATLAB®Exercises .................................... 119 References.................................................... 119 CHAPTER4 Mean-SquareErrorLinearEstimation ............................. 121 4.1 Introduction ................................................... 121 4.2 Mean-SquareErrorLinearEstimation:theNormalEquations.............. 122 4.2.1 TheCostFunctionSurface ................................. 123 4.3 AGeometricViewpoint:OrthogonalityCondition ...................... 124 Contents ix 4.4 ExtensiontoComplex-ValuedVariables .............................. 127 4.4.1 WidelyLinearComplex-ValuedEstimation .................... 129 4.4.2 OptimizingWithRespecttoComplex-ValuedVariables: WirtingerCalculus ....................................... 132 4.5 LinearFiltering ................................................ 134 4.6 MSELinearFiltering:aFrequencyDomainPointofView ................ 136 Deconvolution:ImageDeblurring............................ 137 4.7 SomeTypicalApplications ....................................... 140 4.7.1 InterferenceCancelation .................................. 140 4.7.2 SystemIdentification ..................................... 141 4.7.3 Deconvolution:ChannelEqualization ......................... 143 4.8 AlgorithmicAspects:theLevinsonandLattice-LadderAlgorithms ......... 149 ForwardandBackwardMSEOptimalPredictors ................ 151 4.8.1 TheLattice-LadderScheme ................................ 154 4.9 Mean-SquareErrorEstimationofLinearModels ....................... 158 4.9.1 TheGauss–MarkovTheorem ............................... 160 4.9.2 ConstrainedLinearEstimation:theBeamformingCase ........... 162 4.10 Time-VaryingStatistics:KalmanFiltering ............................ 166 Problems ..................................................... 172 MATLAB®Exercises .................................... 174 References.................................................... 176 CHAPTER5 OnlineLearning:theStochasticGradientDescentFamilyofAlgorithms ..... 179 5.1 Introduction ................................................... 180 5.2 TheSteepestDescentMethod ..................................... 181 5.3 ApplicationtotheMean-SquareErrorCostFunction .................... 184 Time-VaryingStepSizes .................................. 190 5.3.1 TheComplex-ValuedCase ................................. 193 5.4 StochasticApproximation ........................................ 194 ApplicationtotheMSELinearEstimation ..................... 196 5.5 TheLeast-Mean-SquaresAdaptiveAlgorithm ......................... 198 5.5.1 ConvergenceandSteady-StatePerformanceoftheLMSinStationary Environments ........................................... 199 5.5.2 CumulativeLossBounds .................................. 204 5.6 TheAffineProjectionAlgorithm ................................... 206 GeometricInterpretationofAPA ............................ 208 OrthogonalProjections.................................... 208 5.6.1 TheNormalizedLMS .................................... 211 5.7 TheComplex-ValuedCase ........................................ 213 TheWidelyLinearLMS .................................. 213 TheWidelyLinearAPA ................................... 214 5.8 RelativesoftheLMS ............................................ 214 TheSign-ErrorLMS ..................................... 214 TheLeast-Mean-Fourth(LMF)Algorithm ..................... 215 Transform-DomainLMS .................................. 215 x Contents 5.9 SimulationExamples ............................................ 218 5.10 AdaptiveDecisionFeedbackEqualization ............................ 221 5.11 TheLinearlyConstrainedLMS .................................... 224 5.12 TrackingPerformanceoftheLMSinNonstationaryEnvironments .......... 225 5.13 DistributedLearning:theDistributedLMS............................ 227 5.13.1 CooperationStrategies .................................... 228 5.13.2 TheDiffusionLMS ...................................... 231 5.13.3 ConvergenceandSteady-StatePerformance:SomeHighlights ...... 237 5.13.4 Consensus-BasedDistributedSchemes ........................ 240 5.14 ACaseStudy:TargetLocalization .................................. 241 5.15 SomeConcludingRemarks:ConsensusMatrix ........................ 243 Problems ..................................................... 244 MATLAB®Exercises .................................... 246 References.................................................... 247 CHAPTER6 TheLeast-SquaresFamily ...................................... 253 6.1 Introduction ................................................... 253 6.2 Least-SquaresLinearRegression:aGeometricPerspective................ 254 6.3 StatisticalPropertiesoftheLSEstimator ............................. 257 TheLSEstimatorIsUnbiased .............................. 257 CovarianceMatrixoftheLSEstimator ........................ 257 TheLSEstimatorIsBLUEinthePresenceofWhiteNoise ........ 258 TheLSEstimatorAchievestheCramér–RaoBoundforWhite GaussianNoise ......................................... 259 AsymptoticDistributionoftheLSEstimator ................... 260 6.4 OrthogonalizingtheColumnSpaceoftheInputMatrix:theSVDMethod .... 260 PseudoinverseMatrixandSVD ............................. 262 6.5 RidgeRegression:aGeometricPointofView ......................... 265 PrincipalComponentsRegression ........................... 267 6.6 TheRecursiveLeast-SquaresAlgorithm ............................. 268 Time-IterativeComputations ............................... 269 TimeUpdatingoftheParameters ............................ 270 6.7 Newton’sIterativeMinimizationMethod ............................. 271 6.7.1 RLSandNewton’sMethod ................................ 274 6.8 Steady-StatePerformanceoftheRLS ............................... 275 6.9 Complex-ValuedData:theWidelyLinearRLS ........................ 277 6.10 ComputationalAspectsoftheLSSolution ............................ 279 CholeskyFactorization.................................... 279 QRFactorization ........................................ 279 FastRLSVersions ....................................... 280 6.11 TheCoordinateandCyclicCoordinateDescentMethods ................. 281 6.12 SimulationExamples ............................................ 283 6.13 TotalLeast-Squares ............................................. 286 GeometricInterpretationoftheTotalLeast-SquaresMethod ........ 291 Problems ..................................................... 293

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.