ebook img

Introduction to Semi-supervised Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning) PDF

130 Pages·2009·1.12 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Introduction to Semi-supervised Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning)

Introduction to Semi-Supervised Learning Synthesis Lectures on Artificial Intelligence and Machine Learning Editors RonaldJ.Brachman,Yahoo!Research ThomasDietterich,OregonStateUniversity IntroductiontoSemi-SupervisedLearning XiaojinZhuandAndrewB.Goldberg 2009 ActionProgrammingLanguages MichaelThielscher 2008 RepresentationDiscoveryusingHarmonicAnalysis SridharMahadevan 2008 EssentialsofGameTheory:AConciseMultidisciplinaryIntroduction KevinLeyton-Brown,YoavShoham 2008 AConciseIntroductiontoMultiagentSystemsandDistributedArtificialIntelligence NikosVlassis 2007 IntelligentAutonomousRobotics:ARobotSoccerCaseStudy PeterStone 2007 Copyright© 2009byMorgan&Claypool Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedin anyformorbyanymeans—electronic,mechanical,photocopy,recording,oranyotherexceptforbriefquotationsin printedreviews,withoutthepriorpermissionofthepublisher. IntroductiontoSemi-SupervisedLearning XiaojinZhuandAndrewB.Goldberg www.morganclaypool.com ISBN:9781598295474 paperback ISBN:9781598295481 ebook DOI10.2200/S00196ED1V01Y200906AIM006 APublicationintheMorgan&ClaypoolPublishersseries SYNTHESISLECTURESONARTIFICIALINTELLIGENCEANDMACHINELEARNING Lecture#6 SeriesEditors:RonaldJ.Brachman,Yahoo!Research ThomasDietterich,OregonStateUniversity SeriesISSN SynthesisLecturesonArtificialIntelligenceandMachineLearning Print1939-4608 Electronic1939-4616 Introduction to Semi-Supervised Learning Xiaojin Zhu and Andrew B.Goldberg UniversityofWisconsin,Madison SYNTHESISLECTURESONARTIFICIALINTELLIGENCEAND MACHINELEARNING#6 M &C Morgan &cLaypool publishers ABSTRACT Semi-supervised learning is a learning paradigm concerned with the study of how computers and naturalsystemssuchashumanslearninthepresenceofbothlabeledandunlabeleddata.Traditionally, learning has been studied either in the unsupervised paradigm (e.g.,clustering,outlier detection) whereallthedataisunlabeled,orinthesupervisedparadigm(e.g.,classification,regression)where allthedataislabeled.Thegoalofsemi-supervisedlearningistounderstandhowcombininglabeled and unlabeled data may change the learning behavior,and design algorithms that take advantage of such a combination.Semi-supervised learning is of great interest in machine learning and data miningbecauseitcanusereadilyavailableunlabeleddatatoimprovesupervisedlearningtaskswhen thelabeleddataisscarceorexpensive.Semi-supervisedlearningalsoshowspotentialasaquantitative tool to understand human category learning,where most of the input is self-evidently unlabeled. In this introductory book, we present some popular semi-supervised learning models, including self-training,mixturemodels,co-trainingandmultiviewlearning,graph-basedmethods,andsemi- supervisedsupportvectormachines.Foreachmodel,wediscussitsbasicmathematicalformulation. The success of semi-supervised learning depends critically on some underlying assumptions. We emphasize the assumptions made by each model and give counterexamples when appropriate to demonstratethelimitationsofthedifferentmodels.Inaddition,wediscusssemi-supervisedlearning for cognitive psychology.Finally,we give a computational learning theoretic perspective on semi- supervised learning, and we conclude the book with a brief discussion of open questions in the field. KEYWORDS semi-supervisedlearning,transductivelearning,self-training,Gaussianmixturemodel, expectation maximization (EM), cluster-then-label, co-training, multiview learning, mincut,harmonicfunction,labelpropagation,manifoldregularization,semi-supervised supportvectormachines(S3VM),transductivesupportvectormachines(TSVM),en- tropyregularization,humansemi-supervisedlearning To our parents Yu and Jingquan Susan and Steven Goldberg with much love and gratitude. ix Contents Preface.........................................................................xiii 1 IntroductiontoStatisticalMachineLearning........................................1 1.1 TheData..................................................................2 1.2 UnsupervisedLearning......................................................2 1.3 SupervisedLearning........................................................3 2 OverviewofSemi-SupervisedLearning.............................................9 2.1 LearningfromBothLabeledandUnlabeledData..............................9 2.2 HowisSemi-SupervisedLearningPossible? .................................11 2.3 Inductivevs.TransductiveSemi-SupervisedLearning.........................12 2.4 Caveats...................................................................13 2.5 Self-TrainingModels......................................................15 3 MixtureModelsandEM.........................................................21 3.1 MixtureModelsforSupervisedClassification ................................21 3.2 MixtureModelsforSemi-SupervisedClassification...........................25 ∗ 3.3 OptimizationwiththeEMAlgorithm ......................................26 3.4 TheAssumptionsofMixtureModels........................................28 3.5 OtherIssuesinGenerativeModels..........................................30 3.6 Cluster-then-LabelMethods...............................................31 4 Co-Training.....................................................................35 4.1 TwoViewsofanInstance..................................................35 4.2 Co-Training ..............................................................36 4.3 TheAssumptionsofCo-Training...........................................37 ∗ 4.4 MultiviewLearning ......................................................38 x CONTENTS 5 Graph-BasedSemi-SupervisedLearning...........................................43 5.1 UnlabeledDataasSteppingStones..........................................43 5.2 TheGraph................................................................43 5.3 Mincut...................................................................45 5.4 HarmonicFunction........................................................47 ∗ 5.5 ManifoldRegularization ..................................................50 ∗ 5.6 TheAssumptionofGraph-BasedMethods .................................51 6 Semi-SupervisedSupportVectorMachines ........................................57 6.1 SupportVectorMachines ..................................................58 ∗ 6.2 Semi-SupervisedSupportVectorMachines .................................61 ∗ 6.3 EntropyRegularization ...................................................63 6.4 TheAssumptionofS3VMsandEntropyRegularization ......................65 7 HumanSemi-SupervisedLearning................................................69 7.1 FromMachineLearningtoCognitiveScience................................69 7.2 StudyOne:HumansLearnfromUnlabeledTestData.........................70 7.3 StudyTwo:PresenceofHumanSemi-SupervisedLearninginaSimpleTask....72 7.4 StudyThree:AbsenceofHumanSemi-SupervisedLearninginaComplexTask 75 7.5 Discussions...............................................................77 8 TheoryandOutlook.............................................................79 ∗ 8.1 ASimplePACBoundforSupervisedLearning ..............................79 ∗ 8.2 ASimplePACBoundforSemi-SupervisedLearning ........................81 8.3 FutureDirectionsofSemi-SupervisedLearning..............................83 A BasicMathematicalReference....................................................85 B Semi-SupervisedLearningSoftware...............................................89 C Symbols ........................................................................93 Biography ..................................................................... 113

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.