ebook img

Automated Machine Learning: Methods, Systems, Challenges PDF

222 Pages·2019·6.54 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Automated Machine Learning: Methods, Systems, Challenges

The Springer Series on Challenges in Machine Learning Frank Hutter Lars Kotthoff Joaquin Vanschoren Editors Automated Machine Learning Methods, Systems, Challenges The Springer Series on Challenges in Machine Learning Serieseditors HugoJairEscalante,AstrofisicaOpticayElectronica,INAOE,Puebla,Mexico IsabelleGuyon,ChaLearn,Berkeley,CA,USA SergioEscalera ,UniversityofBarcelona,Barcelona,Spain Thebooksinthisinnovativeseriescollectpaperswritteninthecontextofsuccessful competitions in machine learning. They also include analyses of the challenges, tutorial material, dataset descriptions, and pointersto data and software. Together with the websites of the challenge competitions, they offer a complete teaching toolkitandavaluableresourceforengineersandscientists. Moreinformationaboutthisseriesathttp://www.springer.com/series/15602 Frank Hutter • Lars Kotthoff • Joaquin Vanschoren Editors Automated Machine Learning Methods, Systems, Challenges 123 Editors FrankHutter LarsKotthoff DepartmentofComputerScience UniversityofWyoming UniversityofFreiburg Laramie,WY,USA Freiburg,Germany JoaquinVanschoren EindhovenUniversityofTechnology Eindhoven,TheNetherlands ISSN2520-131X ISSN2520-1328 (electronic) TheSpringerSeriesonChallengesinMachineLearning ISBN978-3-030-05317-8 ISBN978-3-030-05318-5 (eBook) https://doi.org/10.1007/978-3-030-05318-5 ©TheEditor(s)(ifapplicable)andTheAuthor(s)2019.Thisbookisanopenaccesspublication. Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicateifchangesweremade. Theimages or other third party material in this book are included in the book’s Creative Commons licence,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthebook’s CreativeCommonslicenceandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthe permitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland ToSophiaandTashia.–F.H. ToKobe,Elias,Ada,andVeerle.–J.V. TotheAutoMLcommunity,forbeingawesome.–F.H.,L.K.,andJ.V. Foreword “I’dliketousemachinelearning,butIcan’tinvestmuchtime.”Thatissomething you hear all too often in industry and from researchers in other disciplines. The resulting demand for hands-free solutions to machine learning has recently given rise to the field of automatedmachine learning (AutoML), and I’m delighted that withthisbook,thereisnowthefirstcomprehensiveguidetothisfield. I have been very passionate about automating machine learning myself ever since our Automatic Statistician project started back in 2014. I want us to be really ambitious in this endeavor; we should try to automate all aspects of the entire machine learning and data analysis pipeline. This includes automatingdata collectionandexperimentdesign;automatingdatacleanupandmissingdataimputa- tion;automatingfeatureselectionandtransformation;automatingmodeldiscovery, criticism, and explanation; automating the allocation of computational resources; automating hyperparameter optimization; automating inference; and automating model monitoring and anomaly detection. This is a huge list of things, and we’d optimallyliketoautomateallofit. There is a caveat of course. While full automation can motivate scientific researchandprovidealong-termengineeringgoal,inpractice,weprobablywantto semiautomatemostoftheseandgraduallyremovethehumanintheloopasneeded. Along the way, whatis going to happenif we try to do all this automationis that wearelikelytodeveloppowerfultoolsthatwillhelpmakethepracticeofmachine learning, first of all, more systematic (since it’s very ad hoc these days) and also moreefficient. Theseareworthygoalsevenifwedidnotsucceedinthefinalgoalofautomation, butasthisbookdemonstrates,currentAutoMLmethodscanalreadysurpasshuman machinelearningexpertsinseveraltasks.Thistrendislikelyonlygoingtointensify aswe’remakingprogressandascomputationbecomesevercheaper,andAutoML is therefore clearly one of the topics that is here to stay. It is a great time to get involvedinAutoML,andthisbookisanexcellentstartingpoint. Thisbookincludesveryup-to-dateoverviewsofthebread-and-buttertechniques we need in AutoML (hyperparameter optimization, meta-learning, and neural architecturesearch),providesin-depthdiscussionsofexistingAutoMLsystems,and vii viii Foreword thoroughlyevaluatesthestateoftheartinAutoMLinaseriesofcompetitionsthat ran since 2015. As such, I highly recommend this book to any machine learning researcher wanting to get started in the field and to any practitioner looking to understandthemethodsbehindalltheAutoMLtoolsoutthere. SanFrancisco,USA ZoubinGhahramani Professor,UniversityofCambridgeand ChiefScientist,Uber October2018 Preface The past decade has seen an explosion of machine learning research and appli- cations; especially, deep learning methods have enabled key advances in many applicationdomains,suchascomputervision,speechprocessing,andgameplaying. However, the performance of many machine learning methods is very sensitive to a plethora of design decisions, which constitutes a considerable barrier for new users. This is particularly true in the booming field of deep learning, where humanengineersneedto select the rightneuralarchitectures,trainingprocedures, regularizationmethods,andhyperparametersofallofthesecomponentsinorderto maketheirnetworksdowhattheyaresupposedtodowithsufficientperformance. This process has to be repeated for every application. Even experts are often left withtediousepisodesoftrialanderroruntiltheyidentifyagoodsetofchoicesfor aparticulardataset. Thefieldofautomatedmachinelearning(AutoML)aimstomakethesedecisions in a data-driven, objective, and automated way: the user simply provides data, andtheAutoMLsystemautomaticallydeterminestheapproachthatperformsbest for this particular application. Thereby, AutoML makes state-of-the-art machine learningapproachesaccessibletodomainscientistswhoareinterestedinapplying machine learning but do not have the resources to learn about the technologies behinditindetail.Thiscanbeseenasademocratizationofmachinelearning:with AutoML,customizedstate-of-the-artmachinelearningisateveryone’sfingertips. As we show in this book, AutoML approaches are already mature enough to rivalandsometimesevenoutperformhumanmachinelearningexperts.Putsimply, AutoML can lead to improved performance while saving substantial amounts of time and money,as machinelearningexpertsare both hardto find andexpensive. Asaresult,commercialinterestinAutoMLhasgrowndramaticallyinrecentyears, andseveralmajortechcompaniesarenowdevelopingtheirownAutoMLsystems. Wenote,though,thatthepurposeofdemocratizingmachinelearningisservedmuch betterbyopen-sourceAutoMLsystemsthanbyproprietarypaidblack-boxservices. This book presents an overview of the fast-moving field of AutoML. Due to the community’s current focus on deep learning, some researchers nowadays mistakenly equate AutoML with the topic of neural architecture search (NAS); ix x Preface but of course, if you’re reading this book, you know that – while NAS is an excellent example of AutoML – there is a lot more to AutoML than NAS. This book is intended to provide some background and starting points for researchers interestedindevelopingtheirownAutoMLapproaches,highlightavailablesystems for practitioners who want to apply AutoML to their problems, and provide an overview of the state of the art to researchers already working in AutoML. The bookisdividedintothreepartsonthesedifferentaspectsofAutoML. Part I presents an overview of AutoML methods. This part gives both a solid overviewfornovicesandservesasareferencetoexperiencedAutoMLresearchers. Chap.1discussestheproblemofhyperparameteroptimization,thesimplestand most common problemthat AutoML considers, and describes the wide variety of differentapproachesthatareapplied,withaparticularfocusonthemethodsthatare currentlymostefficient. Chap.2showshowtolearntolearn,i.e.,howtouseexperiencefromevaluating machinelearningmodelsto informhow to approachnew learningtasks with new data. Such techniques mimic the processes going on as a human transitions from a machine learning novice to an expert and can tremendously decrease the time requiredtogetgoodperformanceoncompletelynewmachinelearningtasks. Chap.3providesacomprehensiveoverviewofmethodsforNAS.Thisisoneof themostchallengingtasksinAutoML,sincethedesignspaceisextremelylargeand asingleevaluationofaneuralnetworkcantakeaverylongtime.Nevertheless,the areaisveryactive,andnewexcitingapproachesforsolvingNASappearregularly. PartIIfocusesonactualAutoMLsystemsthatevennoviceuserscanuse.Ifyou aremostinterestedinapplyingAutoMLtoyourmachinelearningproblems,thisis thepartyoushouldstartwith.Allofthe chaptersinthispartevaluatethe systems theypresenttoprovideanideaoftheirperformanceinpractice. Chap. 4 describes Auto-WEKA, one of the first AutoML systems. It is based on the well-known WEKA machine learning toolkit and searches over different classification and regression methods, their hyperparameter settings, and data preprocessing methods. All of this is available through WEKA’s graphical user interfaceattheclickofabutton,withouttheneedforasinglelineofcode. Chap. 5 givesan overview of Hyperopt-Sklearn,an AutoML framework based on the popularscikit-learn framework.Italso includesseveralhands-onexamples forhowtousesystem. Chap. 6 describes Auto-sklearn, which is also based on scikit-learn. It applies similar optimization techniques as Auto-WEKA and adds several improvements over other systems at the time, such as meta-learning for warmstarting the opti- mization and automatic ensembling. The chapter compares the performance of Auto-sklearntothatofthetwosystemsinthepreviouschapters,Auto-WEKAand Hyperopt-Sklearn.In two different versions, Auto-sklearn is the system that won thechallengesdescribedinPartIIIofthisbook. Chap. 7 gives an overview of Auto-Net, a system for automated deep learning thatselectsboththearchitectureandthehyperparametersofdeepneuralnetworks. AnearlyversionofAuto-Netproducedthefirstautomaticallytunedneuralnetwork thatwonagainsthumanexpertsinacompetitionsetting.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.