ebook img

Learning From Data PDF

215 Pages·2015·4.98 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Learning From Data

LEARNING FROM DATA Theb ookw ebsiAtMeL boocko.m contaisnusp porting mfaotre rial instrucatnodrr esa ders. LEARNING FDRAOTMA A SHORT COURSE YaseSr. A bu-Msotafa CalifIonrsntiiaot fTu etceh nology MaliMka gdon-Ismail RenssePloaleyrt eIcnhsntiict ute Hsuan-TiLeinn NatioTnaailw Uanni versity AMLbook.com YasSe.Ar b u 1/foasfat MalMiakg don Ismail DepartmofeE nltesc tErnigcianle eDreipnagr tomfCe onmtp uStceire nce anCdo mpuStceire nce CalifIonrsntiioatfT u etceh nologRye nssPeoltlaeyecrh Innisct itute PasadCeAn9 a11,2 U5S,A TroyN,Y 1 128U0S,A yaser©caltech.edu [email protected] HsuaTnieLni n DepartomfCe onmtp uStceire nce anIdn formaEtnigoinn eering NatioTnaailwU anni versity Taip1e0iT6,a, i wan htlin©csie.ntu.edu.tw ISB1N:0 1 60049 006 9 ISB1N3 :197068049 006 4 @201Y2a sSe.r Abu MMoasltMiaakfg ad,o n Ismail, Hsuan Tien Lin. 1.10 Alrli grhetsse rTvheiwdso. r mka yn obte t ransolrac toepdii enwd h oolrei np art withtohuwetr itpteernm isosfti hoaenu thoNrosp .a rotft hipsu blicmaatyi on ber eprodsutcoeridena,d r etrsiyevsatolert m r,a nsmiinat ntfyeo dr omrb ya ny means-elemcetcrhoannpiihcco,at lo,c opyinogro, t hsecrawninsienp-grw,ii otrh out writpteernm isofst ihaoeun t heoxrcse,ap spt e rmiutntdeSeder c t1i70oo nr1 0o8f th1e79 6U nitSetda Ctoepsry igAhctt . LimoifLt i ab/iDliistcyl oafWi amrerra Wnhtyit:lh eae u thhoarvuses ed btehseti r efforitnps r epatrhiibnsog o tkh,em ya kneo r epreseonrwt aartriaownnit tirhees spetcott h aec curoarcc oym pletoeftn hceeos nst oenftt hsib so oakn sdp ecifically discalnaiyim mp lwiaerdr aonfmt eirecsh antoarfib tinlfeiosatrsp y a rticular purpose. Now arramnatyby ec reaotree xdt enbdyse adl reesp reseonrtw artiitsvtaeelsne s materTihaeal dsv.ia cnesd t ratceogniteahsi enreemdia nyn obte s uitabyloeu rf or situaYtoiuso hno.u clodn swuiltathp reosfsiwohneaarlpe p ropTrhieaa utteh.o rs shanlolbt e l iafbolaren lyo osfps r oofirat n oyt hceorm merdcaimaalg es, including bunto lti mittose pde ciinacli,d, ce onntsaelquoeron tthideaarlm ,a ges. Theu sien t hpiusb licoaftt riaodne ntarmaedse,m saerrkvmsia,cr eka sn,sd imilar termesve,in tf h eayrn eo itd entaissfi uecdih ns,o ttob et akaesan n e xpreosfs ion opinaisto own h etohren rott h eayrs eu bjteopc rto prireittgsah.r y Thibso owka ts ypebsyte htae u thaonrds pwraisn atnebdd o unidnt hUen ited StaotfeA sm erica. Too urt eachearnsdt) o o urs tudents Preface Thibso oiksd e sifogrna es dh ocrotu ornsm ea chlienaer nIitin asgs .h ort cournsoaeth , u rried couard seec.oa tfdFe era octmhh imionasvgt e erwr ei al, havdei stwihlawlteeb de liteobv ete h ceo troep tihcaestv esrtyu doeftn hte subjsehcotku nlodWw e.c hotshete i t'llee afrronmid nagtt ah'af ta ithfully describessu bwjiheasacb tot ,au t nthmdea dieat p otit no cotveorpi inct hse as torfya-hsliioOknue.hr o pitesh tahtre e acdaelnre aarltnlhf eu ndamentals oft hseu bjbeyrc eta dtihbneog o cko vteocr o ver. Learfrnoimnd ga thaad si sttihnecotr aentpdir caacltt riaccaIklyfs o .u reatdwb ooo ktsh afotc uosno nter aocrtk h oet her, feyeotluh aymtoa uy arree adaibnotguw tdo i ffesruebnjtae lcttosg. Ie ntt hhbieosrko w,e b alance thteh eoraenttdih cpear la cttihmceaa tlh,e maanttdih cheae lu ir.ciO sutr critfeoririn ocnl iusrs eiloenev .Ta hnecotrhyae ts tabtlhiceso hnecse ptual framewolreka rfinoisirn n cgl aundsdeoa d r,he e uritshtaiitmc psat chptee r­ formanocfre el aela rsnyisntgSe tmrse.n agnwtdeh ask sneoesfts hdei fferent paratrsse ep lloeu.dtO uprh iloistso osp ahiyylt i ikite s w:h awtek now, whawted onk'ntoa wn,wd h awtep arltlkiyna ow. Thbeo ockab net auighnet x actthley ioitrps dr eerts eeTdnh.ne o table excepmtabyie o n Ch2aw,ph tieicrsh m tohsteth eorcehtaipocttfaeh lbre o ok. Thteh eoorfgy e neraltihzattah tciihsoa npc toevrie scr esn ttrola ela rning fromd ataan,wd e m adaene ffotromt a kieat c cestsoaiw bildreee adership. Howee,rvi nstrwuhcoat romero sr e initnte hrpeer satcestdiimd caeays l k im ovei,rto rd eliautyn taifeltrt hper acticaolfC hmaept3tah erotreda su ght. You wiltlh wanetio ntcilceuexd eerdc ises (in gtrhaey boxes) throughout texTth.me a ipnu rpotsheee sxoeef r ictsioe sn egsa ge tahneed n rheaandceer understoafapn adritnigc ular toOpuirrce abfseoosirnen rpgaa tcionvge red. theex erocuiitsts eh sa t atrnheo ectyr utcoita hlleo gicaNle vfleorwt.h eless, thecyo ntuasieinfn ufolr maatnwideo s nt,rg loeynn couyrotaugor e e atdh em, eveinyf o duo nd'ott h etmoc ompleItnisotnrm.ua cyfit nosdro sm oef the exeracpipsreospa rs' ieaahtsoeym 'e wporrokb laenmwdse a ,l psroo vaidd­e ditipornoeabmlols f vadriyffiicniugnl tPytr hoeebm lsse ctaitto henen odf eacchh apter. Toh elipn strwuictpthro erpsa trhielniegrc tbuarseoesndt h beo ko,w e provsiudpep omrattienorgnit ahbleo owke'bss (AiMtLeb ocook)rn..T heirse alasfo or utmh acto vaedrdsi ttioopniianlcl es a rfrnoimn gd Waetw ai.l l vii PREFACE disctuhsefssut erh ientr hEep ilooftg hubieos o k. Acknowleidnag lmpehnatbo ertdfoiercer aa lcgh r o:uW pew oullidtk oe ( ) exproeugsrrs a ttiott uhadele u monofiu L re arSnyisntgGe rmosau tCp al tech whgoa vuesd etaeixlpefeeder dtb aZcekhC:ra at aelL,ti enLpgiA ,m rPirta tap, anJdo sSeipWlhel t .h atnhmkea nsyt udaenncdto sl lewahggoua evuessu seful feedbdaucrki ndge vtehletoo pftm hebinosko ,es pieaclClhyu n-LWi.euT ih e CaltLeicbhrs atraeyffs ,pi eaclKlryi sBtuixnta onnDd a viMdc Cashlaivne, givuesen x cealdlveainncthde e lp sie-nlp fuobulrei offsr.hWt ien gat lhsaon k LuciAncdoas thaeh refo ltrph routghhweor uitto iftnh gbi oso k. La,sb tunto lte ,aw sewt oullidtk ote h aonukfrm a ilfoirte hse einrc ourage­ men,tt hesiurp paonrmdto ,so tfa ltlh epiart iaestn hceeeyn dutrheed time dematnhdwasrt itabi ongoh kai sm poosnue .sd YaseSr.A bu-MfoaPs,at saadCeanolari,fn ia. MalMiakg don-TIrsomNyae,iwY l o,r k. HsuanL-TiiTnea,pni e Tiai,w an. Marc2h0,1 2. viii Contents Prefa e vii 1 The Learning Problem 1 1.1 Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Components of Learning . . . . . . . . . . . . . . . . . . 3 1.1.2 A Simple Learning Model . . . . . . . . . . . . . . . . . 5 1.1.3 Learning versus Design . . . . . . . . . . . . . . . . . . 9 1.2 Types of Learning . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . 11 1.2.2 Reinfor ement Learning . . . . . . . . . . . . . . . . . . 12 1.2.3 Unsupervised Learning. . . . . . . . . . . . . . . . . . . 13 1.2.4 Other Views of Learning . . . . . . . . . . . . . . . . . . 14 1.3 Is Learning Feasible? . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.1 Outside the Data Set. . . . . . . . . . . . . . . . . . . . 16 1.3.2 Probability to the Res ue . . . . . . . . . . . . . . . . . 18 1.3.3 Feasibility of Learning . . . . . . . . . . . . . . . . . . . 24 1.4 Error and Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.4.1 Error Measures . . . . . . . . . . . . . . . . . . . . . . . 28 1.4.2 Noisy Targets . . . . . . . . . . . . . . . . . . . . . . . . 30 1.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2 Training versus Testing 39 2.1 Theory of Generalization. . . . . . . . . . . . . . . . . . . . . . 39 2.1.1 E(cid:27)e tive Number of Hypotheses . . . . . . . . . . . . . 41 2.1.2 Bounding the Growth Fun tion . . . . . . . . . . . . . . 46 2.1.3 The VC Dimension . . . . . . . . . . . . . . . . . . . . . 50 2.1.4 The VC Generalization Bound . . . . . . . . . . . . . . 53 2.2 Interpreting the Generalization Bound . . . . . . . . . . . . . . 55 2.2.1 Sample Complexity. . . . . . . . . . . . . . . . . . . . . 57 2.2.2 Penalty for Model Complexity . . . . . . . . . . . . . . 58 2.2.3 The Test Set . . . . . . . . . . . . . . . . . . . . . . . . 59 2.2.4 Other Target Types . . . . . . . . . . . . . . . . . . . . 61 2.3 Approximation-GeneralizationTradeo(cid:27) . . . . . . . . . . . . . 62 ix

Description:
@2012 Yaser S. Abu Mostafa, Malik Magdon Ismail, Hsuan Tien Lin. 1.10 Learning from data has distinct theoretical and practical tracks. If you.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.