QQuuaassiiggrroouuppPPrroobblleemm DDeeffiinniittiioonn SSoollvviinngg tthhee QQuuaassiiggrroouupp !!GGiivveenn aa ppaarrttiiaall aassssiiggnnmmeenntt ooff ccoolloorrss,, ccaann tthhee ppaarrttiiaall qquuaaiissggrroouuppbbee ccoommpplleetteedd ttoo oobbttaaiinn aa ffuullll pprroobblleemm uussiinngg SSiimmuullaatteedd qquuaassiiggrroouupp?? AAnnnneeaalliinngg !!NNoo ccoolloorr sshhoouulldd bbee rreeppeeaatteedd iinn aannyy rrooww oorr ccoolluummnn !!1100 bbyy 1100 GGrriidd wwiitthh 1100 ppoossssiibbllee ccoolloorrss ffoorr eeaacchh SSaammuueell AAmmiinn ssqquuaarree SSiimmuullaatteedd AAnnnneeaalliinngg AAllggoorriitthhmm !!AAnn aapppprrooaacchh tthhaatt rreesseemmbblleess ssiimmppllee hhiillll cclliimmbbiinngg,, !! FFuunnccttiioonn SSIIMMUULLAATTEEDD--AANNNNEEAALLIINNGG(( pprroobblleemm,, sscchheedduullee)) rreettuurrnnss aa ssoolluuttiioonn ssttaattee bbuutt ooccccaassiioonnaallllyy aa nnoonn ooppttiimmaall sstteepp iiss ttaakkeenn ttoo ccuurrrreenntt<<--iinniittiiaall ssttaattee ooff pprroobblleemm aavvooiidd llooccaall mmiinniimmaa.. ffoorr tt <<--11 ttoo iinnffiinniittyy ddoo TT<<--sscchheedduullee[[tt]] !!TThhee pprroobbaabbiilliittyy ooff ttaakkiinngg aa nnoonn ooppttiimmaall sstteepp iiff TT == 00 tthheenn rreettuurrnn ccuurrrreenntt ddeeccrreeaasseess oovveerr ttiimmee.. nneexxtt<<--rraannddoommllyy sseelleecctteedd ssuucccceessssoorr ooff ccuurrrreenntt EE <<--VVAALLUUEE[[nneexxtt]] (cid:150)(cid:150)VVAALLUUEE [[ccuurrrreenntt ]] iiff EE >> 00 tthheenn ccuurrrreenntt<<--nneexxtt eellssee ccuurrrreenntt<<--nneexxtt oonnllyy wwiitthh pprroobbaabbiilliittyy eeEE//TT AAddjjuussttiinngg QQuuaassiiggrroouupppprroobblleemm ffoorr PPrrooggrreessss aanndd PPrroobblleemmss ffaacceedd SSiimmuullaatteedd AAnnnneeaalliinngg !! IInniittiiaall SSttaattee !!TTwweeaakkiinngg sscchheedduullee ooff TT !!SSeett tthhee pprreeddeeffiinneedd vvaalluueess ttoo tthhee ggrriidd,, aanndd mmaarrkk tthheemm aass !!LLooccaall MMiinniimmaa pprreeddeeffiinneedd.. TThheessee ssqquuaarreess wwiillll nnoott bbee aalltteerreedd !!RRaannddoommllyy ffiillll oouutt rreemmaaiinniinngg ssqquuaarreess oonn ggrriidd wwhhiillee eennssuurriinngg tthhaatt tthheerree aarree eexxaaccttllyy 1100 iinnssttaanncceess ooff eeaacchh ccoolloorr.. !! TToo ggeett tthhee nneexxtt ssttaattee,, rraannddoommllyy sswwaapp ttwwoo ssqquuaarreess oonn ggrriidd tthhaatt aarree nnoott pprreeddeeffiinneedd !! VVaalluuee ooff GGrriidd iiss 110000 (cid:150)(cid:150)NNuummbbeerr ooff rreeppeeaatteedd ssqquuaarreess 1 SSyysstteemm AArrcchhiitteeccttuurree HHaannddwwrriitttteenn CChhaarraacctteerr RReeccooggnniittiioonn uussiinngg NNeeuurraall (cid:149)(cid:149) IImmaaggee ((bbiittmmaapp)) OObbjjeecctt NNeettwwoorrkkss (cid:150)(cid:150) 1166xx1166 bbiittmmaapp ssccaalliinngg (cid:150)(cid:150) II//OO (cid:149)(cid:149) NNeeuurraall nneettwwoorrkk oobbjjeecctt (cid:150)(cid:150) TTrraaiinniinngg aanndd lleeaarrnniinngg (cid:150)(cid:150) RReeccooggnniittiioonn CCSSEE 559922 PPrroojjeecctt (cid:149)(cid:149) UUsseerr iinntteerrffaaccee (cid:150)(cid:150) HHaanndd--wwrriittee cchhaarraacctteerrss SSaammeerr AArraaffeehh (cid:150)(cid:150) CCoonnttrroollss lleeaarrnniinngg rraattee (cid:150)(cid:150) SSaavvee lleeaarrnneedd ddaattaa NNeeuurraall NNeettwwoorrkk NNeettwwoorrkk nnooddeess eevvaalluuaattiioonn (cid:149)(cid:149)MMuullttii--llaayyeerr:: 33 LLaayyeerrss nneeuurraall nneettwwoorrkk (cid:149)(cid:149)225566 iinnppuutt nnooddeess:: 00..55 iiff ppiixxeell iiss oonn,, --225566 IInnppuutt nnooddeess ((nnooddee ffoorr eeaacchh ffoorr eeaacchh ootthheerrwwiissee --00..55.. iinnppuutt ppiixxeell)) (cid:149)(cid:149)HHiiddddeenn nnooddeess aanndd oouuttppuutt nnooddeess aarree --vvaarriiaabbllee nnuummbbeerr ooff hhiiddddeenn nnooddeess ccaallccuullaatteedd uussiinngg tthhee ssiiggmmooiidd tthhrreesshhoolldd uunniitt ((ccuurrrreennttllyy sseett ttoo 2255)) aass:: --3366 oouuttppuutt nnooddeess ((00--99 aanndd (cid:145)(cid:145)AA(cid:146)(cid:146) ttoo (cid:145)(cid:145)ZZ(cid:146)(cid:146))) oo == 11//((11++ee--nneett)) wwhheerree nneett == ∑∑wwxx iiii ((oovveerr aallll iinnccoommiinngg eeddggeess)) BBaacckkpprrooppaaggaattiioonn TTrraaiinniinngg (cid:149)(cid:149) HHiiddddeenn aanndd OOuuttppuutt wweeiigghhttss aarree iinniittiiaalliizzeedd ttoo (cid:149)(cid:149)FFoorr eeaacchh hhiiddddeenn nnooddee,, rree--eevvaalluuaattee eeaacchh ooff rraannddoomm vvaalluueess bbeettwweeeenn [[--00..55,,00..55]] tthhee oouuttppuutt nnooddee wweeiigghhtt eeddggeess ((ww )) aass:: (cid:149)(cid:149) FFoorr eeaacchh oouuttppuutt nnooddee,, ccaallccuullaattee tthhee eerrrroorr tteerrmm nneewwoo δδkkaass:: wwnneewwoo== wwoollddoo++ ((ηη δδkkhh)) ;; hh iiss tthhee hhiiddddeenn nnooddee vvaalluuee,, ηηiiss tthhee lleeaarrnniinngg rraattee δδ == ((tt (cid:150)(cid:150)oo)) kk kk kk (cid:149)(cid:149) BBaacckk pprrooppaaggaattee tthhee eerrrroorr tteerrmm ttoo tthhee hhiiddddeenn (cid:149)(cid:149)FFoorr eeaacchh iinnppuutt nnooddee,, rree--eevvaalluuaattee eeaacchh ooff nnooddeess ssuucchh tthhaatt,, ffoorr eeaacchh hhiiddddeenn nnooddee,, ccaallccuullaattee tthhee hhiiddddeenn nnooddee wweeiigghhtt eeddggeess ((ww )) aass:: tthhee eerrrroorr tteerrmm δδhhaass:: nneewwhh δδhh== ∑∑wwkkhhδδkk((oovveerr aallll hhiiddddeenn nnooddee eeddggeess)) wwnneewwhh== wwoollddhh++ ((ηη δδhhxx)) ;; xx iiss tthhee iinnppuutt nnooddee vvaalluuee,, ηηiiss tthhee lleeaarrnniinngg rraattee 2 RReeccooggnniittiioonn (cid:149)(cid:149)RRuunn tthhee rree--eevvaallaauuttiioonn aallggoorriitthhmm aaggaaiinn wwiitthh tthhee nneeww sseett ooff wweeiigghhtteedd eeddggeess aanndd ffiinndd tthhee oouuttppuutt nnooddee wwiitthh tthhee llaarrggeesstt wwhhiicchh wwoouulldd ccoorrrreessppoonndd ttoo tthhee rreeccooggnniizzeedd DDeemmoo cchhaarraacctteerr.. IBM ’s Robocode an AI Playground Rob ocod e !IBM’s RoboCode !Virtual platform to test AI concepts !Little tanks battle each other !Each tank has a gun and radar by Diana Bullion !Each tank is allotted the same resources (energy, ammunition) Robots Battle !Built 5 Robots with different strategies –Diana’s First …simple tutorial-like robot –BumperBot …brute force tank –ThirdTimeCharmer …focused attack –TheGreatX…stays out of the way –MasterEvader… predicts aiming point !Implement multiple robots with varying levels of intelligence !Wanted to prove intelligence and strategy wins over brute force 3 BumperBot MasterEvader !Basic robot scans for other robots !Advanced Robot !Bumps into them and repeatedly shoots !Evasive Movements … random !Brute force -low intelligence figure-eightishpattern –Does not predict where robot will be !Predicts best path to fire bullet … taking –Does not stay focused on closest robot into account future speed and location when different robot is scanned of both target and source robots, time to !Results were surprising -original turn gun, time for bullet to travel objective was for the more intelligent !Fire power relative to target distance robots to win against BumperBot Results The Rest !ThirdTimeCharmer –Advanced Robot –Maintains a focused attack –Standard movement pattern ! Survival –50 pts for everyone that died before it !TheGreatX ! Last Survivor –10 pts for every robot in battle –Travels great distances ! Bullet Damage –1 pt for each pt of inflicted damage –Rarely shoots ! Bullet Damage Bonus –20% kill bonus of all the damage it did –Lets others run out of energy ! Ram Damage –2 pts for every pt of ram damage !Diana’s First ! Ram Damage Bonus –30% kill bonus of all ram damage it did –My first robot … modified tutorial Robocode Rules !Evironmentloop –Robot code executed, time incremented, Learning Go with TD(λ) bullets move, robots move, robots scan !Bullets –Bullet damage = 4*firepower (plus Todd Detwiler 2*(firepower-1) if firepower > 1) CSE 592 –Bullet speed = 20 –3*firepower Winter 2003 –Energy returned on hit = 3 * firepower !Robot Collision = .6 damage each !Advanced Robots take Wall Collision penalty 4 What is GO? The Rules (cid:127) Players alternate placing stones on open (cid:127) One of the oldest and most popular board intersections of the board (a 19x19 grid) (cid:127) Adjacent stones form groups games in the world (around 4000 yrs old) (cid:127) Empty intersections adjacent to groups form its (cid:127) A game of territory acquisition liberties (cid:127) Deterministic, perfect-information, zero- (cid:127) A group is captured when all of its liberties are sum, 2 player strategy game removed (cid:127) 2 passes signify the end of the game (cid:127) A “grand challenge” in AI (Rivest1993) (cid:127) Ko Captures Why is Go so Hard? (cid:127) Pspace-complete If white plays at –Average branching factor of game tree around the location 200 indicated by the red circle, they –Size of game tree on the order of 10170 will capture the (compared to around 1050for Chess) black stone by –Too large for look-ahead evaluation removing its last liberty. (cid:127) No good evaluation function for game states TD(λ) Approach The Pieces that I Started With (cid:127) Learn an evaluation function (cid:127) OpenGo5.1 beta –Use neural network as a function estimator –A set of pre-written Go objects as well as an (cid:127) Temporal credit assignment environment for playing in (cid:127)Very buggy, not as useful as I initially suspected (cid:127) Nonlinear TD/Backproppseudo C-code –Allen BondeJr. and Richard Sutton –I have extended this to be an actual C++ object 5 Player Design One Problem (cid:127) Like TD-Gammon, games (state sequences) are generated by pitting my Go player against itself (cid:127) Unlike TD-Gammon, I am using off-line learning (cid:127) Initially give player rules only, no strategy (cid:127) Later augmented with one rudimentary extension to reduce plies/game The Extension Current Status (cid:127) Don’t fill in simple, size 1 eyes (cid:127) Player – Identifies all legal moves (cid:127) Super Ko – Plays against itself – Detects win – Black tracks game states for learning (cid:127) TD(λ) network is implemented, but not fully tested – Currently testing load/save functionality (cid:127) Learning has not yet been achieved Questions? Letter Recognition by Using Multi-Layer Neural Network Meng Tat Fong 03/13/2003 6 Problem Domain Data Set Create a classifier to identity the 26 David Slate donated to UCI machine ! ! capital letters in the English Alphabet learning repository Extensible 20,000 samples ! ! Create electronic document from letter images from black-and-white ! ! scanned documents, newspapers, etc. displays 20 different fonts ! randomly distorted (all unique samples) ! Data Set Backgrounds 16 integer attributes Not using any existing Machine ! ! Normalize to 0.0 (cid:150)1.0 Learning libraries ! ! 26 output classes (A-Z) ! Java 750-800 samples each ! 2,4,4,3,2,7,8,2,9,11,7,7,1,8,5,6,Z ! 4,7,5,5,5,5,9,6,4,8,7,9,2,9,7,10,P ! Algorithms Algorithms ! Separate the sample data set into two sets ! Wji = Wji + ∆Wji (~16,000 and ~4,000) ! ∆Wji= (cid:181)EjXji ! Network is trained and then verified ! Based on the idea that each unit is partially ! Stochastic gradient descent version of the responsible for the error of its parent. BackPropagation algorithm ! Unit weight is updated after each sample Level N J ! Sigmoid Units to learn non-linear functions Level N -1 I 7 Network Topology Improvements Momentum --nth weight update ! A partially depending on the previous B update C ! ∆Wji(N) = (cid:181)EjXji+ α∆Wji(N-1) ! Help to escape local minima ! Move along flat region during the search Z ! Increase my network accuracy by 2.2% ! Momentum 0.58 (75.1% to 77.3%) Input Layer Hidden Layer Output Layer 16 units 45 units 26 units Improvements Improvements ! Learn from mistakes ! Ensemble ! Train the network with all the training ! Use multiple networks to perform samples once classification ! Fneeetwd otrhke, bsaumt oen slya mupsele isn tcoo rtrreacintl yt hcela ssify ! Each network will predict an outcome and the majority will win samples ! Give the network chances to correct its ! Improved the accuracy to >80% mistakes ! Accuracy improved from 72.0% to 77.3% Results Results ! Slate(cid:146)s Adaptive Classifiers (1990)--~80% ! Start small ! Weka(cid:146)sJ48 Decision Tree --87.75% ! Build a small network to solve a simple ! Weka(cid:146)sNaive network --64.23% problem. (no hidden unit, one output ! Weka(cid:146)sneural network (cid:150)no result after 10 class, trivial problem domain) hours Add more output classes ! ! My network (cid:150)up to 85%, alpha 0.60, Add more hidden layers momentum 0.58, hidden layer 1, 45 hidden ! units, >300,000 training examples 8 Results ! Hard to create a generic neural network Thank You! Need to adjust the network topology, ! learning rates, momentum, etc Once you have a working network, it ! will perform very well Basic Idea (cid:127) Bayesian networks serve as compact Random Sampling in Mixtures representations of data (cid:127) The data is represented in terms of of Bayes Nets conditional distributions (cid:127) Draw random samples from these conditional distributions to generate data Manish Goyal which can then be used for a variety of purposes Base system Explanation of Base System (cid:127) Random sampling has been applied to a (cid:127) The model for each character consists of a problem relating to recognition of single mixture of Bayes characters nweetisg(hBtNin1g, …fa.cBtoNrns) with w1 w2 w3 (cid:127) The base system consists of a model for w1,….wn each character (cid:127) Models have been trained for each of the 99 BN1 BN2 BN3 supported characters (cid:127) The training set consists of approximately 200 samples of each character 9 Explanation of Bayesian Nets Method of sampling within each model (cid:127) For each handwritten character we extract (cid:127) First randomly select which Bayesian 64 features Network you will select. The bayesian (cid:127) These features are a mixture of networks are selected with (cid:127) FoFOCr ooCtuhnRrteio ef uerpr aTu Ftrureapraneotssusforeer msofs this talk the exact nature (cid:127) OPnnnoercotwweb noathrebkeeili dtBy t(aowy 1gese,…inaen.wr anntee)t woobsrke risv asteiolencst efrdo mw eth e w1 w2 wn of these features is not important (cid:127) For this we need to traverse the graph in (cid:127) Enaocdhe oinf tah egsrea pfeha. tHuerensc eis grievperne stheantt ethde ares aar e oorf dtrearv. eFrosra el xw.o Iunl dth bee f i1g,u2r,e… ,tnhe correct order 64 features, there are 64 nodes in each (cid:127) Each node is specified in terms of its Bayesian Network conditional mean and covariance given by (cid:127) Ecpaoacnrhed ninttioso dnoeaf lti shd eirse tnrpiorbedusetei)onnt.e ide .i nP t(e nrmodse o/af ltlh the e (cid:127)(cid:127) CAMso evyaaonrui=a tMnrac=veµe=crC+s∑=e ptσ(hXcep -gµrpa)ph, generate the 1 3 4 observation for the particular node by sampling from a Gaussian distribution with 2 Mean=M and Covariance=C (cid:127) Once the observations of all the parents are 5 known, the conditional mean can be ……. …. (cid:127) Dcooob msthepirsuv tafeotidro anfol lcr t ahthnea bnt eon domedased aen fdo rh tehnact en oadne 6 7 (cid:127) Iterate through this generating as many samples as are required. Verification Results (cid:127) Use the generated data to train a feed (cid:127) Original training set Error on test set contains approx 200 forward neural network (fully connected,1 samples per code point hidden layer) (cid:127) Generated 200 and 500 NN trained used 20.32 % (cid:127) Compare the error rate using the samples for each code original data point using the random generated data to a net trained using sampling method NN trained 24.30 % original data using generated (cid:127) Test set used consists of data(200 (cid:127) See if these two error rates are 17000 samples samp/code pt) comparable NN trained 23.97 % using generated data(500 samp/code pt) Results contd. Pipe dream (cid:127) Twhhee pnr ewveio wuse rree ssualmtsp wlinegre f raolml a Error on test set (cid:127) Rather than using the S distribution with mean=M and generated data (cid:127) Wcoev caarina ninccer=eCase or decrease NN trained 24.30 % separately, could we BN NN tgchoeevn aerarrianatdneocdme d ngaeivtsaes nb o ybf uyt hshien* Cg aw here using h=1 uthsee oitr itgoi nsaulp tpraleinminegn t G G+S h is a heuristic (cid:127) Dtrifafeinreedn tf onre dtsif fheareven tb veaelune s of NusNin tgra hin=e0d.1 41.96 % dmaatnan ?e rIf wuislle wd ein b teh is Error on test set the heuristic factor NN trained using 20.32 % (cid:127) As can be seen h=1 gives the able to improve the original data beexpset cretesdu ltth (eaosr wetoicualdll yb).e NusNin tgra hin=e2d 24.45 % base accuracy of the NgeNn etrraaitneedd s uasminpgle 2s0 0 20.8 % (cid:127) Spaominpt=le2s0 g0enerated per code neural network ? pdearta code pt +original NN trained using 200 22.26 % generated samples per code pt +original data 10
Description: