ebook img

Introduction to Learning Classifier Systems PDF

132 Pages·2017·1.405 MB·english
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Introduction to Learning Classifier Systems

Ryan J. Urbanowicz · Will N. Browne Introduction to Learning Classifier Systems 1 3 Ryan J. Urbanowicz Will N. Browne Department of Biostatistics, E pidemiology, School of Engineering and Computer and Informatics, Perelman School of Science Medicine Victoria University of Wellington University of Pennsylvania Wellington Philadelphia, PA New Zealand USA ISSN 2196-548X ISSN 2196-5498 (electronic) SpringerBriefs in Intelligent Systems ISBN 978-3-662-55006-9 ISBN 978-3-662-55007-6 (eBook) DOI 10.1007/978-3-662-55007-6 Library of Congress Control Number: 2017950033 © The Author(s) 2017 This Springer imprint is published by Springer Nature The registered company is Springer-Verlag GmbH Germany The registered company address is: Heidelberger Platz 3, 14197 Berlin, Germany Preface This textbook provides an accessible introduction to Learning Classifier Systems (LCSs)forundergraduate/postgraduatestudents,dataanalysts,andmachinelearn- ing practitioners alike. We aim to tackle the following questions through the lens of simple example problems: (1) How do LCSs work and how can they be imple- mented?(2)Whattypesofproblemshaveorcantheybeappliedto?(3)Whatmakes LCSalgorithmsuniqueandadvantageouscomparedtoothermachinelearners?(4) Whatchallengesordisadvantagesexist?(5)WhatvariationsofLCSalgorithmsex- ist?(6)WhatresourcesexisttosupportthedevelopmentofnewLCSalgorithms,or theapplicationofexistingones? The term LCS has been used to describe a family of machine learning (ML) algorithmsthatemergedfromafoundingconceptdesignedtomodelcomplexadap- tive systems (e.g. the economy, weather, or the human brain). The LCS concept has enjoyed over 40 years of active research from a small but dedicated commu- nityandLCSalgorithmshaverepeatedlydemonstratedtheiruniquevalueacrossa diverse and growing set of applications. Despite this history, LCS algorithms are stilllargelyunknownandunderutilisedamongMLapproaches.Thisistrueforboth typesofproblemstowhichLCSalgorithmsaremostcommonlyapplied(i.e.rein- forcementlearningandsupervisedlearning). Reinforcementlearningproblemsonlyprovidethelearnerwithoccasionalfeed- back in the form of reward/punishment. This type of learning has close ties to the broaderfieldofartificialintelligenceandhasbeenappliedtotaskssuchasbehavior modeling, maze navigation, and game play. On the other hand, supervised learn- ing problems provide the learner with the correct decision as input. This type of learningiscommonlyappliedtodatasciencetaskssuchaspredictivemodeling(i.e. classificationorregression). The name Learning Classifier System is a bit odd/misleading since there are manyMLalgorithmsthatlearntoclassify(suchasdecisiontreesorsupportvector machines)butarenotLCSs.AslightlymoregeneraltermthatbetterrepresentsLCS algorithms is Rule-Based Machine Learning (RBML). This term encompasses the twomainfamiliesofLCSalgorithms(i.e.Michigan-styleandPittsburgh-style),as well as Association Rule Learning and Artificial Immune Systems, which are not LCS algorithms. The LCS concept was developed by John Holland in the 1970s around the same time that he popularised what is now known as a Genetic Algo- rithm(GA).LCSstypicallyincorporateaGA,andassucharesometimesevenmore generallyreferredtoasGenetics-BasedMachineLearning(GBML).Dependingon context,thesethreeterms(LCS,RBML,andGBML)canbeusedinterchangeably. At a high level, LCS algorithms all combine a discovery component (typically driven by Evolutionary Computation (EC) methods, such as a genetic algorithm) and a learning component that tracks accuracy and/or handles credit assignment in order to improve performance through the acquisition of experience. A basic understandingofECwouldbeausefulprerequisiteforthisbook.Inshort,ECisa fieldthatstudiesalgorithmsinspiredbytheprinciplesofDarwinianevolution. Inouropinion,themostdistinguishingfeatureofLCSs,andRBMLingeneral, is that the ‘model’ output by the system is a set of rules that each ‘cover’ (i.e. are relevant to) only a subset of the possible inputs to the problem at hand. Each rule represents an IF:THEN expression that links specific state conditions with an ac- tion/class.Forexample,IF‘red’AND‘octagon’THEN‘stop-sign’mightbearule learnedforclassifyingtypesoftrafficsigns.Noticethatthisruledoesnotconstitute acompletemodel,butratherisonlypartofthecollaborativerulesetrequiredtoac- curatelyandgenerallyclassifythevarietyofroadsignsbasedonavailablefeatures such as color, shape, or size. This property allows LCS algorithms to challenge a nearlyubiquitousparadigmofmachinelearning,i.e.thatasingle‘best’modelwill be found. LCS algorithms instead learn a ‘distributed solution’ or a ‘map’ of the problem space represented by a ruleset that implicitly breaks complex problems into simpler pieces. This property is the main reason why LCS algorithms can (1) flexibly handle very different problem domains, (2) adapt to new input, (3) model complexpatternssuchasnon-linearfeatureinteractions(i.e.epistasis)andhetero- geneousassociations,and(4)beappliedtoeithersingle-stepormulti-stepproblems. Additionally,LCSalgorithmshavethefollowingadvantages:(1)Theyarefun- damentallymodelfree,i.e.theymakefewornoassumptionsabouttheunderlying problemdomain,(2)theirrulesareintuitivelyhumaninterpretable,unlikeso-called ‘blackbox’MLalgorithmssuchasartificialneuralnetworksorrandomforests,(3) they produce solutions that are implicitly multi-objective, with evolutionary pres- suresdrivingmaximallyaccurateandgeneralrules,and(4)theyresembleensemble learners, which tend to make more accurate and reliable predictions, particularly whenpriorproblemknowledgeisunavailable.Thisbookwillhighlightthetypesof problemstowhichLCSalgorithmshavebeenshowntobeparticularlywellsuited, e.g.thosewithepistasisandheterogeneity. Theoretical understanding of the LCS approach has improved, but an accepted theory does not yet exist. This is probably due to the interactive complexity and underlying stochastic nature of LCSs. Whether it is even possible to include con- vergence proofs is debatable, although such a proof would be beneficial in cross- disciplinaryacceptanceandadoption. Thisbookisintendedasajumping-offpoint,anddoesnotincludeadetailedhis- toryofLCSs,nordoesitexploremanyofthecutting-edgeadvancementsavailable in the field today. Many great researchers, papers, and ideas will not be cited. In- stead,itaddressesanoutstandingneedforasimpleintroductiontotheLCSconcept, whichcanseemabittrickytograspcomparedtootherMLalgorithms.Thisisdue totheunusuallearningparadigmofferedbyLCSs,aswellasthemultipleinteract- ingcomponentsthatmakeupthesealgorithms.Conveniently,thecomponentsofan LCScanbeexchanged,added,orremoved(likealgorithmicLegobuildingblocks), yieldingaframeworkwiththeproblemversatilityofaSwissArmyknife.Tofacil- itate comprehension of how LCSs operate, and how they can be implemented, we havepairedthisbookwithaneducationalversionofLCS,namedeLCS,codedsim- plyinPython.GrantsupportfromtheNationalInstitutesofHealth(R01AI116794), andtheVictoriaUniversityofWellington(204021)helpedmakethisbookpossible. Pleaseenjoy! RyanUrbanowicz,UniversityofPennsylvania,USA WillBrowne,VictoriaUniversityofWellington,NZ Contents 1 LCSsinaNutshell ............................................. 1 1.1 ANon-trivialExampleProblem:TheMultiplexer ............... 1 1.2 KeyElements.............................................. 4 1.2.1 Environment ........................................ 4 1.2.2 Rules,Matching,andClassifiers........................ 4 1.2.3 DiscoveryComponent-EvolutionaryComputation........ 6 1.2.4 LearningComponent ................................. 7 1.3 LCSFunctionalCycle....................................... 7 1.4 Post-training............................................... 10 1.4.1 RuleCompaction .................................... 10 1.4.2 Prediction .......................................... 11 1.4.3 Evaluation .......................................... 11 1.4.4 Interpretation........................................ 12 1.5 CodeExercises(eLCS)...................................... 13 2 LCSConcepts ................................................. 21 2.1 Learning.................................................. 22 2.1.1 ModelingwithaRuleset .............................. 23 2.2 Classifier ................................................. 24 2.2.1 Rules .............................................. 24 2.2.2 RepresentationandAlphabet .......................... 27 2.2.3 Generalisation....................................... 28 2.3 System ................................................... 30 2.3.1 InteractionwithProblems ............................. 30 2.3.2 CooperationofClassifiers ............................. 33 2.3.3 CompetitionBetweenClassifiers ....................... 34 2.4 ProblemProperties ......................................... 35 2.4.1 ProblemComplexity ................................. 35 2.4.2 ApplicationsOverview ............................... 37 2.5 Advantages................................................ 39 2.6 Disadvantages ............................................. 40 3 FunctionalCycleComponents ................................... 41 3.1 EvolutionaryComputationandLCSs .......................... 42 3.2 InitialConsiderations ....................................... 44 3.3 BasicAlphabetsforRuleRepresentation ....................... 45 3.3.1 EncodingforBinaryAlphabets......................... 45 3.3.2 Interval-Based....................................... 48 3.4 Matching ................................................. 51 3.5 Covering.................................................. 52 3.6 FormaCorrectSetorSelectanAction......................... 53 3.6.1 Explorevs.Exploit................................... 54 3.6.2 ActionSelection..................................... 56 3.7 PerformingtheAction ...................................... 57 3.8 Update ................................................... 58 3.8.1 NumerosityofRules ................................. 58 3.8.2 FitnessSharing ...................................... 59 3.9 SelectionforRuleDiscovery ................................. 59 3.9.1 ParentSelectionMethods ............................. 60 3.10 RuleDiscovery ............................................ 62 3.10.1 WhentoInvokeRuleDiscovery........................ 63 3.10.2 IdentifyingBuildingBlocksofKnowledge............... 64 3.10.3 Mutation ........................................... 65 3.10.4 Crossover........................................... 66 3.10.5 InitialisingOffspringClassifiers........................ 67 3.10.6 OtherRuleDiscovery................................. 68 3.11 Subsumption .............................................. 68 3.12 Deletion .................................................. 69 3.13 Summary ................................................. 70 4 LCSAdaptability .............................................. 71 4.1 LCSPressures ............................................. 71 4.2 Michigan-Stylevs.Pittsburgh-StyleLCSs ...................... 74 4.3 Michigan-StyleApproaches.................................. 76 4.3.1 Michigan-StyleSupervisedLearning(UCS).............. 76 4.3.2 UpdateswithTime-WeightedRecencyAverages .......... 78 4.3.3 Michigan-StyleReinforcementLearning(e.g.XCS) ....... 79 4.4 Pittsburgh-StyleApproaches ................................. 87 4.4.1 GAssistandBioHEL ................................. 87 4.4.2 GABIL,GALE,andA-PLUS.......................... 88 4.5 Strength-vs.Accuracy-BasedFitness.......................... 89 4.5.1 Strength-Based ...................................... 89 4.5.2 Accuracy-Based ..................................... 90 4.6 Niche-BasedRuleDiscovery................................. 90 4.7 Single-vs.Multi-stepLearning ............................... 92 4.7.1 Sense,Plan,Act ..................................... 93 4.7.2 DelayedReward ..................................... 95 4.7.3 AnticipatoryClassifierSystems ........................ 97 4.8 ComputedAlphabets........................................ 98 4.8.1 S-ExpressionandGeneticProgramming................. 98 4.8.2 ArtificialNeuralNetworks ............................ 99 4.8.3 ComputedPrediction ................................. 99 4.8.4 ComputedAction ....................................100 4.8.5 CodeFragments .....................................100 4.9 EnvironmentConsiderations .................................101 5 ApplyingLCSs ................................................103 5.1 LCSSetup ................................................104 5.1.1 RunParameter‘SweetSpots’ ..........................110 5.1.2 HybridiseorDie.....................................113 5.2 Tuning ...................................................114 5.3 Troubleshooting............................................115 5.3.1 LackofConvergence .................................116 5.4 WheretoNow? ............................................117 5.4.1 WorkshopsandConferences...........................117 5.4.2 Books,Journals,andSelectReviews ....................118 5.4.3 WebsitesandSoftware................................121 5.4.4 Collaborate .........................................122 5.5 ConcludingRemarks........................................123 Acronyms and Glossary Acronyms Acronymsusedbyrelatedmethodsandfieldsofstudy. AI ArtificialIntelligence-intelligenceexhibitedbymachines.Aflex- iblerationalagentthatperceivesitsenvironmentandtakesactions tomaximiseitschanceofsuccess. DM Data Mining - A main component of knowledge discovery in databases,wherepatternsandknowledgecontainedindataaredis- covered. EC Evolutionary Computation - Global search method based on the principlesofDarwinianselection. EDA EstimationofDistributionAlgorithm-Buildsaprobabilisticmodel ofthesolution. EML Evolutionary Machine Learning - Machine learning using Dar- winianprinciples:Modernnamefortheoverarchingfieldthatin- cludesLCS(replacingthetermGBML). GA Genetic Algorithm - Both the name of the field of genetic algo- rithmsandthemethodusedtodiscoverhypothesisedbetterrulesin LCS(seeRD). GBMLGenetics-BasedMachineLearning-Thetermusedtoidentifyany ML approach with a genetics-based evolutionary component (su- persededbyEML). LCS LearningClassifierSystem-Theoriginalnamefortheconceptof anartificialintelligencetechniquethatbuildsmodelsofpatternsin- herentinthedatathroughglobalsearchofpatternstructures(e.g. rulescreatedthroughevolutionarycomputation)andlocallearning of pattern utility (e.g. pattern fitness through supervised or rein- forcementlearning).Notethat‘LCSs’willbeusedthroughoutthis booktorefertothepluralof‘LCS’. ML MachineLearning-Asubfieldofcomputerscienceemergingfrom patternrecognitionandartificialintelligenceexploringalgorithms thatcanlearnfromandmakepredictionsaboutdata. RBML Rule-BasedMachineLearning-Atermformachinelearningalgo- rithmsthatemployrulesorclassifiersinmodeling.RBMLincludes alltypesofLCS,alongwithAssociationRuleLearning,andArtifi- cialImmuneSystems. RD RuleDiscovery-MechanismsthatcanintroducenewrulesinLCS. RL Reinforcement Learning - Learning from environmental reward payoffonly. SL SupervisedLearning-Learningfromateacherintheenvironment. KeyLCSTerms ThereareafewtermsthataretailoredtothefieldofLCSsthatarenotincommon useinotherbranchesofAI. action The endpoint, consequent, output, or THEN portion of the ‘if ... then...’ruleexpressionevolvedbyanLCS. classifier Aruleplussupportingstatisticslearnedfromenvironmentalinter- action. condition Thespecifiedfeaturestates,antecedent,input,orIFportionofthe ‘if...then...’ruleexpressionevolvedbyanLCS. coverage Thesamplespacematchedbytheconditionportionofarule. niche Anareaofthesamplespaceinthedomainwheretheneighboring instancesshareacommonproperty,e.g.sameclass. rule An‘if<condition>then<action>’expressionevolvedbyanLCS. samplespaceTheuniqueinstances(states)availablefromtheproblemdomain. searchspace Theuniquerulesthatcanbecreated;thesearchspaceislinkedto thesamplespacetogetherwiththechosenalphabet/rulerepresen- tationoftheLCS. Rulesets SetsofrulesinLCSareconventionallyindicatedbysquarebrackets‘[]’ratherthan curlybrackets‘{}’.Thesymbolforeachsetisitalicised,indicatingavariable, giventhatthecontentofthesesetswillvaryduringevolution.Notethatwhile italicsoughttobeused,theyareoftenomittedbyconventionintheLCSliterature. [P] Population-thesetofallclassifiersintheLCS(utilisedinbothreinforce- mentandsupervisedlearning). [M]MatchSet-thesetofallclassifierswithconditionsthatmatchtheenviron- mentalstate.Matchingrulescanrecommenddifferentactions,thus[M]is asupersetof[A](inreinforcementlearning),and[M]isasupersetofboth [C]and[I](insupervisedlearning). [A] ActionSet-thesetofallclassifierswithmatchingconditions(i.e.thatare in[M])thatalsorecommendtheactionthatthesystemselectedtoaffectthe environment(usedinreinforcementlearning). [C] CorrectSet-thesetofallclassifierswithmatchingconditions(i.e.thatarein [M])thatalsorecommendthecorrectactionobtainedfromtheenvironment (usedinsupervisedlearning). [I] IncorrectSet-thesetofallclassifierswithmatchingconditions(i.e.thatare in[M])thatalsorecommendanincorrectactionobtainedfromtheenviron- ment(usedinsupervisedlearning). VariantsofLCSAlgorithms NamesofLCSvariants(therehavebeenmanyinthehistoryofLCS,sothisisonly asubsample). BioHEL Bioinformatics-orientedHierarchicalEvolutionaryLearn- ing-BioHELisdesignedtohandlelarge-scale,e.g.bioin- formatic,datasetsusingameta-representation. eLCS EducationalLCS-Abare-bonesLCSspecificallydesigned to complement this book and facilitate understanding of LCS implementation. Not intended to function optimally on real-world problems. (Users are welcome to develop eLCS further and advance the academic field under the CreativeCommonslicense). ExSTraCSExtended Supervised Tracking and Classifying System - ExSTraCS extends the UCS framework adding expert- knowledge-guided learning, attribute tracking for hetero- geneous subgroup identification, and a number of other heuristicstohandlecomplex,noisy,andlarger-scale(e.g. bioinformatic)datamining. GAssist Genetic clASSIfier sySTem - Accuracy-based Pittsburgh learningclassifiersystemwithdefaultaction,incremental updateandnovelrepresentation.NowsupersededbyBio- HEL. SCS Simple Classifier System - now outdated simple LCS in Goldberg’s1989book. UCS sUpervisedClassifierSystem-UCSisbasedontheXCS frameworkbutspecificallyadaptedtosupervisedlearning. UnfortunatelySCShadalreadybeentakenandSuCSmight nothavebeensosuccessful(note:UnsupervisedLearning ClassifierSystemsareveryrareintheLCSfield). XCS XCS - Michigan-style, accuracy-based, niche-based LCS thathasbecomethemostwidelyusedsysteminthefield. Designedtohandlebothreinforcementlearningandsuper- visedlearningproblems.BackacronymedtobeeXtended ClassifierSystem. ZCS Zeroth-level Classifier System - Revolutionary strength- basedclassifiersystemwithimplicitbucketbrigadeupdate mechanism.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.