ebook img

Advances in Intelligent Systems Research and Innovation PDF

489 Pages·2021·15.929 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Advances in Intelligent Systems Research and Innovation

Studies in Systems, Decision and Control 379 Vassil Sgurev Vladimir Jotsov Janusz Kacprzyk   Editors Advances in Intelligent Systems Research and Innovation Studies in Systems, Decision and Control Volume 379 SeriesEditor JanuszKacprzyk,SystemsResearchInstitute,PolishAcademyofSciences, Warsaw,Poland The series “Studies in Systems, Decision and Control” (SSDC) covers both new developments and advances, as well as the state of the art, in the various areas of broadly perceived systems, decision making and control–quickly, up to date and withahighquality.Theintentistocoverthetheory,applications,andperspectives on the state of the art and future developments relevant to systems, decision making,control,complexprocessesandrelatedareas,asembeddedinthefieldsof engineering,computerscience,physics,economics,socialandlifesciences,aswell astheparadigmsandmethodologiesbehindthem.Theseriescontainsmonographs, textbooks, lecture notes and edited volumes in systems, decision making and control spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular valuetoboththecontributorsandthereadershiparetheshortpublicationtimeframe and the world-wide distribution and exposure which enable both a wide and rapid disseminationofresearchoutput. IndexedbySCOPUS,DBLP,WTIFrankfurteG,zbMATH,SCImago. AllbookspublishedintheseriesaresubmittedforconsiderationinWebofScience. Moreinformationaboutthisseriesathttp://www.springer.com/series/13304 · · Vassil Sgurev Vladimir Jotsov Janusz Kacprzyk Editors Advances in Intelligent Systems Research and Innovation Editors VassilSgurev VladimirJotsov InstituteofInformation UniversityofLibraryStudies andCommunicationTechnologies andInformationTechnologies BulgarianAcademyofSciences Sofia,Bulgaria Sofia,Bulgaria JanuszKacprzyk SystemsResearchInstitute PolishAcademyofSciences Warsaw,Poland ISSN2198-4182 ISSN2198-4190 (electronic) StudiesinSystems,DecisionandControl ISBN978-3-030-78123-1 ISBN978-3-030-78124-8 (eBook) https://doi.org/10.1007/978-3-030-78124-8 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNature SwitzerlandAG2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuse ofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Contents Balancing Exploration and Exploitation in Forward Model Learning ......................................................... 1 AlexanderDockhornandRudolfKruse UniversalAdversarialPerturbationGeneratedbyUsingAttention Information ....................................................... 21 ZifeiWang,XiaolinHuang,JieYang,andNikolaKasabov FrameworkandDevelopmentProcessforIoTDataGathering ......... 41 MikaSaari,PetriRantanen,SamiHyrynsalmi,andDavidHästbacka AutomatedEnvironmentalMappingwithBehaviorTrees ............. 61 FranciscoGarciaRosas,FrankHoeller,andFrankE.Schneider Multi-objectiveAutomaticClusteringwithGeneRearrangement andClusterMerging ............................................... 87 HongchunQu,LiYin,andXiaomingTang Assessment of Deep Learning Models for Human Activity RecognitiononMulti-variateTimeSeriesDataandNon-targeted AdversarialAttack ................................................ 129 MahbubaTasmin, SharifUddinRuman, TaoseefIshtiak, Arif-ur-RahmanChowdhurySuhan, RedwanHasif, ShahnawazZulminan,andRashedurM.Rahman ComplexMultivaluedHierarchicalLogic(HS-Logic) ................. 161 VassilSgurev Design Considerations for a Multiple Smart-Device-Based PersonalizedLearningEnvironment ................................. 173 MasashiKatsumata IntelligentAssistiveSensorsandSmartSystemsfortheControl andAnalysisofDriverReactionTimes .............................. 185 DavidSanders, MalikHaddad, GilesTewkesbury, TomBarker, MartinLangner,andAlexGegov v vi Contents ImplementationandExperimentalValidationoftheIEC63047 StandardforDataTransferinRadiologicalandNuclearRobotic Applications ...................................................... 205 FrankE.Schneider,DennisWildermuth, FranciscoGarciaRosas, andJanPaepen Data Science Modeling and Constraint-Based Data Selection forEEGSignalsDenoisingUsingWaveletTransforms ................ 241 MagdalenaGarvanova,IvanGarvanov,andVladimirJotsov IntuitionisticFuzziness,StandardandExtendedModality ............ 269 KrassimirAtanassov IntroductiontoOctopus-InspiredSoftRobots:Pipe-Climbing RobotTAOYAKA-SIIandLadder-ClimbingRobotMAMEYAKA ..... 287 KazuyukiIto,YoshihiroHomma,HiroakiShimizu,andYuseiSakuhara AutomaticLogAnalysistoPreventCyberAttacks .................... 315 AndreBrandaoandPetiaGeorgieva GeneralizedNetModelforCollecting,EvaluatingandIncluding ofFactsintheEducationalContent ................................. 341 KrassimirAtanassov, AnthonyShannon, EvdokiaSotirova, ValentinVasilev,andSotirSotirov VirtualizationofThingsinaSmartAgricultureSpace ................ 349 StanimirStoyanov, TodorkaGlushkova, IvanPopchev, andLyubkaDoukovska TurbineHillChartGenerationUsingArtificialNeuralNetwork ....... 369 ArlindoRodriguesGalvãoFilho, FilipedeS.L.Ribeiro, RafaelVianadeCarvalho,andClarimarJoséCoelho Stimuli-BasedControlofNegativeEmotionsinaDigitalLearning Environment ...................................................... 385 RossitzaKaltenborn,MinchoHadjiski,andStefanKoynov IntelligentNetwork-FlowSolutionswithRisksatTransportation ofProducts ....................................................... 417 VassilSgurev,LyubkaDoukovska,andStanislavDrangajov EmbeddedIntelligenceinaSystemforAutomaticTestGeneration forSmoothlyDigitalTransformationinHigherEducation ............. 441 PepaPetrova,IvaKostadinova,andMajidH.Alsulami Simulation Algorithms to Assess the Impact of Aging ontheReliabilityofStandbySystemswithSwitchingFailures ......... 463 KirilTenekedjiev, NataliaNikolova, GuixinFan, MarkSymes, andOanhNguyen Balancing Exploration and Exploitation in Forward Model Learning AlexanderDockhornandRudolfKruse Abstract Forwardmodellearningalgorithmsenabletheapplicationofsimulation- based search methods in environments for which the forward model is unknown. Multiplestudieshaveshowngreatperformanceingame-relatedandmotioncontrol applications.Inthese,forwardmodellearningagentsoftenrequiredlesstrainingtime whileachievingasimilarperformancethanstate-of-the-artreinforcementlearning methods.However,severalproblemscanemergewhenreplacingtheenvironment’s truemodelwithalearnedapproximation.Whilethetrueforwardmodelallowsthe accurate prediction of future time-steps, a learned forward model may always be inaccurateinitsprediction.Theseinaccuraciesbecomeproblematicwhenplanning long action sequences since the confidence in predicted time-steps reduces with increasingdepthofthesimulation.Inthiswork,weexploremethodsforbalancing riskandrewardindecision-makingusinginaccurateforwardmodels.Therefore,we proposemethodsformeasuringthevarianceofaforwardmodelandtheconfidence inthepredictedoutcomeofplannedactionsequences.Basedonthesemetrics,we definemethodsforlearningandusingforwardmodelsunderconsiderationoftheir currentpredictionaccuracy.Proposedmethodshavebeentestedinvariousmotion controltasksoftheOpenAIGymframework.Resultsshowthattheinformationon the model’s accuracy can be used to increase the efficiency of the agent’s training andtheagent’sperformanceduringevaluation. B A.Dockhorn( ) QueenMaryUniversity,London,UK e-mail:[email protected] R.Kruse OttovonGuerickeUniversity,Magdeburg,Germany e-mail:[email protected] ©TheAuthor(s),underexclusivelicensetoSpringerNatureSwitzerlandAG2022 1 V.Sgurevetal.(eds.),AdvancesinIntelligentSystemsResearchandInnovation, StudiesinSystems,DecisionandControl379, https://doi.org/10.1007/978-3-030-78124-8_1 2 A.DockhornandR.Kruse 1 Introduction Forward Model Learning describes the process of learning a model of a pri- ori unknown environments. This enables an agent to anticipate the outcome of plannedactionsequencesandusesimulation-basedsearchtechniquestooptimizeits behavior. Thischapterbuildsuponourrecentworkonforwardmodellearningingames[1] and directly extends our work on forward model learning for motion control tasks[12].Previousstudieshaveshownthatreliableforwardmodelscanbelearned byobservation.Sincenoinformationontheenvironmentisavailable,wehavecho- sen to implement an uninformed training process in which the agent uses random actions to explore its environment. This results in many interactions being wasted dueto(1)repeatingaknowninteractionand(2)neitherfocusingonimprovingthe model nor improving the agent’s performance. As a result, we have observed that the model may be unreliable since large amounts of the state space remain unob- served.Inreturn,theagent’sperformanceduringtheevaluation willbelimitedby thepredictionaccuracyofthetrainedforwardmodel. Motivatedbytheseshortcomings,westudymethodsforimprovingtheefficiency oftheagent’strainingprocess.Therefore,weformulatetwolearninggoals(1)the explorationoftheenvironmenttotrainareliableforwardmodel,and(2)theexploita- tionofpromisingactionsequencestobecomemoreproficientinthegiventask.The maincontributionofthisworkcanbesummarizedby: • TaxonomyofLearningAlgorithms:Wepresentaunifyingviewonreinforcement learning,search,andforwardmodellearningalgorithmsinthecontextoftheagent- environmentinterface. • DecomposedDifferentialForwardModel:Wereviewthedefinitionsofprevi- ouslyproposedmodelbuildingheuristicsandtheirgeneralizationinformofthe decomposedforwardmodel.Asaresult,weproposethedecomposeddifferential forwardmodelwhichwillbeusedthroughoutthisstudy. • Risk Awareness: We propose several methods for measuring the agent’s confi- denceinthelearnedforwardmodel’spredictions.Furthermore,weproposeseveral optimizationgoalstobalanceexplorationandexploitationduringtheagent’strain- ingprocess. The remainder of this chapter is structured as follows: In Sect.2, we compare theconceptsofreinforcementlearning,search-basedalgorithms,andforwardmodel learningfordecision-making. The followingsections (Sects.3and4)focus onthe introductionofforwardmodellearningandtypesofforwardmodelrepresentations. Section5exemplarilyshowstheusageofGaussianprocessregressionandensemble regressionmodelssuchasarandomforestregression.Forboth,weproposemethods for incorporating the variance of made predictions in the agent’s decision-making process (Sect.5.2). We evaluate proposed methods based on their resultingperfor- manceinthreesimplemotioncontroltasksinSect.6.Foramoredetailedanalysis,we comparetheimpactofproposedmeasuresontheagent’strainingprocessinSect.7. WeconcludeourresultsinSect.8anddiscussopportunitiesforfurtherstudies. BalancingExplorationandExploitationinForwardModelLearning 3 2 TaxonomyofLearningAlgorithms Theagent-environmentinterfacerepresentsageneraldescriptionofalearningsce- nario.Here,theagentisincontinuousinteractionwithanenvironment.Eachofthe agent’sexecutedactionshasthepotentialtoupdatethestateoftheenvironment.In return,theagentcanbeofferedanumericalrewardthatistypicallyassociatedwith thegiventask. Inreinforcementlearning,theagentfocusesonpickingactionstomaximizeits expected reward over time. Given observations of all its previous interactions, the agent tries to estimate the expected value of state-action pairs. Algorithms such asTemporalDifferenceLearning(TDL)[32],Q-learning[35],andtheMonteCarlo (MC)method[30],updatetheexpectedvalueafterarewardsignalhasbeenobserved oraftertheresultofanepisodehasbeenobservedrespectively.Thesetechniquesare calledmodel-freesincetheydonottrytobuildamodeloftheirenvironment. Incontrasttothesesimplerreinforcementlearningalgorithms,whicharestoring thevalueofeachstate-actionpair,methodsindeepreinforcementlearninguseaneu- ralnetworktoapproximatethevaluebasedontheinput,thusdrasticallydecreasing thestorageusedforthemodel[37].Whiledeepreinforcementlearningalgorithms have resulted in an impressive performance in the context of many game-related benchmarks,thenumberofrequiredtrainingexamplestofitallthenetwork’sparam- etersisoftenhighandthusrequiremuchtrainingtime. Simulation-basedsearchalgorithms,suchasminimax[23]orMonteCarlotree search(MCTS)[5],utilizetheenvironment’sforwardmodeltosimulatetheoutcome ofplannedactionsequences.Eachactionsequencerepresentsacandidatesolution. Itssimulationiscalledarolloutandtheresultofsuchasimulationcanbeusedto estimatethevalueofactionsinthesimulatedsequence.Incontrasttoreinforcement learning algorithms, simulation-based search algorithms estimate the value of an action at run-time. Therefore, these algorithms require knowledge of the current state and the game’s model, and return the action with the highest expected value regardingthecurrentstate.Inreturn,theycanbeappliedwithoutpriortraining.Due totheircapabilityofbeingappliedwithouttraining,searchalgorithmssuchasMCTS andOpenLoopSearch[26]haveperformedwellingeneralgame-playingtasks[27]. Dynamicprogrammingalgorithms[21]requireknowledgeoftheenvironment’s forwardmodeltocalculatethetruevalueofanystate-actionpairs.Thisoptimization process can yield a perfect policy for small state and action spaces. However, the highbreadthordepthofthesearchtreeoftenrendersthismethodinfeasible. Next to traditional search schemes, the rolling horizon evolutionary algorithm (RHEA) [15, 16], uses mutation and crossover to optimize the agent’s action sequence. A heuristic value of simulated action sequences can be used as a fit- nessmeasuretoguideevolutionaryoptimization.Thedesignofsuchaheuristiccan drasticallyimpact,theresultingbehavior. In the following, we compare the reviewed methods based on their knowledge of the environment. Given the agent’s action and the current state, the agent will observethenextstateandareward.Bothupdatescanbeencodedinaseparatemodel,

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.