Studies in Computational Intelligence 770 Andrzej Bielecki Models of Neurons and Perceptrons: Selected Problems and Challenges Studies in Computational Intelligence Volume 770 Series editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland e-mail: [email protected] The series “Studies in Computational Intelligence” (SCI) publishes new develop- mentsandadvancesinthevariousareasofcomputationalintelligence—quicklyand with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. More information about this series at http://www.springer.com/series/7092 Andrzej Bielecki Models of Neurons and Perceptrons: Selected Problems and Challenges 123 Andrzej Bielecki Faculty of Electrical Engineering, Automation,Computer Science andBiomedical Engineering AGHUniversityofScienceandTechnology Cracow Poland ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN978-3-319-90139-8 ISBN978-3-319-90140-4 (eBook) https://doi.org/10.1007/978-3-319-90140-4 LibraryofCongressControlNumber:2018938784 ©SpringerInternationalPublishingAG,partofSpringerNature2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisSpringerimprintispublishedbytheregisteredcompanySpringerInternationalPublishingAG partofSpringerNature Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Part I Preliminaries 2 Biological Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Foundations of Artificial Neural Networks . . . . . . . . . . . . . . . . . . . 15 3.1 Models of Neurons and Synaptic Transmission. . . . . . . . . . . . . 16 3.2 Artificial Neural Networks and Their Applications . . . . . . . . . . 17 3.2.1 Taxonomy of Neural Networks. . . . . . . . . . . . . . . . . . . 18 3.2.2 Taxonomy of Training Methods . . . . . . . . . . . . . . . . . . 20 3.2.3 Applications of Neural Networks . . . . . . . . . . . . . . . . . 21 Part II Mathematical Foundations 4 General Foundations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Foundations of Dynamical Systems Theory. . . . . . . . . . . . . . . . . . . 35 5.1 Preliminaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 The Euler Method on a Riemannian Manifold . . . . . . . . . . . . . 39 5.3 Linear Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.4 Weakly Nonlinear Dynamical Systems. . . . . . . . . . . . . . . . . . . 42 5.5 Gradient Dynamical Systems. . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.6 Topological Conjugacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.7 Pseudo-orbit Tracing Property . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.8 Dynamical Systems with Control. . . . . . . . . . . . . . . . . . . . . . . 51 5.9 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 v vi Contents Part III Mathematical Models of the Neuron 6 Models of the Whole Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.1 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7 Models of Parts of the Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 7.1 Model of Dendritic Conduction . . . . . . . . . . . . . . . . . . . . . . . . 67 7.2 Model of Axonal Transport. . . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.3 Models of Transport in the Presynaptic Bouton . . . . . . . . . . . . 71 7.3.1 The A-G Model of Fast Transport Based on ODEs . . . . 71 7.3.2 The Model of Fast Synaptic Transport Based on PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.3.3 Model of Neuropeptide Slow Transport. . . . . . . . . . . . . 83 7.4 Model of the Synapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 7.5 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Part IV Mathematical Models of the Perceptron 8 General Model of the Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . 99 8.1 Model of a Structure of a Neural Network . . . . . . . . . . . . . . . . 99 8.2 Supervised Deterministic Training Process . . . . . . . . . . . . . . . . 106 8.3 Gradient Learning Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 8.4 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 9 Linear Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 9.1 Basic Properties of Linear Perceptrons . . . . . . . . . . . . . . . . . . . 111 9.2 Dynamics of Training Process of Linear Perceptrons . . . . . . . . 113 9.3 Stability of the Learning Process of Linear Perceptrons. . . . . . . 117 9.4 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 10 Weakly Nonlinear Perceptrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 10.1 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 11 Nonlinear Perceptrons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 11.1 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 12 Concluding Remarks and Comments . . . . . . . . . . . . . . . . . . . . . . . 133 Part V Appendix 13 Approximation Properties of Perceptrons. . . . . . . . . . . . . . . . . . . . 137 13.1 Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 14 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 14.1 Estimation of Constants in Fečkan Theorem. . . . . . . . . . . . . . . 141 14.2 Estimation of the Euler Method Error on a Manifold . . . . . . . . 144 References.... .... .... .... ..... .... .... .... .... .... ..... .... 149 Chapter 1 Introduction In contemporary natural sciences strong mutual relations can be observed. Thus, biologyisassociatedwithchemistryandphysics.Itisconnected,furthermore,with computerscience,electronics,mathematics,cyberneticsandphilosophywhichcon- tributesignificantlyintobiologicalstudies.Letusspecifytheaforementionedrela- tionsindetail-seeFig.1.1. Biology, as such, treats of structures and processes in living systems. The two latter are studied by using observations and experiments, often sophisticated and technologically advanced. The obtained results are, on the one hand, the starting pointforbiologicaltheories,forinstance,totheparadigmthatspecificfeaturesare inheritedgenetically.Ontheotherhand,inthecontemporarybiology,formalmod- els of structures and processes are created. In order to create an adequate model, thepropertieswhicharecrucialforthemodelledphenomenonhavetobespecified. Thesemi-formaldescription,whichisinitscharacter,inaway,analogoustoaxiom systemsinmathematics,isintroducedonthebasisofthespecifiedproperties-see, for example, [41], Sect.2. This description is the starting point for creating either a formal model (the arrow 4 in Fig. 1.1), for instance mathematical one, or some implementations. The aforementioned implementations can have two forms - of a software algorithm or an electronic system. The latter one should be functionally similartothemodelledprocessorstructure(thearrow5).Thereisareciprocalrela- tionbetweenaformalmodelanditsimplementation-eachonecanbethestarting pointfortheother.Forinstance,iftheordinarydifferentialequation,whichmodels thedynamicsofthestudiedbiologicalprocess,canbeeasilyobtainedonthebasisof thesemi-formaldescription,thentheelectroniccircuit,whosedynamicsisdescribed bythisequation,canbeconstructed(thearrow6).Onthecontrary,ifthestructureof theelectroniccircuitcanbederiveddirectlyfromthesemi-formaldescription,then this circuit can be implemented and the differential equation, which describes its dynamics,canbederived(thearrow7).Regardlessoftheorderofthemodelandthe implementationcreation,themodelcanbeanalyzedbyusingformalapproach(the ©SpringerInternationalPublishingAG,partofSpringerNature2019 1 A.Bielecki,ModelsofNeuronsandPerceptrons:SelectedProblems andChallenges,StudiesinComputationalIntelligence770, https://doi.org/10.1007/978-3-319-90140-4_1 2 1 Introduction Fig.1.1 Thegeneralschemaofrelationsinmodellingbiologicalstructuresandprocesses arrow 8). This analysis allows the researcher to study the properties of the formal modeland,astheconsequence,thepropertiesoftheinvestigatedphenomenon.The resultsofthisanalysiscanpointoutthedirectionsforfurtherobservationsandexper- iments(thearrow9)aswellasthenecessitytomodifythesetofcrucialproperties (the arrow 10). In such a way the formal models and both software and electronic implementationsbecomeasignificantpartofthemethodologyofbiologicalstudies andgeneratespecificmethodologicalandphilosophicalproblems[34]. Thismonographtreatsofmodelsofneuralnetworksinthecontextoftheirmod- elling. The artificial systems, which are modelled after biological neural cells and structuresconstitutedbythem,arecreatedfortworeasons.Ontheonehand,such an approach enables the researchers to study biological phenomena indirectly by investigatingartificialmodels.Ontheotherhand,artificialneuralnetworks(ANNs, forabbreviation)arecomputationalsystemsofartificialintelligence.Thementioned systemsenableresearcherstosolveawideclassofproblems-patternrecognition, control,classificationanddiagnosticscanbeputasexamples.Modellingofneural systems, hardware and software implementations of these models and, first of all, analysisofthemodelsbyusingmathematicaltoolsisthemaintopicofthismono- graph. Thus, referring to Fig. 1.1, problems which correspond to the frames E, F, Gandpartially,D,aswellastherelationssymbolizedbythearrows4,5,6,7and 8,arethetopicsofthismonograph.Itshouldbestressed,however,thatonlysome selectedproblemsarediscussed. 1 Introduction 3 Inscientificinvestigations,thestudiesconcerningmathematicalmodellingofthe biological neural structures and artificial neural networks are, usually, regarded as separatetopics.Nevertheless,alltypesofANNs,aswellastheirtrainingalgorithms, arebasedonthemodelsofbiologicalprototypes.Therefore,inthismonograph,the intension of the author is to unify these two topics. The more so because it seems thattherearenumerousmodelsofbiologicalneuralstructuresthatcanbethebasis for artificial systems and that have not been utilized yet. The way of the problem presentationinthismonographcanbeprospectivefortheabovementionedreasons. Thismonographconsistsoffiveparts.Thefirstpart,thepreliminaryone,treats of the foundations of both neuroscience (Chap.2) and ANNs (Chap.3). It should bestressedthatbiological foundations arepresentedmoredetailed thanusuallyin thebooksthatconcernneurocyberneticsandneuromathematics.InChap.3alltypes ofANNsarediscussed.Inthesecondpart(Chaps.4and5)foundations ofmathe- maticaltoolsusedinthesequelarespecified.Chapter4treatsaboutmathematical foundations.Itdealswithbasicissuesandassuchcanbeomittedbymathematicians. InChap.5veryspecialtopicsofdynamicalsystemstheoryarepresentedanditcan beinterestingevenfortheprofessionalmathematicians.Mathematicalmodelsofthe neuron,boththewholeone(Chap.6)anditsparts(Chap.7),arediscussedinthethird part of the monograph. The models discussed in Chap.7 are based on differential equationsandtheydescribetheprocessesofsignaltransmissioninsideoftheneu- ron.Electronicimplementationsofthesemodelsarediscussedwidelyaswell.The Sects.7.3.2and7.3.3presenttheresultsobtainedbytheauthor.Theyconcernfast andslowtransportphenomenainthepresynapticbouton.Themathematicalmodels oftheperceptronarediscussedinthefourthpart.Thispartofthemonographalso referstotheresultsobtainedbytheauthor.InChap.8themodeloftheperceptron structure,aswellasthegeneralmodelofgradienttrainingprocessoftheperceptron, ispresented.Dynamicalaspectsoftrainingprocessoflinear,weaklynonlinearand nonlinearperceptronsareanalyzedinChaps.9,10and11,respectively.Theanalysis isbasedonthedynamicalsystemstheoryandreferstothestabilityofadynamical system,theflowdiscretization,thetopologicalconjugacyofcascadesandtheshad- owingproperty.ConcludingremarksarepresentedinChap.12.Appendix(Chaps.13 and14)isthefifthpartofthebook.Thedynamicalmodelsarethetopicofthemono- graph.Theapproximationcapabilitiesofperceptrons,however,aretheveryclassical andwell-workedtopicinmathematicalanalysisofperceptrons.Thus,thebasicand classical results are presented in Chap.13. In the text of this monograph only the proofsofthetheoremsthatconcernsthetopisdirectlyarepresented.Theproofsof othertheoremsthatarenotknownwidelybuthasbeenutilizedinthismonograph, arepresentedinChap.14.
Description: