Jonathan Amezcua Patricia Melin (cid:129) Oscar Castillo fi New Classi cation Method Based on Modular Neural Networks with the LVQ Algorithm and Type-2 Fuzzy Logic 123 JonathanAmezcua Oscar Castillo Division of Graduate Studies Division of Graduate Studies TijuanaInstitute of Technology TijuanaInstitute of Technology Tijuana, BajaCalifornia Tijuana, BajaCalifornia Mexico Mexico Patricia Melin Division of Graduate Studies TijuanaInstitute of Technology Tijuana, BajaCalifornia Mexico ISSN 2191-530X ISSN 2191-5318 (electronic) SpringerBriefs inApplied SciencesandTechnology ISSN 2520-8551 ISSN 2520-856X (electronic) SpringerBriefs inComputational Intelligence ISBN978-3-319-73772-0 ISBN978-3-319-73773-7 (eBook) https://doi.org/10.1007/978-3-319-73773-7 LibraryofCongressControlNumber:2017962995 © Author(s), under licence to Springer International Publishing AG, part of SpringerNature2018 Preface Inthisbook,anewmodelfordataclassificationwasdeveloped.Thisnewmodelis based on the competitive neural network learning vector quantization (LVQ) and type-2 fuzzy logic. This computational model consists of the hybridization of the aforementionedtechniques,usingafuzzylogicsystemwithinthecompetitivelayer of the LVQ network to determine the shortest distance between a centroid and an input vector. This new model is based on a modular LVQ architecture to further improve its performance on complex classification problems. It also implements a data-similarity process for preprocessing the datasets, in order to build dynamic architectures, having the classes with the highest degree of similarity in different modules. Some architectures were developed in order to work mainly with two datasets, an arrhythmia dataset (using ECG signals) for classifying 15 different types ofarrhythmias,andasatelliteimagesegmentdatasetusedfor classifying six different types of soil. Both datasets show interesting features that make them interesting for testing new classification methods. First, this book started with the optimization of some parameters of a modular LVQnetworkarchitecture,andtheseparameterswerethenumberofclustercenters, number of epochs for training, and the LVQs algorithm learning rate. The bio-inspired metaheuristic method called particle swarm optimization (PSO) was used for this purpose, showing good performance in this problem. Afterward, afuzzyinferencesystem(FIS)wasdesignedanddevelopedinorder toadaptittotheLVQscompetitivelayer.Thisfuzzysystemdeterminestheclosest cluster center to an input vector, based on the distances computed by the LVQ algorithm itself. Finally, this FIS was elevated into an interval type-2 fuzzy infer- encesystem(IT2FIS).Eventhoughobtainedresultsarenotstatisticallyconclusive, the hybridization in this new model generated favorable results under certain conditions. The obtained results for this new model will also depend on the com- plexity of the datasets to work with. ThisresearchworkwaspartiallyfundedbyCONACYTandTijuanaInstituteof Technology, and we would like to express our gratitude to both institutions. In addition, we would like to thank Prof. Janusz Kacprzyk for always supporting and encouraging us to perform good research in the computational intelligence area. Tijuana, Mexico Dr. Jonathan Amezcua November 2017 Prof. Patricia Melin Prof. Oscar Castillo Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Theory and Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 History of Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . 6 2.3 Neural Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.1 Input Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.2 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.3 Output Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.4 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Supervised Learning Neural Networks. . . . . . . . . . . . . . . . . . . . . 10 2.4.1 Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.2 Multilayer Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4.3 MLPs Backpropagation Algorithm . . . . . . . . . . . . . . . . . . 12 2.5 Unsupervised Learning Neural Networks . . . . . . . . . . . . . . . . . . . 12 2.5.1 Competitive Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.2 Learning Vector Quantization. . . . . . . . . . . . . . . . . . . . . . 14 2.6 Modular Neural Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.1 Characteristics of Modular Neural Networks . . . . . . . . . . . 16 2.7 Fuzzy Inference Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.7.1 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.7.2 Membership Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7.3 Fuzzy If-Then Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.7.4 Components of a Fuzzy Inference System. . . . . . . . . . . . . 23 2.8 Interval Type-2 Fuzzy Inference Systems. . . . . . . . . . . . . . . . . . . 24 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 Datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1.1 Arrhythmia Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1.2 Satellite Images Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4 Proposed Classification Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Fuzz LVQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Model Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Data Similarity Process . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.2 Model Architectures for the Arrhythmia Dataset . . . . . . . . 39 4.2.3 Model Architectures for the Satellite Images Dataset . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.1 Arrhythmia Dataset Methods Description. . . . . . . . . . . . . . . . . . . 41 5.1.1 Arrhythmia Dataset Simulation Results. . . . . . . . . . . . . . . 42 5.1.2 Arrhythmia Dataset Statistical Analysis. . . . . . . . . . . . . . . 47 5.2 Satellite Images Dataset Methods Description. . . . . . . . . . . . . . . . 48 5.2.1 Satellite Images Dataset Simulation Results. . . . . . . . . . . . 49 5.2.2 Satellite Images Dataset Statistical Analysis . . . . . . . . . . . 52 Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.1 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Appendix. .... .... .... .... ..... .... .... .... .... .... ..... .... 57 Index .... .... .... .... .... ..... .... .... .... .... .... ..... .... 73 Chapter 1 Introduction A classification problem consists in categorizing an object based on certain attri- butes, with the aim of identifying to which class it belongs to. For instance, a fruit could be classified based on its size, color, or shape; the same way as an auto- mobile, a flower, an animal, among others. All these objects have their own attri- butes, and which attributes are considered for classifying an object (or event) will depend on the problem to work with. For example, a heart disease could be clas- sified using data obtained from a Holter device, a tumor or a cancer cell could be classified based on the data of an image. The list of classification problems is countless, and here is where many classi- fication algorithms emerge [1, 2], for solving the majority of all these kinds of problems. Most of these algorithms work with feature vectors of the objects, in these vectors the objects attributes are described, in order to be learned by the classification algorithm. Depending on the algorithm to work with, the features in thesevectorscanbebinary,real-valued,categorical, etc.Forinstance,toclassifya tumor based on an image, the feature vector would be composed by the values of the pixels in the image. According to [3], the classification process is composed by four basic components: (cid:129) Class, represented by a label, and used on the object after its classification. (cid:129) Attributes of the object to be classified (defined in the feature vectors). (cid:129) Training dataset, which is used for training the classification model, to rec- ognize the appropriate class based on the available attributes. (cid:129) Testing dataset, containing the new data that should be classified by the classification model. 2 1 Introduction Some of the proposed and commonly used algorithms for classification tasks include Naïve-Bayes classifiers, Support Vector Machines (SVM), Neural Networks, Learning Vector Quantization (LVQ), among others. Naïve-Bayes classifiershavebeenthoroughlystudiedandarebasedontheBayestheorem,which explains the probability of an event based on previous knowledge related to the event.Theseclassifiersassumethatthevalueofagivenfeatureinaclassvariableis independent of the value of another feature, hence regardless of the possible cor- relations between all the features of an object. Some works based on this classifier can be found in [4, 5]. Support Vector Machines (SVMs) are supervised-learning models also used for classification problems. For example, given a set of feature vectors for training, each one labeled with one of two possible classes, an SVM training algorithm buildsamodelthatsetsnewfeaturevectorstooneofthetwopossibleclasses;this makesitanon-probabilisticlinearclassifier.SVMrepresents thefeaturevectorsas pointsinspace,mappedsothatvectorsindifferentcategoriesareseparatedbyagap aswideaspossible.Then,newfeaturevectorsaremappedintothesamespaceand forecasted to belong to a class depending on which side of the gap they fall in. Research works based on SVM can be found in [6, 7]. Neural Networkshaveproven tobesuccessful forproblemsolving; inspired by biological neural networks in nature, they work in the same way that the human brain would, though in a more abstract way. Neural networks have been widely used in areas such as recognition, clustering, classification, etc. Some related research works can be found in [8, 9]. However, this book is focused on a special neural network approach, called Learning Vector Quantization. Learning Vector Quantization (LVQ) is an adaptive method for data classifi- cation, a prototype-based supervised classification algorithm. It applies the winner-take-all learning-based approach. Represented by prototypes defined in the featurevectors,theLVQswinner-takes-alltrainingalgorithmdecides,foreachdata point,theclosestprototype toaninputvector,based onadistance measure, inthis case, theEuclidean distance.Then thepositionof thewinnerprototype isadapted, this is, the winner prototype is moved closer if it correctly classifies a data point, otherwise is moved away. Some works related to LVQ are described in [10–12]. ThereisanotherdisciplineincomputersciencecalledFuzzyLogic.Inthiscase, Fuzzy Logic is based on the fuzzy sets theory, fuzzy if-then rules, and fuzzy reasoning.Ithasbeensuccessfullyappliedinavarietyofareasincludingautomatic control, robotics, time series prediction, classification, and many more [13–15]. Fuzzy logic based systems are useful solving problems with variable answers (uncertainty), for instance when asking a group of people to measure the water temperature in a tank, a fuzzy system would provide flexibility answering to this question through the use of membership functions, where expressions like “the wateriswarm”or“thewateriscold”canbeusedinthiswaywithacertaindegree of belonging (membership degree). References 3 References 1. Farhad,P.,Choo,J.,Chee,P.,&Junita,M.(2017).AQ-learning-basedmulti-agentsystem fordataclassification.AppliedSoftComputing,52,519–531. 2. Jagapriya,J.,&Annapoorani,G.(2011).Neuralnetworkbasedclassificationfororthopedic conditionsdiagnosisusinggreylevelco-occurrenceprobabilities.In20113rdInternational ConferenceonElectronicsComputerTechnology,Kanyakumari,pp.89–93. 3. Gorunescu, F. (2011). Data mining, concepts, models and techniques (pp. 15–19). Berlin, Heidelberg:Springer. 4. Fouladi,R.F.,Kayatas,C.E.,&Anarim,E.(2016).FrequencybasedDDoSattackdetection approach using naive Bayes classification. In 2016 39th International Conference on TelecommunicationsandSignalProcessing(TSP)(pp.104–107),Vienna. 5. Liu, J., Tian, Z., Liu, P., Jiang, J., & Li, Z. (2016) An approach of semantic web service classification based on Naive Bayes. In 2016 IEEE International Conference on Services Computing(SCC)(pp.356–362),SanFrancisco,CA. 6. Davis, P., Creusere, C. D., & Kroger, J. (2014) Classification of human viewers using high-resolutionEEGwithSVM.In201448thAsilomarConferenceonSignals,Systemsand Computers(pp.184–188),PacificGrove,CA. 7. Li,H.,Chung,F.,&Wang,S.(2015).ASVMbasedclassificationmethodforhomogeneous data.AppliedSoftComputing,36,228–235. 8. Maglogiannis,I.,Sarimveis,H.,Kiranoudis,C.T.,Chatziioannou,A.A.,Oikonomou,N.,& Aidinis,V.(2008).Radialbasisfunctionneuralnetworksclassificationfortherecognitionof idiopathic pulmonary fibrosis in microscopic images. IEEE Transactions on Information TechnologyinBiomedicine,12(1),42–54. 9. Thulasidasan, S., & Bilmes, J. (2017) Acoustic classification using semi-supervised deep neuralnetworksandstochasticentropy-regularizationovernearest-neighborgraphs.In2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp.2731–2735),NewOrleans,LA,USA. 10. Ramesh,P.,Katagiri,S.,&LeeC.H.(1991).Anewconnectedwordrecognitionalgorithm basedonHMM/LVQsegmentationandLVQclassification.InICASSP91:1991International Conference on Acoustics, Speech, and Signal Processing (Vol. 1, pp. 113–116), Toronto, Ontario. 11. Salloum,R.,&Kuo,C.C.J.(2017)ECG-basedbiometricsusingrecurrentneuralnetworks. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(pp.2062–2006),NewOrleans,LA,USA. 12. Zhang,Y.,&Li,M.(2016)Anevaluationmodelofwaterqualitybasedonlearningvector quantization neural network. In 2016 35th Chinese Control Conference (CCC) (pp.3685–3689),Chengdu. 13. Castillo, O., & Melin, P. (1999). Modelling complex dynamical systems with a new fuzzy inference system for differential equations: The case of robotic dynamic systems. In Fuzzy SystemsConferenceProceedings,1999.FUZZ-IEEE’99(Vol.2,pp.662–667).Seoul,South Korea. 14. Castillo, O., & Melin, P. (1998). A new fuzzy-fractal-genetic method for automated mathematical modelling and simulation of robotic dynamic systems. In 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on ComputationalIntelligence(Cat.No.98CH36228)(Vol.2,pp.1182–1187),Anchorage,AK. 15. Teng, T., Wang, Y., Cai, W., & Li H. (2017) Fuzzy model predictive control of discrete systemswithtime-varyingdelayanddisturbances.IEEETransactionsonFuzzySystems,PP (99),1–1. Chapter 2 Theory and Background ComputerscienceembracesavarietyofdifferentareassuchasComputerGraphics, Computational Complexity, Computer Cryptography, Computational Intelligence, among others. The area of our interest in this book is Computational Intelligence whichincludesfieldssuchasArtificialNeuralNetworks,FuzzyInferenceSystems, ComputerVision,DataMining,etc.Hence,inthissection,conceptsregardingthese Computational Intelligence fields are thoroughly covered, in concrete: neural net- works, and fuzzy logic systems. 2.1 Artificial Neural Networks AnArtificialNeuralNetwork(ANN)isamodelwithalearningalgorithm,inspired on the biological nervous system, for information processing. It is composed by layersofartificialneurons,whichareconnectedwitheachother.Theseconnections between neurons transmit an activation signal of different strength; if the combi- nation of the incoming signals is strong enough, the neuron is activated and the signalmovestootherneuronsconnectedtoit.Thesealgorithmscanbetrained,and have been used to solve a wide variety of areas. Accordingto[1],aneuralnetworkisasystemcomposedbyparallelprocessors connected to each other as in a directed graph, where each processor (artificial neuron)isrepresentedasagraphnode.Theseconnectionsbetweenneuronsdefinea hierarchical structure which tries to imitate thehuman brain physiology on finding new processing models for real-world problems solving [2–4]. Figure 2.1 shows a representation of a neural network structure. Neural networks, with their remarkable ability to understand the meaning of complicatedorimprecisedata,canbeusedtoextractpatternsanddetecttrendsthat are too complex to be noticed by humans or other computer technology software. 6 2 TheoryandBackground Fig.2.1 Fullyconnectedneuralnetworkstructure A trained neural network can be thought as an expert on the information that has been given toanalyze. Neural networks have been used in a wide variety of areas, amongwhichare:robotics,dataanalysis,patternrecognition,classification,etc.[5]. Next,someconceptslikehistory,architecture,learningrulesofneuralnetworksare covered. 2.2 History of Artificial Neural Networks McCoullochandPittswerethepioneerswhoproposedacomputermodelbasedon asimpleneuronasalogicalelement.Later,DonaldHebbproposedanincremental learning rule called Hebbian Rule for adapting the connections strength between neurons. This rule became the basis of many artificial models regarding to neural networksresearch[6].Sincethe1980stherehasbeenarenewedinterestinthefield of artificial neural networks, this can be attributed to several factors. The original defects of the initial neural network models were surpassed by the introduction of moresophisticatedartificialneuralnetworks,alongwithnewandimprovedtraining techniques. The availability of high-speed computers created the possibility of simulating more complex and more convenient artificial models. Significant research efforts from several scientists have helped in restoring lost confidence in this field. This confidence was enhanced by the research efforts of Rumelhart, Hinton and Williams, who developed a generalization of Widrows’ delta rule, followed by a seriesofdemonstrationsabouthowanartificialneuralnetworkcouldlearndifficult tasks in areas, such as speech recognition, control systems, pattern recognition, among others. Thus, research in this area has experienced extremely rapid growth due to its interdisciplinary applicability [7].

