SPRINGER BRIEFS IN APPLIED SCIENCES AND TECHNOLOGY COMPUTATIONAL INTELLIGENCE Jonathan Amezcua Patricia Melin Oscar Castillo New Classification Method Based on Modular Neural Networks with the LVQ Algorithm and Type-2 Fuzzy Logic SpringerBriefs in Applied Sciences and Technology Computational Intelligence Series editor Janusz Kacprzyk, Polish Academy of Sciences, Systems Research Institute, Warsaw, Poland The series “Studies in Computational Intelligence” (SCI) publishes new develop- mentsandadvancesinthevariousareasofcomputationalintelligence—quicklyand with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution, which enable both wide and rapid dissemination of research output. More information about this series at http://www.springer.com/series/10618 Jonathan Amezcua Patricia Melin (cid:129) Oscar Castillo fi New Classi cation Method Based on Modular Neural Networks with the LVQ Algorithm and Type-2 Fuzzy Logic 123 JonathanAmezcua Oscar Castillo Division of Graduate Studies Division of Graduate Studies TijuanaInstitute of Technology TijuanaInstitute of Technology Tijuana, BajaCalifornia Tijuana, BajaCalifornia Mexico Mexico Patricia Melin Division of Graduate Studies TijuanaInstitute of Technology Tijuana, BajaCalifornia Mexico ISSN 2191-530X ISSN 2191-5318 (electronic) SpringerBriefs inApplied SciencesandTechnology ISSN 2520-8551 ISSN 2520-856X (electronic) SpringerBriefs inComputational Intelligence ISBN978-3-319-73772-0 ISBN978-3-319-73773-7 (eBook) https://doi.org/10.1007/978-3-319-73773-7 LibraryofCongressControlNumber:2017962995 © The Author(s), under exclusive licence to Springer International Publishing AG, part of Springer Nature2018 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisSpringerimprintispublishedbytheregisteredcompanySpringerInternationalPublishingAGpartof SpringerNature Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Inthisbook,anewmodelfordataclassificationwasdeveloped.Thisnewmodelis based on the competitive neural network learning vector quantization (LVQ) and type-2 fuzzy logic. This computational model consists of the hybridization of the aforementionedtechniques,usingafuzzylogicsystemwithinthecompetitivelayer of the LVQ network to determine the shortest distance between a centroid and an input vector. This new model is based on a modular LVQ architecture to further improve its performance on complex classification problems. It also implements a data-similarity process for preprocessing the datasets, in order to build dynamic architectures, having the classes with the highest degree of similarity in different modules. Some architectures were developed in order to work mainly with two datasets, an arrhythmia dataset (using ECG signals) for classifying 15 different types ofarrhythmias,andasatelliteimagesegmentdatasetusedfor classifying six different types of soil. Both datasets show interesting features that make them interesting for testing new classification methods. First, this book started with the optimization of some parameters of a modular LVQnetworkarchitecture,andtheseparameterswerethenumberofclustercenters, number of epochs for training, and the LVQs algorithm learning rate. The bio-inspired metaheuristic method called particle swarm optimization (PSO) was used for this purpose, showing good performance in this problem. Afterward, afuzzyinferencesystem(FIS)wasdesignedanddevelopedinorder toadaptittotheLVQscompetitivelayer.Thisfuzzysystemdeterminestheclosest cluster center to an input vector, based on the distances computed by the LVQ algorithm itself. Finally, this FIS was elevated into an interval type-2 fuzzy infer- encesystem(IT2FIS).Eventhoughobtainedresultsarenotstatisticallyconclusive, the hybridization in this new model generated favorable results under certain conditions. The obtained results for this new model will also depend on the com- plexity of the datasets to work with. v vi Preface ThisresearchworkwaspartiallyfundedbyCONACYTandTijuanaInstituteof Technology, and we would like to express our gratitude to both institutions. In addition, we would like to thank Prof. Janusz Kacprzyk for always supporting and encouraging us to perform good research in the computational intelligence area. Tijuana, Mexico Dr. Jonathan Amezcua November 2017 Prof. Patricia Melin Prof. Oscar Castillo Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Theory and Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 History of Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . 6 2.3 Neural Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.1 Input Function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.2 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.3 Output Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.4 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Supervised Learning Neural Networks. . . . . . . . . . . . . . . . . . . . . 10 2.4.1 Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.2 Multilayer Perceptron. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4.3 MLPs Backpropagation Algorithm . . . . . . . . . . . . . . . . . . 12 2.5 Unsupervised Learning Neural Networks . . . . . . . . . . . . . . . . . . . 12 2.5.1 Competitive Learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.2 Learning Vector Quantization. . . . . . . . . . . . . . . . . . . . . . 14 2.6 Modular Neural Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.6.1 Characteristics of Modular Neural Networks . . . . . . . . . . . 16 2.7 Fuzzy Inference Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.7.1 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.7.2 Membership Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7.3 Fuzzy If-Then Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.7.4 Components of a Fuzzy Inference System. . . . . . . . . . . . . 23 2.8 Interval Type-2 Fuzzy Inference Systems. . . . . . . . . . . . . . . . . . . 24 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 vii viii Contents 3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 Datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1.1 Arrhythmia Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1.2 Satellite Images Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4 Proposed Classification Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Fuzz LVQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Model Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Data Similarity Process . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2.2 Model Architectures for the Arrhythmia Dataset . . . . . . . . 39 4.2.3 Model Architectures for the Satellite Images Dataset . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.1 Arrhythmia Dataset Methods Description. . . . . . . . . . . . . . . . . . . 41 5.1.1 Arrhythmia Dataset Simulation Results. . . . . . . . . . . . . . . 42 5.1.2 Arrhythmia Dataset Statistical Analysis. . . . . . . . . . . . . . . 47 5.2 Satellite Images Dataset Methods Description. . . . . . . . . . . . . . . . 48 5.2.1 Satellite Images Dataset Simulation Results. . . . . . . . . . . . 49 5.2.2 Satellite Images Dataset Statistical Analysis . . . . . . . . . . . 52 Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.1 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Appendix. .... .... .... .... ..... .... .... .... .... .... ..... .... 57 Index .... .... .... .... .... ..... .... .... .... .... .... ..... .... 73 Chapter 1 Introduction A classification problem consists in categorizing an object based on certain attri- butes, with the aim of identifying to which class it belongs to. For instance, a fruit could be classified based on its size, color, or shape; the same way as an auto- mobile, a flower, an animal, among others. All these objects have their own attri- butes, and which attributes are considered for classifying an object (or event) will depend on the problem to work with. For example, a heart disease could be clas- sified using data obtained from a Holter device, a tumor or a cancer cell could be classified based on the data of an image. The list of classification problems is countless, and here is where many classi- fication algorithms emerge [1, 2], for solving the majority of all these kinds of problems. Most of these algorithms work with feature vectors of the objects, in these vectors the objects attributes are described, in order to be learned by the classification algorithm. Depending on the algorithm to work with, the features in thesevectorscanbebinary,real-valued,categorical, etc.Forinstance,toclassifya tumor based on an image, the feature vector would be composed by the values of the pixels in the image. According to [3], the classification process is composed by four basic components: (cid:129) Class, represented by a label, and used on the object after its classification. (cid:129) Attributes of the object to be classified (defined in the feature vectors). (cid:129) Training dataset, which is used for training the classification model, to rec- ognize the appropriate class based on the available attributes. (cid:129) Testing dataset, containing the new data that should be classified by the classification model. ©TheAuthor(s),underexclusivelicencetoSpringerInternationalPublishingAG, 1 partofSpringerNature2018 J.Amezcuaetal.,NewClassificationMethodBasedonModularNeuralNetworks withtheLVQAlgorithmandType-2FuzzyLogic,SpringerBriefsinComputational Intelligence,https://doi.org/10.1007/978-3-319-73773-7_1
Description: