Table Of ContentStudies in Computational Intelligence 712
Sankar K. Pal
Shubhra S. Ray
Avatharam Ganivada
Granular Neural
Networks,
Pattern
Recognition and
Bioinformatics
Studies in Computational Intelligence
Volume 712
Series editor
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: kacprzyk@ibspan.waw.pl
About this Series
The series “Studies in Computational Intelligence” (SCI) publishes new develop-
mentsandadvancesinthevariousareasofcomputationalintelligence—quicklyand
with a high quality. The intent is to cover the theory, applications, and design
methods of computational intelligence, as embedded in the fields of engineering,
computer science, physics and life sciences, as well as the methodologies behind
them. The series contains monographs, lecture notes and edited volumes in
computational intelligence spanning the areas of neural networks, connectionist
systems, genetic algorithms, evolutionary computation, artificial intelligence,
cellular automata, self-organizing systems, soft computing, fuzzy systems, and
hybrid intelligent systems. Of particular value to both the contributors and the
readership are the short publication timeframe and the worldwide distribution,
which enable both wide and rapid dissemination of research output.
More information about this series at http://www.springer.com/series/7092
Sankar K. Pal Shubhra S. Ray
(cid:129)
Avatharam Ganivada
Granular Neural Networks,
Pattern Recognition
and Bioinformatics
123
Sankar K.Pal Avatharam Ganivada
Centerfor Soft Computing Research Centerfor Soft Computing Research
Indian Statistical Institute Indian Statistical Institute
Kolkata Kolkata
India India
Shubhra S.Ray
Centerfor Soft Computing Research
Indian Statistical Institute
Kolkata
India
ISSN 1860-949X ISSN 1860-9503 (electronic)
Studies in Computational Intelligence
ISBN978-3-319-57113-3 ISBN978-3-319-57115-7 (eBook)
DOI 10.1007/978-3-319-57115-7
LibraryofCongressControlNumber:2017937261
©SpringerInternationalPublishingAG2017
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar
methodologynowknownorhereafterdeveloped.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom
therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Printedonacid-freepaper
ThisSpringerimprintispublishedbySpringerNature
TheregisteredcompanyisSpringerInternationalPublishingAG
Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland
To our parents
Preface
The volume “Granular Neural Networks, Pattern Recognition and Bioinformatics”
isanoutcomeofthegranularcomputingresearchinitiatedin2005attheCenterfor
Soft Computing Research: A National Facility, Indian Statistical Institute (ISI),
Kolkata. The center was established in 2005 by the Department of Science and
Technology, Govt. of India under its prestigious IRHPA (Intensification of
Research in High Priority Area) program. Now it is an Affiliated Institute of ISI.
Granulation is a process like self-production, self-organization, functioning of
brain, Darwinian evolution, group behavior and morphogenesis—which are
abstracted from natural phenomena. Accordingly, it has become a component of
natural computing. Granulation is inherent in human thinking and reasoning pro-
cess,andplaysanessentialroleinhumancognition.Granularcomputing(GrC)isa
problem-solving paradigm dealing with the basic elements, called granules.
Agranulemaybedefinedastheclumpofindistinguishableelementsthataredrawn
together, for example, by indiscernibility, similarly, proximity or functionality.
Granules with different levels of granularity, as determined by its size and shape,
may represent a system differently. Since in GrC, computations are performed on
granules,ratherthanonindividualdatapoints,computationtimeisgreatlyreduced.
ThismadeGrCaveryusefulframeworkfor designing scalablepattern recognition
and data mining algorithms for handling large data sets.
Thetheoryofroughsetsthatdealswithaset(concept)definedoveragranulated
domain provides an effective tool for extracting knowledge from databases. Two
oftheimportant characteristics of thistheorythat drew theattentionof researchers
inpatternrecognitionanddecisionscienceareitscapabilityofuncertaintyhandling
andgranularcomputing.Whiletheconceptofgranularcomputingisinherentinthis
theory where the granules are defined by equivalence relations, uncertainty arising
from the indiscernibility in the universe of discourse can be handled using the
concept of lower and upper approximations of the set. Lower and upper approxi-
mate regions respectively denote the granules which definitely, and definitely and
possiblybelongtotheset.Inreal-lifeproblemsthesetandgranules,eitherorboth,
could be fuzzy; thereby resulting in fuzzy-lower and fuzzy-upper approximate
regions, characterized by membership functions.
vii
viii Preface
Granular neural networks described in the present book are pivoted on the
characteristics of lower approximate regions of classes demonstrating its signifi-
cance. The basic principle of design is—detect lower approximations of classes
(regions where the class belonging of samples is certain); find class information
granules, called knowledge; form basic networks based on those information, i.e.,
by knowledge encoding; and then grow the network with samples belonging to
upperapproximateregions(i.e.,samplesofpossibleaswellasdefinitebelonging).
Informationgranulesconsideredarefuzzytodealwithreal-lifeproblems.Theclass
boundaries generated in this way provide optimum error rate. The networks thus
developedarecapableofefficientandspeedylearningwithenhancedperformance.
These systems have a strong promise to Big data analysis.
The volume, consisting of seven chapters, provides a treatise in a unified
framework in this regard, and describes how fuzzy rough granular neural network
technologies can be judiciously formulated and used in building efficient pattern
recognition and mining models. Formation of granules in the notion of both fuzzy
and rough sets is stated. Judicious integration in forming fuzzy-rough information
granules based on lower approximate regions enables the network in determining
the exactness in class shape as well as handling the uncertainties arising from
overlapping regions. Layered network and self-organizing map are considered as
basic networks.
Basedontheexistingaswellasnewresults,thebookisstructuredaccordingto
themajorphasesofapatternrecognitionsystem(e.g.,classification,clustering,and
feature selection) with a balanced mixture of theory, algorithm and application.
Chapter 1 introduces granular computing, pattern recognition and data mining for
the convenience of readers. Beginning with the concept of natural computing, the
chapter describes in detail the various characteristics and facets of granular com-
puting, granular information processing aspects of natural computing, its different
components such as fuzzy sets, rough sets and artificial networks, relevance of
granular neural networks, different integrated granular information processing
systems, and finally the basic components of pattern recognition and data mining,
andbigdataissues.Chapter2dealswithclassificationtask,Chaps.3and5address
clustering problems, and Chap. 4 describes feature selection methodologies, all
from the point of designing fuzzy rough granular neural network models. Special
emphasis has been given to dealing with problems in bioinformatics, e.g., gene
analysis and RNA secondary structure prediction, with a possible use of the
granular computing paradigm. These are described in Chaps. 6 and 7 respectively.
New indices for cluster evaluation and gene ranking are defined. Extensive
experimental results have been provided to demonstrate the salient characteristics
of the models.
Most of the texts presented in this book are from our published research work.
The related and relevant existing approaches or techniques are included wherever
necessary. Directions for future research in the concerned topic are provided.
A comprehensive bibliography on the subject is appended in each chapter, for the
convenienceofreaders.Referencestosomeofthestudiesintherelatedareasmight
have been omitted because of oversight or ignorance.
Preface ix
The book, which is unique in its character, will be useful to graduate students
and researchers in computer science, electrical engineering, system science, data
science, medical science, bioinformatics and information technology both as a
textbookandareferencebookforsomepartsofthecurriculum.Theresearchersand
practitioners in industry and R&D laboratories working in the fields of system
design, pattern recognition, big data analytics, image analysis, data mining, social
network analysis, computational biology, and soft computing or computational
intelligence will also be benefited.
Thanks to the co-authors, Dr. Avatharam Ganivada for generating various new
ideasindesigninggranularnetworkmodelsandDr.ShubhraS.Rayforhisvaluable
contributions to bioinformatics. It is the untiring hard work and dedication of
Avatharamduringthelasttendaysthatmadeitpossibletocompletethemanuscript
and submit to Springer in time.
We take this opportunity to acknowledge the appreciation of Prof. Janusz
KacprzykinacceptingthebooktopublishundertheSCI(StudiesinComputational
Intelligence) series of Springer, and Prof. Andrzej Skowron, Warsaw University,
Poland for his encouragement and support in the endeavour. We owe a vote of
thankstoDr.ThomasDitzingerandDr.LavanyaDiazofSpringerforcoordinating
the project, as well as the office staff of our Soft Computing Research Center for
theirsupport.ThebookwaswrittenwhenProf.S.K.PalheldJ.C.BoseFellowship
and Raja Ramanna Fellowship of the Govt. of India.
Kolkata, India Sankar K. Pal
January 2017 Principal Investigator
Center for Soft Computing Research
Indian Statistical Institute
Contents
1 Introduction to Granular Computing, Pattern Recognition
and Data Mining.... .... ..... .... .... .... .... .... ..... .... 1
1.1 Introduction ... .... ..... .... .... .... .... .... ..... .... 1
1.2 Granular Computing. ..... .... .... .... .... .... ..... .... 2
1.2.1 Granules .... ..... .... .... .... .... .... ..... .... 2
1.2.2 Granulation.. ..... .... .... .... .... .... ..... .... 3
1.2.3 Granular Relationships .. .... .... .... .... ..... .... 3
1.2.4 Computation with Granules... .... .... .... ..... .... 4
1.3 Granular Information Processing Aspects of Natural
Computing .... .... ..... .... .... .... .... .... ..... .... 4
1.3.1 Fuzzy Set ... ..... .... .... .... .... .... ..... .... 4
1.3.2 Rough Set... ..... .... .... .... .... .... ..... .... 8
1.3.3 Fuzzy Rough Sets.. .... .... .... .... .... ..... .... 14
1.3.4 Artificial Neural Networks ... .... .... .... ..... .... 15
1.4 Integrated Granular Information Processing Systems . ..... .... 21
1.4.1 Fuzzy Granular Neural Network Models. .... ..... .... 21
1.4.2 Rough Granular Neural Network Models .... ..... .... 22
1.4.3 Rough Fuzzy Granular Neural Network Models.... .... 23
1.5 Pattern Recognition . ..... .... .... .... .... .... ..... .... 23
1.5.1 Data Acquisition... .... .... .... .... .... ..... .... 24
1.5.2 Feature Selection/Extraction .. .... .... .... ..... .... 25
1.5.3 Classification. ..... .... .... .... .... .... ..... .... 26
1.5.4 Clustering ... ..... .... .... .... .... .... ..... .... 27
1.6 Data Mining and Soft Computing.... .... .... .... ..... .... 29
1.7 Big Data Issues. .... ..... .... .... .... .... .... ..... .... 30
1.8 Scope of the Book .. ..... .... .... .... .... .... ..... .... 31
References.. .... .... .... ..... .... .... .... .... .... ..... .... 34
xi