Lecture Notes in Computer Science 7255 CommencedPublicationin1973 FoundingandFormerSeriesEditors: GerhardGoos,JurisHartmanis,andJanvanLeeuwen EditorialBoard DavidHutchison LancasterUniversity,UK TakeoKanade CarnegieMellonUniversity,Pittsburgh,PA,USA JosefKittler UniversityofSurrey,Guildford,UK JonM.Kleinberg CornellUniversity,Ithaca,NY,USA FriedemannMattern ETHZurich,Switzerland JohnC.Mitchell StanfordUniversity,CA,USA MoniNaor WeizmannInstituteofScience,Rehovot,Israel OscarNierstrasz UniversityofBern,Switzerland C.PanduRangan IndianInstituteofTechnology,Madras,India BernhardSteffen TUDortmundUniversity,Germany MadhuSudan MicrosoftResearch,Cambridge,MA,USA DemetriTerzopoulos UniversityofCalifornia,LosAngeles,CA,USA DougTygar UniversityofCalifornia,Berkeley,CA,USA MosheY.Vardi RiceUniversity,Houston,TX,USA GerhardWeikum MaxPlanckInstituteforInformatics,Saarbruecken,Germany James F. Peters Andrzej Skowron (Eds.) Transactions on Rough Sets XV 1 3 VolumeEditors JamesF.Peters UniversityofManitoba,Winnipeg,MB,Canada E-mail:[email protected] AndrzejSkowron UniversityofWarsaw,Poland E-mail:[email protected] ISSN0302-9743(LNCS) e-ISSN1611-3349(LNCS) ISSN1861-2059(TRS) e-ISSN1861-2067(TRS) ISBN978-3-642-31902-0 e-ISBN978-3-642-31903-7 DOI10.1007/978-3-642-31903-7 SpringerHeidelbergDordrechtLondonNewYork CRSubjectClassification(1998):I.5.1-3,I.2.4,I.2.6,I.2,F.4.1,G.1.2,I.4,H.3 ©Springer-VerlagBerlinHeidelberg2012 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, inistcurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable toprosecutionundertheGermanCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelaws andregulationsandthereforefreeforgeneraluse. Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface Volume XV of the Transactions on Rough Sets (TRSXV) offers a number re- search streams that have grown out of the seminal work by Zdzisl(cid:3)aw Pawlak1 during the first decade of the twenty-first century. These research streams include work on a promising rough set approach in machine learningby A. Janusz,the introductionofmulti-valued nearset theory byM.E.AbdEl-Monsef,H.M.Abu-DoniaandE.A.Marei,theadventofacom- plete system that supports a rough-near set approach to digital image analysis by C.J. Henry, and an exhaustive study of the mathematics of vagueness by A. Mani. The first of these research streams focuses on an extension of the rule-based similarity(RBS) thatis a dynamic rule-basedsimilarity (DRBS) framework,an extension of the rule-basedsimilarity (RBS) model by A. Janusz.RBS is an ex- tensionoftheA.Tverskyfeaturecontrastmodel,whereanobjectisrepresented by a set of features,object comparisondepends on a feature matching function, and the representation of similarity is based on the contrast of the measures of objectfeatures2.DRBSrepresentsasignificantstepforwardinmachinelearning inasmuch as DRBS makes it possible to learn a similarity relation from high- dimensionaldata.AsignificantapplicationofDBRSis inDNA microarraydata mining. The second of the research streams represented by M.E. Abd El-Monsef, H.M. Abu-Donia and E.A. Marei in TRSXV considers the nearness of objects in terms of an extended approximation space model3 and a new approach to near sets based on several types of neighborhoods that takes its cue from topo- logical rough sets4. The main results in this paper are that a right (left) lower neighborhoodcoverageisnearthecorrespondingright(left)upperneighborhood coverageand that topologies are generated from families of neighborhoods. The thirdresearchstreaminTRSXVis representedby C.J.Henry’snearset evaluationandrecognition(NEAR)system.Itcanbeobservedthattheapproach 1 See, e.g.,Pawlak, Z., A Treatise on Rough Sets, Transactions on Rough Sets IV, (2006), 1-17. See, also, Pawlak, Z., Skowron,A.: Rudimentsof rough sets, Informa- tion Sciences 177 (2007) 3-27; Pawlak, Z., Skowron, A.: Rough sets: Some exten- sions, Information Sciences 177 (2007) 28-40; Pawlak, Z., Skowron, A.: Rough sets and Boolean reasoning, Information Sciences 177 (2007) 41-73. 2 A. Tversky, Features of similarity, Psych. Review 84 (1977), 327–352, especially A. Tversky, D.H. Krantz, The dimensional representation and metric structure of similarity data, J. Math. Psych.7 (1970), 572-597. 3 J.F.Peters,A.Skowron,J.Stepaniuk,Nearnessofobjects:Extensionofapproxima- tion space model, Fund.Info. 79 (/4) (2007), 497-512. 4 A.Wiweger, On topological rough sets, Bull. Pol. Akad.,Math. 37 (1989), 89-93. VI Preface to describing the nearness of objects in terms of feature vectors is actually an alternative to A. Tversky’s view of object similarity defined by sets of features in representing objects. Henry carries forward the feature vector approach to describing objects with the introduction of visual rough sets and an approach to measuring the similarity of disjoint roughsets. Henry’s proposedapproachis useful in digital image analysis as well as in content-based image retrieval. The fourth researchstream is represented by A. Mani in TRSXV in a study of the mathematics of vagueness. Mani introduces a structure called a rough Y-system that captures a minimum common fragment of different rough set theories. The article by Mani in this volume is broad in scope inasmuch as it considers the category Rough, Y. Yao’s information granule model, contamina- tion of object perception by meta-level consideration of objects of all types vs. classes of equivalent objects, objectivity in the computation by rough inclusion methods,axiomaticapproachtoinformationgranules,variousformsofroughset theory considered relative discernibility, and classification of rough set theories. Theeditorsofthisspecialissuewouldliketoexpressgratitudetotheauthors of all submitted papers. Special thanks are due to the following reviewers: Mo- hua Banerjee, Jan Bazan, Jerzy Grzyma(cid:3)la-Busse, Davide Cuicci, Ivo Du¨ntsch, HomaFashandi,AnnaGomolin´ska,ChristopherHenry,JouniJ¨arvinen,Andrzej Janusz, Marcin Wolski, Wei-Zhi Wu and Wojciech Ziarko. The editorsand authorsof this volumeextend their gratitude to Alfred Hof- mann, Anna Kramer, Ursula Barth, Christine Reiss, and the LNCS staff at Springer for their support in making this volume of the TRS possible. The Editors-in-Chief were supported by the State Committee for Scientific Researchofthe Republic ofPoland(KBN)researchgrantNN516077837,grant 2011/01/D/ST6/06981from the PolishNational Science Centre, the PolishNa- tional Centre for Research and Development (NCBiR) under grant SP/I/1/ 77065/10by the strategicscientific researchandexperimentaldevelopmentpro- gram:“InterdisciplinarySystemforInteractiveScientificandScientific-Technical Information,”anindividualresearchgrantbytheprogramHomingPlus,edition 3/2011, from the Foundation for Polish Science and the Natural Sciences and EngineeringResearchCouncilofCanada(NSERC)researchgrant185986,Cana- dian Network of Excellence (NCE), and a Canadian Arthritis Network (CAN) grant SRI-BIO-05. March 2012 James F. Peters Andrzej Skowron LNCS Transactions on Rough Sets The Transactions on Rough Sets series has as its principal aim the fostering of professional exchanges between scientists and practitioners who are interested in the foundations and applications of rough sets. Topics include foundations and applications of rough sets as well as foundations and applications of hybrid methodscombiningroughsetswithotherapproachesimportantforthedevelop- ment of intelligent systems. The journal includes high-quality research articles accepted for publication on the basis of thorough peer reviews. Dissertations and monographs up to 250 pages that include new research results can also be considered as regular papers. Extended and revised versions of selected papers from conferences can also be included in regularor special issues of the journal. Editors-in-Chief: James F. Peters, Andrzej Skowron Managing Editor: Sheela Ramanna Technical Editor: Marcin Szczuka Editorial Board Mohua Banerjee Ewa Orl(cid:3)owska Jan Bazan Sankar K. Pal Gianpiero Cattaneo Lech Polkowski Mihir K. Chakraborty Henri Prade Davide Ciucci Sheela Ramanna Chris Cornelis Roman Sl(cid:3)owin´ski Ivo Du¨ntsch Jerzy Stefanowski Anna Gomolin´ska Jaros(cid:3)law Stepaniuk Salvatore Greco Zbigniew Suraj Jerzy W. Grzymal(cid:3)a-Busse Marcin Szczuka Masahiro Inuiguchi Dominik S´le¸zak Jouni Ja¨rvinen Roman S´winiarski Richard Jensen Shusaku Tsumoto Boz˙ena Kostek Guoyin Wang Churn-Jung Liau Marcin Wolski PawanLingras Wei-Zhi Wu Victor Marek Yiyu Yao Mikhail Moshkov Ning Zhong Hung Son Nguyen Wojciech Ziarko Table of Contents Dynamic Rule-Based Similarity Model for DNA MicroarrayData ...... 1 Andrzej Janusz Multi-valued Approach to Near Set Theory.......................... 26 M.E. Abd El-Monsef, H.M. Abu-Donia, and E.A. Marei Perceptual Indiscernibility, Rough Sets, Descriptively Near Sets, and Image Analysis .................................................. 41 Christopher J. Henry Dialectics of Counting and the Mathematics of Vagueness ............. 122 A. Mani Author Index.................................................. 181 Dynamic Rule-Based Similarity Model for DNA Microarray Data Andrzej Janusz Faculty of Mathematics, Informatics, and Mechanics, The Universityof Warsaw, Banacha 2, 02-097 Warszawa, Poland [email protected] Abstract. Rules-based Similarity (RBS) is a framework in which con- cepts from rough set theory are used for learning a similarity relation from data. This paper presents an extension of RBS called Dynamic Rules-based Similarity model (DRBS) which is designed to boost the qualityofthelearnedrelationincaseofhighlydimensionaldata.Rules- based Similarity utilizes a notion of a reduct to construct new features whichcanbeinterpretedasimportantaspectsofasimilarityintheclas- sification context. Having defined such features it is possible to utilize the idea of Tversky’s feature contrast similarity model in order to de- sign an accurate and psychologically plausible similarity relation for a given domain of objects. DRBS tries to incorporate a broader array of aspects of the similarity into themodel byconstructing many heteroge- neoussetsoffeaturesfrommultipledecisionreducts.Toensurediversity, the reducts are computed on random subsets of objects and attributes. This approach is particularly well-suited for dealing with“few-objects- many-attributes”problem,suchasminingofDNAmicroarraydata.The induced similarity relation and the resulting similarity function can be used to perform an accurate classification of previously unseen objects inacase-basedfashion.Experiments,whoseresultsarealsopresentedin thepaper,show thattheproposed modelcan successfully competewith other state-of-the-art algorithms such as RandomForest or SVM. 1 Introduction A notion of similarity plays an important role in both the rough set theory and data analysis in general. Since its introduction in [1], rough sets have been usedinconjunctionwiththeconceptofsimilarityandnumeroussimilarityfunc- tions to perform classification (e.g. [2], [3]) or clustering of data (e.g. [4], [5]). The similarity itself has been used for generalization of rough sets by defining more natural lower and upper approximations ([2], [6], [7]). It has also been applied in the process of data granulation for the purpose of granular comput- ing approaches([8]). In statistical learning clustering algorithms grouptogether objects that are similar in some sense. For this purpose, many methods make a use of predefined similarity models (e.g. distance-based similarity as in the classic k-means algorithm). Moreover, in domains such as information retrieval J.F.PetersandA.Skowron(Eds.):TransactionsonRoughSetsXV,LNCS7255,pp.1–25,2012. (cid:2)c Springer-VerlagBerlinHeidelberg2012 2 A.Janusz or case-based reasoning the concept of similarity is essential as it is being used in every phase of the case-basedreasoning cycle ([9]). In the most of those examples, the utilized models of the similarity are given a priori and their properties are dictated by the design. However, researchers who investigated human perception of similar objects noticed that in real life the features of the similarity are strongly dependent on a domain of considered objects andalso ona context([10], [11]). Due to this fact, in confrontationwith many real-life problems the similarity model cannot be given by an expert but has to be learned from available data ([12]). There have been many attempts to develop a similarity learning model that would fit to a wide range of applications. Among them, a huge share consists of methods that try to optimize weights of distance-based local similarities and aggregate them into a global similarity function ([13], [14], [15], [16], [17]). Al- though the distance-based similarity learning models differ in a way they hone a similarity function, they all enforce certain properties on resulting similarity relation(suchassymmetryortriangularinequity)andthatmaybeundesirable. Rules-based Similarity (RBS) model was developed as an alternative to the distance-basedapproaches([18]). It may be seenas a roughsetextensionto the psychologicallyplausible feature contrastmodel (see [10]) proposedby Tversky. Hearguedthatdistancemetricsarenotsuitableformodelinghumanperception of resemblance since they enforce some undesirable properties and they neglect a context in which the objects are compared. Instead, he proposed to measure the similarity by examining whether the objects share some binary features. (cid:1)(cid:4)(cid:5)(cid:3)(cid:15)(cid:2)(cid:10)(cid:4)(cid:16)(cid:17)(cid:10)(cid:4)(cid:6)(cid:7)(cid:18)(cid:17)(cid:7)(cid:13) (cid:3)(cid:6)(cid:11)(cid:19)(cid:14)(cid:10)(cid:17)(cid:10)(cid:4)(cid:6)(cid:7)(cid:18)(cid:6)(cid:20)(cid:18) (cid:25)(cid:2)(cid:7)(cid:2)(cid:15)(cid:17)(cid:10)(cid:4)(cid:6)(cid:7)(cid:18)(cid:6)(cid:20) (cid:12)(cid:6)(cid:14)(cid:32)(cid:21)(cid:18)(cid:5)(cid:2)(cid:10) (cid:13)(cid:2)(cid:3)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7)(cid:18)(cid:15)(cid:2)(cid:13)(cid:14)(cid:3)(cid:10)(cid:5)(cid:18)(cid:20)(cid:6)(cid:15) (cid:13)(cid:2)(cid:3)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7)(cid:18)(cid:17)(cid:7)(cid:13)(cid:18) (cid:17)(cid:19)(cid:19)(cid:15)(cid:6)(cid:33)(cid:4)(cid:11)(cid:17)(cid:10)(cid:4)(cid:6)(cid:7) (cid:2)(cid:17)(cid:3)(cid:21)(cid:18)(cid:13)(cid:2)(cid:3)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7)(cid:18)(cid:3)(cid:22)(cid:17)(cid:5)(cid:5) (cid:4)(cid:7)(cid:21)(cid:4)(cid:26)(cid:4)(cid:10)(cid:6)(cid:15)(cid:9)(cid:18)(cid:15)(cid:14)(cid:22)(cid:2)(cid:5) (cid:6)(cid:20)(cid:18)(cid:10)(cid:21)(cid:2)(cid:18)(cid:5)(cid:4)(cid:11)(cid:4)(cid:22)(cid:17)(cid:15)(cid:4)(cid:10)(cid:9)(cid:18)(cid:10)(cid:6) (cid:19)(cid:17)(cid:15)(cid:10)(cid:4)(cid:3)(cid:14)(cid:22)(cid:17)(cid:15)(cid:18)(cid:6)(cid:26)(cid:34)(cid:2)(cid:3)(cid:10)(cid:5) (cid:27)(cid:2)(cid:28)(cid:18) (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7) (cid:29)(cid:2)(cid:17)(cid:10)(cid:14)(cid:15)(cid:2) (cid:12)(cid:2)(cid:13)(cid:14)(cid:3)(cid:10) (cid:8)(cid:2)(cid:10)(cid:18)(cid:20)(cid:6)(cid:15) (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:4)(cid:6)(cid:7) (cid:1)(cid:2)(cid:1)(cid:12)(cid:3)(cid:4)(cid:2)(cid:2)(cid:5)(cid:1)(cid:12)(cid:3)(cid:13)(cid:4)(cid:6)(cid:4)(cid:14)(cid:2)(cid:2)(cid:5)(cid:7)(cid:23)(cid:3)(cid:3)(cid:13)(cid:4)(cid:6)(cid:10)(cid:4)(cid:14)(cid:5)(cid:22)(cid:7)(cid:17)(cid:3)(cid:4)(cid:6)(cid:20)(cid:10)(cid:6)(cid:5)(cid:7)(cid:5)(cid:15)(cid:18)(cid:18)(cid:24) (cid:23)(cid:22)(cid:17)(cid:5)(cid:5)(cid:18)(cid:24) (cid:12)(cid:8)(cid:14)(cid:4)(cid:22)(cid:11)(cid:2)(cid:30)(cid:4)(cid:31)(cid:22)(cid:17)(cid:17)(cid:15)(cid:5)(cid:4)(cid:10)(cid:2)(cid:9)(cid:13) (cid:8)(cid:9)(cid:5)(cid:10)(cid:2)(cid:11) (cid:1)(cid:12)(cid:2)(cid:2)(cid:12)(cid:3)(cid:13)(cid:2)(cid:4)(cid:14)(cid:5)(cid:13)(cid:3)(cid:4)(cid:14)(cid:6)(cid:10)(cid:3)(cid:7)(cid:10) (cid:11)(cid:6)(cid:13)(cid:2)(cid:22) Fig.1. A general construction schema of the RBSmodel In RBS, a similarity between two objects in a context set by a decision at- tributeisalsoexpressedintermsoftheircommonanddistinctivefeatures.Those features correspond to higher-level characteristics of examined samples and are automatically derived from data using a decision rule mining algorithm. This approach is different from other rough set case-based classification models (e.g. [2], [3]) as it does not need to consider all pairs of available training samples and it does not assume existence of any predefined local similarity measures. Experiments conducted with the use of the original RBS showed that it can
Description: