Petra Perner (Ed.) Advances in Data Mining 7 8 9 7 I A N Applications and Theoretical Aspects L 13th Industrial Conference, ICDM 2013 New York, NY, USA, July 2013 Proceedings 123 Lecture Notes in Artificial Intelligence 7987 Subseries of Lecture Notes in Computer Science LNAISeriesEditors RandyGoebel UniversityofAlberta,Edmonton,Canada YuzuruTanaka HokkaidoUniversity,Sapporo,Japan WolfgangWahlster DFKIandSaarlandUniversity,Saarbrücken,Germany LNAIFoundingSeriesEditor JoergSiekmann DFKIandSaarlandUniversity,Saarbrücken,Germany Petra Perner (Ed.) Advances in Data Mining Applications and Theoretical Aspects 13th Industrial Conference, ICDM 2013 NewYork, NY, USA, July 16-21, 2013 Proceedings 1 3 VolumeEditor PetraPerner InstituteofComputerVision andAppliedComputerSciences,IBaI Kohlenstraße2 04107Leipzig,Germany E-mail:[email protected] ISSN0302-9743 e-ISSN1611-3349 ISBN978-3-642-39735-6 e-ISBN978-3-642-39736-3 DOI10.1007/978-3-642-39736-3 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2013943124 CRSubjectClassification(1998):I.2.6,I.2,H.2.8,J.3,H.3,I.4-5,J.1 LNCSSublibrary:SL7–ArtificialIntelligence ©Springer-VerlagBerlinHeidelberg2013 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection withreviewsorscholarlyanalysisormaterialsuppliedspecificallyforthepurposeofbeingenteredand executedonacomputersystem,forexclusiveusebythepurchaserofthework.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheCopyrightLawofthePublisher’slocation, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Permissionsforuse maybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violationsareliabletoprosecution undertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpublication, neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforanyerrorsor omissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespecttothe materialcontainedherein. Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface The13eventoftheIndustrialConferenceonDataMiningICDMwasheldinNew York (www.data-mining-forum.de) running under the umbrella of the World Congress “The Frontiers in Intelligent Data and Signal Analysis, DSA 2013.” Forthisedition,theProgramCommitteereceived112submissions.Afterthe peer-review process, we accepted 33 high-quality papers for oral presentation, of which 22 included in this proceeding book. The topics range from theoretical aspects of data mining to applications of data mining, such as in multimedia data,inmarketing,financeandtelecommunication,inmedicineandagriculture, andinprocesscontrol,industryandsociety.Extendedversionsofselectedpapers will appear in theInternational Journal Transactions on Machine Learning and Data Mining (www.ibai-publishing.org/journal/mldm). In all, 30 papers were selected for poster presentations and six for indus- try paper presentations that are published in the ICDM Poster and Industry Proceeding by ibai-publishing (www.ibai-publishing.org). In conjunction with ICDM, four workshopswere run focusing on special hot application-oriented topics in data mining: the Workshop on Case-Based Rea- soning (CBR-MD), Data Mining in Marketing (DMM), and the Workshop on Data Mining in Agriculture (DMA). All workshop papers are published in the workshop proceedings by ibai-publishing (www.ibai-publishing.org). A tutorial on Data Mining, a tutorial on Case-Based Reasoning, a tutorial on Intelligent Image Interpretation and Computer Vision in Medicine, Biotech- nology, Chemistry & Food Industry, a tutorial on Big Data and Text Analysis and a tutorial on Standardization in Immunofluorescence were held before the conference. We were pleased to give out the best paper award for ICDM for the seventh time this year. There are four announcement mentioned at www.data-mining- forum.de. The final decision was made by the Best Paper Award Committee basedonthepresentationbytheauthorsandthediscussionwiththeauditorium. Theceremonytookplaceattheendoftheconference.Thisprizeissponsoredby ibai solutions (www.ibai-solutions.de),one of the leading companies in data mining for marketing, Web mining and e-commerce. The conference was rounded up by an outlook session on new challenging topics in data mining before the Best Paper Award Ceremony. We would like to thank all reviewers for their highly professional work and their effort in reviewing the papers. We also thank the members of the Institute of Applied Computer Sciences, Leipzig, Germany (www.ibai-institut.de), who handled the conference as secre- tariat.WeappreciatethehelpandunderstandingoftheeditorialstaffatSpringer Verlag,andinparticularAlfredHofmann,whosupportedthepublicationofthese proceedings in the LNAI series. VI Preface Last, but not least, we wish to thank all the speakers and participants who contributedtothesuccessoftheconference.Wehopetoseeyouin2014inSankt Petersburg at the next World Congress “The Frontiers in Intelligent Data and SignalAnalysis,DSA2014”(www.worldcongressdsa.com)thatcombinesunder its roof the following three events: International Conferences Machine Learning andDataMiningMLDM,theIndustrialConferenceonDataMiningICDM,and the International Conference on Mass Data Analysis of Signals and Images in Medicine, Biotechnology, Chemistry and Food Industry MDA. July 2013 Petra Perner Organization Chair Petra Perner IBaI, Leipzig, Germany Program Committee Ajith Abraham Machine Intelligence Research Labs, USA Andrea Ahlemeyer-Stubbe ENBIS, Amsterdam, The Netherlands Eva Armengol IIA CSIC, Spain Brigitte Bartsch-Spo¨rl BSR Consulting GmbH, Germany Orlando Belo University of Minho, Portugal Isabelle Bichindaritz State University of New York, USA Leon Bobrowski Bialystok Technical University, Poland Marc Boull´e France T´el´ecom, France Shirley Coleman University of Newcastle, UK Juan M. Corchado Universidad de Salamanca, Spain Antonio Dourado University of Coimbra, Portugal Jeroen de Bruin Medical University of Vienna, Austria Peter Funk M¨alardalen University, Sweden Geert Gins KU Leuven, Belgium Warwick Graco ATO, Australia Osman Hegazy Cairo University, Egypt Gary F. Holness Delaware State University, USA Pedro Isaias Universidade Aberta, Portugal Piotr Jedrzejowicz Gdynia Maritime University, Poland Martti Juhola University of Tampere, Finland Janusz Kacprzyk Polish Academy of Sciences, Poland Mineichi Kudo Hokkaido University, Japan Mehmed Kantardzic University of Louisville, USA David Manzano Macho Ericsson Research Spain, Spain Dunja Mladenic Jozef Stefan Institute, Slovenia Eduardo F. Morales INAOE, Ciencias Computacionales, Mexico Stefania Montani Universita` del Piemonte Orientale, Italy Jerry Oglesby SAS Institute Inc., USA Wieslaw Paja University of Information Technology and Management in Rzeszow, Poland Eric Pauwels CWI Amsterdam, The Netherlands Mykola Pechenizkiy Eindhoven University of Technology, The Netherlands Jonas Poelmans KU Leuven, Belgium VIII Organization Georg Ruß Otto-von-Guericke-Universita¨tMagdeburg, Germany Rainer Schmidt University of Rostock, Germany Kaoru Shimada Fukuoka Dental College, Japan Yanbo J. Wang China Minsheng Banking Corporation Ltd., China Claus Weihs University of Dortmund, Germany Yong Zheng DePaul University, USA Table of Contents Mining and Information Integration Practice for Chinese Bibliographic Database of Life Sciences ......................................... 1 Heng Chen, Yi Jin, Yan Zhao, Yongjuan Zhang, Chengcai Chen, Jilin Sun, and Shen Zhang An Automated Search Space Reduction Methodology for Large Databases....................................................... 11 Angel Fernando Kuri-Morales Towards a High Productivity Automatic Analysis Framework for Classification: An Initial Study .................................... 25 Thomas Ludescher, Thomas Feilhauer, Anton Amann, and Peter Brezany Extending Statistical Models for Batch-End Quality Prediction to Batch Control ................................................... 40 Geert Gins, Jef Vanlaer, Pieter Van den Kerkhof, and Jan F.M. Van Impe Pattern-BasedSolution Risk Model for Strategic IT Outsourcing....... 55 Robert Gwadera Mining Semantic Relationships between Concepts across Documents Incorporating Wikipedia Knowledge................................ 70 Peng Yan and Wei Jin Estimating Risk Management in Software Engineering Projects ........ 85 Jaime Santos and Orlando Belo Wastewater Treatment Plant Performance Prediction with Support Vector Machines ................................................. 99 Daniel Ribeiro, Ant´onio Sanfins, and Orlando Belo Mining Floating Train Data Sequences for Temporal Association Rules within a Predictive Maintenance Framework......................... 112 Wissam Sammouri, Etienne Cˆome, Latifa Oukhellou, and Patrice Aknin Online Shopping Customer Data Analysis by Using Association Rules and Cluster Analysis ............................................. 127 Serhat Gu¨den and Umman Tu˘gba Gursoy X Table of Contents A Study on Multi-label Classification............................... 137 Clifford A. Tawiah and Victor S. Sheng Robust Feature Selection for SVMs under Uncertain Data............. 151 Hoai An Le Thi, Xuan Thanh Vo, and Tao Pham Dinh A Hybrid Machine Learning Method and Its Application in Municipal Waste Prediction ................................................ 166 Emadoddin Livani, Raymond Nguyen, J¨org Denzinger, Gu¨nther Ruhe, and Scott Banack BiETopti-BiClustering Ensemble Using Optimization Techniques....... 181 Geeta Aggarwal and Neelima Gupta Multiple Buying Behavior as an Indicator of Brand Loyalty: An Association Rule Application................................... 193 Diren Bulut, Umman Tu˘gba Gursoy, and Kemal Kurtulus Matching Semi-structured Documents Using Similarity of Regions through Fuzzy Rule-Based System ................................. 205 Alireza Ensan and Yevgen Biletskiy Data Mining Application for Cyber Credit-Card Fraud Detection System ......................................................... 218 John Akhilomen Feature Representation for Customer Attrition Risk Prediction in Retail Banking .................................................. 229 Yanbo J. Wang, Gang Di, Junxuan Yu, Juan Lei, and Frans Coenen An Evolutionary Method for Associative Local Distribution Rule Mining ......................................................... 239 Kaoru Shimada and Takashi Hanioka Application of Data Mining Techniques on EMG Registers of Hemiplegic Patients............................................ 254 Ana Aguilera, Alberto Subero, and Ram´on Mata-Toledo Configurations and Couplings: An Exploratory Study................. 266 Warwick Graco and Hari Koesmarno Author Index.................................................. 281