Studies in Classification, Data Analysis, and Knowledge Organization ManagingEditors EditorialBoard (cid:2) H.-H.Bock,Aachen Ph.Arabie,Newark W.Gaul,Karlsruhe D.Baier,Cottbus M.Vichi,Rome F.Critchley,MiltonKeynes R.Decker,Bielefeld E.Diday,Paris M.Greenacre,Barcelona C.N.Lauro,Naples J.Meulman,Leiden P.Monari,Bologna S.Nishisato,Toronto N.Ohsumi,Tokyo O.Opitz,Augsburg G.Ritter,Passau M.Schader,Mannheim C.Weihs,Dortmund Forfurthervolumes: http://www.springer.com/series/1564 • Wolfgang Gaul Andreas Geyer-Schulz (cid:2) Lars Schmidt-Thieme Jonas Kunze (cid:2) Editors Challenges at the Interface of Data Analysis, Computer Science, and Optimization Proceedings of the 34th Annual Conference of the Gesellschaft fu¨r Klassifikation e. V., Karlsruhe, July 21 - 23, 2010 123 Editors Prof.Dr.WolfgangGaul Prof.AndreasGeyer-Schulz KarlsruheInstituteofTechnology(KIT) KarlsruheInstituteofTechnology(KIT) InstituteofDecisionTheory InsituteforInformationSystems andOperationsResearch andManagement(IISM) Kaiserstr.12 Kaiserstr.12 76128Karlsruhe 76131KarlsruheBaden-Wu¨rttemberg Germany Germany [email protected] [email protected] Prof.Dr.Dr.LarsSchmidt-Thieme JonasKunze UniversityofHildesheim KarlsruheInstituteofTechnology(KIT) InstituteofComputerScience InstituteforInformationSystems MarienburgerPlatz22 andManagement(IISM) 31141Hildesheim Kaiserstraße12 Hildesheim 76128Karlsruhe Germany Germany [email protected] [email protected] ISSN1431-8814 ISBN978-3-642-24465-0 e-ISBN978-3-642-24466-7 DOI10.1007/978-3-642-24466-7 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2012930643 (cid:2)c Springer-VerlagBerlinHeidelberg2012 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting, reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9, 1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violations areliabletoprosecutionundertheGermanCopyrightLaw. Theuseofgeneral descriptive names,registered names, trademarks, etc. inthis publication doesnot imply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotective lawsandregulationsandthereforefreeforgeneraluse. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface Revisedversionsofselectedpaperspresentedatthe34thAnnualConferenceofthe GermanClassificationSociety(GfKl(Gesellschaftfu¨rKlassifikation),amemberof IFCS(InternationalFederationofClassificationSocieties)),heldatKIT(Karlsruhe Institute of Technology)in July 2010, are contained in this volume of “Studies in Classification,DataAnalysis,andKnowledgeOrganization”. One aimoftheconferencewastoprovidea platformfordiscussionsonresults concerning the interface that data analysis has in common with other areas such as, e.g., computer science, operations research, and statistics from a scientific perspective, as well as with various application areas when “best” interpretations ofdatathatdescribeunderlyingproblemsituationsneedknowledgefromdifferent researchdirections. Practitionersandresearchers–interestedindataanalysisinthebroadsense–had the opportunityto discuss recent developmentsand to establish cross-disciplinary cooperationin their fields of interest. More than 200 persons attended the confer- ence, 158 talks (including plenary and semiplenary lectures) were presented. The audienceoftheconferencewasquiteinternationalwithabout90contributionsfrom participants from abroad with the largest groups of foreign presenters from Italy, Japan, and Poland. Additionally, a program for librarians (14 presentations) was organized. Parallel to the conference a German-Japanese workshop sponsored by GfKlandJCS(JapaneseClassificationSociety)tookplacewithadditional24talks. Sixty of the papers presented at the conference are contained in this volume (The contributions given at the German-Japanese workshop will be published elsewhere.). As an unambiguous assignment of topics addressed in single papers issometimesdifficultthecontributionsaregroupedinawaythattheeditorsfound appropriate.Within (sub)chaptersthe presentationsare listed in alphabeticalorder withrespecttotheauthors’names.Attheendofthisvolumeanindexisincluded that,additionally,shouldhelptheinterestedreader. Last but not least, we would like to thank all participants of the conference for their interest and various activities which, again, made the 34th annual GfKl conferenceandthisvolumeaninterdisciplinarypossibilityforscientificdiscussion, in particular all authors and all colleagues who reviewed papers, chaired sessions v vi Preface orwereotherwiseinvolved.Here,theteamsofProf.Gaul,Prof.Geyer-Schulz,and Prof.Schmidt-Thiemehaveprovidedsupportinawaythatoutsidepersonscannot fully appreciate. Additionally, we gratefully take the opportunity to acknowledge support by Deutsche Forschungsgemeinschaft (DFG) as well as the Fakulta¨t fu¨r WirtschaftswissenschaftenofKIT. This volume was puttogether at Lehrstuhl“InformationServices& Electronic Markets”, and, at least, Jonas Kunze should be explicitly mentioned for the final editing tasks. As always we thank Springer Verlag, Heidelberg, especially Dr.MartinaBihn,forexcellentcooperationinpublishingthisvolume. KarlsruheandHildesheim(Germany) WolfgangGaul AndreasGeyer-Schulz LarsSchmidt-Thieme JonasKunze Conference Organization LocalOrganizers Prof.Dr.WolfgangGaul(Chair) Prof.Dr.AndreasGeyer-Schulz Prof.Dr.MartinE.Ruckes Prof.Dr.DetlefSeese Prof.Dr.Karl-HeinzWaldmann Fakulta¨tfu¨rWirtschaftswissenschaften KollegiumamSchloss,Universita¨tKarlsruhe(TH),KIT ScientificProgramCommittee Y.Baba(Tokyo,Japan) D.Baier(Cottbus) H.-H.Bock(Aachen) A.-L.Boulesteix(Mu¨nchen) M.P.Brito(Porto,Portugal) J.M.Buhmann(Zurich,Switzerland) A.Cerioli(Parma,Italy) R.Decker(Bielefeld) L.deRaedt(Leuven,Belgium) W.Esswein(Dresden) W.Gaul(Karlsruhe) M.Greenacre(Barcelona,Spain) P.J.F.Groenen(Rotterdam,TheNetherlands) M.Gro¨tschel(Berlin) Ch.Hennig(London,GreatBritain) K.Hornik(Vienna,Austria) O.Hudry(Paris,France) R.Klein(Augsburg) vii viii ConferenceOrganization P.Kuntz(Nantes,France) G.McLachlan(Brisbane,Australia) M.Mizuta(Sapporo,Japan) A.Okada(Tokyo,Japan) S.T.Rachev(Karlsruhe) M.Schader(Mannheim) L.Schmidt-Thieme(Hildesheim,Chair) W.Seidel(Hamburg) M.Spiliopoulou(Magdeburg) M.Vichi(Rome,Italy) C.Weihs(Dortmund) Contents PartI Classification,ClusterAnalysis,andMultidimensional Scaling FuzzificationofAgglomerativeHierarchical CrispClusteringAlgorithms.................................................... 3 MathiasBankandFriedhelmSchwenker AnEMAlgorithmfortheStudent-t Cluster-WeightedModeling.......... 13 SalvatoreIngrassia,SimonaC.Minotti,andGiuseppeIncarbone AnalysisofDistributionValuedDissimilarityData .......................... 23 MasahiroMizutaandHiroyukiMinami AnOverallIndexforComparingHierarchicalClusterings................. 29 I.MorliniandS.Zani AnAcceleratedK-MeansAlgorithmBasedonAdaptiveDistances........ 37 Hans-JoachimMuchaandHans-GeorgBartel Bias-VarianceAnalysisofLocalClassificationMethods .................... 49 JuliaSchiffner,BerndBischl,andClausWeihs EffectofDataStandardizationontheResultofk-MeansClustering...... 59 KensukeTaniokaandHiroshiYadohisa ACaseStudyontheUseofStatisticalClassificationMethods inParticlePhysics................................................................ 69 Claus Weihs, Olaf Mersmann,Bernd Bischl, Arno Fritsch, HeikeTrautmann,TillMoritzKarbach,and BernhardSpaan ProblemsofFuzzyc-MeansClusteringandSimilarAlgorithms withHighDimensionalDataSets .............................................. 79 RolandWinkler,FrankKlawonn,andRudolfKruse ix