Table Of ContentStudies in Classification, Data Analysis,
and Knowledge Organisation
Managing Editors Editorial Board
H.-H. Bock, Aachen Ph. Arabie, Newark
W. Gaul, Karlsruhe D. Baier, Cottbus
M. Vichi, Rome F. Critchley, Milton Keynes
R. Decker, Bielefeld
E. Diday, Paris
M. Greenacre, Barcelona
C. Lauro, Naples
J. Meulman, Leiden
P. Monari, Bologna
S. Nishisato, Toronto
N. Ohsumi, Tokyo
o.
Opitz, Augsburg
G. Ritter, Pass au
M. Schader, Mannheim
C. Weihs, Dortmund
Springer
Berlin
Heidelberg
New York
Hong Kong
London
Milan
Paris
Tokyo
Titles in the Series
H.-H. Bock and P. Ihm (Eds.) M. Vichi and O. Opitz (Eds.)
Classification, Data Analysis, Classification and Data Analysis. 1999
and Knowledge Organization. 1991
(out of print) W. Gaul and H. Locarek-Junge (Eds.)
Classification in the Information Age.
M. Schader (Ed.) 1999
Analyzing and Modeling Data
and Knowledge. 1992 H.-H. Bock and E. Diday (Eds.)
Analysis of Symbolic Data. 2000
O. Opitz, B. Lausen, and R. Klar (Eds.)
Information and Classification. 1993 H.A.L. Kiers, J.-P. Rasson,
(out of print) P.J.F. Groenen, and M. Schader (Eds.)
Data Analysis, Classification,
H.-H. Bock, W. Lenski, and Related Methods. 2000
and M. M. Richter (Eds.)
Information Systems and Data W. Gaul, O. Opitz, and M. Schader
Analysis. 1994 (out of print) (Eds.)
Data Analysis. 2000
E. Diday, Y. Lechevallier, M. Schader,
P. Bertrand, and B. Burtschy (Eds.) R. Decker and W. Gaul
New Approaches in Classification and Classification and Information
Data Analysis. 1994 (out of print) Processing at the Turn of the
Millenium. 2000
W. Gaul and D. Pfeifer (Eds.)
From Data to Knowledge. 1995 S. Borra, R. Rocci, M. Vichi,
and M. Schader (Eds.)
H.-H. Bock and W. Polasek (Eds.) Advances in Classification
Data Analysis and Information and Data Analysis. 2001
Systems. 1996
W. Gaul and G. Ritter (Eds.)
E. Diday, Y. Lechevallier, Classification, Automation,
and O. Opitz (Eds.) and New Media. 2002
Ordinal and Symbolic Data Analysis.
1996 K. Jajuga, A. Sokolowski,
and H.-H. Bock (Eds.)
R. Klar and O. Opitz (Eds.) Classification, Clustering and Data
Classification and Knowledge Analysis. 2002
Organization. 1997
M. Schwaiger, O. Opitz (Eds.)
C. Hayashi, N. Ohsumi, K. Yajima, Exploratory Data Analysis
Y. Tanaka, H.-H. Bock, and Y. Baba in Empirical Research. 2003
(Eds.)
Data Science, Classification, M. Schader, W. Gaul, and M. Vichi
and Related Methods. 1998 (Eds.)
Between Data Science and
1. Balderjahn, R. Mathar, Applied Data Analysis. 2003
and M. Schader (Eds.)
Classification, Data Analysis, H.-H. Bock, M. Chiodi, and A. Mineo
and Data Highways. 1998 (Eds.)
Advances in Multivariate Data Analysis.
A. Rizzi, M. Vichi, and H.-H. Bock 2004
(Eds.)
Advances in Data Science
and Classification. 1998
David Banks · Leanna House ·
Frederick R. McMorris · Phipps Arabie ·
Wolfgang Gaul
Editors
Classification, Clustering,
and Data Mining Applications
Proceedings of the Meeting of the International Federation
of Classification Societies (IFCS),
Illinois Institute of Technology, Chicago, 15-18 July 2004
With 156 Figures and 86 Tables
Springer
Dr. David Banks
Leanna House
Institute of Statistics and Decision Sciences
Duke University
27708 Durham, NC
U.S.A.
banks@stat.duke.edu
house@stat.duke.edu
Dr. Frederick R. McMorris
Illinois Institute of Technology
Department of Mathematics
10 West 32nd Street
60616-3793 Chicago,IL
U.S.A.
mcmorris@iit.edu
Dr. Phipps Arabie
Faculty of Management
Rutgers University
180 University Avenue
07102-1895 Newark, NJ
U.S.A.
arabie@andromeda.rutgers.edu
Prof. Dr. Wolfgang Gaul
Institute of Decision Theory
University of Karlsruhe
Kaiserstr. 12
76128 Karlsruhe
Germany
wolfgang.gaul@Wiwi.uni-karlsruhe.de
ISSN 1431-8814
ISBN 3-540-22014-3 Springer-Verlag Berlin Heidelberg New York
Library of Congress Control Number: 2004106890
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad
casting, reproduction on microfIlm or in any other way, and storage in data banks. Duplication of this
publication or parts thereof is permitted only under the provisions of the German Copyright Law of
September 9, 1965, in its current version, and permission for use must always be obtained from
Springer-Verlag. Violations are liable for prosecution under the German Copyright Law.
Springer-Verlag is a part of Springer Science+Business Media
springeronline.com
© Springer-Verlag Berlin· Heidelberg 2004
Printed in Germany
The use of general descriptive names, registered names~ trademarks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
Softcover design: Erich Kirchner, Heidelberg
SPIN 11008729 88/3130-543210 -Printed on acid-free paper
This book is dedicated to the memory of
Professor Chikio Hayashi
who served as president of the International Federation of
Classification Societies from 1998 until 2000. His leadership,
his scholarship, and his humanity are an example to us all.
Preface
This book presents some of the key research undertaken by the members of
the International Federation of Classification Societies during the two years
since our last symposium. If the past is a guide to the future, these papers
contain the seeds of new ideas that will invigorate our field.
The editors are grateful to the community of classification scientists. Even
those whose work does not appear here have contributed through previous
research, teaching, and mentoring. It is a great joy to participate in this kind
global academy, which is only possible because its members work to cultivate
courtesy along with creativity, and friendship together with scholarship.
The editors particularly thank the referees who reviewed these papers:
Simona Balbi Lynne Billard Hans Bock Paul De Boeck
Jaap Brand Peter Bryant Doug Carroll Edwin Diday
Adrian Dobra Vincenzo Esposito Anuska Ferligoj Ernest Fokoue
John Gower Andre Hardy Stephen Hirtle Krzysztof Jajuga
Bart Jan van Os Mel Janowitz Henk Kiers Mike Larsen
Carlo Lauro Michael Lavine Ludovic Lebart Bruno Leclerc
Herbie Lee Taerim Lee Scotland Leman Walter Liggett
Masahiro Mizuta Clive Moncrieff Fionn Murtagh Noboru Ohsumi
Jennifer Pittman Alfredo Rizzi Pascale Rousseau Ashish Sanil
Bill Shannon Javier Trejos Mark Vangel Rosanna Verde
Kert Viele Kiri Wagstaff Stan Wasserman Stan Young
Durham, North Carolina David Banks
Durham, North Carolina Leanna House
New Brunswick, New Jersey Phipps Arabie
Chicago, Illinois F. R. McMorris
Karlsruhe, Germany Wolfgang Gaul
March,2004
Contents
Part I New Methods in Cluster Analysis
Thinking Ultrametrically
Fionn Murtagh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Clustering by Vertex Density in a Graph
Alain Guenoche. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 15
Clustering by Ant Colony Optimization
Javier Trejos, Alex Murillo, Eduardo Piza. . . . . . . . . . . . . . . . . . . . . . . . . .. 25
A Dynamic Cluster Algorithm Based on Lr Distances for
Quantitative Data
Francisco de A. T. de Carvalho, Yves Lechevallier, Renata M.C.R. de
Souza .......................................................... 33
The Last Step of a New Divisive Monothetic Clustering
Method: the Gluing-Back Criterion
Jean-Yves Pirr;on, Jean-Paul Rasson ............................... 43
Standardizing Variables in K-means Clustering
Douglas Steinley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 53
A Self-Organizing Map for Dissimilarity Data
Ai"cha El Golli, Brieuc Conan-Guez, Fabrice Rossi. . . . . . . . . . . . . . . . . .. 61
Another Version of the Block EM Algorithm
Mohamed NadiJ, Gerard Govaert .................................. 69
Controlling the Level of Separation of Components in Monte
Carlo Studies of Latent Class Models
Jose G. Dias .................................................... 77
Description:This volume describes new methods with special emphasis on classification and cluster analysis. These methods are applied to problems in information retrieval, phylogeny, medical diagnosis, microarrays, and other active research areas.