MultidiMensional nonlinear descriptive analysis Shizuhiko Nishisato MultidiMensional nonlinear descriptive analysis Shizuhiko Nishisato Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2007 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-612-9 (Hardcover) International Standard Book Number-13: 978-1-58488-612-9 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the conse- quences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com To Lorraine, Ira, Samantha and In memory of George William Ford v Preface This book is intended for those involved in data analysis in diverse areas of research. Unlike in a well-controlled and well-designed statisticalexperiment,manyofusfacedatatowhichthenotionof“data being a random sample from the normal population” does not apply. “Normal distribution” means that data must be continuous, but most datawedealwithinthesocialsciencesarecategoricalornon-numerical. Furthermore,weknowthattherelationbetweentwonormallydistributed variables is by design linear. In practice, however, we encounter many nonlinear relations such as “the strength of the body is generally a concave function of age.” Such a phenomenon exists and cannot be ignoredforthesakeofusingthenormaldistributionassumption. Quantification of categorical or non-numerical data is indeed a ubiquitousproblem. Yet most courses for data analysis are devoted to thetraditionalstatisticsbasedonnormaltheorywhichaffordsanelegant andsophisticatedinferentialframeworksuchasaconfidenceintervalfor aparameterandhypothesistesting.Thecurrentbookisintendedtoserve theneedsforthosewhomustfacetherealityoftypicaldataanalysis:data arediscreteornon-numerical,notnecessarilysampledrandomlyfroma populationandinvolvenotonlylinearbutalsononlinearrelations. Thebookstartswithsome discussionofwhyand howthisbookcan respond to typical needs of data analysis. The discussion is hopefully illuminating.Thenthereaders are exposedtoconceptualpreliminaries, again innon-technicalterms, and totechnical preliminaries needed for data analysis in the remaining chapters. Part One of the book covers these topics, which hopefully is useful to understand data analysis discussed in subsequent chapters. Part Two and Part Three contain applications of what we call “multidimensional nonlinear descriptive analysis,” abbreviated as MUNDA, to diverse data types, witheach chapterbeingdevotedtoonetypeofcategorical data, abrief historicalcommentandbasicskillspeculiartothedatatypes.Thebook willendwiththechapterentitled“FurtherPerspectives,”whichpresents severalproblemsthatneedtobesolvedinthenearfutureforthecurrent methodology.Thus,thisbookisanoverviewofMUNDAofdiscretedata withsuggestionsforfuturedevelopments. The mainpartofthebookis similar in topics to Nishisato(1994), but the topics are discussed with much more insightthan in the 1994 book, as well as with new results sincethen. This book is for students in the social and biological sciences and researchers in such fields as marketing research, education, health and vi medical sciences, psychology, sociology, biology,ecology, agriculture, economics, political science, criminology, archaeology, geology and geography.Thelevelofthebookisintermediate,butthewritingstyleis straightforwardsothatonemayreadthebookwithrelativeease, andat thesametimeitisintendedtobebeneficialforwell-seasonedresearchers engagedindataanalysis. Work related to MUNDA started in the early years of the twentieth century in ecology and biology under the name of gradient andordinationmethods.Thecontributionstothisareabyecologistsand biologistsfrommany countriesareoverwhelming,andtheyhave ledto anenormousnumberofapplicationstoproblemsintheirdisciplinesup to the present moment. Outside ecology and biology, we see familiar formulations of such methods as the method of reciprocal averages, simultaneouslinearregressions and appropriatescoringbystatisticians and psychologists, starting in the 1930s and 1940s, particularly in the United Kingdom and the United States. After the 1950s, the number of publicationson MUNDA and related studies increased, resultingin its wide applications, particularlyin Japan and France, led by eminent researchers ChikioHayashi inJapan andJean Paul Benze´cri inFrance. Theresearchspreadnotonlytomanyothercountriesbutalsotodiverse disciplines.Itisthustimelynowtopauseandreviewthegeneralareaof MUNDA. Finally,a personal note. I have devoted my entire research career to MUNDA, which I coined as “dual scaling.” R. Darrell Bock was my mentor at the University of North Carolina who introduced me to his optimal scaling (Bock, 1960). Also instrumental to my successful graduateworkatthePsychometricLaboratoryinChapelHillwereLyle V. Jones, Director, Masanao Toda, mentor in Japan, the late Mr. and Mrs.ArthurRingwalt,hostfamily,theFulbrightCommissionandmany dear fellow students and other faculty members. Michael W. Browne introducedmetotheFrenchcorrespondenceanalysisinearly1970sand drewmyattentiontotheproblemthatthejointplotofrowweightsand columnweightsinthesamespacewasnotmathematicallysound.Atmy retirementsixyearsago,JohnC.Gowerencouragedmewiththewords “Lifeexistsafterretirement.” During my career, I have had the pleasure of personally knowing researchers in many countries: Austria, Australia, Belgium, Brazil, Britain,Bulgaria,Canada,China,France,Germany,Greece,HongKong, India,Italy,Japan,TheNetherlands,Pakistan,Russia,Singapore,South Korea, Spain, Sweden, Switzerland, Taiwan, USA and The West Indies.Togetherwithmanycolleagues,amongothers,RossTraub,Gila vii Hanna, Richard Wolfe, Merlin Wahlstrom, Leslie McLean, Donald Burrill,VincentD’Oyley,RoderickMcDonald,PhilipNagy,GlenEvans, Dennis Roberts, Ruth Childs, Tony Lam, Joel Weiss, Alexander Even, DomerEllis,SabirAlvi,TahanyGaddala,JoanPreston,thelateHoward Russell, the late Raghu Bhargava, the late Shmuel Avital, the late Sar Khan and the late Dorothy Horn, I also taught students from such countries as Argentine, Australia, Bahrain, Brazil, Canada, Central Africa, China(Mainland,Hong Kong),Egypt, Ethiopia,Greece, India, Iran, Israel, Japan, Malawi, Malaysia, New Zealand, Nigeria, Pakistan, Peru, Philippines, Singapore, South Africa, South Korea, Sri Lanka, Taiwan,Thailand,USA,VenezuelaandTheWestIndies.Otherthanmy careeratMcGillUniversityinMontrealandtheUniversityofToronto,I was a visitingprofessorattheUniversityofKarlsruhe, Germany (Host Professor Wolfgang Gaul), Institute of Statistical Mathematics, Japan (Host Professor Yasumasa Baba), Kwansei Gakuin University, Japan (Host Professors Takehiro Fijihara, Masao Nakanishi, Akihiro Yagi (twice), and Shoji Yamamoto) and Doshisha University, Japan (Host Professors ShigeoTatsukiand HirotsuguYamauchi). Iowe allofthese dearfriends,colleaguesandstudentsagreatdealfortheirmanyyearsof associationandfriendship. Forthecurrentbook,Ihavethreeregrets.Thefirstregretisaboutthe partialcoverage ofthegeneraltopicsofMUNDA.Duringtheliterature search,Iwasoverwhelmedbytheenormouscontributionsbyecologists andbiologistssincetheearlypartofthetwentiethcenturyallthewayto date andIwasaware ofonlysome oftheircontributionsuntilrecently. Because of this late discovery, I could not cover their contributionsin a deserving way in the current book. In addition,jointcorrespondence analysis(Greenacre,1988;TateneniandBrowne,2000)isnotdiscussed. The departure of the procedure from correspondence analysis is comparabletothatoffactoranalysisfromprincipalcomponentanalysis, and as such it is very importantfor the development of MUNDA. The secondregretisaboutthenotation.Sincethecurrentbookisasummary ofwhatIhavebeenworkingonunderthenamedualscaling,Iusedmy own notationof some antiquity.This was done in spiteof the fact that MichaelGreenacreandJo¨rgeBlasiushavebeentryinghardtounifythe notationforMUNDA-relatedpublications.Ifounditdifficulttochange my notation in many of my old papers to the new one. My third and greatestregretisthefactthatinspiteofmyfinalstatementinmy1980 book that “Inferential aspects of dual scaling deserve immediate attentionbyresearchers,”Ididnotincludeinference-relatedtopicsinthis book,notablyworkonconfidenceregions,associationmodels(loglinear