F C A UZZY LUSTERING LGORITHMS AND THEIR A M I A PPLICATION TO EDICAL MAGE NALYSIS A dissertationsubmittedin partialfulfilmentoftherequirements forthedegreeofDoctorofPhilosophyoftheUniversity ofLondon. A I S HMED SMAIL HIHAB DepartmentofComputing ImperialCollegeofScience, Technologyand Medicine UniversityofLondon,LondonSW7 2AZ. DECEMBER 2000. 2 Abstract Thegeneralproblemofdataclusteringisconcernedwiththediscoveryofagroup- ing structure within a finite number of data points. Fuzzy Clustering algorithms pro- vide a fuzzy description of the discovered structure. The main advantage of this de- scriptionisthatitcapturestheimprecisionencounteredwhendescribingreal-lifedata. Thus, the user is provided with more information about the structure in the data com- pared toacrisp, non-fuzzyscheme. During the early part of our research, we investigated the popular Fuzzy c-Means (FCM) algorithm and in particular its problem of being unable to correctly identify clusters with grossly different populations. We devised a suite of benchmark data sets to investigate the reasons for this shortcoming. We found that the shortcoming originatesfromtheformulationoftheobjectivefunctionofFCMwhichallowsclusters with relatively large population and extent to dominate the solution. This led to a search for a new objectivefunction, which we have indeed formulated. Subsequently, we derived a new so-called Population Diameter Independent (PDI) algorithm. PDI was tested on the samebenchmark data used to study FCM and was found to perform better than FCM. We have also analysed PDI’s behaviourand identified how it can be furtherimproved. Sinceimagesegmentationisfundamentallyaclusteringproblem,thenextstepwas to investigatetheuse offuzzy clustering techniquesfor imagesegmentation. We have identified the main decision points in this process. Furthermore, we have used fuzzy clusteringtodetecttheleftventricularbloodpoolincardiaccineimages. Specifically, theimageswereoftheMagneticResonance(MR)modality,containingbloodvelocity dataaswellastissuedensitydata. Wehaveanalysedtherelativeimpactofthevelocity data in the goal of achieving better accuracy. Our work would be typically used for qualitative analysis of anatomical structures and quantitative analysis of anatomical measures. 3 4 DEDICATION To my parents and sisters, with much love, appreciation, and affection. 5 6 Acknowledgments While a thesis has a single author by definition, many people are responsible for its existence. Dr Peter Burger, my supervisor, is perhaps the most important of these people. Peter provided me with regular weekly meetings and many ideas. I wish to thank him sincerely for being very supportive and friendly throughout the whole of my PhD. I would also like to thank my examiners: Professor Michael Fairhurst of the ElectronicEngineeringDepartment, UniversityofKent, Canterbury and Professor XiaohuiLiuoftheDepartmentofComputerScience, Brunel University. Iam grateful to Dr. Daniel Ru¨ckert for commenting extensively on an earlier draft of this thesis. I wouldalso liketoacknowledgeDrGuang-ZhongYang forhishelp inmymock viva. During my PhD journey I met a number of excellent people with whom I have become good friends and therefore made the journey particularly enjoyable. I would hope that weremain friends after we haveall gone separate ways. MoustafaGhanem: thank you for your wise and light-hearted chats. Daniel Ru¨ckert and Gerardo Ivar Sanchez-Ortiz: thankyouforbeingspecialfriendswithwhomboundariesfaded. Ioan- nisAkrotirianakis: thankyouforourmanysharedmagicalmoments. TarkanTahseen: thank you for your friendship, inspiration, and all those netmaze sessions. Khurrum Sair: thankyouforputtingupwithallsortsofinconveniencesfrommeandforournew friendship. OutsideofCollege,IwouldliketothankAtifSharafandWalidZgallaifor being mygood(half-Egyptian!) Arab friends withwhomIshared manyagoodtime. On a more personal level, I would like to thank my parents, Amaal and Ismail — their hard work gave me the opportunity to choose the path that led here — and my sisters,Fatimaand Iman, fortheircontinuoussupportand encouragement. Lastbutnotleast,ImustthanktheDepartmentofComputing,ImperialCollegefor kindly allowingme to use its facilities even after the expiry of my registration period. 7 8 Contents 1 Introduction 17 1.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.1.1 ClusteringApplications . . . . . . . . . . . . . . . . . . . . 19 1.1.2 ClusteringParadigms . . . . . . . . . . . . . . . . . . . . . . 21 1.1.3 Fuzzy Clustering . . . . . . . . . . . . . . . . . . . . . . . . 21 1.2 ImageAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.3 General Framework and Motivation . . . . . . . . . . . . . . . . . . 24 1.4 Research Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.5 MainResearch Contributions . . . . . . . . . . . . . . . . . . . . . . 25 1.6 OutlineofthisDissertation . . . . . . . . . . . . . . . . . . . . . . . 26 2 The BasicsofData Clustering 28 2.1 NotationandTerminology . . . . . . . . . . . . . . . . . . . . . . . 29 2.2 ExamplesofDataSets . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.3 Hierarchical and PartitionalClustering . . . . . . . . . . . . . . . . . 36 2.3.1 Hierarchical Clustering . . . . . . . . . . . . . . . . . . . . . 36 2.3.2 Example: Singlelinkalgorithm . . . . . . . . . . . . . . . . 38 2.3.3 PartitionalClustering . . . . . . . . . . . . . . . . . . . . . . 39 9 2.3.4 Example: Hard -Means(HCM) . . . . . . . . . . . . . . . . 40 2.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3 Fuzzy Clustering 44 3.1 Fuzzy Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 TheFuzzy -Means Algorithm . . . . . . . . . . . . . . . . . . . . . 46 3.2.1 FCM OptimisationModel . . . . . . . . . . . . . . . . . . . 47 3.2.2 ConditionsforOptimality . . . . . . . . . . . . . . . . . . . 48 3.2.3 TheAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.4 AnExample . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.5 AnalysisofFCM Model . . . . . . . . . . . . . . . . . . . . 51 3.2.6 Noteson UsingFCM . . . . . . . . . . . . . . . . . . . . . . 53 3.2.7 Strengthsand Weaknesses . . . . . . . . . . . . . . . . . . . 54 3.3 ExtensionsofFCM . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3.1 Fuzzy CovarianceClustering . . . . . . . . . . . . . . . . . . 56 3.3.2 Fuzzy -ElliptotypesClustering . . . . . . . . . . . . . . . . 58 3.3.3 ShellClustering . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Modificationsto theFCM Model . . . . . . . . . . . . . . . . . . . . 60 3.4.1 Possibilistic -Means (PCM) Clustering . . . . . . . . . . . 60 C 3.4.2 HighContrast . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4.3 CompetitiveAgglomeration . . . . . . . . . . . . . . . . . . 63 3.4.4 CredibilisticClustering . . . . . . . . . . . . . . . . . . . . . 64 3.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4 A New AlgorithmforFuzzy Clustering 67 4.1 TheExperimentalFramework . . . . . . . . . . . . . . . . . . . . . 68 4.1.1 TestsforClusteringAlgorithms . . . . . . . . . . . . . . . . 70 4.1.2 DesigningtheSyntheticData. . . . . . . . . . . . . . . . . . 71 4.2 TheBehaviourofFCM . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.2.1 FCM’sResults ontheSyntheticData . . . . . . . . . . . . . 75 4.2.2 DiscussionofFCM’s Results . . . . . . . . . . . . . . . . . . 82 10
Description: