ebook img

Correspondence Analysis PDF

562 Pages·2014·2.054 MB·english
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Correspondence Analysis

Correspondence Analysis Theory, Practice and New Strategies Eric J. Beh SchoolofMathematics&PhysicalSciences UniversityofNewcastle,Australia Rosaria Lombardo DepartmentofEconomics,SecondUniversityofNaples, Italy Thiseditionfirstpublished2014 ©2014JohnWiley&Sons,Ltd LibraryofCongressCataloging-in-PublicationData Beh,EricJ.,author. Correspondenceanalysis:theory,practiceandnewstrategies/EricBeh,RosariaLombardo. pagescm Includesbibliographicalreferencesandindex. ISBN978-1-119-95324-1(hardback) 1.Correspondenceanalysis(Statistics)I.Lombardo,Rosaria,author.II.Title. QA278.5.B442014 519.5’37--dc23 2014017301 AcataloguerecordforthisbookisavailablefromtheBritishLibrary. ISBN:978-1-119-95324-1 Setin10/12ptTimesLTStd-RomanbyThomsonDigital,Noida,India. 12014 Contents Foreword xv Preface xvii PartOne Introduction 1 1 DataVisualisation 3 1.1 AVeryBriefIntroductiontoDataVisualisation 3 1.1.1 AVeryBriefHistory 3 1.1.2 IntroductiontoVisualisationToolsforNumericalData 4 1.1.3 IntroductiontoVisualisationToolsforUnivariateCategoricalData 6 1.2 DataVisualisationforContingencyTables 10 1.2.1 FourfoldDisplays 11 1.3 OtherPlots 12 1.4 StudyingExposuretoAsbestos 13 1.4.1 AsbestosandIrvingJ.Selikoff 13 1.4.2 Selikoff’sData 17 1.4.3 NumericalAnalysisofSelikoff’sData 17 1.4.4 AGraphicalAnalysisofSelikoff’sData 18 1.4.5 ClassicalCorrespondenceAnalysisofSelikoff’sData 20 1.4.6 OtherMethodsofGraphicalAnalysis 22 1.5 HappinessData 25 1.6 CorrespondenceAnalysisNow 29 1.6.1 ABibliographicTaste 29 1.6.2 TheIncreasingPopularityofCorrespondenceAnalysis 29 1.6.3 TheGrowthoftheCorrespondenceAnalysisFamilyTree 32 1.7 OverviewoftheBook 34 1.8 RCode 35 References 36 2 Pearson’sChi-SquaredStatistic 44 2.1 Introduction 44 2.2 Pearson’sChi-SquaredStatistic 44 2.2.1 Notation 44 2.2.2 MeasuringtheDeparturefromIndependence 45 2.2.3 Pearson’sChi-SquaredStatistic 47 2.2.4 Other𝜒2MeasuresofAssociation 48 2.2.5 ThePowerDivergenceStatistic 49 2.2.6 DealingwiththeSampleSize 50 2.3 TheGoodman--KruskalTauIndex 51 2.3.1 OtherMeasuresandIssues 52 2.4 The2×2ContingencyTable 52 2.4.1 Yates’ContinuityCorrection 53 2.5 EarlyContingencyTables 54 2.5.1 TheImpactofAdolphQuetelet 55 2.5.2 Gavarret’s(1840)LegitimateChildrenData 58 2.5.3 Finley’s(1884)TornadoData 58 2.5.4 Galton’s(1892)FingerprintData 59 2.5.5 FinalComments 61 2.6 RCode 61 2.6.1 ExpectationandVarianceofthePearsonChi-SquaredStatistic 61 2.6.2 Pearson’sChi-SquaredTestofIndependence 62 2.6.3 TheCressie--ReadStatistic 64 References 67 PartTwo CorrespondenceAnalysisofTwo-WayContingencyTables 71 3 MethodsofDecomposition 73 3.1 Introduction 73 3.2 ReducingMultidimensionalSpace 73 3.3 ProfilesandCloudofPoints 74 3.4 PropertyofDistributionalEquivalence 79 3.5 TheTripletandClassicalReciprocalAveraging 79 3.5.1 One-DimensionalReciprocalAveraging 80 3.5.2 MatrixFormofOne-DimensionalReciprocalAveraging 81 3.5.3 𝑀-DimensionalReciprocalAveraging 83 3.5.4 SomeHistoricalComments 83 3.6 SolvingtheTripletUsingEigen-Decomposition 84 3.6.1 TheDecomposition 84 3.6.2 Example 85 3.7 SolvingtheTripletUsingSingularValueDecomposition 86 3.7.1 TheStandardDecomposition 86 3.7.2 TheGeneralisedDecomposition 88 3.8 TheGeneralisedTripletandReciprocalAveraging 89 3.9 SolvingtheGeneralisedTripletUsingGram--SchmidtProcess 91 3.9.1 OrderedCategoricalVariablesandaprioriScores 91 3.9.2 OnFindingOrthogonalisedVectors 92 3.9.3 ARecurrenceFormulaeApproach 94 3.9.4 ChangingtheBasisVector 96 3.9.5 GeneralisedCorrelations 97 3.10 BivariateMomentDecomposition 100 3.11 HybridDecomposition 100 3.11.1 AnAlternativeSinglyOrderedApproach 102 3.12 RCode 103 3.12.1 Eigen-DecompositioninR 103 3.12.2 SingularValueDecompositioninR 103 3.12.3 SingularValueDecompositionforMatrixApproximation 104 3.12.4 GeneratingEmerson’sPolynomials 106 3.13 APreliminaryGraphicalSummary 109 3.14 AnalysisofAnalgesicDrugs 112 References 115 4 SimpleCorrespondenceAnalysis 120 4.1 Introduction 120 4.2 Notation 121 4.3 MeasuringDeparturesfromCompleteIndependence 122 4.3.1 The‘DuplicationConstant’ 123 4.3.2 PearsonRatios 123 4.4 DecomposingthePearsonRatio 124 4.5 CoordinateSystems 126 4.5.1 StandardCoordinates 126 4.5.2 PrincipalCoordinates 127 4.5.3 BiplotCoordinates 132 4.6 Distances 136 4.6.1 DistancefromtheOrigin 136 4.6.2 Intra-VariableDistancesandthe𝐿𝑝 Metric 137 4.6.3 Inter-VariableDistances 138 4.7 TransitionFormulae 140 4.8 MomentsofthePrincipalCoordinates 141 4.8.1 TheMeanof𝑓𝑖𝑚 142 4.8.2 TheVarianceof𝑓𝑖𝑚 142 4.8.3 TheSkewnessof𝑓𝑖𝑚 143 4.8.4 TheKurtosisof𝑓𝑖𝑚 143 4.8.5 MomentsoftheAsbestosData 144 4.9 HowManyDimensionstoUse? 145 4.10 RCode 147 4.11 OtherTheoreticalIssues 154 4.12 SomeApplicationsofCorrespondenceAnalysis 156 4.13 AnalysisofaMother’sAttachmenttoHerChild 158 References 165 5 Non-SymmetricalCorrespondenceAnalysis 177 5.1 Introduction 177 5.2 TheGoodman--KruskalTauIndex 180 5.2.1 TheTauIndexasaMeasureoftheIncreaseinPredictability 180 5.2.2 TheTauIndexintheContextofANOVA 182 5.2.3 TheSensitivityof𝜏 182 5.2.4 ADemonstration:RevisitingSelikoff’sAsbestosData 185 5.3 Non-SymmetricalCorrespondenceAnalysis 186 5.3.1 TheCentredColumnProfileMatrix 186 5.3.2 Decompositionof𝜏 187 5.4 TheCoordinateSystems 188 5.4.1 StandardCoordinates 188 5.4.2 PrincipalCoordinates 189 5.4.3 BiplotCoordinates 193 5.5 TransitionFormulae 197 5.5.1 SupplementaryPoints 198 5.5.2 ReconstructionFormulae 198 5.6 MomentsofthePrincipalCoordinates 199 5.6.1 TheMeanof𝑓𝑖𝑚 199 5.6.2 TheVarianceof𝑓𝑖𝑚 200 5.6.3 TheSkewnessof𝑓𝑖𝑚 201 5.6.4 TheKurtosisof𝑓𝑖𝑚 201 5.7 TheDistances 201 5.7.1 ColumnDistances 201 5.7.2 RowDistances 203 5.8 ComparisonwithSimpleCorrespondenceAnalysis 204 5.9 RCode 204 5.10 AnalysisofaMother’sAttachmenttoHerChild 209 References 212 6 OrderedCorrespondenceAnalysis 216 6.1 Introduction 216 6.2 Pearson’sRatioandBivariateMomentDecomposition 221 6.3 CoordinateSystems 222 6.3.1 StandardCoordinates 222 6.3.2 TheGeneralisedCorrelations 223 6.3.3 PrincipalCoordinates 225 6.3.4 Location,DispersionandHigherOrderComponents 229 6.3.5 TheCorrespondencePlotandGeneralisedCorrelations 230 6.3.6 ImpactontheChoiceofScores 232 6.4 ArtificialDataRevisited 233 6.4.1 OntheStructureoftheAssociation 233 6.4.2 AGraphicalSummaryoftheAssociation 233 6.4.3 AnInterpretationoftheAxesandComponents 234 6.4.4 TheImpactoftheChoiceofScores 235 6.5 TransitionFormulae 236 6.6 DistanceMeasures 238 6.6.1 DistancefromtheOrigin 238 6.6.2 Intra-VariableDistances 239 6.7 SinglyOrderedAnalysis 239 6.8 RCode 241 6.8.1 GeneralisedCorrelationsandPrincipalInertias 241 6.8.2 DoublyOrderedCorrespondenceAnalysis 245 References 248 7 OrderedNon-SymmetricalCorrespondenceAnalysis 251 7.1 Introduction 251 7.2 GeneralConsiderations 252 7.2.1 OrthogonalPolynomialsInsteadofSingularVectors 253 7.3 DoublyOrderedNon-SymmetricalCorrespondenceAnalysis 254 7.3.1 BivariateMomentDecomposition 254 7.3.2 GeneralisedCorrelationsinBivariateMomentDecomposition 255 7.4 SinglyOrderedNon-SymmetricalCorrespondenceAnalysis 257 7.4.1 HybridDecompositionforanOrderedPredictorVariable 257 7.4.2 HybridDecompositionintheCaseofOrderedResponseVariables 258 7.4.3 GeneralisedCorrelationsinHybridDecomposition 258 7.5 CoordinateSystemsforOrderedNon-SymmetricalCorrespondence Analysis 259 7.5.1 PolynomialPlotsforDoublyOrderedNon-Symmetrical CorrespondenceAnalysis 260 7.5.2 PolynomialBiplotforDoublyOrderedNon-Symmetrical CorrespondenceAnalysis 262 7.5.3 PolynomialPlotforSinglyOrderedNon-Symmetrical CorrespondenceAnalysiswithanOrderedPredictorVariable 262 7.5.4 PolynomialBiplotforSinglyOrderedNon-Symmetrical CorrespondenceAnalysiswithanOrderedPredictorVariable 263 7.5.5 PolynomialPlotforSinglyOrderedNon-Symmetrical CorrespondenceAnalysiswithanOrderedResponseVariable 264 7.5.6 PolynomialBiplotforSinglyOrderedNon-Symmetrical CorrespondenceAnalysiswithanOrderedResponseVariable 265 7.6 TestsofAsymmetricAssociation 265 7.7 DistancesinOrderedNon-SymmetricalCorrespondenceAnalysis 266 7.7.1 DistancesinDoublyOrderedNon-Symmetrical CorrespondenceAnalysis 267 7.7.2 DistancesinSinglyOrderedNon-Symmetrical CorrespondenceAnalysis 269 7.8 DoublyOrderedNon-SymmetricalCorrespondenceofAsbestosData 269 7.8.1 Trends 270 7.9 SinglyOrderedNon-SymmetricalCorrespondenceAnalysisofDrugData 277 7.9.1 PredictabilityofOrderedRowsGivenColumns 278 7.10 RCodeforOrderedNon-SymmetricalCorrespondenceAnalysis 283 References 300 8 ExternalStabilityandConfidenceRegions 302 8.1 Introduction 302 8.2 OntheStatisticalSignificanceofaPoint 303 8.3 CircularConfidenceRegionsforClassicalCorrespondenceAnalysis 304 8.4 EllipticalConfidenceRegionsforClassicalCorrespondenceAnalysis 306 8.4.1 TheInformationintheOptimalCorrespondencePlot 306 8.4.2 TheInformationintheFirstTwoDimensions 308 8.4.3 EccentricityofEllipticalRegions 309 8.4.4 ComparisonofConfidenceRegions 309 8.5 ConfidenceRegionsforNon-SymmetricalCorrespondenceAnalysis 311 8.5.1 CircularRegionsinNon-SymmetricalCorrespondenceAnalysis 312 8.5.2 EllipticalRegionsinNon-SymmetricalCorrespondenceAnalysis 312 8.6 Approximate𝑝-valuesandClassicalCorrespondenceAnalysis 313 8.6.1 Approximate𝑝-valuesBasedonConfidenceCircles 313 8.6.2 Approximate𝑝-valuesBasedonConfidenceEllipses 314 8.7 Approximate𝑝-valuesandNon-SymmetricalCorrespondenceAnalysis 315 8.8 BootstrapEllipticalConfidenceRegions 315 8.9 Ringrose’sBootstrapConfidenceRegions 316 8.9.1 ConfidenceEllipsesandCovarianceMatrix 317 8.10 ConfidenceRegionsandSelikoff’sAsbestosData 318 8.11 ConfidenceRegionsandMother--ChildAttachmentData 322 8.12 RCode 325 8.12.1 CalculatingthePathofaConfidenceEllipse 326 8.12.2 ConstructingEllipticalRegionsinaCorrespondencePlot 327 References 335 9 VariantsofCorrespondenceAnalysis 337 9.1 Introduction 337 9.2 CorrespondenceAnalysisUsingAdjustedStandardisedResiduals 337 9.3 CorrespondenceAnalysisUsingtheFreeman--TukeyStatistic 340 9.4 CorrespondenceAnalysisofRankedData 342 9.5 RCode 343 9.5.1 AdjustedStandardisedResiduals 343 9.5.2 Freeman--TukeyStatistic 349 9.6 TheCorrespondenceAnalysisFamily 353 9.6.1 DetrendedCorrespondenceAnalysis 353 9.6.2 CanonicalCorrespondenceAnalysis 354 9.6.3 InverseCorrespondenceAnalysis 355 9.6.4 OrderedCorrespondenceAnalysis 355 9.6.5 GradeCorrespondenceAnalysis 355 9.6.6 SymbolicCorrespondenceAnalysis 356 9.6.7 CorrespondenceAnalysisofProximityData 356 9.6.8 Residual(Scaling)CorrespondenceAnalysis 360 9.6.9 Log-RatioCorrespondenceAnalysis 362 9.6.10 ParametricCorrespondenceAnalysis 364 9.6.11 SubsetCorrespondenceAnalysis 364 9.6.12 Foucart’sCorrespondenceAnalysis 365 9.7 OtherTechniques 365 References 366 PartThree CorrespondenceAnalysisofMulti-WayContingencyTables 373 10 CodingandMultipleCorrespondenceAnalysis 375 10.1 IntroductiontoCoding 375 10.2 CodingData 377 10.2.1 B-Splines 377 10.2.2 CrispCoding 380 10.2.3 FuzzyCoding 382 10.3 CodingOrderedCategoricalVariablesbyOrthogonalPolynomials 382 10.4 BurtMatrix 384 10.5 AnIntroductiontoMultipleCorrespondenceAnalysis 386 10.6 MultipleCorrespondenceAnalysis 388 10.6.1 Notation 388 10.6.2 DecompositionMethods 389 10.6.3 Coordinates,TransitionFormulaeandAdjustedInertia 393 10.7 VariantsofMultipleCorrespondenceAnalysis 395 10.7.1 JointCorrespondenceAnalysis 396 10.7.2 StackingandConcatenation 397 10.8 OrderedMultipleCorrespondenceAnalysis 398 10.8.1 OrthogonalPolynomialsinMultipleCorrespondenceAnalysis 398 10.8.2 HybridDecompositionofMultipleIndicatorTables 399 10.8.3 TwoOrderedVariablesandTheirContingencyTable 400 10.8.4 TestofStatisticalSignificance 401 10.8.5 PropertiesofOrderedMultipleCorrespondenceAnalysis 403 10.8.6 GraphicalDisplaysinOrderedMultipleCorrespondence Analysis 404 10.9 Applications 405 10.9.1 CustomerSatisfactioninHealthCareServices 406 10.9.2 TwoQualityAspects 411 10.10 RCode 417 10.10.1 B-SplineFunction 417 10.10.2 CrispandFuzzyCodingUsingB-SplinesinR 421 10.10.3 CrispCodingandtheBurtTablebyIndicatorFunctionsinR 425 10.10.4 ClassicalandMultipleCorrespondenceAnalysisinR 428 References 444 11 SymmetricalandNon-SymmetricalThree-WayCorrespondenceAnalysis 451 11.1 Introduction 451 11.2 Notation 453 11.3 SymmetricandAsymmetricAssociationinThree-WayContingency Tables 454 11.4 PartitioningThree-WayMeasuresofAssociation 455 11.4.1 PartitioningPearson’sThree-WayStatistic 457 11.4.2 PartitioningMarcotorchino’sandGray--William’s Three-WayIndices 458 11.4.3 Marcotorchino’sIndex 460 11.4.4 PartitioningtheThree-WayDeltaIndex 461 11.4.5 Three-WayDeltaIndex 463 11.5 FormalTestsofPredictability 463 11.5.1 TestingPearson’sStatistic 464 11.5.2 TestingtheMarcotorchino’sIndex 464 11.5.3 TestingtheDeltaIndex 465 11.5.4 Discussion 465 11.6 Tucker3DecompositionforThree-WayTables 466 11.7 CorrespondenceAnalysisofThree-WayContingencyTables 467 11.7.1 SymmetricallyAssociatedVariables 467 11.7.2 AsymmetricallyAssociatedVariables 468 11.7.3 AdditionalProperty 469 11.8 ModellingofPartialandMarginalDependence 470 11.9 GraphicalRepresentation 471 11.9.1 InteractivePlot 471 11.9.2 InteractiveBiplot 472 11.9.3 CategoryContribution 474 11.10 OntheApplicationofPartitions 474 11.10.1 OliveData:PartitioningtheAsymmetricAssociation 474 11.10.2 JobSatisfactionData:PartitioningtheAsymmetricAssociation 476 11.11 OntheApplicationofThree-WayCorrespondenceAnalysis 477 11.11.1 JobSatisfactionandThree-WaySymmetricalCorrespondence Analysis 477 11.11.2 JobSatisfactionandThree-WayNon-Symmetrical CorrespondenceAnalysis 483 11.12 RCode 490 References 511 PartFour TheComputationofCorrespondenceAnalysis 517 12 ComputingandCorrespondenceAnalysis 519 12.1 Introduction 519 12.2 ALookThroughTime 519 12.2.1 Pre-1990 519 12.2.2 From1990to2000 520 12.2.3 TheEarly2000s 522 12.3 TheImpactofR 523 12.3.1 OverviewofCorrespondenceAnalysisinR 523 12.3.2 MASS 524 12.3.3 Nenadic´ andGreenacre’s(2007)ca 525 12.3.4 Murtagh(2005) 527 12.3.5 ade4 530 12.4 SomeStand-AlonePrograms 533 12.4.1 JMP 533 12.4.2 SPSS 533 12.4.3 PAST 534 12.4.4 DtmVic5.6+ 535 References 540 Index 545

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.