F I F T H E D I T I O N Using Multivariate Statistics Barbara G. Tabachnick California State University, Northridge Linda S. Fidell California State University, Northridge Boston rn New York rn San Francisco Mexico City Montreal rn Toronto rn London rn Madrid rn Munich Paris Hong Kong rn Singapore rn Tokyo rn CapeTown rn Sydney Executive Editor: Siisczrz Hlzrttnan Editorial Assistant< Tizerese Felser Marketing Manager: Karen Nat~lle Senior Production Administrator: Donna Simons Editorial Production Service: Omegatype Typogmph~In c. Composition Buyer: Andrew Erryo Manufacturing Buyer: Andrew T~lrso Electronic Composition: Omegcztype Typography, Inc. Cover Administrator: Joel Geizdron For related titles and support materials, visit our online catalog at www.ablongman.com. Copyright 0 2007,2001, I996 Pearson Education. Inc. All rights reserved. No part of the material protected by th~sco pyright notice may be reproduced or utilized in any form or by any means. electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright holder. To obtain permissionis) to use materials from this work, please submit a written request to Allyn and Bacon, Permissions Department, 75 Arlington Street, Boston, MA 02 1 16, or fax your request to 6 17-848-7320. Between the time web site information is gathered and published. it is not unusual for some sites to have closed. Also, the transcription of URLs can result in typographical errors. The publisher would appreciate notification where these occur so that they may be corrected in subsequent editions. ISBN 0-205-45938-2 Punted In the Un~redS tate\ ot Amerlca 1 0 9 8 7 6 5 4 3 2 1 RRD 10 09 08 07 06 C O N T E N T S Preface xxvii 1.1 Multivariate Statistics: Why? 1 1.1.1 The Domain of Multivariate Statistics: Numbers of IVs and DVs 1 1.1.2 Experimental and Nonexperimental Research 2 1.1.3 Computers and Multivariate Statistics 4 1.1.4 Garbage In, Roses Out? 5 1.2 Some Useful Definitions 5 1.2.1 Continuous. Discrete, and Dichotomous Data 5 1.2.2 Samples and Populations 7 1.2.3 Descriptive and Inferential Statistics 7 1.2.4 Orthogonality: Standard and Sequential Analyses 8 1.3 Linear Combinations of Variables 10 1.4 Number and Nature of Variables to Include 11 1.5 Statistical Power 11 1.6 Data Appropriate for Multivariate Statistics 12 1.6.1 The Data Matrix 12 1.6.2 The Correlation Matrix 1 3 1.6.3 The Variance-Covariance Matrix 14 1.6.4 The SUG-of-Sc;uaresa nd Cr~ss-ProductsM atrix l4 1.6.5 Residuals 16 1.7 Organization of the Book 16 3 A Guide to Statistical Techniques: Using the Book 17 li 2.1 Research Questions and Associated Techniques 17 2.1.1 Degree of Relationship among Variables 17 2.1.1.1 Bivariate r 17 2.1.1.2 Multiple R 18 2.1.1.3 Sequential R 18 2.1.1.4 Canonical R I8 2.1.1.5 Multiway Frequency Analysis 19 7.1.1.6 Multilevel Modeling 19 iii iv C O N T E N T S 2.1.2 Significance ot'(;i-oup Difference\ I9 3. I .Z. i One-Way XKOL'A and r Test 19 2.1.1.2 One-Way ANCOVA 30 2.1.2.3 Factorial ANOVA 20 2.1.2.4 Factorial ANCOVA 20 2.1.2.5 Hotelling's T' 21 2.1.2.6 One-Way MANOVA 2 1 2.1.2.7 One-Way MANCOVA 2 1 2.1.2.8 Factorial MANOVA 22 2.1.2.9 Factorial MANCOVA 22 2.1.2.10 Profile Analysis of Repeated Measures 23 2.1.3 Prediction of Group Membership 23 2.1.3.1 One-Way Discriminant 23 2.1.3.2 Sequential One-Way Discriminant 24 2.1.3.3 Multiway Frequency Analysis (Logit) 24 2.1.3.4 Logistic Regression 24 2. I .3.5 Sequential Logistic Regression 25 2.1.3.6 Factorial Discriminant Analysis 25 2.1.3.7 Sequential Factorial Discriminant Analysis 25 2.1.4 Structure 25 2.1.4.1 Principal Components 25 2.1.3.2 Factor Analysis 26 2. !. 4.? Structural Eqtiation Modeling 26 2.1.5 Time Course of Events 26 2.1.5.1 Survival/Failure Analysis '26 2.1.5.2 Time-Series Analysis 27 2.2 Some Further Comparisons 27 2.3 A Decision Tree 28 2.4 Technique Chapters 31 2.5 Preliiiiinary Check of the Data 32 3 Review of Univariate and Bivariate Statistics 33 3.1 Hypothesis Testing 33 3.1.1 One-Sample z Test as Prototype 33 3.1.2 Power 36 3.1.3 Extensions of the Model 37 3.1.4 Controversy Surrounding Significance Testing 37 3.2 Analysis of Variance 37 3.2.1 One-Way Between-Subjects ANOVA 39 3.2.2 Factorial Between-Subjects ANOVA 42 3.2.3 Within-Subjects ANOVA 43 3.2.4 Mixed Between-Within-Subjects ANOVA 46 3.2.5 De\ign Complexity 47 3.2.5.1 Nesting 47 3.2.5.1 Latin-Square Designs 47 3.2.5.3 Unequal 11 and Nonorthogonal~ty 48 3.2.5.3 Fixed and Random Effects 49 3.2.6 Specific Comparisons 49 3.2.6.1 Weighting Coefficients for Comparisons 50 3.2.6.2 Orthogonality of Weighting Coefficients 50 3.2.6.3 Obtained F for Comparisons 5 1 3.2.6.4 Critical F for Planned Comparisons 52 3.2.6.5 Critical F for Post Hoc Comparisons 53 3.3 Parameter Estimation 53 3.4 Effect Size 54 3.5 Bivariate Statistics: Correlation and Regression 56 3.5.1 Correlation 56 3.5.2 Regression 57 3.6 Chi-Square Analysis 58 Cleaning Up Your Act: Screening Data Prior to Analysis 60 4.1 Important Issues in Data Screening 61 4.1 . 1 Accuracy of Data File 6 1 4.1.2 Honest Correlations 61 4.1.2.1 Inflated Correlation 6 1 4.1.2.2 Deflated Correlation 61 4.1.3 Missing Data 62 4. !. 3. ! Deleting Cases or Variables 63 4.1.3.2 Estimating Missing Data 66 4.1.3.3 Using a Missing Data Correlation Matrix 70 4.1.3.4 Treating Missing Data as Data 71 4.1.3.5 Repeating Analyses with and without Missing Data 71 4.1.3.6 Choosing among Methods for Dealing with Missing Data 7 1 4.1.4 Outliers 72 4.1.4.1 Detecting Univariate and Multivariate Outliers 73 4.1.4.2 Describing Outliers 76 4.1.4.3 Reducing the Influence of Outliers 77 4.1.4.4 Outliers in a Solution 77 4.1.5 Normality, Linearity, and Hornoscedasticity 78 4.1.5.1 Normality 79 4.1.5.2 Linearity 83 4.1.5.3 Homoscedasticity, Homogeneity of Variance. and Homogeneity of Variance-Covariance Matrices 85 vi C O N T E N T S 4.1 Cornnion Data Tr;tn(forlndti~n\ 86 (7 1.1.7 M~~lticollinedrit'i>nd Sing~11~1-1ty 88 4.1.8 A Checklist and Some Pract~calR ecommendation\ 9 I 4.2 Complete Examples of Data Screening 92 4.2.1 Screening Ungrouped Data 92 4.2.1.1 Accuracy of Input. Missing Data, Distributions, and Univariate Outliers 93 4.2.1.2 Linearity and Homoscedasticity 96 4.2.1.3 Transformation 98 4.2.1.4 Detecting Multivariate Outliers 99 4.2.1.5 Variables Causing Cases to Be Outliers 100 4.2.1.6 Multicollinearity 104 4.2.2 Screening Grouped Data 105 4.2.2.1 Accuracy of Input, Missing Data, Distributions, Homogeneity of Variance, and Univariate Outliers 105 4.2.2.2 Linearity 110 4.2.2.3 Multivariate Outliers 11 1 4.2.2.4 Variables Causing Cases to Be Outliers 1 13 4.2.2.5 Multicollinearity 1 14 Ir 3 Multiple Regression 117 5.1 General Purpose and Description 117 5.2 Kinds of Research Questions 118 5.2.1 Degree of Relationship 1 19 5.2.2 Irr~portanceo f i'v'a 119 5.2.3 Adding IVs 119 5.2.4 Changing 1Vs 120 5.2.5 Contingencies among IVs 120 5.2.6 Comparing Sets of IVs 120 5.2.7 Predicting DV Scores for Members of a New Sample 120 5.2.8 Parameter Estimates 121 5.3 Limitations to Regression Analyses 121 5.3.1 Theoretical Issues 122 5.3.2 Practical Issues 123 5.3.2.1 Ratio of Cases to IVs 123 5.3.2.2 Absence of Outliers among the IVs and on the DV 124 5.3.2.3 Absence of Multicollinearity and Singularity 124 5.3.2.4 Normality, Linearity, Homoscedasticity of Residuals 125 5.3.2.5 Independence of Errors 128 5.3.2.6 Absence of Outliers in the Solution 128 5.4 Fundamental Equations for Multiple Regression 128 5.4.1 General Linear Equations 129 5.4.2 Matrix Equations 13 1 5.4.3 Computer Analyses of Small-Sample Example 134 C O \ \,EL r‘j vii 5.5 Major Types of klultiple Regression 136 5.5. l Stancl~rctM ultiple Regression 136 5.5.2 Sequential Multiple Regression 138 5.5.3 Statistical (Stepwise) Regression 138 5.5.4 Choosing among Regression Strategies 143 . 5.6 Some Important Issues 144 5.6.1 Importance of IVs 144 5.6.1.l Standard Multiple Regression 146 5.6.1.2 Sequential or Statistical Regression 146 5.6.2 Statistical Inference 146 5.6.2.1 Test for Multiple R 147 5.6.2.2 Test of Regression Components 148 5.6.2.3 Test of Added Subset of IVs 149 5.6.2.4 Confidence Limits around B and Multip le" R 50 5.6.2.5 Comparing Two Sets of Predictors 152 5.6.3 Adjustment of R~ 153 5.6.4 Suppressor Variables 154 5.6.5 Regression Approach to ANOVA 155 5.6.6 Centering when Interactions and Powers of IVs Are Included 157 5.6.7 Mediation in Causal Sequences 159 5.7 Complete Examples of Regression Analysis 161 5.7,1 Evaluation of Assumptions 16 1 5.7.1.1 Ratio of Cases to IVs 16 1 5.7.1.2 Normality, Linearity, Homoscedasticity, and Independence of Residuals 16 1 5.7.1.3 Outliers 165 5.7.1.4 Multicollinearity and Singularity 167 5.7.2 Standard Multiple Regression 167 5.7.3 Sequential Regression 174 5.7.4 Example of Standard Multiple Regression with Missing Values Multiply Imputed 179 5.8 Comparison of Programs 188 5.8.1 SPSS Package 188 5.8.2 SAS System 191 5.8.3 SYSTAT System 194 6 Analysis of Covariance 195 6.1 General Purpose and Description 195 6.2 Kinds of Research Questions 198 6.2.1 Main Effects of IV4 I98 6.2.2 Interactions among IVs 198 6.2.3 Specific Comparisons and Trend Analysis 199 6.2.4 Effects of Covariates 199 CONTENTS 6.2.5 Effect S~re I99 6.2.6 Parameter E\t~rnates 199 6.3 Limitations to Analysis of Covariance 200 6.3.1 Theoretical Issues 200 6.3.2 Practical Issues 20 1 6.3.2.1 Unequal Sample Sizes, Missing Data, and Ratio of Cases to IVs 201 6.3.2.2 Absence of Outliers 201 6.3.2.3 Absence of Multicollinearity and Singularity 20 1 6.3.2.4 Normality of Sampling Distributions 202 6.3.2.5 Homogeneity of Variance 202 6.3.2.6 Linearity 202 6.3.2.7 Homogeneity of Regression 202 6.3.2.8 Reliability of Covariates 203 6.4 Fundamental Equations for Analysis of Covariance 203 6.4.1 Sums of Squares and Cross Products 204 6.4.2 Significance Test and Effect Size 208 6.4.3 Computer Analyses of Small-Sample Example 209 6.5 Some Important Issues 211 6.5.1 Choosing Covariates 2 1 1 6.5.2 Evaluation of Covariates 2 12 6.5.3 Test for Homogeneity of Regression 2 13 6.5.4 Design Complexity 2 13 6.5.4.1 Within-Subjects and Mixed Within-Between Designs 2 13 6.5.4.2 Unequal Sample Si7es 2 17 6.5.4.3 Specific Comparisons and Trend Analysis 2 18 6.5.4.4 Effect Size 22 1 6.5.5 Alternatives to ANCOVA 22 1 6.6 Complete Example of Analysis of Covariance 223 6.6.1 Evaluation of Assumptions 223 6.6. I. 1 Unequal n and Missing Data 224 6.6.1.2 Normality 224 6.6.1.3 Linearity 224 6.6.1.4 Outliers 224 6.6.1.5 Multicollinearity and Singularity 227 6.6.1.6 Homogeneity of Variance 228 6.6.1.7 Homogeneity of Regression 230 6.6.1.8 Reliability of Covariates 230 6.6.2 Analysis of Covariance 230 6.6.2.1 Main Analysis 230 6.6.2.2 Evaluation of Covariates 235 6.6.2.3 Homogeneity of Regression Run 237 6.7 Comparison of Programs 240 6.7.1 SPSSPackage 240 C O N T E N T S ix 6 7 2 SXS Sqstem 710 6.7.3 SYSTAT Systern 310 7 Multivariate Analysis of Variance and Covariance 243 7.1 General Purpose and Description 243 7.2 Kinds of Research Questions 247 7.2.1 Main Effects of IVs 247 7.2.2 Interactions among IVs 247 7.2.3 Importance of DVs 247 7.2.4 Parameter Estimates 248 7.2.5 Specific Comparisons and Trend Analysis 248 7.2.6 Effect Size 248 7.2.7 Effects of Covariates 248 7.2.8 Repeated-Measures Analysis of Variance 249 7.3 Limitations to Multivariate Analysis of Variance and Covariance 249 7.3.1 Theoretical Issues 249 7.3.2 Practical Issues 250 7.3.2.1 Unequal Sample Sizes. Missing Data, and Power 250 7.3.2.2 Multivariate Normality 25 : 7.3.2.3 Absence of Outliers 25 1 7.3.2.4 Homogeneity of Variance-Covariance Matrices 25 1 7.3.2.5 Linearity 252 7.3.2.6 Homogeneity of Regression 252 7.3.2.7 Reliability of Covariates 253 7.3.2.8 Absence of Multicollinearity and Singular~ty 253 7.4 Fundamental Equations for Multivariate Analysis of Variance and Covariance 253 7.4.1 Multivariate Analysis of Variance 253 7.4.2 Computer Analyses of Small-Sample Example 261 7.4.3 Multivariate Analysis of Covariance 264 7.5 Some Important Issues 268 7.5.1 MANOVA vs. ANOVAs 268 7.5.2 Criteria for Statistical Inference 269 7.5.3 Assessing DVs 270 7.5.3.1 Univariate F 270 7.5.3.2 Roy-Bargmann Stepdown Analysis 27 1 7.5.3.3 Using Discriminant Analysis 272 7.5.3.4 Choosing among Strategies for Assessing DVs 273 7.5.4 Specific Comparisons and Trend Analysis 273 7.5.5 Design Complexity 274 7.5.5,l Within-Subjects and Between-Within Designs 273 7.5.5.3- Unequal Sample Sizes 276 X CONTENTS 7.6 Complete Examples of 3Iultivariate .Analysis of Variance and Covariance 277 7.6.1 Evaluation of Assulnptions 177 7.6. I. I Unequal Sample Sizes and Missing Data 277 7.6. 1.2 Multivariate Normality 279 7.6.1.3 Linearity 279 7.6.1.4 Outliers 279 7.6.1.5 Homogencity of Variance-Covariance Matrices 280 7.6.1.6 Homogeneity of Regression 28 1 7.6.1.7 Reliability of Covariates 284 7.6.1.8 Multicollinearity and Singularity 285 7.6.2 Multivariate Analysis of Variance 285 7.6.3 Multivariate Analysis of Covariance 296 7.6.3.1 Assessing Covariates 296 7.6.3.2 Assessing DVs 296 7.7 Comparison of Programs 307 7.7.1 SPSS Package 307 7.7.2 SAS System 310 7.7.3 SYSTAT System 3 10 8 Profile Analysis: The Multivariate Approach to Repeated Measures 311 8.1 General Purpose and Description 31 1 8.2 Kinds of Research Questions 312 8.2. I Parallelism of Profiles 3 12 8.2.2 Overali D~t'ferencea mong Groups 3 13 8.2.3 Flatness of Profiles 3 13 8.2.4 Contrasts Following Profile Analysis 3 13 8.2.5 Parameter Estimates 3 !3 8.2.6 Effect Size 314 8.3 Limitations to Profile Analysis 314 8.3.1 Theoretical Issues 3 14 8.3.2 Practical Issues 3 15 8.3.2.1 Sample Size, Missing Data, and Power 3 15 8.3.2.2 Multivariate Normality 3 15 8.3.2.3 Absence of Outliers 3 15 8.3.2.4 Homogeneity of Variance-Covariance Matrices 3 15 8.3.2.5 Linearity 3 16 8.3.2.6 Absence of Multicollinearity and Singularity 3 16 8.4 Fundamental Equations for Profile Analysis 316 8.4.1 Differences in Levels 3 16 8.4.2 Parallelism 3 18