The correct bibliographic citation for this manual is as follows: Stroup, Walter W., George A. Milliken, Elizabeth A. Claassen, and Russell D. Wolfinger . 2018. SAS® for Mixed Models: Introduction and Basic Applications. Cary, NC: SAS Institute Inc. SAS® for Mixed Models: Introduction and Basic Applications Copyright © 2018, SAS Institute Inc., Cary, NC, USA 978-1-63526-135-6 (Hardcopy) 978-1-63526-154-7 (Web PDF) 978-1-63526-152-3 (epub) 978-1-63526-153-0 (mobi) All Rights Reserved. Produced in the United States of America. For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. For a web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication. The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated. U.S. Government License Rights; Restricted Rights: The Software and its documentation is commercial computer software developed at private expense and is provided with RESTRICTED RIGHTS to the United States Government. Use, duplication, or disclosure of the Software by the United States Government is subject to the license terms of this Agreement pursuant to, as applicable, FAR 12.212, DFAR 227.7202-1(a), DFAR 227.7202-3(a), and DFAR 227.7202-4, and, to the extent required under U.S. federal law, the minimum restricted rights as set out in FAR 52.227-19 (DEC 2007). If FAR 52.227-19 is applicable, this provision serves as notice under clause (c) thereof and no other notice is required to be affixed to the Software or documentation. The Government’s rights in Software and documentation shall be only those set forth in this Agreement. SAS Institute Inc., SAS Campus Drive, Cary, NC 27513-2414 December 2018 SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. SAS software may be provided with certain third-party software, including but not limited to open-source software, which is licensed under its applicable third-party software license agreement. For license information about third-party software distributed with SAS software, refer to http://support.sas.com/thirdpartylicenses. Contents About This Book ........................................................................................................ vii Dedication and Acknowledgments .............................................................................. xi Chapter 1: Mixed Model Basics ...................................................................................1 1.1 Introduction .......................................................................................................................................................1 1.2 Statistical Models ..............................................................................................................................................2 1.3 Forms of Linear Predictors ................................................................................................................................6 1.4 Fixed and Random Effects ................................................................................................................................8 1.5 Mixed Models ....................................................................................................................................................9 1.6 Typical Studies and Modeling Issues That Arise ............................................................................................. 10 1.7 A Typology for Mixed Models .......................................................................................................................... 15 1.8 Flowcharts to Select SAS Software to Run Various Mixed Models ................................................................. 16 Chapter 2: Design Structure I: Single Random Effect ................................................ 19 2.1 Introduction ..................................................................................................................................................... 19 2.2 Mixed Model for a Randomized Block Design ................................................................................................. 20 2.3 The MIXED and GLIMMIX Procedures to Analyze RCBD Data ...................................................................... 23 2.4 Unbalanced Two-Way Mixed Model: Examples with Incomplete Block Design .............................................. 32 2.5 Analysis with a Negative Block Variance Estimate: An Example .................................................................... 41 2.6 Introduction to Mixed Model Theory ................................................................................................................ 44 2.7 Summary ......................................................................................................................................................... 47 Chapter 3: Mean Comparisons for Fixed Effects ........................................................ 49 3.1 Introduction ..................................................................................................................................................... 49 3.2 Comparison of Two Treatments ...................................................................................................................... 50 3.3 Comparison of Several Means: Analysis of Variance ...................................................................................... 52 3.4 Comparison of Quantitative Factors: Polynomial Regression ......................................................................... 58 3.5 Mean Comparisons in Factorial Designs ......................................................................................................... 67 3.6 Summary ......................................................................................................................................................... 99 Chapter 4: Power, Precision, and Sample Size I: Basic Concepts ............................ 101 4.1 Introduction ................................................................................................................................................... 101 4.2 Understanding Essential Background for Mixed Model Power and Precision ............................................... 102 4.3 Computing Precision and Power for CRD: An Example ................................................................................ 104 4.4 Comparing Competing Designs I—CRD versus RCBD: An Example ........................................................... 108 4.5 Comparing Competing Designs II—Complete versus Incomplete Block Designs: An Example .................... 112 4.6 Using Simulation for Precision and Power .................................................................................................... 117 4.7 Summary ....................................................................................................................................................... 129 Chapter 5: Design Structure II: Models with Multiple Random Effects ..................... 131 5.1 Introduction ................................................................................................................................................... 131 5.2 Treatment and Experiment Structure and Associated Models ...................................................................... 132 5.3 Inference with Factorial Treatment Designs with Various Mixed Models ....................................................... 146 5.4 A Split-Plot Semiconductor Experiment: An Example ................................................................................... 150 5.5 A Brief Comment about PROC GLM ............................................................................................................. 161 5.6 Type × Dose Response: An Example ........................................................................................................... 162 5.7 Variance Component Estimates Equal to Zero: An Example ........................................................................ 171 iv SAS for Mixed Models 5.8 A Note on PROC GLM Compared to PROC GLIMMIX and PROC MIXED: Incomplete Blocks, Missing Data, and Spurious Non-Estimability ........................................................................................................................... 177 5.9 Summary ...................................................................................................................................................... 179 Chapter 6: Random Effects Models ......................................................................... 181 6.1 Introduction: Descriptions of Random Effects Models .................................................................................. 181 6.2 One-Way Random Effects Treatment Structure: Influent Example .............................................................. 188 6.3 A Simple Conditional Hierarchical Linear Model: An Example ..................................................................... 196 6.4 Three-Level Nested Design Structure: An Example ..................................................................................... 202 6.5 A Two-Way Random Effects Treatment Structure to Estimate Heritability: An Example .............................. 210 6.6 Modern ANOVA with Variance Components ................................................................................................ 214 6.7 Summary ...................................................................................................................................................... 216 Chapter 7: Analysis of Covariance .......................................................................... 217 7.1 Introduction ................................................................................................................................................... 217 7.2 One-Way Fixed Effects Treatment Structure with Simple Linear Regression Models .................................. 218 7.3 One-Way Treatment Structure in an RCB Design Structure—Equal Slopes Model: An Example ................ 223 7.4 One-Way Treatment Structure in an Incomplete Block Design Structure: An Example ................................ 237 7.5 One-Way Treatment Structure in a BIB Design Structure: An Example ....................................................... 246 7.6 One-Way Treatment Structure in an Unbalanced Incomplete Block Design Structure: An Example ............ 254 7.7 Multilevel or Split-Plot Design with the Covariate Measured on the Large-Size Experimental Unit or Whole Plot: An Example ..................................................................................................................................... 256 7.8 Summary ...................................................................................................................................................... 267 Chapter 8: Analysis of Repeated Measures Data ..................................................... 269 8.1 Introduction ................................................................................................................................................... 269 8.2 Mixed Model Analysis of Data from Basic Repeated Measures Design: An Example .................................. 272 8.3 Covariance Structures .................................................................................................................................. 283 8.4 PROC GLIMMIX Analysis of FEV1 Data ...................................................................................................... 293 8.5 Unequally Spaced Repeated Measures: An Example .................................................................................. 301 8.6 Summary ...................................................................................................................................................... 305 Chapter 9: Best Linear Unbiased Prediction (BLUP) and Inference on Random Effects ................................................................................................................... 307 9.1 Introduction ................................................................................................................................................... 307 9.2 Examples Motivating BLUP .......................................................................................................................... 308 9.3 Obtainment of BLUPs in the Breeding Random Effects Model ..................................................................... 311 9.4 Machine-Operator Two-Factor Mixed Model ................................................................................................ 316 9.6 Matrix Notation for BLUP .............................................................................................................................. 334 9.7 Summary ...................................................................................................................................................... 337 Chapter 10: Random Coefficient Models ................................................................. 339 10.1 Introduction ................................................................................................................................................. 339 10.2 One-Way Random Effects Treatment Structure in a Completely Randomized Design Structure: An Example ....................................................................................................................................................... 342 10.3 Random Student Effects: An Example ....................................................................................................... 348 10.4 Repeated Measures Growth Study: An Example ....................................................................................... 353 10.5 Prediction of the Shelf Life of a Product ..................................................................................................... 364 10.6 Summary .................................................................................................................................................... 376 Chapter 11: Generalized Linear Mixed Models for Binomial Data ............................ 377 11.1 Introduction ................................................................................................................................................. 377 11.2 Three Examples of Generalized Linear Mixed Models for Binomial Data ................................................... 378 11.3 Example 1: Binomial O-Ring Data .............................................................................................................. 380 11.4 Generalized Linear Model Background ...................................................................................................... 383 11.5 Example 2: Binomial Data in a Multicenter Clinical Trial ............................................................................. 394 v 11.6 Example 3: Binary Data from a Dairy Cattle Breeding Trial ......................................................................... 412 11.7 Summary ..................................................................................................................................................... 417 Chapter 12: Generalized Linear Mixed Models for Count Data ................................. 419 12.1 Introduction ................................................................................................................................................. 419 12.2 Three Examples Illustrating Generalized Linear Mixed Models with Count Data ........................................ 420 12.3 Overview of Modeling Considerations for Count Data ................................................................................. 421 12.4 Example 1: Completely Random Design with Count Data .......................................................................... 424 12.5 Example 2: Count Data from an Incomplete Block Design .......................................................................... 429 12.6 Example 3: Linear Regression with a Discrete Count Dependent Variable ................................................. 445 12.7 Blocked Design Revisited: What to Do When Block Variance Estimate is Negative ................................... 453 12.8 Summary ..................................................................................................................................................... 456 Chapter 13: Generalized Linear Mixed Models for Multilevel and Repeated Measures Experiments ............................................................................................................ 457 13.1 Introduction ................................................................................................................................................. 457 13.2 Two Examples Illustrating Generalized Linear Mixed Models with Complex Data ...................................... 457 13.3 Example 1: Split-Plot Experiment with Count Data...................................................................................... 458 13.4 Example 2: Repeated Measures Experiment with Binomial Data................................................................ 473 Chapter 14: Power, Precision, and Sample Size II: General Approaches .................. 487 14.1 Introduction ................................................................................................................................................. 487 14.2 Split Plot Example Suggesting the Need for a Follow-Up Study ................................................................. 487 14.3 Precision and Power Analysis for Planning a Split-Plot Experiment ............................................................ 489 14.4 Use of Mixed Model Methods to Compare Two Proposed Designs ............................................................ 492 14.5 Precision and Power Analysis: A Repeated Measures Example ................................................................. 495 14.6 Precision and Power Analysis for Non-Gaussian Data: A Binomial Example .............................................. 501 14.7 Precision and Power: Example with Incomplete Blocks and Count Data .................................................... 505 14.8 Summary ..................................................................................................................................................... 508 Chapter 15: Mixed Model Troubleshooting and Diagnostics..................................... 509 15.1 Introduction ................................................................................................................................................. 509 15.2 Troubleshooting........................................................................................................................................... 510 15.3 Residuals .................................................................................................................................................... 514 15.4 Influence Diagnostics .................................................................................................................................. 520 15.5 Two Diagnostic Plots Useful for Non-Gaussian Data .................................................................................. 538 15.5 Summary ..................................................................................................................................................... 541 Appendix A: Linear Mixed Model Theory .................................................................. 543 A.1 Introduction ................................................................................................................................................... 543 A.2 Matrix Notation .............................................................................................................................................. 543 A.3 Formulation of the Mixed Model .................................................................................................................... 544 A.4 Estimating Parameters, Predicting Random Effects ..................................................................................... 549 A.5 Statistical Properties ..................................................................................................................................... 557 A.6 Model Selection ............................................................................................................................................ 558 A.7 Inference and Test Statistics ......................................................................................................................... 559 Appendix B: Generalized Linear Mixed Model Theory .............................................. 563 B.1 Introduction ................................................................................................................................................... 563 B.2 Formulation of the Generalized Linear Model ............................................................................................... 563 B.3 Formulation of the Generalized Linear Mixed Model..................................................................................... 566 B.4 Conditional versus Marginal Models and Inference Space ........................................................................... 569 B.5 Integral Approximation .................................................................................................................................. 572 References .............................................................................................................. 577 Index ....................................................................................................................... 585 vi SAS for Mixed Models About This Book What Does This Book Cover? During the past 25 years, mixed models have become an integral part of statistical methodology. Nearly all areas of application that use statistics use mixed models in some form. Mixed models are taught in graduate-level statistics courses, as well as disciplines outside traditional statistics. Mixed models are familiar to most statisticians. Nonetheless, many persons who are engaged in analyzing mixed model data have questions about the appropriate implementation of the methodology. In addition, given the rapid growth of degree programs in data science, as well as statistics, those who are new to the discipline and ready to extend their knowledge of statistical methods to mixed models need a place to start. Being an area of active research and methodological development, mixed models have ever-increasing new applications capabilities available. Those who studied the topic several years ago may not be aware of these developments and need a resource that makes these advances accessible. This edition is intended to address the needs of this diverse audience. Like the first two editions of SAS for Mixed Models, this third publication presents mixed model methodology in a setting that is driven by applications. The scope is both broad and deep. Examples represent numerous areas of application and range from introductory examples to technically advanced case studies. The book is intended to be useful to as diverse an audience as possible, although persons with some knowledge of analysis of variance and regression analysis will benefit most. The first chapter provides important definitions and categorizations and delineates mixed models from other classes of statistical models. Chapters 2 through 10 cover specific forms of mixed models and the situations in which they arise. Randomized block designs (Chapter 2) give rise to models with fixed treatment and random block effects—among the simplest mixed models. These enable us to introduce elementary mixed model concepts and operations, and to demonstrate the use of SAS mixed model procedures in this simple setting. An overview of mean comparison procedures for various treatment designs is presented in Chapter 3. The topic of “power and sample size” often means doing a power calculation for a designated design at the end of the planning process. However, power involves more than sample size—different designs with the same sample size can yield very different power characteristics. Mixed models provide a powerful methodology for comprehensive assessment of competing plausible designs. Mixed model power and precision analysis is introduced in Chapter 4. Studies with multiple levels, such as split-plot and hierarchical designs, are common in many areas of application. These give rise to models with multiple random effects. The analysis of the associated models is discussed in Chapter 5. Chapter 6 considers models in which all effects are random, and it covers variance component estimation and inference on random effects. Chapter 7 covers analysis of covariance in the mixed model setting. Repeated measures in time or space and longitudinal data give rise to mixed models in which the serial dependency among observations can be modeled directly; this is the topic of Chapter 8. Chapter 9 continues with inference on random effects, a topic begun in Chapter 6. Chapter 9 is devoted to statistical inference based on best linear unbiased prediction of random effects. This naturally leads us to random coefficient and multilevel linear models (Chapter 10). The second edition of SAS for Mixed Models was published when the earliest version of the GLIMMIX procedure had just been released. Since then, new releases of PROC GLIMMIX have greatly expanded SAS capability to handle generalized linear mixed models (GLMMs), mixed models for non-Gaussian data. Although the first two editions of SAS for Mixed Models devoted a single chapter to GLMMs, this edition devotes three. The GLMM is introduced in Chapter 11 with binomial data. Chapter 12 introduces GLMMs for count data. Chapter 13 covers multilevel and repeated measures designs in a GLMM context. In Chapter 14 we revisit power and precision. Chapter 4 concerns simple designs and Gaussian data only, whereas Chapter 14 considers more complex designs and non-Gaussian data. Chapter 14 showcases the full potential of GLMM-based precision and power analysis. Chapter 15 covers mixed model diagnostics based on residuals and influence analysis, as well as some trouble-shooting strategies. Good statistical applications require a certain amount of theoretical knowledge. The more advanced the application, the more an understanding of mixed models’ theoretical foundations will help. Although this book focuses on applications, theoretical developments are presented as well. Appendix A covers linear mixed model theory. Appendix B covers generalized linear mixed model theory. These appendices describe how mixed model methodology works, provide essential detail about the algorithms used by SAS mixed model software, and cover the assumptions underlying mixed model analysis. In addition to viii describing how mixed models work, these appendices should help readers understand why things are not working in cases (hopefully few) where problems arise. Topics included in SAS for Mixed Models, Second Edition, but not appearing in this volume are as follows: ● Bayesian analysis ● spatial variability ● heterogeneous variance models ● the NLMIXED procedure ● additional case studies The authors have reserved these topics for a planned subsequent publication. What’s New in This Edition? SAS for Mixed Models, Second Edition, has been the go-to book for practitioners, students, researchers and instructors on mixed model methodology for more than a decade. PROC GLIMMIX is the most comprehensive and sophisticated mixed model software on the market. The current version of PROC GLIMMIX was released in 2008, two years after the publication of the second edition. This publication will be a worthy update incorporating developments over the past decade, building on the SAS for Mixed Models go-to status and fully taking advantage of PROC GLIMMIX capabilities. Some topics have been rearranged to provide a more logical flow, and new examples are introduced to broaden the scope of application areas. Nearly all examples have been updated to use PROC GLIMMIX as the “one-stop shop” for linear modeling, whether fixed effect only, linear mixed models, or generalized linear mixed models. The chapters on GLMMs greatly expand on SAS for Mixed Models, Second Edition, as knowledge and software capability have both improved over the past decade. Expanded power and precision chapters enhance the researcher’s ability to plan experiments for optimal outcomes. Statistical graphics now utilize the modern SGPLOT procedures. Is This Book for You? SAS for Mixed Models: Introduction and Basic Applications is useful to anyone wanting to use SAS for analysis of mixed model data. It is meant to be a comprehensive reference book for data analysts working with mixed models. It is a good supplementary text for a statistics course in mixed models, or a course in hierarchical modeling or applied Bayesian statistics. Mixed model applications have their roots in agricultural research, the behavioral sciences, and medical research—aspects of mixed model methodology arose somewhat independently in these three areas. But the same or similar methodology has proven to be useful in other subject areas, such as the pharmaceutical, natural resource, engineering, educational, and social science disciplines. We assert that almost all data sets have features of mixed models. Not everyone will want to read the book from cover to cover. Readers who have little or no exposure to mixed models will be interested in the early chapters and can progress through later chapters as their needs require. Readers with good basic skills may want to jump into the chapters on topics of specific interest and refer to earlier material to clarify basic concepts. To gain the most benefit from this book, ideally readers will have intermediate knowledge of SAS. More importantly, knowledge of some statistical ideas, such as multiple regression, analysis of variance, and experimental design, will ensure that the reader gains the most value from the book. ix What Should You Know about the Examples? This book includes examples for you to follow to gain hands-on experience with SAS. Software Used to Develop the Book's Content The software products used to develop the content for this book are as follows: Base SAS 9.4 SAS/STAT 14.3 SAS/GRAPH 9.4 Example Code and Data You can access the example code and data for this book by linking to its author pages at https://support.sas.com/authors. Output and Figures The tabular and graphical output in this book was generated with a SAS Output Delivery System style customized for optimal book print quality; therefore, your output will differ in appearance. Color versions of Figures 3.11 and 3.13 are included with the example code and data: https://support.sas.com/authors. SAS University Edition This book is compatible with SAS University Edition. If you are using SAS University Edition, then begin here: https://support.sas.com/ue-data . SAS Press Wants to Hear from You Do you have questions about a SAS Press book that you are reading? Contact us at [email protected]. SAS Press books are written by SAS Users for SAS Users. Please visit sas.com/books to sign up to request information on how to become a SAS Press author. We welcome your participation in the development of new books and your feedback on SAS Press books that you are using. Please visit sas.com/books to sign up to review a book Learn about new books and exclusive discounts. Sign up for our new books mailing list today at https://support.sas.com/en/books/subscribe-books.html. Learn more about these authors by visiting their author pages, where you can download free book excerpts, access example code and data, read the latest reviews, get updates, and more: https://support.sas.com/en/books/authors/walter-stroup.html https://support.sas.com/en/books/authors/george-milliken.html https://support.sas.com/en/books/authors/elizabeth-claassen.html https://support.sas.com/en/books/authors/russell-wolfinger.html x