U se in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by sirnilar or dissirnilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. (MVY) 9 8 7 654 3 2 1 SPIN 10935354 springeranline. carn Tü Janü, für all the ways that yüu help tü make things würk, and für all the ways that yüu make things würthwhile (MW) Preface This edited volume gives a new and integrated introduction to item re sponse models (predominantly used in measurement applications in psy chology, education, and other social science areas) from the viewpoint of the statistical theory of generalized linear and nonlinear mixed models. Moreover, this new framework aHows the domain of item response mod els to be co-ordinated and broadened to emphasize their explanatory uses beyond their standard descriptive uses. The basic explanatory principle is that item responses can be modeled as a function of predictors of various kinds. The predictors can be (a) char acteristics of items, of persons, and of combinations of persons and items; they can be (b) observed or latent (of either items or persons); and they can be (c) latent continuous or latent categorical. Thus, a broad range of models can be generated, including a wide range of extant item response models as weH as some new ones. Within this range, models with explana tory predictors are given special attention, but we also discuss descriptive models. Note that the 'item responses' that we are referring to are not just the traditional 'test data,' but are broadly conceived as categorical data from a repeated observations design. Hence, data from studies with repeated-observations experimental designs, or with longitudinal designs, mayaIso be modeled. The intended audience for this volume is rather broad. First, the volume is meant to provide an introduction to item response models, starting from regression models, although the introduction is at a quite compact and gen eral level. Second, since the approach is so general, many different kinds of models are discussed, weH-known models as weH as less weH-known mod els, and even previously-unknown models, aH from the same perspective, so that those already weH familiar with psychometrics mayaIso find the volume of interest to them. Third, the volume also has practical purposes for those already practicing in the field: (a) the regression-based framework that is presented makes it easier to see how models can be estimated with software that was not originaHy designed with item response models in mind, and (b) one can formulate and estimate new models, tailor-made to the measurement and explanatory purposes one has in mind. In this way, we hope to give practitioners a flexible tool for their work. We see this volume as being suitable for a semester-Iong course for ad vanced graduate students in measurement and psychometrics, as weH as a reference for the practicioner. Each chapter is foHowed by a set of exercises designed (a) to give the reader a chance to practice some of the computer viii Preface analyses and (b) to point out some interesting perspectives and extensions arising in the chapter. In order to make the task easier for the reader, a uni fied approach to notation and model description is followed throughout the chapters, and a single data set is used in most examples to make it easier to see how the many models are related. The volume includes a chapter that describes the principal computer programs used in the analyses, and at the end of most chapters one can find command files and enough detail for a representative set of analyses, with the intent that the reader can carry out all computer analyses shown in this volume. A website associated with this volume, has been installed (http://bear . soe . berkeley . edu/EIRM/) - it contains all data sets used in the chapters, the command files for all analyses, sampIe output, and (sampIe) answers to the exercises. Part I of the volume gives an introduction to the framework. In Chapter 1, start ing from the linear regression model, two basic ideas are explained: How linear models can be generalized using a nonlinear link function, and how individual differences can be incorporated, leading to generalized linear and nonlinear mixed models (GLMMs and NLMMs). In Chapter 2 we illustrate the concepts of descriptive and explanatory measurement using four basic item response models: the Rasch model, the latent regression Rasch model, the linear logistic test model (LLTM), and the latent regression LLTM. Chapter 3 describes the extension to models for polytomous data. The general statistical background of the models is explained in more depth in Chapter 4. In Part II, these models are generalized to other and more complicated models that illustrate different ways that models can be explanatory by incorporating external factors. In this part, we concentrate on three types of predictors: (a) Models with explanatory person predictors, including multilevel models with person groups as predictors (Chapter 5); (b) models with explanatory item predictors, including multilevel models with item groups as predictors (Chapter 6); and (c) models with explanatory person by-item predictors, including models for differential item functioning (DIF) and so-called dynamic models with responses from one or more other items as predictors (Chapter 7). In some situations it can make sense to consider models that deal with 'unknown' predictors or predictors with values that are 'not known a pri ori.' These are together called internal factors, because the values of the predictors are derived from the data, instead of being given as external in formation. This is the basis for Part III. In this part, Chapter 8 and Chapter 9 deal with models with so-called latent item predictors. In Chapter 8 bi linear models with item slopes ('discrimination' parameters) are discussed, for example the two-parameter logistic model. Multidimensional models are also discussed in this chapter. In Chapter 9 bilinear models where item parameters are a function of other item parameters are discussed - the so called models with internal restrictions on difficulty (MIRID). In general, independent of the model under consideration, some dependence between Preface ix the item responses may remain. This is the issue of local item dependence. In Chapter 10, different ways to model remaining dependence are presented. An assumption in all models in the previous chapters is that, if predictor weights are random, a normal distribution applies to these weights. This assumption is relaxed in Chapter 11 on mixture models. The volume closes with a final part where there is a chapter on estimation methods and software (Chapter 12). This chapter includes examples of how to use a wide variety of computer programs to estimate models in the Chapters. There are some topics that the reader might have expected to be included in this volume that we have not included. For example, in pursuit of our theme of explanatory rat her than descriptive item response models, we have not explored the topic of the estimation of person parameters, a topic that is mainly of interest in descriptive measurement. In a similar vein, we have not discussed issues in conditional maximum likelihood estimation, as, at present, this seems less useful to explanatory measurement than other formulations. Except in passing, we have not considered response formats involving response times and counts: We see these as being most promising forms of response data for response modeling, but did not include them at this point due to (a) the relative rarity of models for such data in the item response modeling literature, and (b) our own relative inexperience with such data formats. Although we make frequent use of some statistical model testing techniques, we do not include an in-depth account of such techniques, although a general discussion is given in Chapter 4. Paul De Boeck, Leuven, Belgium Mark Wilson, Berkeley, California, USA December 29, 2003 Contents Preface vii Notation xxi Part I: Introduction to the framework 1 1 A framework for item response models 3 1.1 Introduction...................... 3 1.1.1 Measurement or explanation? . . . . . . . . 3 1.1.2 Test data, repeated observations data, and longitudinal data . . . . . . . . . 4 1.1.3 Categorical data . . . . . . . . . 6 1.1.4 A broader statistical perspective 6 1.2 Example data set on verbal aggression 7 1.3 The person side of the data 10 1.3.1 Classical test theory . . . . . . 12 1.3.2 Item analysis . . . . . . . . . . 12 1.4 The other side of the data - the item side 13 1.5 A joint analysis of the two sides. . 15 1.6 The linear regression perspective . . . . 16 1.6.1 Individual linear regressions. . . 16 1.6.2 Results of individual regressions 18 1.6.3 An alternative: linear mixed models 20 1.6.4 Formulation of the linear mixed model 21 1.6.5 Application of the linear mixed model 23 1.6.6 Multilevel modeling 24 1.6.7 Analysis of variance 25 1.6.8 Two points of view 26 1.7 Modeling binary data ... 27 1.7.1 The linear random-intercepts model as an underlying model for binary data . . . . . . . . . . . . . . 27 1.7.2 The normal-ogive random-intercepts model for binary data . . . . . . . . . . . . . . . 29 1.7.3 The logistic random-intercepts model 32 1. 7.4 Scaling issues . . . . . . . 32 1.7.5 Item response models .. 33 1.8 Generalized linear mixed models 35 1.9 Philosophical orientation. . . . . 37 xii Contents 1.10 Exercises 38 1.11 References . 39 2 Descriptive and explanatory item response models 43 2.1 Introduction................. 43 2.1.1 The intercept or person parameter 43 2.1.2 The weights or item parameters. 44 2.1.3 Resulting models . . . . 44 2.2 Four item response models. . . . . . . . 46 2.2.1 Summary and notation ..... 47 2.3 A doubly descriptive model: the Rasch model 48 2.3.1 Formulation of the model . . . . . . . 48 2.3.2 Application of the Rasch model. . . . 55 2.4 A person explanatory model: the latent regression Rasch model 58 2.4.1 Formulation of the model . . . . . . . . . . . . . 58 2.4.2 Application of the latent regression Rasch model 59 2.5 An item explanatory model: the LLTM . 61 2.5.1 Formulation of the model . . . . . . . . . . . . . 61 2.5.2 Application of the LLTM . . . . . . . . . . . . . 63 2.6 A doubly explanatory model: the latent regression LLTM 66 2.6.1 Formulation of the model . . . . . . . . . . 66 2.6.2 Application of the latent regression LLTM . 67 2.7 Enlarging the perspective . . . . . . . . . . . 68 2.8 Software........................ 68 2.8.1 Rasch model (verbal aggression data) 68 2.8.2 Latent regression Rasch model (verbal aggression data) 69 2.9 Exercises 70 2.10 References . . . . . . . . . . 71 3 Models for polytomous data 75 3.1 Introduction .................... . 75 3.2 The multivariate generalized linear mixed model 76 3.2.1 Data .................... . 77 3.2.2 Multivariate extension of the generalized linear model 78 3.2.3 Multivariate extension of the generalized linear mixed model . . . . . . . . . . . . . . 80 3.3 Predictor matrices and model building 81 3.4 Specifying the link function . . . 85 3.4.1 Adjacent-categories logits 86 3.4.2 Cumulative logits. . . . . 89 3.4.3 Baseline-category logits . 91 3.5 Application of models for polytomous data 94 3.5.1 Adjacent-categories logit models 94 3.5.2 Cumulative logit models 100 3.6 Choice of a logit link function . . . . . . 101

