ebook img

ERIC ED373069: Linear Models for Item Scores: Reliability, Covariance Structure, and Psychometric Inference. PDF

38 Pages·1993·0.63 MB·English
by  ERIC
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ERIC ED373069: Linear Models for Item Scores: Reliability, Covariance Structure, and Psychometric Inference.

DOCUMENT RESUME TM 021 791 ED 373 069 Woodruff, David AUTHOR Linear Models for Item Scores: Reliability, TITLE Covariance Structure, and Psychometric Inference. American Coll. Testing Program, Iowa City, Iowa. INSTITUTION AC1-RR-93-4 REPORT NO PUB DATE 93 NOTE 39p. ACT Research Report Series, P.O. Box 168, Iowa City, AVAILABLE FROM IA 52243. Evaluative/Feasibility (142) PUB TYPE Reports MF01/PCO2 Plus Postage. EDRS PRICE Analysis of Covariance; *Analysis of Variance; DESCRIPTORS Comparative Analysis; Factor Analysis; *Psychometrics; *Scores; *Statistical Inference; Statistical Studies; *Test Items; Test Reliability Alpha Coefficient; *Covariance Structure Models IDENTIFIERS ABSTRACT Two analyses of valiance (ANOVA) models for item scores are compared. The first is an items by subject random effect ANOVA. The second is a mixed effects ANOVA with items fixed and subjects random. Comparisons regarding reliability, Cronbach's alpha coefficient, psychometric inference, and inter-item covariance structure are made between the models. When considering the inter-item covariance structures for the two ANOVA models, brief comparisons with factor analysis models are also made. It is concluded that inference from a sample of items to a population of items requires homogenous inter-item covariances, that reliability has different meanings under the two models, and that while cocfficient alpha is a lower bound for reliability under the second model, it is not under the first. (Contains 51 references and two tables). (Author/SLD) *********************************************************************** Reproductions supplied by EDRS are the best that can be made * from the original document. ***************************************************%**************** 93-4 ACT Research Report Series Linear Models for Item Scores: Reliability, Covariance Structure, and Psychometric Inference U.S. DEPARTMENT OF EDUCATION THIS Improvernent "PERMISSION TO REPRODUCE Once 4 Eduoshonar Research end GRANTED BY INFORMATION MATERIAL HAS BEEN EOU5ATIONAL RESOURCES CENTER (ERIC) 5fteezanAr-7- PD,Cnis documnt hes Peen reproduced as apenuation rec4uued from the person or David Woodruff originating it made to imptotro 0 Ma*, cnancsts nave been ,e0roduction paddy in this docir Points ol view oe opinions staled represent off,ciel RESOURCES *ant do 001 necessarily TO THE EDUCATIONAL OERI Million or or:Arty INFORMATION CENTER (ERIC)." August 1993 BEST COPY AVAILABLE 2 For additional copies write: ACT Research Report Series P.O. Box 168 Iowa City, Iowa 52243 Program. All rights reserved. rP:1993 by The American College Testing 3 Linear Models 1 Linear Models for Item Scores: Reliability, Covariance Structure, and Psychometric Inference David Woodruff American College Testing LINEAR MODELS Running Head: Linear Models 2 Linear Models for Item Scores: Reliability, Covariance Structure, and Psychometric Inference Linear Models 3 Abstract The first is an items by Two ANOVA models for item seores are compared. The second is a mixed effects ANOVA with items subject random effects ANOVA. Comparisons regarding reliability, fixed and subjects random. inter-item covariance Cronbach's a coefficient, psychometric inference, and When considering the inter-item structure are made between the models. comparisons with factor covariance structures for the two ANOVA models, brief It is concluded that inference from a sample analysis models are also made. inter-item covariances, of items to a population of items requires homogeneous models, and that while that reliability has different meanings under the two model, it is coefficient a is a lower bound for reliability under the second not under the first. Coefficient Alpha, Covariance Structure, Generalizability, Key Words: Linear Models, Psychometric Inference, Reliability Linear Models 4 Introduction The first This paper compares two different ANOVA models for items. The model is the two-way items by examinees random effects (Model II) ANOVA. second model is the two-way items by examinees mixed effects (Model III) Very careful and complete statistical derivations of these models are ANOVA. This paper draws heavily (1956a, 1956b, and 1959). given by Scheffe' The two ANOVA models are compared to each other in from Scheffe''s work. Factor analysis models are detail and briefly to factor analysis models. As considered here, extensively discussed by Harmon (1976) and Mulaik (1972). the factor analysis model is statistically more similar to the mixed ANOVA Under the factor analysis model, items model than to the random ANOVA model. are considered fixed and non-random, while subjects are randomly sampled from See Mulaik and McDonald (1978), Williams (1978), a population of subjects. and McDonald and Mulaik (1979) for an alternative formulation of the factor analysis model. A model is All of the models under consideration are linear models. defined as linear if an examinee's expected score on an item is a linear Item characteristics may be fixed function of item characteristics. parameters as in the mixed ANOVA model or random variables as in the random The factor analysis model is here considered to be linear in its ANOVA model. item parameters which are usually called factor loadings even though these linear coefficients are applied to factor scores, which are unobserved random variablcs associated with examinees. An example of a nonlinear model is the logistic ogive item characteristic curve model (Lord and Novick, 1968). From a theoretical viewpoint, linear models usually do not accurately describe dichotomously scored items, and most items are so scored. However, for carefully constructed tests, linear models for item scores are often Linear Models 5 [See Feldt (1965), sufficiently accurate to provide useful approximations. Hsu and Feldt (1969), Hakstian and Whalen (1976), Seeger and Gabrielsson (1968), Gabrielsson and Seeger (1976), McDonald and Ahlawat (1974), McDonald (1981, 1935), and Collins, Cliff, McCormick, and Zatkin (1986).] The discussion of the models presented here will focus on three Under the The first is reliability. characteristics useful in psychometrics. three models reliability is defined as the squared correlatim between an A few relevant references regarding reliability observed and a true score. are Gutman (1945), Novick and Lewis (1967) Bentler (1972), Jackson and Parametric Agunwamba (1977), and Bentler and Woodward (1980, 1983). expressions for reliabil:ty and Cronbach's (1951) coefficient alpha are given, and the sampling distribution for the sample alpha coefficient is discussed. For each The second characteristic is the inter-item covariance matrix. model, the assumed or resulting covariance structure is discussed and compared Finally, psychometric inference is discussed. with factor analysis models. Psychometric inference is considered as statistical inference to a population The more of items from a sample of items randomly drawn from the population. general term generalizability is not used since it connotes statistical There is a large body inference for a wide array of facets, not just items. A few references are Hotelling of literature on psychwetric inference. (1933), Tryon (1957), Lord and Novick (1968), Cronbach, Gleser, Nanda, and Rajaratnam (1972), Mulaik (1972), Kaiser and Michael (1975), Rozeboom (1978), Both the approach ami results presented McD,,oald (1978), and Brennan (1983). here, while most similar to, differ in part from those developed by Lord and Novick (1968) and C,'onbach et al. (1972). Brief descriptions of seven conclusions original to this paper are: Conditional variances for interaction effects may be heterogeneous in 1. the random ANOVA model. BEST COM' AVAILABLE Linear Models 6 The random ANOVA model requires the inter-item covariance matrix to 2. have homogeneous off-diagonal elements, while the mixed ANOVA model places no restrictions on the inter-item covariance matrix except Hence, any factor analysis model may be positive semi-definiteness. subsumed under the mixed ANOVA model but not the random ANOVA model. Interaction effects in the random ANOVA model are analogous to 3. specific factors in a certain single common factor factor analysis model, while the examinee main effect is analogous to the single common factor. The squared correlation between observed scores and true scores is a 4. useful definition of reliability under the random ANOVA model as well as under the mixed ANOVA model, but the definition of true score differs under the two models. has different meanings under the two Reliability as defined in 14 . 5. In the mixed ANOVA model, interaction (specific) variance is models. included in true score variance, while in the random ANOVA model it is not. The parametric value of Cronbach's alpha coefficient is a lower bound 6. to the parametric value of reliability (as defined in 4) under the mixed ANOVA model but not under the random ANOVA model. Given certain normality assumptions, a transformation of the sample 7. alpha coefficient has an F distribution under the random ANOVA model. For the mixed ANOVA model, the F distribution only holds if in addition to certain normality assumptions there are either no interactions or the inter-item covariance matrix has special restricted forms. The practical implications of these conclusions for the analysis of test data will be diseussed in the last section of this paper. The Items by Examinees Random ANOVA Model The model presented here is essentially the same model developed by Scheffe" (1959, chap. 7). It assumes that a random sample of n items chosen from a countably infinite population of items is administered to a random sample of N examinees chosen from a countably infinite population of examinees. The sampling of items and examinees is assumed to be completely independent. Let xij represent subject j's observed score on item i. A preliminary form of the model is Linear Model, 7 = 1,...,n = 1,...,N (1) xij eij tij j i + . The quantities tij and eij are, respectively, the true score and the error Different definitions for true and error sc(xe of examinee j on item I. Within the scores under the random ANOVA model will be admitted later. present context, true and error scores are not absolutes; their definitions The various true and error may vary depending on the inferences being made. scores considered in tnis naper are not necessarily an exhaustive set of possible true and error scores under the models presented. If examinee j responds independently and repeatedly to item i, these For cognitive tests such random replications are indexed by the subscript k. replications are rarely available, though they occasionally may be obtained The present development assumes that such replications for affective scales. In the theoretical development of the model, are not available from the data. In particular, the model these replications are allowed to be present. assumes that for the sequences of independent random varqables e..2, eijk' i.e., that the error variances are Var(e = 02(e E(e2 ) ) ) ij ij k ' For notational simplicity, the heterogeneous over the domains of i and j. subscript k will usually be suppressed, since for the remainder of the paper it will usually take the value of one. The above imply that Ei(eij) = 0 and that E(e) = 0, where notation such as Ei and Vari means that the expectation and variance are taken over the When no subscript is population whose members are indexed by the subscript i. The above also imply present the expectation is over random replications. that the true and error scores are uncorrelated, i.e., Cov.(t ,e. ,) ij ij 1 respectively. It is further for all j,j' and 1,1' - 0 Cov.(t. .,e , assumed that all errors are independent within and across all populations.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.