Revised Comprehensive Norms for an Expanded Halstead-Reitan Battery: Demographically Adjusted Neuropsychological Norms for African American and Caucasian Adults Professional Manual Robert K. Heaton, PhD, S. Walden Miller, PhD, Michael J. Taylor, PhD, and Igor Grant, MD Copyright © 1991, 1992, 2004 by PAR. All rights reserved. May not be reproduced in whole or in part in any form or by any means without written permission of PAR. 9 8 7 6 5 4 3 2 1 Reorder #RO-5273 Printed in the U.S.A. ii ACKNOWLEDGMENTS The authors are most grateful to several people who Core), the NHLBI-funded Nocturnal Oxygen Therapy provided assistance in this work. The late Charles G. Trial (Dr. R. Timms, PI; Drs. Heaton & Grant, Co- Matthews and Harold Goodglass generously contributed Investigators), the NIDA-funded Collaborative Neuropsy- to the normative data sets used in the previous chological Study of Polydrug Users (Dr. Grant, PI), and “Comprehensive Norms” publications; these data also the VA Merit Review-funded Alcohol Abuse and were included in the analyses used to generate the current Neuropsychological Impairment Study (Dr. Grant, PI). norms for Caucasians. Nanci Avitable put together sev- The majority of African American data were collected as eral data files in support of the current project. We wish parts of Dr. Miller’s NIA-funded Senior African American to thank Ralph M. Reitan for allowing us to present here Neuropsychological Assessment, and the NIMH-funded our adaptation of his Story Memory Test; in a broader African-American Neuropsychological Test Norms. sense, we are indebted to him for originating many of the We thank the many co-investigator colleagues who neuropsychological methods and concepts discussed in collaborated with us over the years in acquiring these this volume. Finally, Carol Demong provided valuable data; and we especially thank the many volunteers who editorial assistance in the preparation of this manuscript. contributed their time and energy as control participants Much of the test data used to develop these norms in these various research programs. were obtained in research supported by several NIH grants awarded to the current authors. These include the NIMH-funded HIV Neurobehavioral Research Center Robert K. Heaton (Dr. Grant, PI; Dr. Heaton, PI Neuropsychology Core), S. Walden Miller the NIDA-funded NeuroAIDS: Effects of Methamphe- Michael J. Taylor tamine (Dr. Grant, PI; Dr. Heaton, PI Neuropsychology Igor Grant iii TABLE OF CONTENTS ACKNOWLEDGMENTS..................................................................................................................................... iii CHAPTER 1: INTRODUCTION....................................................................................................................... 1 CHAPTER 2: THE NEED FOR DEMOGRAPHIC CORRECTIONS IN NEUROPSYCHOLOGICAL ASSESSMENT............................................................................................. 3 CHAPTER 3: DATA COLLECTION FOR THE NORMATIVE PROJECT.................................................. 7 Participant Samples......................................................................................................................................... 7 Neuropsychological Testing............................................................................................................................ 9 Third Editions of the Wechsler Adult Intelligence Scale (WAIS-III) and Wechsler Memory Scale (WMS-III).................................................................................................... 9 Halstead-Reitan Battery (HRB)................................................................................................................. 10 Wisconsin Card Sorting Test (WCST)...................................................................................................... 10 Thurstone (Written) Word Fluency Test................................................................................................... 11 Letter and Category (Oral) Fluency Tests................................................................................................ 11 Paced Auditory Serial Addition Test (PASAT).......................................................................................... 11 Digit Vigilance Test.................................................................................................................................... 11 Boston Naming Test (BNT)...................................................................................................................... 12 Complex Ideational Material Subtest of the Boston Diagnostic Aphasia Examination (BDAE)......... 12 Peabody Individual Achievement Test (PIAT)......................................................................................... 12 Story Memory Test..................................................................................................................................... 12 California Verbal Learning Test (CVLT)................................................................................................... 13 Figure Memory Test................................................................................................................................... 13 Global Deficit Score (GDS)...................................................................................................................... 14 Summary of Measures..................................................................................................................................... 15 CHAPTER 4: DEVELOPMENT AND VALIDATION OF THE DEMOGRAPHIC CORRECTION SYSTEM.............................................................................................................................. 17 Procedure......................................................................................................................................................... 17 Years of Education: Definition................................................................................................................. 17 Demographic Corrections......................................................................................................................... 18 Evaluating the T Scores............................................................................................................................. 18 Results .............................................................................................................................................................. 19 Further Refinement and Validation of the Global Deficit Score (GDS)................................................ 31 CHAPTER 5: USE OF THE DEMOGRAPHIC CORRECTION SYSTEM................................................. 35 Converting Raw Scores to T Scores................................................................................................................. 35 v Clinical Case Examples................................................................................................................................... 37 Patient 1 (P.E.)........................................................................................................................................... 37 Patient 2 (T.J.)............................................................................................................................................ 45 Patient 3 (M.P.).......................................................................................................................................... 52 Patient 4 (M.T.)......................................................................................................................................... 61 Review of Interpretive Issues........................................................................................................................... 70 REFERENCES....................................................................................................................................................... 75 APPENDIX A. Scoring Guidelines for the Story Memory Test............................................................... 79 APPENDIX B. Scoring Guidelines for the Figure Memory Test............................................................. 83 APPENDIX C. Raw Score to Scaled Score Equivalents............................................................................ 87 APPENDIX D. Caucasian-Based Norms: Scaled Score to T-Score Conversions by Sex, Education, and Age Group ...................................................................................................................... 91 APPENDIX E. African American-Based Norms: Scaled Score to T-Score Conversions by Sex, Education, and Age Group........................................................................................................................ 357 APPENDIX F. Raw Score to T-Score Conversions for the Wisconsin Card Sorting Test (WCST) Perseverative Errors Score......................................................................................................................... 623 APPENDIX G. Raw Score to T-Score Conversions for the Global Deficit Score (GDS)................. 637 vi Chapter 1 INTRODUCTION This manual describes a normative system designed to graphically corrected scores facilitates comparisons of assist clinicians and researchers in their interpretation of strengths and deficits both within and between groups. the tests in an expanded Halstead-Reitan Neuropsycho- Chapter 2 briefly discusses the importance of consid- logical Test Battery. A major purpose of this system is to ering age, education, sex, and ethnicity in neuropsycho- improve neurodiagnostic accuracy by providing simulta- logical test interpretation. Chapter 3 describes the neous corrections for four demographic variables that methods used to collect the data employed in the present relate significantly to test performance: age, education, normative system. Chapter 4 describes the development sex, and Caucasian as well as African American ethnicity. of the system which corrects for the influence of age (20 Another purpose of the system is to convert raw scores on through 85 years), education (0 through 20 years), sex, diverse tests to standard scores, all of which have the same and ethnicity (Caucasian vs. African American) on the distribution in groups of neurologically normal adults. raw scores. This chapter also examines the demographic The ability to convert raw test scores to demographi- influences that are apparent in both the raw scores and cally corrected scores has a number of advantages for the the corrected scores for all tests in the battery. The distri- neuropsychologist. First, it facilitates comparisons of an butional properties of the corrected scores are examined, individual’s test results with normal expectations based and the performance of the correction system is checked on that person’s demographic characteristics. For exam- at different levels of age and education. Because this revi- ple, the fact that a patient’s raw test score is below aver- sion of the authors’ previously published, regression- age for the general adult population does not mean that based norms is based on larger samples and incorporates it is a poor score for persons at all levels of age and edu- nonlinear, as well as linear, effects of demographic vari- cation. The score may be above average in comparison to ables, some differences in the T scores produced by the some adult samples and in the “impaired” range in com- two systems were anticipated. Therefore, chapter 4 also parison to others. Using the corresponding standard examines the degree of similarity between T scores from score, the clinician can establish more precisely what per- the 1991 norms and the current ones for each test in the centage of normal individuals with the same demo- battery; these analyses used only Caucasian participants, graphic characteristics perform at or below the patient’s because virtually all of the participants in the 1991 nor- level of performance. Second, because standard scores in mative project were Caucasian. this normative system are more directly comparable, use of these scores facilitates the analysis of an individual’s Chapter 5 describes and illustrates the appropriate use patterns of strengths and deficits across tests. Instead of of Appendixes C through G in converting raw scores to trying to compare a time score on one test with an error demographically corrected scores. In addition, four case score on another, comparisons are made between scores examples are presented in this chapter, and several issues that have the same units of measurement. Similarly, in related to the use of the corrected scores in clinical inter- research involving groups of individuals, the use of demo- pretation and research applications are addressed. 1 Chapter 2 THE NEED FOR DEMOGRAPHIC CORRECTIONS IN NEUROPSYCHOLOGICAL ASSESSMENT Over the last several decades, the field of clinical neu- Karzmark, Heaton, Lehman, & Crouch, 1985; Wilson et ropsychology has produced a large number of standard- al., 1978). ized tests and test batteries that are sensitive to cerebral In addition to age, education, and sex, there is increas- disorders (Boller, Grafman, & Rizzolatti, 2000; Grant & ing evidence that ethnicity accounts for substantial Adams, 1996; Lezak, 1995). These instruments are widely amounts of variance in cognitive test performances of used for neurodiagnostic purposes and have unique normal adults (Evans, Miller, Byrd, & Heaton, 2000). value in identifying the effects of brain disorders on Although other ethnic minority groups have shown per- patients’ basic adaptive abilities (Heaton & Marcotte, formance differences versus Caucasians on such tests 2000). However, neuropsychological tests are not exclu- (e.g., Arnold, Montgomery, Castaneda, & Longoria, 1994; sively sensitive to brain pathology. It has long been real- Heaton, Taylor, & Manly, 2003), much of the research in ized that performances on most of these instruments are this area has focused on African Americans. For example, strongly related to age and education, and that significant African Americans in the Wechsler Adult Intelligence male-female differences are observed with a few of these Scale-III/Wechsler Memory Scale-III (WAIS-III/WMS-III; tests (Finlayson, Johnson, & Reitan, 1977; Heaton, Ryan, The Psychological Corporation, 1997) national standard- Grant, & Matthews, 1996; Matarazzo, 1972; Parsons & ization sample performed significantly worse than their Prigatano, 1978; Reitan, 1955). The scope of this prob- Caucasian counterparts, even when effects of other lem is illustrated by a study of the relationships between demographic variables were controlled (Heaton et al., the neuropsychological test scores and demographic 2003). This phenomenon is not test-specific: Similar characteristics of 553 neurologically normal adults ethnicity-related differences have been observed on ear- (Heaton, Grant, & Matthews, 1986). The tests included lier versions of the Wechsler batteries (Kaufman, McLean, the Wechsler Adult Intelligence Scale (WAIS; Wechsler, & Reynolds, 1988; Reynolds, Chastain, Kaufman, & 1955) and the Halstead-Reitan Battery (HRB; Reitan & McLean, 1987) and on other tests covering a broad range Wolfson, 1985). A great deal of variability was found in of cognitive functions. These have included tests of letter the strength of the associations between individual test and category fluency (Gladsjo et al., 1999; Johnson- measures and particular demographic variables; that is, Selfridge, Zalewski, & Aboudarham, 1998), naming/ some test scores were found to be strongly related to age, word-finding (Roberts & Hamsher, 1984), the Category whereas others were found to be more strongly related to Test from the HRB (Bernard, 1989), the California Verbal education. For a few tests (especially tests of motor speed Learning Test (Norman, Evans, Miller, & Heaton, 2000), and strength in the upper extremities), scores were found and the Paced Auditory Serial Addition Test (Diehr, to be most strongly related to sex. A significant amount Heaton, Miller, & Grant, 1998). A variety of cultural, edu- of variability in almost all test scores (more than 40% in cational, socioeconomic, and other factors probably con- some cases) could be accounted for by a single demo- tribute to these ethnicity-related differences in cognitive graphic variable. Even more of the test score variance, test performance, including the experience during test however, can be accounted for by multiple demographic taking that one is being judged in terms of a negative variables in combination (Barona, Reynolds, & Chastain, racial stereotype (“stereotype threat”). The latter phe- 1984; Karzmark, Heaton, Grant, & Matthews, 1984; nomenon is especially likely to be part of the cognitive 3 test-taking experience of African Americans and has been participants. Nevertheless, more than 1,000 participants experimentally manipulated to influence test perform- provided data for each of the HRB test scores. The only ance of Caucasians as well (Steel & Aronson, 1995, large amount of missing data occurred for African 1998). In any event, the noted ethnic differences in test Americans on Finger Tapping, and this was because the performance are reliably observed and, although they test was added to the battery after the African American may not accurately reflect differences in the ultimate norm development study was underway. As a result, 150 potential of the examinees, they should be considered of the 578 AIRs of African Americans were prorated based when interpreting cognitive tests within a neurodiagnos- upon 11 of the 12 component scores; no differences in tic context. Failure to do so typically results in a substan- the pattern of results were seen when these participants tial (sometimes up to three-fold) increase in the were excluded from the analyses. probability of misclassifying normal African Americans Turning now to the results for the Caucasian cohort as having brain disorders, as compared to misclassifica- (see left half of Table 1), consider first the Russell et al. tion rates for Caucasians (e.g., Heaton et al., 2003; (1970) norms for the various test measures for the entire Norman et al., 2000). group. Although most of these correct classification rates Despite the apparent importance of demographic vari- are fairly high (≥74% for 9 of the 13 variables), a few are ables in predicting neuropsychological test performances not—most notably for the Tactual Performance Test– of normal adults, most such tests have normative stan- Location and Finger Tapping–Worst Hand. Moreover, dards that either ignore demographic effects entirely or Table 1 shows that the overall correct classification rates correct only for age. These norms may take the form of a for the 13 measures vary by more than 50% (i.e., 40% for single cutoff score (defining impaired and unimpaired Finger Tapping–Worst Hand in females to 93% for ranges) or may define a normal range of performance Sensory-Perceptual–Total). This indicates that the sug- and several levels of impairment. In either case, the accu- gested test score cutoff points for defining “impairment” racy and appropriateness of the norms will be best when on the different measures are not at all comparable in they are applied to new individuals who are similar to their ability to correctly classify nonimpaired individuals. the average person in the normative sample (typically, a Use of such cutoff scores could cause the clinician or middle-aged Caucasian male with 1 or 2 years of col- researcher to mistakenly infer that a given pattern of lege). On the other hand, the neurodiagnostic sensitivity impaired or unimpaired test results reflects dysfunction and specificity of such norms are likely to vary consider- of specific brain regions or functional circuits, when the ably when applied to individuals and groups that differ pattern actually represents inconsistencies within the from the average person in the normative sample (e.g., normative system (i.e., differences in the diagnostic pre- older or younger people with high or low levels of edu- cision of the cutoff scores for the various tests). For the cation, ethnic minorities; see Heaton et al., 1986, and same reason, such normative systems can obscure the Heaton et al., 2003). presence of diagnostically meaningful patterns that do exist in the test results. To illustrate these points, the authors show how a sin- gle set of norms for the HRB (Russell, Neuringer, & The inconsistent overall performance of these norma- Goldstein, 1970) performs with demographically strati- tive standards might be less surprising if one considers fied subgroups of the current large normative sample. the relatively imprecise manner in which they were devel- More specifically, the percentages of neurologically nor- oped. Although the Russell et al. (1970) norms were mal participants in these subgroups are considered for arguably the best available at the time they were pub- those correctly classified as normal by the Russell et al. lished and continue to be widely used by clinicians and standards for the Average Impairment Rating (AIR) and researchers, they were originally based upon a combi- 11 of the component HRB measures. For these analyses, nation of clinical judgment and the test results of a the Caucasians and African Americans will be further very small sample of neurologically normal individuals divided into subgroups at three age levels (<40 years, 40- (n = 26). 59 years, 60+ years) and three educational levels (<12 As Table 1 also indicates, this variability in the per- years, 12-15 years, 16+ years). formance of the normative data is even more dramatic The results are presented in Table 1. Note that the when one considers the way different demographic sub- sample sizes vary somewhat for the different test vari- groups are classified by the sametest measure. The AIR is ables. Although the same core group of Caucasian and a representative example of this point. Although the over- African American participants took all of the tests, not all correct classification rate of the AIR for the total nor- all of the tests in the HRB were administered to all of the mal group is a respectable 82%, it correctly classified 4
Description: