ebook img

Psychological testing: principles and applications PDF

491 Pages·3.587 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Psychological testing: principles and applications

Pearson New International Edition Psychological Testing Principles and Applications Kevin R. Murphy Charles O. Davidshofer Sixth Edition International_PCL_TP.indd 1 7/29/13 11:23 AM ISBN 10: 1-292-04002-5 ISBN 13: 978-1-292-04002-8 Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsoned.co.uk © Pearson Education Limited 2014 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the publisher or a licence permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS. All trademarks used herein are the property of their respective owners. The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affi liation with or endorsement of this book by such owners. ISBN 10: 1-292-04002-5 ISBN 10: 1-269-37450-8 ISBN 13: 978-1-292-04002-8 ISBN 13: 978-1-269-37450-7 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Printed in the United States of America Copyright_Pg_7_24.indd 1 7/29/13 11:28 AM 111122222579135702461022264382684 P E A R S O N C U S T O M L I B R AR Y Table of Contents 1. Tests and Measurements Kevin R. Murphy/Charles O. Davidshofer 1 2. Defining and Measuring Psychological Attributes: Ability, Interests, and Personality Kevin R. Murphy/Charles O. Davidshofer 20 3. Testing and Society Kevin R. Murphy/Charles O. Davidshofer 52 4. Basic Concepts in Measurement and Statistics Kevin R. Murphy/Charles O. Davidshofer 72 5. Scales, Transformations, and Norms Kevin R. Murphy/Charles O. Davidshofer 92 6. Reliability: The Consistency of Test Scores Kevin R. Murphy/Charles O. Davidshofer 116 7. Using and Interpreting Information About Test Reliability Kevin R. Murphy/Charles O. Davidshofer 134 8. Validity of Measurement: Content and Construct-Oriented Validation Strategies Kevin R. Murphy/Charles O. Davidshofer 153 9. Validity for Decisions: Criterion-Related Validity Kevin R. Murphy/Charles O. Davidshofer 178 10. Item Analysis Kevin R. Murphy/Charles O. Davidshofer 202 11. The Process of Test Development Kevin R. Murphy/Charles O. Davidshofer 226 12. Computerized Test Administration and Interpretation Kevin R. Murphy/Charles O. Davidshofer 248 13. Ability Testing: Individual Tests Kevin R. Murphy/Charles O. Davidshofer 264 I 233344448159224775145637 14. Ability Testing: Group Tests Kevin R. Murphy/Charles O. Davidshofer 287 15. Issues in Ability Testing Kevin R. Murphy/Charles O. Davidshofer 315 16. Interest Testing Kevin R. Murphy/Charles O. Davidshofer 351 17. Personality Testing Kevin R. Murphy/Charles O. Davidshofer 394 Appendix: Forty Representative Tests Kevin R. Murphy/Charles O. Davidshofer 425 Appendix: Ethical Principles of Psychologists and Code of Conduct Kevin R. Murphy/Charles O. Davidshofer 426 References Kevin R. Murphy/Charles O. Davidshofer 443 Index 477 II Tests and Measurements The term psychological test brings to mind a number of conflicting images. On the one hand, the term might make one think of the type of test so often described in television, movies, and the popular literature, wherein a patient answers questions like, “How long have you hated your mother?” and in doing so reveals hidden facets of his or her personality to the clinician. On the other hand, the psychological test might refer to a long series of multiple-choice questions such as those answered by hundreds of high school students taking college entrance examinations. Another type of “psychological test” is the self-scored type published in the Reader’s Digest,which purports to tell you whether your marriage is on the rocks, whether you are as anxious as the next fellow, or whether you should change your job or your lifestyle. In general, psychological tests are neither mysterious, as our first example might suggest, nor frivolous, as our last example might suggest. Rather, psychological tests represent systematic applications of a few relatively simple principles in an attempt to measure personal attributes thought to be important in describing or understanding individual behavior. The aim of this text is to describe the basic principles of psycho- logical measurement and to describe the major types of tests and their applications. We will not present test theory in all its technical detail, nor will we describe (or even men- tion) all the different psychological tests currently available. Rather, our goal is to pro- vide the information needed to make sensible evaluations of psychological tests and their uses within education, industry, and clinical practice. The first question that should be addressed in a psychological testing text is, “Why is psychological testing important?” There are several possible answers to this question, but we believe that the best answer lies in the simple statement that forms the central theme of this text: Tests are used to make important decisions about in- dividuals. College admissions officers consult test scores before deciding whether to admit or reject applicants. Clinical psychologists use a variety of objective and From Chapter 1 of Psychological Testing: Principles and Applications, Sixth Edition. Kevin R. Murphy, Charles O. Davidshofer. Copyright © 2005 by Pearson Education, Inc. All rights reserved. 1 Tests and Measurements projective tests in the process of choosing a course of treatment for individual clients. The military uses test scores as aids in deciding which jobs an individual soldier might be qualified to fill. Tests are used in the world of work, both in personnel selection and in professional certification and licensure. Almost everyone reading this text has taken at least one standardized psychological test. Scores on such a test may have had some impact on an important decision that has affected your life. The area of psychological testing is therefore one of considerable practical importance. Psychological tests are used to measure a wide variety of attributes—intelligence, motivation, mastery of seventh-grade mathematics, vocational preferences, spatial ability, anxiety, form perception, and countless others. Unfortunately, one feature that all psychological tests share in common is their limited precision. They rarely, if ever, provide exact, definitive measures of variables that are believed to have important ef- fects on human behavior. Thus, psychological tests do not provide a basis for making completely accurate decisions about individuals. In reality, no method guarantees complete accuracy. Thus, although psychological tests are known to be imperfect mea- sures, a special panel of the National Academy of Sciences concluded that psychologi- cal tests generally represent the best, fairest, and most economical method of obtaining the information necessary to make sensible decisions about individuals (Wigdor & Garner, 1982a, 1982b). The conclusions reached by the National Academy panel form another important theme that runs through this text. Although psychological tests are far from perfect, they represent the best, fairest, and most accurate technology avail- able for making many important decisions about individuals. Psychological testing is highly controversial. Public debate over the use of tests, particularly standardized tests of intelligence, has raged since at least the 1920s (Cron- bach, 1975; Haney, 1981; Scarr, 1989).1An extensive literature, both popular and techni- cal, deals with issues such as test bias and test fairness. Federal and state laws have been passed calling for minimum competency testing and for truth in testing, terms that refer to a variety of efforts to regulate testing and to increase public access to infor- mation on test development and use. Tests and testing programs have been challenged in the courts, often successfully. Psychological testing is not only important and controversial, but it is also a highly specialized and somewhat technical enterprise. In many of the natural sciences, measurement is a relatively straightforward process that involves assessing the physi- cal properties of objects, such as height, weight, or velocity.2 However, for the most part, psychological attributes, such as intelligence and creativity, cannot be measured by the same sorts of methods as those used to measure physical attributes. Psychologi- cal attributes are not manifest in any simple, physical way; they are manifest only in 1Special issues of American Psychologistin November 1965 and October 1981 provide excel- lent summaries of many of the issues in this debate. 2Note, however, that physical measurement is neither static nor simple. Proposals to rede- fine the basic unit of length, the meter, in terms of the time that light takes to travel from point to point (Robinson, 1983) provide an example of continuing progress in redefining the bases of physical measurement. 2 Tests and Measurements the behavior of individuals. Furthermore, behavior rarely reflects any one psychologi- cal attribute, but rather a variety of physical, psychological, and social forces. Hence, psychological measurement is rarely as simple or direct as physical measurement. To sensibly evaluate psychological tests, therefore, it is necessary to become familiar with the specialized methods of psychological measurement. This chapter provides a general introduction to psychological measurement. First, we define the term testand discuss several of the implications of that definition. We then briefly describe the types of tests available and discuss the ways in which tests are used to make decisions in educational, industrial, and clinical settings. We also dis- cuss sources of information about tests and the standards, ethics, and laws that govern testing. PSYCHOLOGICAL TESTS—A DEFINITION The diversity of psychological tests is staggering. Thousands of different psychological tests are available commercially in English-speaking countries, and doubtlessly hun- dreds of others are published in other parts of the world. These tests range from per- sonality inventories to self-scored IQ tests, from scholastic examinations to perceptual tests. Yet, despite this diversity, several features are common to all psychological tests and, taken together, serve to define the term test. A psychological test is a measurement instrument that has three defining charac- teristics: 1. A psychological test is a sample of behavior. 2. The sample is obtained under standardized conditions. 3. There are established rules for scoring or for obtaining quantitative (numeric) information from the behavior sample. Behavior Sampling Every psychological test requires the respondent to do something. The subject’s behavior is used to measure some specific attribute (e.g., introversion) or to predict some specific outcome (e.g., success in a job training program). Therefore, a variety of measures that do not require the respondent to engage in any overt behavior (e.g., an X-ray) or that require behavior on the part of the subject that is clearly incidental to whatever is being measured (e.g., a stress electrocardiogram) fall outside the domain of psychological tests. The use of behavior samples in psychological measurement has several implica- tions. First, a psychological test is not an exhaustive measurement of all possible be- haviors that could be used in measuring or defining a particular attribute. Suppose, for example, that you wished to develop a test to measure a person’s writing ability. One strategy would be to collect and evaluate everything that person had ever written, from term papers to laundry lists. Such a procedure would be highly accurate, but im- practical. A psychological test attempts to approximate this exhaustive procedure by 3 Tests and Measurements collecting a systematic sample of behavior. In this case, a writing test might include a series of short essays, sample letters, memos, and the like. The second implication of using behavior samples to measure psychological vari- ables is that the quality of a test is largely determined by the representativeness of this sample. For example, one could construct a driving test in which each examinee was required to drive the circuit of a race track. This test would certainly sample some as- pects of driving but would omit others, such as parking, following signals, or negotiat- ing in traffic. It would therefore not represent a very good driving test. The behavior elicited by the test also must somehow be representative of behaviors that would be observed outside the testing situation. For example, if a scholastic aptitude test were administered in a burning building, it is unlikely that students’ responses to that test would tell us much about their scholastic aptitude. Similarly, a test that required highly unusual or novel types of responses might not be as useful as a test that re- quired responses to questions or situations that were similar in some way to those ob- served in everyday life. Standardization A psychological test is a sample of behavior collected under standardized condi- tions. The Scholastic Assessment Tests (SAT), which are administered to thousands of high school juniors and seniors, provide a good example of standardization. The test supervisor reads detailed instructions to all examinees before starting, and each por- tion of the test is carefully timed. In addition, the test manual includes exhaustive in- structions dealing with the appropriate seating patterns, lighting, provisions for interruptions and emergencies, and answers to common procedural questions. The test manual is written in sufficient detail to ensure that the conditions under which the SAT is given are substantially the same at all test locations. The conditions under which a test is administered are certain to affect the behav- ior of the person or persons taking the test. You would probably give different answers to questions on an intelligence test or a personality inventory administered in a quiet, well-lit room than you would if the same test were administered at a baseball stadium during extra innings of a play-off game. A student is likely to do better on a test that is given in a regular classroom environment than he or she would if the same test were given in a hot, noisy auditorium. Standardization of the conditions under which a test is given is therefore an important feature of psychological testing. It is not possible to achieve the same degree of standardization with all psycho- logical tests. A high degree of standardization might be possible with many written tests, although even within this class of tests the conditions of testing might be difficult to control precisely. For example, tests that are given relatively few times a year in a limited number of locations by a single testing agency (e.g., the Graduate Record Ex- amination Subject Tests) probably are administered under more standard conditions than are written employment tests, which are administered in hundreds of personnel offices by a variety of psychologists, personnel managers, and clerks. The greatest diffi- culty in standardization, however, probably lies in the broad class of tests that are ad- 4 Tests and Measurements ministered verbally on an individual basis. For example, the Wechsler Adult Intelli- gence Scale (WAIS-III), which represents one of the best individual tests of intelligence, is administered verbally by a psychologist. It is likely that an examinee will respond differently to a friendly, calm examiner than to one who is threatening or surly. Individually administered tests are difficult to standardize because the examiner is an integral part of the test. The same test given to the same subject by two different examiners is certain to elicit a somewhat different set of behaviors. Nevertheless, through specialized training, a good deal of standardization in the essential features of testing can be achieved. Strict adherence to standard procedures for administering var- ious psychological tests helps to minimize the effects of extraneous variables, such as the physical conditions of testing, the characteristics of the examiner, or the subject’s confusion regarding the demands of the test. Scoring Rules The immediate aim of testing is to measure or to describe in a quantitative way some attribute or set of attributes of the person taking the test. The final, defining char- acteristic of a psychological test is that there must be some set of rules or procedures for describing in quantitative or numeric terms the subject’s behavior in response to the test. These rules must be sufficiently comprehensive and well defined that different examiners will assign scores that are at least similar, if not identical, when scoring the same set of responses. For a classroom test, these rules may be simple and well de- fined; the student earns a certain number of points for each item answered correctly, and the total score is determined by adding up the points. For other types of tests, the scoring rules may not be so simple or definite. Most mass-produced standardized tests are characterized by objective scoring rules. In this case, the term objective should be taken to indicate that two people, each applying the same set of scoring rules to an individual’s responses, will always arrive at the same score for that individual. Thus, two teachers who score the same multiple- choice test will always arrive at the same total score. On the other hand, many psycho- logical tests are characterized by subjective scoring rules. Subjective scoring rules typically rely on the judgment of the examiner and thus cannot be described with suffi- cient precision to allow for their automatic application. The procedures a teacher fol- lows in grading an essay exam provide an example of subjective scoring rules. It is important to note that the term subjectivedoes not necessarily imply inaccurate or un- reliable methods of scoring responses to tests, but simply that human judgment is an integral part of the scoring of a test. Tests vary considerably in the precision and detail of their scoring rules. For multiple- choice tests, it is possible to state beforehand the exact score that will be assigned to every possible combination of answers. For an unstructured test, such as the Rorschach inkblot test, in which the subject describes his or her interpretation of an ambiguous abstract figure, general principles for scoring can be described, but it may be impossi- ble to arrive at exact, objective scoring rules. The same is true of essay tests in the class- room; although general scoring guidelines can be established, in most cases, it is 5

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.