DOCUMENT RESUME FL 026 045 ED 435 192 Basci, Pelin AUTHOR Beyond Classroom Achievement: Standardized Turkish Tests. TITLE ISSN-1095-8096 ISSN 1999-00-00 PUB DATE NOTE 6p. AATT Bulletin, Portland State University, Dept. of Foreign AVAILABLE FROM Languages and Literatures, P.O. Box 751, Portland, OR 97207. Descriptive (141) Reports Journal Articles (080) PUB TYPE AATT Bulletin; n23-24 p14-17 Spr-Fall 1999 JOURNAL CIT MF01/PC01 Plus Postage. EDRS PRICE Higher Education; *Language Tests; Professional DESCRIPTORS Associations; Program Descriptions; Rating Scales; Scoring; *Standardized Tests; *Test Construction; Test Use; *Turkish; Uncommonly Taught Languages ABSTRACT The goals, concerns, and issues addressed by a committee of the American Association of Teachers of Turkic Languages (AATT) in the process of developing a standardized Turkish language proficiency test at the college level are examined. The testing committee aimed at incorporating recent scholarship on second language acquisition, teaching methods, and assessment into development of a test for intermediate and advanced proficiency levels. Scoring was designed to indicate both a competitive (rating) score and a proficiency level, to meet both diagnostic and placement needs. The test incorporates visual and authentic materials. Considerations in test administration and in test reliability and validity are discussed briefly. (MSE) Reproductions supplied by EDRS are the best that can be made from the original document. BEYOND CLASSROOM ACHIEVEMENT: STANDARDIZED TURKISH TESTS Pelin Basci Portland State University U.S. DEPARTMENT OF EDUCATION Office of Educational Research and Improvement PERMISSION TO REPRODUCE AND EDUCATIONAL RESOURCES INFORMATION DISSEMINATE THIS MATERIAL HAS CENTER (ERIC) BEE GRANTED BY This document has been reproduced as received from the person or organization Patin 0_1 originating it. Minor changes have been made to improve reproduction quality. TO THE EDUCATIONAL RESOURCES ° Points of view or opinions stated in this INFORMATION CENTER (ERIC) document do not necessarily represent 1 official OERI position or policy. Spring 1999 Fall 1999 AATT Bulletin 14 BEYOND CLASSROOM for example, Spanish or other Romance languages in the US. Both the ACHIEVEMENT: demand for standardized tests and the corpus of any data STANDARDIZED that result from such tests and would inform TURKISH TESTS assessment research are smaller. Moreover, in many US academic institutions, Turkish instructors are asked to strike a difficult Pe lin Besot balance between their increasingly diver- Portland State University sified academic and teaching responsibilities and limited resources. U One of the main goals of the testing committee for Turkish, which was founded An increasing separation of tasks has taken in the late 1990s, has been to broaden the place in many "commonly taught" second scope of interest and research in the testing or foreign languages such as English and and assessment of Turkish, and thereby Spanish, where the test builder is no longer promote professional awareness of the topic. necessarily the instructor. This is, of course, It is with these broader goals in mind that I especially true for standardized tests, would like to expand further on the ques- although the same principle also applies to tions that the testing committee had to oral proficiency assessment.' As a matter address in preparing model intermediate of fact, for some of these language areas, and advance tests for the 1998 program assessment is both an independent field of of the American Research Institute in research and a full-fledged industry. There Turkey (ARIT) for language study in are advantages and disadvantages to such Bogazici University. expansions in any field: the advantage Second language acquisition research, stems from the professional work committed particularly the debate that centered around to assessment research and implementation, the distinction between learning and acqui- while the disadvantage most remarkably sition,2 regardless of its shortcomings par- arises from the commercial interests built ticularly for non-cognate languages,; around highly standardized tests such as the provided useful analytic categories for re- Test of English as a Foreign Language thinking classroom instruction. Debates (TOEFL). In some of these cases, special about the place of comprehensible input crash courses, study books, and exercise produced in meaningful authentic contexts, tapes all come to function as parts of a vast the presence or absence of the "monitor" as enterprise. a "grammar police" in learners' minds, and One doubts seriously that standardized the delicate balance between accuracy and tests in Turkish will ever become commer- communicative competence have been cialized to the extent of, for example, informing foreign language methodology TOEFL. While this may indeed be the as well as assessment.4 The preparation good news, the bad news is that testing of multi-dimensional tests, rather than one- and assessment of Turkish within the US dimensional paper-and-pen examinations, context has hardly reached the widespread incorporating authentic material that has professionalism that underlies the assess- some appeal and relevance to the test-taker's ment research, test preparation, and imple- needs, and designing assessment models that mentation in some of the commonly taught are user-friendly and tests that help students language areas. No doubt, this is also due, rather than keeping them in their place, have at least in part, to the status of Turkish as been accompanying debates of instructional a non-cognate critical language in a largely methodology.3 Moreover, with the national English-speaking world: there is no com- move towards clearly defined proficiency parison between the number of students goals as the organizing principle in teaching, who take Turkish and those who take, "knowing" a second or foreign language Spring 1999 Fall 1999 AATT Bulletin 15 has been re-defined as the ability to perform, "to do" things, to carry out diverse tasks In addition to selection, readiness, and using all four language skills in the authentic entrance goals, the tests' ability to diagnose environment of the target language and and place students into expected levels of culture. proficiency implied that the contexts, func- The AA'rT testing committee aimed tions, and levels of accuracy to be tested at incorporating much of the useful recent should reflect the expectations of the re- scholarship on acquisition, methods, and ceiving program. This brought to the fore assessment into the process of developing further issues regarding instructional artic- model tests for two different proficiency ulation, the implied continuity of levels, and levels. In this process, the Turkish pro- expectations among different programs both ficiency guidelines set the framework for within and without the US contextan issue developing the model tests. However, as that plays a part in decision-making, but is the committee was charged with the task beyond the scope of any testing committee's of creating models for the ARIT test, the immediate work. question of intent had to be addressed: to Having listed some of the goals, major what degree were models, like those concerns, and issues to be addressed, it prepared for ARIT, proficiency assessment would be thoroughly unrealistic to claim tests? For the models to be successful, other with any level of certainty that all of these significant variables had to be accounted for. issues were resolved and that "perfect" Prepared within the framework of pro- models for standardized national tests ficiency guidelines, the models had to were in fact created by the Turkish testing become a point of departure for reliable committee. Yet the following may shed light standardized tests that best suited the needs on how the committee handled some of of ARIT and reflected the educational these concerns. context that produced the intermediate and advanced students of Turkish who would (1) Addressing the objectives: The take these tests. For the testing committee, committee designed the scoring to indicate the identification and recognition of all both a competitive score and a proficiency the variables and different criteria in the level in order to meet the different set of construction of models were of great goals mentioned above. This meant that the significance, for they helped to define tests would indicate both a demonstrated what Bachman calls "the abilities we wish level of proficiencyas in intermediate-mid to measure and the means of measure- or intermediate-highand a numerical score ment."2 showing a ranking of the student in relation Testswhile by no means the only to other competitors for selection purposes. criterion of evaluationhelp ARIT decide In designing the questions, the com- what student to select for an in-country mittee made a serious effort to address study program and whom to reward with different proficiency levels for different a scholarship. In this sense, the models had skills. For example, in the section that tested to address simultaneously the "selection," speaking, intermediate students were asked "readiness," and "entrance" goals. Mean- to communicate concrete, descriptive infor- while, there was also a greater need for mation about the and their immediate the tests to inform instructors and program environment; deliver this information administrators with increased accuracy through a range of speech conventions about the students' readiness to pursue such as questions and commands, using intermediate and advanced study in the grammatical structures such as "var/yok", program, which in turn, required that the and the future, present continuous, past, and tests "diagnose" student levels and reduce aorist tenses; and display a certain amount the burden of further "placement" pro- of control and appropriateness in their use cedures. 4 16 MIT Bulletin Spring 1999 Fall 1999 large extent on realia: photos and cartoons of these basic structures. Their functional from newspapers rather than drawings, skills were put to test in well-defined tasks composed newspaper clippings rather than in that related to the expected benchmarks and unscripted or altered written texts, intermediate proficiency levels, such as conversations depicting a certain negotiation inviting friends over for dinner and pro- of meaning such as a rendezvous, to name a viding them with directions and ordering few.4 students were at a restaurant. Advanced about asked to display the ability to talk (4) Reliability and validity: In order to specialized the self in a more detailed and increase the reliability of the model tests, options, fashion, incorporating their career in attempts were made, even if limited future plans, and a discussion of cultural "pre-testing" on those scope, to administer differences. The situational contexts in themselves participate groups who could not which their functional skills were tested constraints in the "competition." Often, time include, but are not limited to, retrieving hampered further more than anything else stolen valuables such as a passport, making despite work in this direction. Moreover, grievances at a hotel while employing all that is said and done, any assessment appropriate structures and increased assessed when process itself can only be rules awareness of the socio-linguistic there is additional input on implementation, of Turkish. which is to say the actual administration process. (2) Constructing a multi-dimensional test Finally, it is the input coming from the while addressing issues of scoring: In implementation process that will help to addition to a written text, the tests included improve the quality and reliability of such in the form of pictures, a variety of visuals models for future use. Were the instructions photographs, newspaper clippings, and an clear to the students and to the proctor audio section which contained questions enough? How did the students perform with spoken read on a tape and required answers the time restrictions, since the tests were not into a tape. The model tests were con- did constructed as "speed-tests"? What role structed to measure student skills at ad- the physical conditions and limits of the vanced and intermediate levels, separately identity testing environment play? Even the examining all four areasspeaking, role in of the proctor could play a significant listening, reading, and writingeven though the implementation of any test. In this case, of the committee felt strongly that each instructor who would a Turkish-speaking these seemingly separate receptive and have an idea of the test material, realize productive skills contributed to one another. what is being tested, and could handle listening At least in the case of speaking and unexpected situations regarding the test, is comprehension the committee therefore certainly preferable to a substitute with no designed the scoring guides accordingly. functional language abilities in Turkish received A certain percentage of the scores and no training in any aspect of teaching the in listening contributed, for example, to and testing. Similarly, the quality and versa.3 student's scores in speaking and vice availability of tape recorders at a given institution, the conditions of the room in which the test was administeredall of (3) The use of authentic material: In the the these and other variables could impact testing of different skills, including reading reliability of any standardized test. It is and writing, the committee tried to provide particularly, but not exclusively, on this note students with communicative contexts such about implementation that the testing com- form (intermedi- as filling in a subscription the mittee would like to solicit responses to ate) and processing information from a film model tests and invite a larger participation review (advanced). In all of the questions in each sub-section, the committee relied to a 5 Spring 1999 Fall 1999 AATT Bulletin 17 in the debate on the assessment of Turkish as a foreign and/or second language. NOTES ' There are those who believe that oral- proficiency assessment, particularly in the form of an interview, works best if the interviewer is not the teacher. 2 For a sample of Krashen's prolific work, see Stephen Krashen, Second Language Acquisition and Second Language Learning (New York: Pergamon Press, 1981). 3 For a critique of Krashen and the monitor model, see Ronald M. Barasch and C. Vaughan James, ed., Beyond the Monitor Model: Comments on Current Theory and Practice in Second Language Acquisition (Boston, Massachusetts: Heinle & Heinle Publishers, 1994). 4 Theodore V. Higgs, ed., Teaching for Proficiency, the Organizing Principle (Lincolnwood: National Textbook Company, 1989). 5 Andrew D. Cohen, Assessing Language Ability in the Classroom (Boston, Massachusetts: Heinle & Heinle Publishers: 1994). 5 Lyle Bachman, Fundamental Considerations in Language Testing (New York: Oxford University Press, 1990), 81. 6For the scoring guide please, see Giiliz Kuruoglu, "Turkish Proficiency Tests: A National Model" in this issue of the Bulletin. 7 This would be one example in which certain justifiable compromises were made in the use of the material. To maintain high-quality audibility, recordings were produced at a later time in a studio environment, although every effort was made during this process to maintain the natural flow of the original unscripted conversation. 6 U.S. Department of Education Office of Educational Research and Improvement (OERI) I _J National Library of Education (NLE) Educational Resources Information Center (ERIC) REPRODUCTION RELEASE (Specific Document) I. DOCUMENT IDENTIFICATION: Title: fc.,Gi e C I a.3 Croo Ci art_c;I res 4-s " g C sCf Author(s): POW() Corporate Source: - Publication Date: Spoebi--Fctil lelql PrATT Catechli II. REPRODUCTION RELEASE: In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced in the monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproduced paper copy, and electronic media, and sold through the ERIC Document Reproduction Service (EDRS). Credit if is given to the source of each document. and- reproduction release is granted, one of the following notices is affixed to the document. If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following three options and sign at the bottom of the page. The sample sticker shown below will be The sample sticker shown below will be The sample sticker shown below will be affixed to all Level 1 documents affixed to all Level 2A documents affixed to all Level 2B documents PERMISSION TO REPRODUCE AND DISSEMINATE THIS MATERIAL IN PERMISSION TO REPRODUCE AND PERMISSION TO REPRODUCE AND MICROFICHE, AND IN ELECTRONIC MEDIA DISSEMINATE THIS MATERIAL IN DISSEMINATE THIS MATERIAL HAS FOR ERIC COLLECTION SUBSCRIBERS ONLY, MICROFICHE ONLY HAS BEEN GRANTED BY BEEN GRANTED BY HAS BEEN GRANTED BY \e \O Sa Sad TO THE EDUCATIONAL RESOURCES TO THE EDUCATIONAL RESOURCES TO THE EDUCATIONAL RESOURCES INFORMATION CENTER (ERIC) INFORMATION CENTER (ERIC) INFORMATION CENTER (ERIC) 2A 2B Level 1 Level 2A Level 2B n Check here for Level 1 release, permitting reproduction Check here for Level 2A release, permitting reproduction Check here for Level 2B release, permitting and dissemination in microfiche or other ERIC archival and dissemination in microfiche and in electronic media reproduction and dissemination in microfiche only media (e.g., electronic) and paper copy. for ERIC archival collection subscribers only Documents will be processed as indicated provided reproduction quality permits. If permission to reproduce is granted, but no box is checked, documents will be processed at Level 1. I hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce and disseminate this document as indicated above. Reproductidii from the ERIC microfiche or electronic media by persons other than ERIC employees and its system contractors requires permission from the copyright holder. Exception is made for non-profit reproduction by libraries and other service agencies to satisfy information needs of educators in response to discrete inquiries. Sign Pe Signature: Printed Name/Position/Title: r ("'n cAz here,-) fel (livers- p Orgartizatioeddress: tioritc,44,4 st-ctfe FAX: please Mbhi3 .3 2 c- c9 g 9 of fore r n k_e>vv. E-Mail Adpresa, PO &ox 15' I 7-216-7- (Cry 170(4-10v.c), p Divov, aocitIn Inn 'ect.t (over) III. DOCUMENT AVAILABILITY INFORMATION (FROM NON-ERIC SOURCE): If permission to reproduce is not granted to ERIC, or, if you wish ERIC to cite the availability of the document from another source, please unless it is publicly provide the following information regarding the availability of the document. (ERIC will not announce a document available, and a dependable source can be specified. Contributors should also be aware that ERIC selection criteria are significantly more stringent for documents that cannot be made available through EDRS.) Publisher/Distributor: Pr f, P 8c*cl) AA TT e ulle 1,^r) ci Rot- 11.'0 Fo refs fa(c, De pact--44 Address: /Dor t.1.-ev`vers e/7 1- PO gox *Fs 01,19u_ai3es ovtcl A fl-e7atures, 0 g q 2© Price: +0 AirTY b Free of C4tcy-9e, b wt.*. 01 IV. REFERRAL OF ERIC TO COPYRIGHT/REPRODUCTION RIGHTS HOLDER: If the right to grant this reproduction release is held by someone other than the addressee, please provide the appropriate name and wiriress: Name: Address: V. WHERE TO SEND THIS FORM: Send this form to the following ERIC Clearinghouse: OUR NEW ADDRESS AS OF SEPTEMBER 1, 1998 Center for Applied Linguistics 4646 40th Street NW Washington DC 20016-1859 However, if solicited by the ERIC Facility, or if making an unsolicited contribution to ERIC, return this form (and the document being contributed) to: C Processing and Reference F 1100 West Street, 2" Flo I, Maryland 207' L- 598 Telepho 1-497-4080 Toll F 99-3742 8 .3 301-953 I .v -mail: [email protected]. : http://ericfac.piccard.csc. EFF-088 (Rev. 9/97) PREVIOUS VERSIONS OF THIS FORM ARE OBSOLETE.