ebook img

Language Testing ( آزمون سازی زبان ) PDF

99 Pages·11.687 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Language Testing ( آزمون سازی زبان )

1.pdf 2.pdf i Iofee Chapter 1: PreliminariesofLanguageTesting WHYTESTING.....n..cccccosssosesecsesseesceneessnvees BENEFITS/IMPORTANCEOFTESTING. MEASUREMENT,TEST,EVALUATION.... ASSESSMENT . CotSaalelFlomeyoSubsea! UWJlJatfoallSyyS59essLoggeslGuyCVsshagtalllslgea NORM-REFERENCEDvs. CRITERION-REFERENCEDTESTS. TEACHER-MADETestvs. STANDARDIZEDTESTS. IrBcballeglsesis 49preenteglacegarre99900ceygl0,FaryocloubSilabyglar59S5eesalleohgoneLS THE CONSEQUENCESOF STANDARDIZEDTESTIN G.... Arthur Hughes James Dean Brown (.g1ceye999-3de> 4gaLe,3)FAJAB sasbe5059ollelaleo,f WASHBACK Oe GapalieFAJAB GUSaSbulj|FAJAB,Heaton .clantslacs,lgsCroud59»Douglas Brown TESTBIAS FAJAB GuslayOh Getazer cel od Jlso Ja!fadGASyoglee Gler alts55 GLYcntage99edylo 1 ara’ ETHICAL ISSUES: CRITICALLANGUAGETESTING. ISSshot251slafleyoaSSs spl F388BSjoyBalOLSGeomMAETyabpole OLSgeBtn olds cel AUTHENTICITY pe allesstogthiome «tlaslsDouglas Brown 4Arthur Hughes J. Dean Brown ,slats4.5chy Aegt StateUniversityQuestionsandAnswers... Cate aSthee GS ead cilasgegl ye hedoySpegeCel bd otilonyCrt?C299CapotelSay GLEoljl AzadUniversityQuestionsandAnswers.. Chapter2: LanguageTestFunctions......... el Aooadgalasslejeg Aasaullelleglscallespd.g0Tb hadouljloyoygOVI TWOMAJORFUNCTIONS OFLANGUAGETESTS.... rlwale WSooplorehbysSgrpesRolSpheIl)pole OLSaSslaShy CONTRASTING CATEGORIESOFLANGUAGETESTS Sihowiag0j!” Ceomd 9 524) gerye GLOLS alles 4) oils ghilgh Seejgbay rad yw lbs Jel” 5 ly, Suez © COMPUTER-ADAPTIVETESTING..........0.:.:000eeeeee Atle jacetep51slaSleobilgspiley9S2FOMoeBBOcee9Lalasjfoglllasgl wS,fal ("Lacslea A GENERALFRAMEWORK ........... whealym sleatljocepbgergddaraeAV SlehaliTy ospileslaySeGVges Jalasl} © State UniversityQuestionsandAnswers we 9 at slagslafadyoJoey oSFe wadur GITslaJibs allt AzadUniversityQuestionsandAnswers .. Chapter3: FormsofLanguageTest STRUCTUREOFANITEM CLASSIFICATIONOFITEMFORMS. TYPES OFITEMS ALTERNATIVEVS.TRADITIONALASSESSMENT StateUniversityQuestionsandAnswers.. AzadUniversityQuestionsandAnswers.. Chapter4: Basic StatisticsinLanguageTestin; STATISTICS......... TYPES OFDATA... TABULATIONOFDATA. GRAPHIC REPRESENTATIONOFDATA DESCRIPTIVE STATISTICS. NORMALDISTRIBUTION... DERIVED SCORES CORRELATION CORRELATIONAL INDEXES CORRELATIONAL FORMULAS StateUniversityQuestionsandAnswers... .. 100 AzadUniversityQuestionsandAnswers... vw 104 Chapter5: TestConstruction eveLID DETERMININGFUNCTIONANDFORM OFTHETEST.. 112 PLANNING......sessesseccssssessessenesesenssnesccteneeeeseenseesenecseadenetessessesecneseesscssasoenecasssesbassseeasneasccnsneenerineesreneene113 THE PERFORMANCECRITERIA..escsssssesssssecsrpsessnessnnsecnsnennnsteesnnonsnrecangecssensnnsenen00secnecesegeee 233 PREPARINGITEMS. 113 DEVELOPINGTEST STEM....... 234 REVIEWING.......... 118 SCORING SYSTEM........:sersseerrees 235 PVRAELTIEDSATTIINOGN..... 111287 EStXatTeRUAnivPeOrIsiNtTySQuTeOstRioEnMsEanMdBAEnRs.wers .... EXTRAPOINTS TOREMEMBER..... AzadUniversity Questions and Answets...........- StateUniversityQuestions andAnswers Chapter 10: A Sketch ofTestingTheFour Skills. AzadUniversityQuestionsandAnswers... TESTING LISTENING COMPREHENSION... Chapter6: Characteristics ofa GoodTest..... . TESTING ORAL PRODUCTION.....ee-ssssessreeeees RELIABILITY: THE GENERAL CONCEPT ooo. cessesesssessseseeeecenssersearonenvencersnscnerennseseeuesseassseneeveye 142 TESTING READING COMPREHENSION. . RELIABILITY INTESTING : TESTINGWRITING.....ccecseesssereeceseeeeteeeeei. CLASSICALTRUE SCORE THEORY(CTS). StateUniversity Questions andAnswers ... APPROACHESTOESTIMATING RELIABILITY ... AzadUniversity Questions andAnswers... FACTORSINFLUENCINGRELIABILITY |... MA98 Questions 1 e“asseuensceassensecorsensuessnonesensecenenensocusneaneneee.s4 STANDARDERROROF MEASUREMENT... MA98 Answers.. OTHERRELIABILITYTHEORIES REFRENCES..... RELIABILITYOF CRITERION-REFERENCEDTESTS.. VALIDITY FACTORS INFLUENCINGVALIDITY THERELATIONSHIP BETWEEN RELIABILITYANDVALIDITY PRACTICALITY... EXTRAPOINTS TOREMEMBER. StateUniversityQuestionsandAnswers AzadUniversityQuestionsandAnswers Chapter7: HistoryofLanguageTesting...... GRAMMAR-TRANSLATION APPROACH DISCRETE-POINTAPPROACH INTEGRATIVEAPPROACH FUNCTIONAL-COMMUNICATIVE APPROACH... StateUniversityQuestions andAnswers.... AzadUniversity.Questions andAnswers .... Chapter8: ClozeandDictation TypeTests. CLOZEPROCEDURE VARIETIES OFCLOZETEST.... CLOZE TASKoeceeecereeeee SCORINGACLOZETEST.. DICTATIONoe VARIETIES OFDICTATION.. SCORINGADICTATION........ VALIDITYOF CLOZE DICTATIONANDCLOZE RELIABILITY OF CLOZE ANDDICTATION... StateUniversityQuestions andAnswers........... AzadUniversityQuestionsandAnswers........... Chapter9: Communicative-functional Testing... SELECTION OFTHE FUNCTION . SOCIALFACTORS Chapter 1 Preliminaries ofLanguage Testing © WhyTesting © Benefits/ Importanceof Testing © Measurement, Test, Evaluation © Assessment ® Norm-Referencedvs. Criterion-Referenced Tests © Teacher-MadeTestvs. Standardized Tests © The Consequences of Standardized Testing © Washback (or Backwash) © Test Bias © Ethical Issues: Critical Language Testing © Authenticity T E S ghathaat, re a je a Chapter1/PreliminariesofLanguageTesting ry 11 eeaBS e Testing can help students prepare themselves and thus learn the materials in three ways. Furst, learners are helped when they study for exams and again when examsare returned and discussed. Next, where several tests are given, learning can be enhanced by students’ growing awareness ofthe objectives and theareas ofemphasis in the course. Finally, tests can foster learning by their diagnostic characteristics; they confirm what each person has mastered, andtheypointup those language items needingfurtherattention. O e Since tests tend to direct students’ learning efforts toward the objectives being measured, Preliminaries ofLanguage Testing I theycanbe used as tools forincreasing the retention and transfer ofclassroom learning, T if tests are aimed at measuring learning: outcomes at the understanding, application, and F Ifyou hear the word test in any. classroom setting, your thoughts are notlikely to be positive, interpretation levels rather than knowledge level. By including measures of these more pleasant, or affirming. The anticipation ofa test is almost always accompanied by feelings of complexlearningoutcomesinourtests,wecan directattentionto theirimportance. anxiety and self-doubt — along with a hope that you will come outof it alive. Tests seem as e A major aim ofall education is to assist individuals to understand themselves better so unavoidable as tomorrow’s sunrise in virtually every kind of educational setting. By all the that theycan make moreintelligent decisions and canmoreeffectively evaluate theirown inconvenience andtroublesa test brings, whydo wetest? Whatare thebenefits oftesting? performance.Periodic testing gives them an insight into the things they can do well and the misconceptions that need correction. Such information provides students with a more 1. WHY TESTING objective basis for planning their study program, for selecting future educational Education is the most important enterprise in any society. In fact, a considerable amount of experiences, and fordeveloping self-evaluation skills. budget, time and energy is putinto it every year by government. More than one-fourth of the Testing can alsobenefitteachers: nation’s population attends school. Education is truly a giant and an important undertaking and, « Testinghelps teachers to diagnose their efforts in teaching. It answers the question, “Have therefore, it is crucial that its process and products be evaluated. In fact, evaluation is a major I been effective inmy instruction?” and therefore testingenables teachers to increase their consideration in anyeducation setting: own effectiveness by making adjustments in their teaching to enable certain groups of ¢ Students,teachers, administrators andparents all work toward achieving educational goals students or individuals inthe class to benefit more. As werecord the test scores, we might andit is quite naturalthat theywant to ascertainthe degree to whichthose goals havebeen well askthefollowing questions: realized.Inthis sense, testing serves as amonitoring device forlearning. Aremylessons ontherightlevel? e Government and private sectors which pay teachers and who employ the students Am I aiming myinstructiontoo low ortoo high? afterwards are interested in havingprecise information about students’ abilities. Am Iteaching someskills effectivelybutothers less effectively? * Most importantly, through testing, accurate information is obtained based on which Whatareas do weneedmoreworkon? Whichpointsneedreviewing? educational decisions are made (from the entrance exam to the universities to placing « Testing can also help teachers gaininsight into ways to improve evaluationprocessitself: students in the rightlevel), When a decision is made, whetherthe decision is great orsmall, Were the testinstructions clear? it should be based on as muchand as accurate information as possible. The more accurate Waseveryoneableto finish inthe allotted time? theinformationuponwhicha decisionis madethe betterthatdecisionis likelyto be. Didthe testcauseunnecessaryanxietyorresentment? 2. BENEFITS/IMPORTANCE OF TESTING 3. MEASUREMENT,TEST, EVALUATION Tests can benefit students, teachers, and even administrators by confirming progress that has Before we look at tests and test design in second language education, we need to understand been made and showing how we canbestredirect our future efforts. Tests can benefit students in three basic interrelated concepts: measurement, test, and evaluation. These terms are sometimes the followingways: used interchangeably, but some educators makedistinctions among them. * ‘Testing can create a positive attitude toward class andwill motivatethem in learning the Measurementis the process ofquantifying the characteristics ofpersons according to explicit subject matter. Tests of appropriate difficulty announced well in advance and covering proceduresand rules. ¢ Quantification involves the process of assigning numbers, and this distinguishes skills scheduled to beevaluated can contribute to apositive tone, and also create a sense of achievement by demonstrating teacher’s spirit of fair play and consistency with course measures from qualitative descriptions such as a verbal accountor visual representation. objectives. ashespeachadie 12 [] LanguageTesting ery aewa . , ChapterI/PreliminariesofLanguage Testing L] 13 Sewmeeeeket*s erieeeens - Non-numerical. categories or rankings such as letter grades (A, B, C, ...) may have the test scores. characteristics ofmeasurementbecausetheirfocus ofattentionis comparisonoftestees. Therelationship among measurement,test, and evaluation are illustrated in following figure. As e Characteristics: We can assign numbers to both physical and mental characteristics of canbe seen all tests are measurementbutnot all tests are evaluation. persons. Physical attributes such as heightandweightcanbe observed directly. Intesting, however, we are almost always interested in quantifying mental attributes and abilities, sometimes called traits or constructs, which can only be observed indirectly. These mental attributes include characteristics such as aptitude, intelligence, motivation, Evaluation attitude, native language, fluency in speaking, and achievement in reading comprehension. : e Rules and procedures: Haphazard assignment of numbers to characteristics of individuals cannot be regarded as measurement. In order to be considered a measure, an observation ofan attribute must be replicable, for other observers, on other contexts and Relationship among measurement,test, evaluation with otherindividuals. Practically, anyone can rate anotherperson’s speaking ability. But &) Note: It is important topoint out that we never measure or evaluatepeople. We measure or while one ratermay focus on pronunciation accuracy, anothermay find vocabulary to be evaluate characteristics orproperties ofpeoplesuch as mentalattributes andabilities. the most salient feature. Such ratings are not considered measurement because the different raters in this case did not follow the samecriteria or procedures for arriving at 4, ASSESSMENT theirratings. Measures are characterizedbythe explicitprocedures and rulesuponwhich Assessment is appraising or estimating the level or magnitude ofsome attribute ofa person. Jn they are based. There are many different types of measures in the social sciences, educational practice, assessment is an ongoing process that encompasses a wide range of includingobservations, rankings, rating scales, andtests. methodological techniques. Whenever a student responds toa question, offers a comment, or Test is a measurementinstrument. Test often connotes the presentation ofa set ofquestions to be tries out anew wordor structure, the teachersubconsciously makes an appraisal ofthe student’s answered, to obtain a measure (that is, a numerical value) of a characteristic (that is, mental performance. Assessment can be classified on two continuum: informal/formal and formative/ attributeandability) ofapersonin a givendomain (language, math, etc.). Whatdistinguishes atest surnmative. fromothertypes ofmeasurementis that it is designed to obtain aspecificsample ofbehaviorfrom which one can make inferences about certain characteristics ofan individual. Let’s review two 4.1. Informal vs. Formal Assessment examplesto illustrate the difference betweenmeasurement andtest. A qualifiedinterviewermight Informal assessment can take a number offorms,starting with incidental, unplanned comments be able to rate an individual’s oral proficiency in a given language accordingto a rating scale, on and responses, along with coaching and other impromptu feedback to the student. Examples thebasisofseveral years’ informal contactwith that individual, and this could constitute ameasure include saying “Nice job!”;.“Did you say can or can’t?”; or putting a smiley face on some ofthatindividual’s oral proficiency. Thismeasure couldnotbeconsidered atest, however, because homework. Informal assessment does not stop there. A good deal of a teacher’s informal theraterdidnotuseanelicitationprocedure(e.g. a setofactivities oraset ofquestions) to obtain a assessment is embedded in classroom tasks designed to elicit performance without recording specific sample ofbehavior. Or, the rating of a collection ofpersonal letters based onarating results and making fixed conclusions about a student’s competence. Informal assessment is scale is considered measurement, while asking a person to write an argumentative editorials (to virtually always non-judgmental, in that you as a teacher are notmakingultimate decisions about elicitaspecific sample ofbehavior) foranews magazineconstitutesatest. the student’s performance. Examplesat this end of the continuum are marginal comments on Evaluation hasbeendefined in avariety ofways: papers, respondingto a draft ofan essay, offering advice about how to better pronounce a word, 1) The process of delineating, obtaining, and providing useful information for judging or suggesting a strategy for compensating for a reading difficulty. decisionalternatives. Onthe other hand, formal assessments are exercises or procedures specifically designed to tap 2) Thedetermination ofthe congruencebetweenperformance andobjectives. into a storehouse of skills and knowledge. They are systematic, planned sampling techniques 3) Aprocessthat allows oneto make ajudgment aboutthedesirability orvalue ofameasure. constructed to give teacher and student an appraisalofstudent achievement. Generally, the purpose ofevaluation is to gather information systematically for the purpose of &} Note: Is formal assessment the same as a test? We can say that all tests are formal making decisions. This information need not be exclusively quantitative. Verbal descriptions, assessments, but not all formal assessmentis testing. For example, a systematic set of ranging from performanceprofiles to letters of reference, as well as overall impressions, can observations ofa student’sfrequency oforal participation in class is certainly aformal provide important information for evaluating individuals, as can measures, such. as ratings and assessment, butit is hardly what anyone would calla test. 14 [_] LanguageTesting i22nay‘ meiemai a erie Chapter1/PreliminariesofLanguage Testing LC 15 ewin GeTi GeeseewahROTH 4.2.Formativevs. SummativeAssessment developing a CRTis that it adequately represents the criterion ability level orcomentdomain ° gpFrrooorcwmeatsthsipvorefotc‘eefssostr.msiTnahrgee’kgtehieyviertnoc(soaumtcpthehtefeonercnmaidteiosofnaiandssmsatklhiellldsseewlgiimtvehenrtyth(eobfgyomatalhteeortfieahalec)hleptrio)negavnatdlhueiamntteteorsntcauoldnietznaittnsiuoiennt(thbahyte btceeosmtepviaanrlvueoadltvetedos.aItnphserteeudasedeteoorfmfciconumetdpoafcrfriintsecgroiraoenpseorrtshoastnt’anssdepapreadrr.aftoOerfmtcaeonnmcpebeutttotenhnotattfarloowfamoytsih,nerctsoh,mepaehpitspepnear °vrnnmannneeee sotthfuedlseetnautrdsneinfnitgl).loinfSuatbphspeerqogupaerpnistaltiyen,ftehceeoidmrbpalecenkasronaintnogpr.eyrFfoeorxremmraactniicsvee,es twaeinsttdhsaaanrceteivyeiiettihteeosrwaasrreedlft-phgreroavfdiuetddueroderctnooonthgienrluapadtetihoiens mUnasusumatbleelrryyo,,faathetisegsthteeiretespmcasos.rseeSswinotcuheledtthemesatpkuoernplnoyosewdihoffeftneersehtnecienh.gasisgtiovesneethifetrhieghttesatneeswhears taorriavedoeai . peat given. i lbNaeo)ntgfeuo:argFmoear.taiSvleol.wPTrhahecentyiychoaaulvpeguiravpseotsaheessit,ruPvdrierinttmuaaalrlycyofamolmcleuknsitntodhseroofangisonuifgongrgemsdatelivoanes,lsoeopsrmsecmnaetinoltfaatrtteehnet(iolorenasrhtnooeura’lnsd ITnytpCeOerepyorwfeatrasteitoniiiy Rcsteoulmdapetanirtvesedti(NnAoopuestrrthcueoedasensentttio’slfseeacpltileeruromfctsoeh)re mrance is | pAcebo rsmcopelanurttaeegdCer(o,iAntolsefytr umitdaooetnnet-trhr’ieesaaflpemelroreeuafnnroct nre,emddao)rnce is poSbrujimemcmatariitvelierysvr.oefrtSo,erutschahtastsstfeieasgertnsdeibmngageicavkcsieounsurraetosfeotfrhegesrreuaeddmnetdsou,poifmowparhraoctfvotoeurrthseheceoersttrliuefduayenrininnttegsroh’sfsatvuildenaesnlntgetruauarcmgntaetiasofbtnrielaorinmytdtyoh,tfehectohrueerssuieln,tsstraurcetiuosneadl TTReyypppoeerootffSedcores sAsTtcoaopnrmeienreescasesscnuutocrirlheeea,grseaenztne-ckrs.acolorrleaa,snTtg-ausnacdgoaerred, pTAsetorusdcmteeeanntattesamhugearensetoasrocpfhneicwuiehmfvebietcedhroeabrcjpoerorcrretedniceotvt.etesar-mbianseedd D> Note:Formative test isongoingandimpliesthe observationofthe “process” oflearning, Measurement abilities orproficiencies languagepoints whilesummativetestisconcernedwiththe “product”oflearning. Spreadstudentsoutalonga Assesstheamountofmaterialknown PurposeofTesting continuumofgeneral abilitiesor orleamedbyeachstudent 5.NORM-REFERENCEDTESTys. CRITERION-REFERENCEDTEST proficiencies rtpraTteahethhfrefteietaeysiirrsncaeecdurnnoiileccrnsaeetetirddesin.rngtipchreWnrtaroteihpveutorreeinpepvntrreboeeeerftdfteeaeestwtrsnueisitlsotcttnstsheo,oc,emorsdprt,eiuewsfswtpfoeeeearcdwret,asenyptttioeshinaenatoktefsberoparpifesprncrietaiectefterniapdicotrsrieislotmnuenae-vtrreoieielolsffnaosehtcrairooroedrewnneocistmwde.oaedeniAttfhniidtnefeteoirperefiredvpa:raerbfenitotelmorsiaermtttmayaih,-nonarnciws.eenefogbIefesr,ffeepornnenoocatamekhdtdemhortfeitahaneenoisdtsstchtccereoerierrritesteehsedr.aroiinaoroTdnnn,o-ad- KQDTSiuecnsesoottsrwretSlisitbeorudntugsiceotnuorfoef wNtAvShahtoeruafridtmemeewetancylatorsonnedtlfhiaeasitnvttitervelietmiblotuyctteolilxnooeptnneeogcnorttfssnuisobnctoiteredsesettsassiawtroieoftmuhsnad ewAwVS1xiath0prtusoe0ihdecere%tkssinii;entmsnsooiofwlktftaenertssonhthieowntirometteaenm,xtm-awcesncoerontlirtllaemy-lradiwlseth.hfsaoiS:tuntlecuddodnesstnucebtotnrsetesttos ipInendrifnvoiordrmuamalnscrewefhoeofreaanrecegsidivmeitnleasrtgsrtoou(tpNh,ReToi)nr,dinvtoierdsmut.alrseTsfhuolertsw‘hnmooarmymthbgeteroeusipnt’teirsipsdreetstyeipdginceawdli.ltyhInartehlfeaerrdgeeenvcegelrootpuompetnohtef MissedItems aWnheitnem,aigtreiastenluimmibneartoedfftreostmeetshemitessst nmWahutemenbraieartlesosfatrteeisrtteeeevmsi,issetdmhieosrisnaesdtddributcyitoianoanglarleat poefrNfRoTrsmatnhcee anroerumsegdroauspriesfegrievnecnetphoeinttesst,foarndinttehrepnretthiengchtahreacpteerrifsotircmsa,ncorenoofromtsh,eroftsthuidsengtrsowuph’os Youperformedbetteronthis test Ywoorukhiasvgeivaennswered 60% oftheitems tethxaekaemapcttlhueealtoefsgttr.hoiIusnpiosttawhkheirantgcaitssheess,otmeNsetRt,irtmaeetsshtercreastluhllaetdns8taroraeadiisnnetgpearorpnartetethenedocarunrmdvegr,reopwuohpre.treePd,erssohalayep,lsythtwehiettohmportseetfnefrpaeemnriccleeinattor EInxtaemrpprleetaotfiToenst tsythouaudneaanrptespbrieonixntighmeactgoermlopyuap7re5adg%.aionfstthwehich mfoorvetohinstunoittchoerrneecxttlyunsiot.youmay aobfsotlheutestmuadgenntisturdeeceoiftveheainrs‘cAo’reosn, the test and the bottom ten percent fail, irrespective of the & Note: NRT helps administrators and teacheérs make program level deciissiions, siuchaans admission, proficiency and placement decisions, and the other family helps teaci “ acEpevprartloauaiacnthiioobsnjecmctaailyvl,eesd,;atfiottciuimsseesn,sotobneiwnshtiaemtnptdlehydecttaoersrdtiieefedfseocrueatnntitdaootedweittteeshrtemweihsn.aetCwtrhhieetytehrkeinroonwte.rsetAfeneesreehnxcaaivmnepgl,aecahwsioeutvhleidds dmiaakgenocsltaiscsorrooamchlieevveelmdeenctitseisotnisng()th.atis, assessingwhat thestudents have learned throug cboenttehnet,carsaethienrwthhiacnhwsittuhdernetspsecatrteoevatlhueaitredrielnatitveermrsaonfkitnhgeiirn rtehleatcilvaessd,eHgerneeceo,famabsastiecrycoonfcceorunrsien 6I.nTanEyACcoHnsEiRde-rMatAiDonETofEeSduTcvatsio.naSlTtAesNti:DngA,RaDdIiesZtEienDctiToEnSmTust be drawn between teacher-_mmaa'de ChapterI/PreliminariesofLanguage Testing [| 17 16 | LanguageTesting Swe shares and standardized instruments. A teacher-made test is a small scale, classroom test which is usually based on an objective procedure. Such tests have a wide range of coveragethatis, they generally prepared, administered and scored by one teacher. In this situation, test objectives are cover more material. They are used to assess either one year’s leaning or more than one year’s based directly on course objectives, and test content derived from specific course content. Such learning. Most elementary and secondary schools in the US have standardized achievementtests testshavethe following advantages: to measure childven’s mastery of the standards or competencies that have been prescribed for e Theymeasure students’ progressbasedontheclassroomactivities. specified grade levels. College entrance exams suchthe Scholastic Aptitude Test (SAT®)arepart ofthe educational experience ofmany high school seniors seeking further education in the US. e They provide an opportunity for the teacher to diagnose students’ weaknesses concerning a given subjectmatter. Examples ofstandardized languageproficiencytests are TOEFL and JELTS. e Theyhelp theteachermakeplans forremedial instruction, ifneeded. 7. THE CONSEQUENCES OF STANDARDIZED TESTING e Theymotive students. The widespread global acceptance of standardized tests as valid procedures for assessing individuals inmany walks oflifebrings withit a set ofconsequences that fall under the category TypeofInterpretation Criterion-referencing Norm-referencing of consequential validity. Consequential validity encompasses all the consequences ofa test, Dairetctitonafor Usuallynouniformdirections Specifiic, culture-~freediirreeccttiion forevery including such considerations as its accuracy in measuring intendedcriteria, its impact on the ministration and testeetounderstand;standardized preparation oftest-takers, its effect on the learner, and the (intended and unintended) social Scoring specified administraattiionandscoriingprocedures consequences ofa test’s interpretation and use. One of the aspects of consequential validity Content determinedbycurriculum and which has drawn special attention is the effect of test preparation courses and manuals on Bothcontentand samplingare subject-matterexperts; involves performance. McNamara cautions against test results that may reflect socioeconomic conditions SamplingofContent ‘| determinedbyclassroom extensiveinvestigationsofexisting such as oppottunities for coaching, that are “differentially available to the students being teacher syllabi,textbooks,andprograms; assessed (for example, because only some families can afford coaching, orbecause children with samplingofcontent donesystematically more highly educatedparents gethelp from their parents).” Maybehurriedandhaphazard; : : oftennotestblueprints,item Usesmeticulousconsttuctionprocedures 8.WASHBACK(orBackwash) Construction tryouts,item analysisor thatincludeconstructingobjectivesand A facet ofconsequential validity is washback. Consider the following scenario: you are working rqeuviitseipono;orqualityoftestmaybe itetsetmblanuaelpyrsiinst,s,aenmdpiltoeymirnegviistieomnstryouts, in.an institution that gets more funding ifthe numberofstudents reaching a certain standard on the standardized test at the end of the year increases. As a result, at the end of the year, your Norms Oavnaliylalbolcea,li.cel.atshseryoaormenorms are naddiittiiontolocalnorms,standardi.zed director will be keeping tabs on how manyofyour students make the standard for funding. Do eststypicallymakeavailablenational you think that would affect yourteaching? How much would yourteaching change? Would you ddeetpearrmtimneendtbytheschoolora schoolsdistrictnorms bemorelikelyto teach material that is related to the test? Material thatyou knowwill actuallybe found onthetest? This clusterofissues is aboutwashback. Washback(also called measurement Bestsuitedformeasuring Bestsuitedformeasuringbroad driven instruction, curriculum alignment, bogwash) generally refers to the effects the tests have PurposeandUse particularobjectives setby curriculumobjectivesandforinter-class, on instruction/pedagogy/learning/ education in terms ofhow students prepare for the test. ‘Cram teacherandforintra-class school andnational comparisons courses’ and ‘teachingto the test’ are examples of such washback. Another form of washback comparisons that occurs more in classroom assessmentis the information that washes back to students in the Unknown;usuallylowerthan High; writtenbyspecialists,pretested form ofuseful diagnoses ofstrengths and weaknesses. Students’ incorrect responses can become QualityofItems standardizedtests dueto andselectedonthebasis ofeffectiveness windows of insight into further work. Their correct responses need to be praised, especially limited timeandskill ofteacher when they represent accomplishments in a student's interlanguage. Teachers can suggest ere Unknown; usuallyhighif Reliability carefullyc?onstructed Hiigh strategies for success as part of their coaching role. Washback enhances a number of basic principles of language acquisition: intrinsic motivation, autonomy, self-confidence, language On the other hand, standardized tests are commercially prepared by skilled test-makers and ego, interlanguage, and strategic investment, among others. Washback also includes the effects measurement experts. They provide methods of obtaining samples of behavior under uniform ofan assessment on teaching and learning prior to the assessmentitself, that is, on preparation procedures. By a uniform procedure it is meant that the same fixed set of questions are fortheassessment. administered with the sameset ofdirections, timerestrictions, and scoringprocedures. Scoring is gihoncedai opin . 18 CJ LanguageTesting _ i rir ChapterI/PreliminariesofLanguage Testing FE] 19 wen haDee Washback can varyalongtwo dimensions:interms ofdegree (from strong to weak) andinterms | 5.Useavarietyofexaminationformats,includingwritten,oral, andpractical. ofkind(positive ornegative). The degree and kind ofwashback depend on: the degree to which 6.Donotlimitskillstobetestedtoacademicareas. i the test counters to current teaching practices, what teachers and textbook writers think are 7. Useauthentic tasks andtexts. | appropriate test preparation methods, how muchteachers and textbook writers are willing and Logistics ; / : able to innovate, and the status ofthe test (and the level ofstakes involved). Theissues ofstakes 1. Insurethattest-takers,teachers,administrators,curriculumdesignersunderstandthepurpose o: i is divided into low’ stakes versus high stakes situations. Low stakes situations typically involve theTest, i classroom testing, which is being used for learning purposes or research. For students, high 2. Makesurelanguage-learninggoalsareclear. | stakes situations usually involve more important decisions like admissions, promotion, 3.Wherenecessary,provideassistancetoteacherstohelpthemunderstandthetests. placement, or graduation decisions that are directly dependenton test scores. The washback 4. Providefeedbacktoteachersandothers someaningfulchangecanbeeffected. i effect is obviouslymuch strongerinhigh stakes situations than in low stakessituations. In terms 5. Providedetailedandtimelyfeedbackto schools onlevelsofpupilsperformanceandareasof fl ofkind, wehavethe followingdefinitions: difficultyinpublicexaminations. | ¢ Negative (or harmful) washback is said to occur when test items are based on an 6. Mbeackaeusseurtehetyeaacrheertsheapnedoapdlmeiwnihsotrwaitlolrshaavreeitnovmoalvkeedcihnadnigfefse.rentphasesofthetestingprocess 4 outdated view oflanguage which bearslittle relationship to the teaching curriculum,ie. 7. Providedetailedscorereporting. when the test content andtesting techniques are at variance with the objectives of the Interpretation/Analysis ; course. An instance of this would be where students are following an English course 1. Makesure examresultsarebelievable,credible,andfairtotesttakers andscoreuser. which is meant to train them in the language skills necessary for university study in an 2.Considerfactorsotherthanteachingeffortinevaluatingpublishedexaminationresultsand English-speaking country, but where the language test which theyhave to take in orderto national rankings. be admitted to a university does not test those skills directly. Ifthe skill ofwriting, for E - 3. Conductpredictivevaliditystudies ofpublicexaminations. a I example, is tested only by multiple-choice items, then there is great pressure to practice ° 4. Improvetheprofessionalcompetenceofexamination authorities, especiallyin testdesign. suchitemsratherthanpractice the skill ofwriting itself. 5. Insurethateachexaminationboardhas aresearch capacity. * Positive (or beneficial) washback is said to result when a testing procedure encourages : 6. Hadamvienitsetsrtaitnogrsa.uthoritiesworkcloselywithcurriculumorganizationsandwith educational i | otgerosaotlsdiintsteeaarcpvhoiisenswgibiplnreaactdifeicvneea.lloFepoxmraemenixtnaamotfptilohene,ttraheaeaydcieonnngcsoesukqriulalegsn.ecAetesaocfahnemoratshnetyro rpeerxaaadcmtipinlcgee,ccotomhnpevreuressheaetnoisofinoaalnn Eaa , 7. Dinetveerelsotpsarengdiocnoanlceprrnosf.essionalnetworkstoinitiateexchangeprogramsandto sharecommon ' languageuse with theirstudents. |I Anumberofsuggesti:onshavebeen made overthe years forways to promoteposictaeive washback. 4 9. TESTBIAS The following list is adopted from Brown (2005,p. 254). It is no secretthat standardized tests involve a numberoftypes oftest bias. Some ofthe sources of bias are background knowledge, native language, cultural background, race, gender, age, ° cognitive characteristics, and learning styles. Atest oritem canbe consideredto be biased ifone particular section ofthe candidatepopulation is advantaged ordisadvantaged by some feature of | Isami T the test or item which is not relevant to what is being measured, An item that is biased against | 2 Designeteeee : one group ofpeople is testing something in addition to what it was originally designedto test, \ 8 Decentaeuerionseferenced and such an item cannot provide clear and easily interpretable information. For instance, 1Basete testonsoundhemeeesnenfoteach consider an IQ item where the answerhinges onunderstandingthe differences betweenthe terms : 55.BBaasseeaachieevvemnentttetesstts‘oonnbobjjeecttiv—es. _ rain,> snow,; sleet,. and hail. Such an item might naturallybe biased against smtudents who gTreeww uupp 6. Usedirecttesting in a tropical area because many of them have never seen anything resembling snow,sleet, 7,Fosterlearneraut.onomyand self-assessment. haili. An obviious example iiff bbiiasas iis shown by the iitem below, whiich appeared in the State Testcontent Examination ofEnglishLanguage forElementary Level: | 1. Testtheabilities whose developmentyouwanttoencourage. / Mia: WhatshouldIdowiththis martabak? |Le| 32.. MUsaekemoerxaemoinpaetni-oennsdreedflietcetmst.he full curri.culum,notmerelya limited aspectofit. :—_ . Mom:Justputthem ona(a) drawer,(b)plate, (c) stove, (d) mug 4. Assesshigher-ordercognitive skills to ensuretheyare taught. 7 tll . — Geir Chapter1/PreliminariesofLanguage Testing LC] 21 20 [_] LanguageTesting _ a * Fymend threat of forced change. In our classrooms, where the dynamics of power and domination Examinees that come fromJavaorare somehow familiarto Indianculturewill findthe item above permeate the fabric ofclassroom life, we are alertedto apossible covertpolitical agenda beneath easy to answer. Yet, for some others coming from different areas, the word martabak may be entirely new and therefore they do not know whether a martabak refers to a kind ofstationery, ourovert technical agenda. One ofthe byproducts ofa rapidly growing testing industry is the danger ofan abuse ofpower. food, a cooking device, or akind ofbeverage. Despite their excellent English proficiency, there is As Shohamy claims “Tests represent a social technology deeply embedded in education, no waythey can getat therightanswer. The item, in otherwords,is culturallybiased againstthese government and business; as such they provide the mechanism for enforcing power and control. examinees. / Let examine another example adopted from a listening comprehension item taken from TOEFL Tests are most powerful as they are often the single indicators for determining the future of individuals”. Proponents of a critical approach to language testing claim that large-scale TestPreparationKitWorkbook: standardized testing is not an unbiased process, but rather is the “agent of cultural, social, Man: I’m taking up a collection for the jazz band. Would you like to give? political, educational, and ideological agendas that shape the lives of individual participants, Woman:Justaminute whileJgetmywallet. teachers andteachers”. The issues ofcritical language testing arenumerous: (narrator) Whatwillthewomanprobablydonext? e Psychometric traditions are challenged by interpretive, individualized procedures for a. Put somemoneyinherwallet predicting success and evaluating ability. b. Buyaband-concertticket « Test designers have a responsibility to offermultiple modes ofperformanceto account for ¢. Makea donation varying styles and abilities amongtest-takers. , d. Lend themansomemoney © Tests are deeply embeddedin culture and ideology. The right answer is c. However, it is very unlikely that examinees ofnon-Western culture are e Test-takers arepolitical subjects in apolitical context. familiarwith thehabit ofcollectingmoneyforaband in the US. culture. Therefore, being largely One of the problems of critical language testing surrounds the widespread conviction that unfamiliar with the meaning of“taking a collection” and looking at the word “give” from the standardized tests designed by reputable test manufacturers are infallible in their predictive man, they may be misled into thinking that the answeris d. alternatively, ifthey have no idea validity. Universities, for example, will deny admission to a student whose TOEFL,score falls whatsoeverthat a bandin theUS mayneed to collect somemoney, theymaychooseb. one point below the requisite score (usually around 500), even though that student, if offered >» Fairness canbedefined asthe degreetowhich atesttreats every studentthe same,orthe other measures of lariguage ability, might demonstrate abilities necessary for success in degree to which it is impartial. Teachers would generallyliketo ensure that theirpersonal university program. Onestandardized test is deemed to be sufficient, follow-up measures are feelings do not interfere with fair assessment of the students or bias the assignment of considered to be too costly. scores. The aim in maximizing objectivity is to give each student an equal chance to do A further problem with ourtest-oriented culture lies in the agendas of those who design and well. Equitable treatment in terms of testing conditions, access to practice materials, those who utilize the tests. Tests are used in some countries to deny citizenship. Tests are by performance feedback, retest opportunities, and other features of test administration, nature culture-biased and therefore may disenfranchise members of a non-mainstream value including providing reasonable accommodation for test takers with disabilities when system. Test givers are always in a position ofpower over test-takers and therefore can impose appropriate, are important aspects of fairness under this perspective. This tendency to social and political ideologies on test-takers through standards of acceptable and unacceptable seek objectivity has led to the proliferation of ‘objective’ tests which minimize the items. Tests promote the notion that answers to real-world problems have unambiguous right and possibilityofvarying treatmentfordifferent students. wrong answers with no shades ofgray. A corollary to the latter is that tests presumeto reflect an appropriate core of common knowledge and acceptable behavior; therefore the test-taker must 10. ETHICAL ISSUES: CRITICAL LANGUAGE TESTING buy into such a system ofbeliefs in order to makethe cut. Shohamy sees the ethics oftesting as an extension ofwhat educators call critical pedagogy, or Shohamy (1998) pointed outthat politicians had capitalized on languagetests fortackling thorny more precisely in this case, critical language testing. For a better understanding of critical political issues that they failed to address by other policy-making process. They could set the languagetesting, we need to know whatcritical pedagogy is. As language teachers we have to benchmark for passing a language test for immigration purposes without any justification, rememberthat weare all driven by convictions about what this world should look like, how its thereby allowing them the flexibility to create immigration quotas. For example, the government people should behave, how its governments should contro! thatbehavior, and: how its inhabitants ofAustralia drew on languagetests to manipulate the numberofimmigrants and to determine if should be partners in the stewardship of the planet. We embody in our teaching a vision of a refugees could be accepted or rejected. Similarly, Latvia used strict language tests to prevent better and more humanelife. However, critical pedagogy brings with it the reminder that.our Russians from obtaining citizenship in the wakeofits independence. learners mustbe free to be themselves, to think for themselves, to behave intellectually without coercion from a powerful elite, to cherish their beliefs and traditions and cultures without the

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.