Author'sCopy Pitch accent type matters for online processing of information status: Evidence from natural and ∗ synthetic speech AOJUCHEN,ELSDENOSANDJANPETERDERUITER Abstract Adopting an eyetracking paradigm, we investigated the role of H*L, L*HL, L*H, H*LH, and deaccentuation at the intonational phrase-final position in online processing of information status in British English in natural speech. The role of H*L, L*H and deaccentuation was also examined in diphone- synthetic speech. It was found that H*L and L*HL create a strong bias to- wards newness, whereas L*H, like deaccentuation, creates a strong bias to- wardsgivenness.Insyntheticspeech,thesameeffectwasfoundforH*L,L*H anddeaccentuation,butitwasdelayed.Thedelaymaynotbecausedentirely bythedifferenceinthesegmentalqualitybetweensyntheticandnaturalspeech. ThepitchaccentH*LH,however,appearstobiasparticipants’interpretation tothetargetword,independentofitsinformationstatus.Thisfindingwasex- plainedinthelightoftheeffectofdurationalinformationatthesegmentallevel onwordrecognition. ∗.ThisresearchwassupportedbytheCOMICproject(IST-2001-32311)duringtheperiodfrom April2004toMarch2005attheMaxPlanckInstituteforPsycholinguistics.WethankDe- phine Dahan, Claudia Kuzla, Barbara Schmiedtová, and Michael White for helpful com- mentsontheexperimentaldesign,CarlosGussenhovenforrecordingthestimuliinnatural speech,RobClarkandStefanRossignolfortheirhelpwithpreparingthesyntheticstimuli, DougDavidson,HerbertBaumann,JohnNagengast,LeahRoberts,KerenShatzman,Johan Weustink,andAnnaZumachfortheirhelpwithsettinguptheexperiment,PimLeveltand AntjeMeyerformakingitpossibletoconducttheexperimentsattheUniversityofBirming- ham,LindaMortensenforhersupportduringthetestingandusefulcommentsonanearlier versionofthetext,YangLuoforhishelpwithautomatingthedataprocessingprocedure,and HolgerMittererforusefuldiscussiononthestatisticalanalyses. TheLinguisticReview24(2007),317–344 0167–6318/07/024-0317 DOI10.1515/TLR.2007.012 (cid:2)cWalterdeGruyter 318 AojuChen,ElsdenOsandJanPeterdeRuiter 1. Introduction Informationconveyedbyasentenceorasentenceconstituentchangesitsstatus typicallyfromnewtogivenasdiscourseproceeds.Speakerscanuseintonation tosignalchangesininformationstatusbyvaryingtheintonationofthecorre- sponding lexical entities. It is generally accepted that in Germanic languages the placementofpitch accentis crucialfor the marking of information status (Gussenhoven2006).Thatis,newinformationtendstobeaccented,butgiven informationdeaccented.Appropriateintonationalencodingofinformationsta- tus can facilitate processing of information while inappropriate intonational encodingofinformationstatushastheoppositeeffect(Cutler,Dahan,andvan Donselaar1997andreferencestherein). Incontrast,theroleoftypeofpitchaccentinprocessinginformationstatus isfarfromclear.Previousstudiesoftheinteractionbetweenaccentplacement andinformationstatusinEnglish(e.g.,BirchandClifton1995;Dahan,Tanen- hausandChambers2002)andDutch(e.g.,NooteboomandTerken1982)often assumedthatdifferenttypesofpitchaccentsfunctioninthesamewayandleft the motivation to include one pitch accenttype into the investigation, but not theother,unexplained.Thereissomeempiricalevidenceinrecentstudiessug- gestingthatdifferenttypesofpitchaccentsareusedtoconveydifferenttypesof informationstatus.Specifically,inaperceptionexperimentinwhichlisteners judgedtheappropriatenessofH*,H+L*anddeaccentuationwhenfollowedby L-%inGermaninacontextwheretheinformationstatuswasvaried,Bauman andHadelich(2003)foundthatbothH*andH+L*wereconsideredappropri- ateinmarkingnewinformation(i.e.,thereferent,definedasanoundepicting anobject,wasintroducedneithervisuallynorauditorilyearlier),withH*be- ingmore favored. Further, deaccentuationwas judgedto be most suitable for thesignalingofgiveninformation(i.e.,thereferentwasearlierintroducedau- ditorily). Evidence in favorof H+L* as the “accessible”accentwas provided inanotherperceptionexperimentinwhichBaumanandGrice(2006)investi- gatedthe appropriatenessof H+L*,H* anddeaccentuationinthe marking of inferentiallyaccessiblereferents(i.e.,referentswhoeitherconstitutedapartof an already mentioned whole or were predictable from the contextually given schemaorframe)inGerman. Against this backdrop, the present study set out to investigate the role of four nuclear (i.e., intonational phrase-final) pitch accent types, fall, rise-fall, rise,fall-rise, as well as deaccentuationin processinggivenvs.new informa- tionin Southern BritishEnglish. Giveninformation is definedasinformation conveyed by a referent that was mentioned previously in the discourse; new informationisdefinedasinformationconveyedbyareferentthatwasnotpre- viouslymentionedoronlyindirectlytouchedupon(e.g.,viasemanticrelated- ness).ThefourtypesofpitchaccentsareknownasH*L(fall),L*HL(rise-fall), Pitchaccenttypemattersforonlineprocessing 319 L*H (rise) and H*LH (fall-rise) in the Transcription of Dutch Intonation no- tation(ToDI)(Gussenhoven2005).ToDIisadoptedtodescribepitchcontours instead of ToBI (Tones and Break Indices – see Beckman and Ayers 1994) in this study for two reasons. First, the intonational grammars of British En- glishandDutchareverysimilarandthephonologicalcategoriesproposedfor DutchinToDIalsoexistinBritishEnglish(Grabe2004).Second,ToDIdoes nothaveleadingtonesinpitchaccentsandemploysonlyonephrase-type,i.e., intonational phrase, both of which are rooted in the British English tradition (Gussenhoven2005). The ToDI notation is therefore believed to reflect more closelythanToBIthenucleartonesinBritishEnglishdiscussedinearlieranal- ysesofintonationalmeaning,whichserveasthestartingpointofourinvesti- gation. At the nuclear position, the boundary tone is the same as the trailing toneofthepitchaccentinthecaseofH*L,L*HLandL*H.AsforH*LH,the high trailing tone is realized as the boundary tone. The nuclear contours can thus be transcribed as H*L L%, L*HL L%, L*H H% and H*L H% in ToDI. TheircounterpartsinToBIareH*L-L%,L*+HL-L%,L*H-H%,andH*L- H%,respectively(Gussenhoven2005).1Thesefourtypesofpitchaccentswere chosenforthreereasons.First,H*L,L*HandH*LHarecommonnuclearpitch accenttypes inSouthern StandardBritishEnglish (Gussenhoven1984,2002; Grabe2004).Second,theyhavebeenclaimedtoconveyinformationstatusin theoriesofEnglishintonationalmeaning.Thereis,however,noconsensuson theexactfunctionsofthesepitchaccents.Finally,theL*HLaccentisclaimed tofunctionlikeanemphaticH*L(Brazil1975;Gussenhoven1984,2002).In- cludingL*HLallowedustoinspecttheassumedgradientmeaningdifference betweenH*LandL*HL.Theroleofthesepitchaccentswasexaminedinboth naturalspeech(Experiment1)andsyntheticspeech(Experiment2).Theuseof syntheticspeechwasintendedtofindoutwhethereffectsofpitchaccenttype wouldbepreservedwhenthesegmentalqualityofthespeechislimited. InSection2wewillfirstgiveabriefreviewofthepostulatedrelationsbe- tweenpitchaccenttypes(H*L,L*HL,L*H,andH*LH),andinformationsta- tusintheoriesofEnglishintonationalmeaning,andthenproposeourhypothe- ses on the role of these pitch accents as well as deaccentuation in processing givenvs.newinformationinBritishEnglish.Experiment1willbereportedin Section3andExperiment2inSection4.Ageneraldiscussionoffindingsfrom bothexperimentswillbegiveninSection5. 1.ThefourpitchaccenttypesinToDIaremergedtothreeinToBI(i.e.,H*,L*+H,andL*). H*LandH*LHhavethesamestarredtoneinToBIbutdifferinfollowingphrasaltones. 320 AojuChen,ElsdenOsandJanPeterdeRuiter 2. Theoreticalbackgroundandhypotheses Four analyses of English intonational meaning will be reviewed in this sec- tion:Brazil(1975),Gussenhoven(1984,2002),PierrehumbertandHirschberg (1990),andSteedman(2000).TheBritishnucleartonesystemwasusedtode- scribe intonation in the first two analyses. Where possible, we give the ToDI labelsoftheintonationcontoursinbrackets.TheToBInotation(Beckmanand Ayers1994)wasusedintheothertwoanalysesandismaintainedinthereview. According to Brazil (1975), the speaker makes a moment-by-moment as- sessmentoftheunderstandingheshareswiththehearer,and“bychoosingone intonationpatternratherthananother,thespeakercanaffectwhatanutterance doestowardsachievingconvergence”(1975:3).Brazilproposedthreespeaker- options: (1) Proclaiming: the speaker presents what he says as new informa- tion;(2)Referring:thespeakermakesreferencestofeatureswhichhetakesto bealreadypresentintheinterpretingworldsofthespeakerandthehearer;(3) Neutral:thespeakeravoidsproclaimingorreferring,i.e.,withdrawinghimself fromtheinteractivesituation.Thesethreeoptionsaresignaledbyfivenuclear tones.Proclaiming tones are fall (H*L) andrise-fall (L*HL). Referring tones include fall-rise (H*LH) and high rise. Rise-fall and high rise have the effect ofintensifyingthemeaningtheysignal.Theneutraltoneislowrise(L*H). Following Brazil (1975), Gussenhoven (1984, 2002) argued that in a con- versation,thespeakerandthehearerstrivetowardssomecommonunderstand- ingabouta particular segmentof the world andthe speakermayachievethis goalinthreeways:(1)Addition:addingtheVariable(i.e.,theinformationthat thespeakercontributestotheconversation)tothebackground,comparableto Brazil’sproclaiming;(2)Selection:selectingaVariablefromthebackground, comparabletoBrazil’sreferring;or(3)Testing:choosingnottocommithim- selfastowhethertheVariablebelongstothebackground.Additionisconveyed by fall (H*L L%), selection by fall-rise (H*L H%), and testing by low rise (L*HH%).NotethatGussenhovenusednucleartonetorefertoboththepitch accentandthe boundary tone. Thesetones were consideredthe basic nuclear tones of English. All the other tones are modifications of them. The modifi- cationrelevanttous here isdelay, i.e.,postponingthe associationof thetone with the segment. This resulted in the delayed fall (L*HL L%), the delayed fall-rise (L*HL H%), and the delayed low rise (L* H%). Each delayed tone was claimed to signal the same meaning as the corresponding basic nuclear tonebutwithanextrameaningelement,i.e.,non-routineness. InlinewithBrazil(1975)andGussenhoven(1984,2002),Pierrehumbertand Hirschberg(hereafter P&H) (1990) proposed thatthe choice ofpitch contour largelyconveyshowthespeakerevaluateshiscontributiontothediscoursewith respecttosomemutualbeliefsbetweenthespeakerandthehearer(s).Different from Brazil (1975) and Gussenhoven (1984, 2002), P&H’s analysis assumed Pitchaccenttypemattersforonlineprocessing 321 strongcompositionalityinthemeaningofthepitchcontour,accordingtowhich eachtypeofcomponents(i.e.,pitchaccent,phraseaccent,andboundarytone) ofthepitchcontourisinterpretedwithrespecttoitsdistinctphonologicaldo- mainandcontributesadistincttypeofinformationtotheoverallinterpretation ofacontour.Thetypeofcomponentsthatconveysinformationaboutthesta- tus of individual discourse referents is pitch accent. Here we mention briefly thepostulatedfunctionsofH*,L*,andL*+H,whicharerelevanttothepitch accents under investigation (see footnote 1). Pitch accents consisting of H* marklexicalitemsthatshouldbetreatedasnewinthediscourse.Pitchaccents consistingofL*marklexicalitemsthatarenottobetreatedasnew,butnev- erthelessaresalientinthediscourse.Withinthisgroupofpitchaccents,L*+H signalsalackofspeakercommitmenttoascalethatlinkstheaccenteditemto otheritemssalientinthehearer’smutualbeliefs.Phraseaccentandboundary toneconveythedegreeofrelatednessbetweenintermediatephrasesandinto- nationalphrasesrespectively,withthehightoneemphasizingrelatednessand thelowtoneindependence. Differentfromthethreepreviousanalyses,Steedman(2000)dividedanut- terance into theme andrheme. A theme is what the speakerandthe hearer(s) haveagreedtotalkabout,thepartofthesentencethattiesittothepreviousdis- course;arhemeisthespeaker’snewcontributiononthesubjectofthetheme. Boththethemeandtherhemecanbemarkedorunmarked.Markedinforma- tioniseithernew(inthecaseofrheme)orcontrastive(inthecaseoftheme); unmarked information is neither. Marked words in rhemes generally receive H*,butcanalsoreceiveL*,andpossiblyH*+L,andH+L*.Markedwordsin themesgenerallyreceiveL+H*,andpossiblyL*+Hinresponseswherecontra- dictionisinvolved.FollowingP&H(1990),Steedmanarguedthatthemeanings ofthepitchaccentsremainthesamewhenfollowedbydifferentphrasaltones. Asmayhavebecomeclear,thesetheoriesmakedifferentclaimsonthefunc- tions of the nuclear pitch accents at issue in marking information status. Ac- cordingtothetheoriesrootedintheBritishEnglishtradition(i.e.,Brazil1975; Gussenhoven 1984, 2002), H*L and L*HL mark new information whereas H*LH marks given information. The two analyses differ in the meaning of L*H, which is “neutral” according to Brazil (1975) but “testing” according toGussenhoven(1984, 2002). Opposite predictions canbe derivedfor L*HL (L*+HL-L%inToBI)andH*LH(H*L-H%inToBI)fromP&H(1990)oper- atingontheToBIsystem.Asphrasalboundarytonesareconsideredirrelevant to the conveyance of the information status of individual discourse referents, pitchcontourswiththesamepitchaccentsbutdifferentphrasaltoneshavethe samemeaning(e.g.,H*L-L%andH*L-H%)andpitchcontourswithdiffer- entpitchaccentsbutthesamephrasaltoneshavedifferentmeanings(e.g.,H* L-L% and L*+H L-L%). It follows that L*HL marks given information and H*LHmarksnewinformation.Steedman(2000)differedfromP&H(1990)in 322 AojuChen,ElsdenOsandJanPeterdeRuiter claimingthatL*H(L*H-H%inToBI)conveysnewinformationandthatL*HL (L*+HL-L%inToBI)conveysgiveninformationinvolvingcontradiction. Because we are dealing with pitch accent types in British English, we de- rived the following working hypotheses on the role of nuclear H*L, L*HL, L*H, and H*LH in processing information status mainly from Brazil (1975) andGussenhoven(1984,2002): a. H*Ltriggerstheinterpretationofnewness.(Allthetheoriesreviewed) b. L*HLtriggerstheinterpretationofnewnesswithmoreemphasisthanH*L. (Brazil1975;Gussenhoven1984,2002) c. L*H triggers the interpretation of givenness, like deaccentuation. (P&H 1990) d. H*LHtriggerstheinterpretationofgivenness,likedeaccentuation.(Brazil 1975;Gussenhoven1984,2002) 3. Experiment1–naturalspeech 3.1. Method The eye-tracking paradigm used in Dahan, Tanenhaus and Chambers (2002) was adopted to evaluate our hypotheses in natural speech. Dahan, Tanenhaus and Chambers examined the role of accent placement in reference resolution by monitoring eye fixations to lexical competitors (e.g., coat and comb) as participantsfollowedpre-recordedinstructionstomoveobjectsdisplayedona computerscreenusingacomputermouse.Eachdisplaycontainedfourobjects andfourgeometricshapes,asillustratedinFigure1.Itwasfoundthattheeffect ofaccentplacementwasreliablyreflectedintheproportionoffixationstothe referentanditslexicalcompetitorinaselectedtimewindow.Theeye-tracking paradigm may thus offer a measure of the effect of pitch accent type on the processingofinformationstatus. 3.1.1. Experimentaldesign. Oneachexperimentaltrial,twooftheobjects hadnamesthatwerephonemicallyrelated,i.e.,sharingthesamestressedsyl- lable (e.g., candle vs. candy), or the same onset-peak cluster (e.g., comb vs. coat) if the words were monosyllabic. One served as the target (e.g., comb) and the other as the competitor (e.g., coat). Each trial consisted of two con- secutive instructions (see Table 1). The second instruction always mentioned the target (e.g., now put the comb below the diamond). The first instruction mentioned either the target (e.g., Put the comb below the triangle), marking the targetatthe onsetof the secondinstruction as “given”butthe competitor as “new”, or the competitor (e.g., Put the coat below the triangle), marking thetargetattheonsetofthesecondinstructionas“new”butthecompetitoras Pitchaccenttypemattersforonlineprocessing 323 Figure1.Exampleofavisualdisplay.Geometricshapeswereblue. “given”.Thetargetnouninthesecondinstructionwastemporarilyambiguous duringthesegmentsithasincommonwiththecompetitornoun.Atthatstage, both the target and competitor nouns were potential candidates for selection, andparticipantswereexpectedtomakeuseofintonationtoidentifythenoun (Dahan, Tanenhaus and Chambers 2002). The intonation of the first instruc- tionwaskeptthesamethroughouttheexperiment;theintonationofthesecond instruction was varied by having the target noun produced with H*L, L*HL, L*H,H*LHanddeaccentuationwithanintonationalphraseboundaryafterthe noun.Combiningthetwotypesofinformation statusofthetarget/competitor duringthesecondinstructionandthefiveaccentconditionsgaveustenexper- imentalconditions,asillustratedinTable1. 3.1.2. Predictions. The patterns of fixations to the competitor picture and thetargetpicturefromthe targetwordonsettothe identificationofthe target word during the secondinstruction havebeenusedas indicators to how into- nation affects the interpretation of information status (Dahan, Tanenhaus and Chambers2002).Inlinewiththismethod,wearrivedatthefollowingpredic- tions: (1) When the target/competitor is new (i.e., not previously mentioned), accentconditions conveyingnewness(e.g.,H*L, L*HL) will trigger morefixationstothetarget/competitorpicturethanaccentconditions conveyinggivenness(e.g.,L*H,H*LH,deaccentuation). (2) Whenthetarget/competitorisgiven(i.e.,previouslymentioned),ac- cent conditions conveying givenness (e.g., L*H, H*LH, deaccentua- tion) will trigger more fixations to the target/competitor than accent conditionsconveyingnewness(e.g.,H*L,L*HL). 324 AojuChen,ElsdenOsandJanPeterdeRuiter ormationstatus Newtargetvencompetitor Giventargetewcompetitor nf Gi N I d n o m on dia L% cti he n u t o nstr ow % uati 1 Secondi combbel(target)H*LL%L*HLLL*HH%H*LH%deaccent nt he e t erim put p w x E o N n i s n o diti n o c al nt e m eri e exp gle ngl ten rian tria Illustrationofthe Firstinstruction coatabovethet(competitor) combabovethe(target) 1. he he able utt utt T P P Pitchaccenttypemattersforonlineprocessing 325 3.1.3. Materials. Twentypairsofphonemicallysimilarnounsservedasthe materialsforexperimentaltrials,ofwhicheighteenpairswerealsousedinDa- han,TanenhausandChambers(2002).Aspitchaccentsarerealizeddifferently inmonosyllabicwordsthanindisyllabicwords,tominimizeeffectsrelatedto phoneticrealizationofpitchaccent,weincludedtwelvepairsofmonosyllabic words and eight pairs of disyllabic words. One member of each pair was as- signed the role of target, the other the role of competitor. In the case of the monosyllabicpairs,carewastakentohaveasimilardistributionofvoicedco- das and voiceless codas in the targets and competitors. The mean lexical fre- quencies of the targets (33.6 per million) and competitors were similar (27.5 permillion),asreportedinFrancisandKu_era(1982).Eachofthe20target- competitor pairs was associated with two distractor nouns, resulting in four pictures on eachdisplay (see Figure 1). Two target-competitor pairs were as- signedtoeachexperimentalconditionbymeansofaLatinSquare.Thisledto tenlistsofexperimentalstimuli. Inadditiontothe20experimentaltrials,48fillertrialswereconstructedto preventparticipantsfromdevelopingtheexpectationthatpictureswithphone- micallysimilarnameswerelikelytobemovedineitherinstruction. Combining the ten lists of experimental stimuli and the fillers gave us 10 stimuluslists.Tominimizeordereffects,twostimulusorderswerecreatedfor eachstimuluslist. The 272 (20 experimental trials × 4 + 48 filler trials × 4) pictures were selectedfromSnodgrassandVanderwart’s(1980)picturedatabaseandthepic- turedatabaseoftheMaxPlanckInstituteforPsycholinguistics(MPI).Allwere blackandwhitelinedrawings. Thespokeninstructionswererecordedbyaprosodicallytrainedmalespeak- erofSouthernStandardBritishEnglishat48kHzsamplingrateinthesound- proofstudioattheMPI.Thespeakerreadtheinstructionsfromprintedrecord- ing script (see (3) for an example). The intonation for each instruction was transcribed in the ToDI notation. The speaker was an expert on ToDI and fa- miliar with producing pitch contours on request. Figure 2 shows example f 0 tracksforthetargetwordcombproducedinallfiveaccentconditions. (3) Put the comb abovethe square; now putthe comb below %LH* H*L H*LL% %LH* H*LH% H*L thediamond. H*LL% TheriseinL*HLandthefallinH*LHmayberealizedlargelyonthesegments presentinboththetargetandthecompetitor.IfL*HLwerefoundtohavethe same effect as L*H, and H*LH were found to have the same effect as H*L, this might have been caused by the ambiguity in the realization of the pitch 326 AojuChen,ElsdenOsandJanPeterdeRuiter Table2.f valuesofthesharedpartofthepitchaccents 0 H*L H*LH L*H L*HL MeanmaximalF0(Hz) 166 179 156 158 MeanminimalF0(Hz) 80 78 79 98 accents.ToestablishthatL*HwasacousticallydistinguishablefromL*HLin therise,andH*LwasacousticallydistinguishablefromH*LHinthefall,we measuredmaximal f andminimal f in the rise and the fall. As can be seen 0 0 inTable2andFigure2,L*H rosefromasignificantlylowerpitchpointthan L*HL(t=4.669,df =12, p<.005),whereasH*LHfellfromasignificantly higherpitchpointthanH*L(t=4.128,df =10, p<.005).2 3.1.4. Procedures. Twenty-four undergraduates and two postgraduates from the School of Psychology at the University of Birmingham participated intheexperiment.TheyallspokeSouthernBritishEnglishastheironlynative language.Noneofthemreportedtohavehearingproblems.Theyreceivedei- ther course credits or a smallfee for their participation. The experimenttook about10minutes. Participantsweretestedindividually.Theexperimenterdescribedfirstbriefly totheparticipantswhattheyweresupposedtodoandthengavethemthewrit- ten instructions on the experimental task to read. An example of the visual displaywasalsoincludedinthewritteninstructions.Participantswereseated at a comfortable distance from the computer screen in a quiet room. The eye tracker was mounted and calibrated. Eye movements were monitored with a portable SR EyeLink II eye-tracking system. Spoken instructions were pre- sentedto the participants through headphones. The structure of a trial was as follows:first,acentralfixationpointappearedonthescreenfor500ms.Then,a 5×5gridwithfourpicturesandfourgeometricshapesappearedonthescreen, astheauditorypresentationofthefirstinstructionwasinitiated.Thepositions ofthepictureswererandomizedacrossfourfixedpositionsofthegrid,while the geometric shapes appeared in fixed positions on every trial. As soon as a picturewasmovedafterthefirstinstructionended,thesecondinstructionwas initiated. Once the participant moved a picture following the second instruc- tion, the next trial began. The position of the mouse cursor on the computer screenwassampledandrecorded,alongwiththeeye-movementdata.Acen- 2.Theanalyseswereperformedonmaximalf0obtainedfrom11targetwordsandminimalf0 obtainedfrom13targetwords.Fortheothertargetwords,noreliablemeasurementscouldbe takenbecausetheriseandthefallwereonlypartiallyvisibleinf0tracks.
Description: