Brain&Language110(2009)135–148 ContentslistsavailableatScienceDirect Brain & Language journal homepage: www.elsevier.com/locate/b&l Review The role of the auditory brainstem in processing linguistically-relevant pitch patterns Ananthanarayan Krishnan, Jackson T. Gandour* DepartmentofSpeechLanguageHearingSciences,PurdueUniversity,1353HeavilonHall,500OvalDrive,WestLafayette,IN47907-2038,USA a r t i c l e i n f o a b s t r a c t Articlehistory: Historically,thebrainstemhasbeenneglectedasapartofthebraininvolvedinlanguageprocessing.We Accepted15March2009 reviewrecentevidenceoflanguage-dependenteffectsinpitchprocessingbasedoncomparisonsofnative Availableonline14April2009 vs.nonnativespeakersofatonallanguagefromelectrophysiologicalrecordingsintheauditorybrain- stem.Wearguethatthereisenhancingoflinguistically-relevantpitchdimensionsorfeatureswellbefore Keywords: theauditorysignalreachesthecerebralcortex.Weproposethatlong-termexperiencewithatonelan- Auditory guagesharpensthetuningcharacteristicsofneuronsalongthepitchaxiswithenhancedsensitivityto Human linguistically-relevant,rapidlychangingsectionsofpitchcontours.Thoughnotspecifictoaspeechcon- Brainstem text,experience-dependentbrainstemmechanismsforpitchrepresentationareclearlysensitivetopar- Pitch ticular aspects of pitch contours that native speakers of a tone language have been exposed to. Such Language Frequencyfollowingresponse(FFR) experience-dependenteffectsonlower-levelsensoryprocessingarecompatiblewithmoreintegrated, Iteratedripplednoise(IRN) hierarchicallyorganizedpathwaystolanguageandthebrain. MandarinChinese (cid:2)2009ElsevierInc.Allrightsreserved. Experience-dependentplasticity Speechperception 1.Introduction pitch information from the ear to the cerebral cortex. In fact, we seethatpitchencodingintheauditorybrainstemisanexampleof Historically,thebrainstemhasnotbeenconsideredtobeapartof experience-dependentneuralplasticity. thebrainworthyofinterestwhenitcomestolanguageprocessing. Theconventionalwisdomisthat‘‘processingoperationsconducted 1.1.Linguisticfunctionsofpitch intherelaynucleiofthebrainstemandthalamusisgeneraltoall sounds,andspeech-specificoperationsprobablydonotbeginuntil Tonelanguagesexploitphonologicallycontrastive pitchatthe thesignalreachesthecerebralcortex”(Scott&Johnsrude,2003,p. wordorsyllablelevel(Gandour,1994;Yip,2003).Suchlanguages 100).Itisagreedthatspeech-specificoperationsarelikelycircum- arecommonintheFarEastandSoutheastAsia.MandarinChinese, scribedtothecortex.Nevertheless,weseethattheauditorysignal forexample,inadditiontoconsonantsandvowels,hasfourlexical issubjecttolanguage-dependenteffectsatsubcorticalstagesofpro- tones (Howie, 1976): yi1 ‘clothing’ high level [T1]; yi2 ‘aunt’ high cessing.Thisisnotcontradictory.Byfocusingonspecificproperties rising[T2];yi3‘chair’lowfalling-rising[T3];yi4‘easy’highfalling oftheauditorysignal,irrespectiveofaspeechornonspeechcontext, [T4].Suchlanguagesaretobedistinguishedfromthoseinwhich wearguethattheemergenceofacoustic–phoneticfeaturesrelevant pitchvariationsareusuallynotcontrastiveatthesyllableorword tospeechperceptionbeginsnolaterthan5–7msfromthetimethe level (e.g., English). In nontone languages, however, variations in auditorysignalenterstheear.Thesegeneral-purposeauditorypro- pitch may be used to signal stress and intonation patterns at cessesaretuneddifferentiallytosuchfeaturesdependingupontheir post-lexicallevelsofrepresentation.Thecrucialfeaturethatdiffer- linguisticrelevance.Sucheffectsoflanguageexperienceonlower- entiatesbetweenthese twotypesoflanguages iswhetheror not levelsensoryprocessingiscompatiblewithamoreintegratedap- pitch variations are contrastive in the lexicon. All languages use proachtolanguageandthebrain(Hickok&Poeppel,2004;Zatorre pitchvariationsforintonation,butfewerpossibilitiesareavailable &Gandour,2008).Thisreviewfocusesonhowlanguageexperience intonelanguagesbecauseofco-occurringdemandsforpitchvari- modulates processing of linguistically-relevant pitch in tone lan- ationatthelexicallevel. guages(e.g.,Thai)atthelevelofthebrainstem.Itisseenthatearly sensory processing involves more than a simple transmission of 1.2.Perceptualdimensionsofpitchintonelanguages * Correspondingauthor.Fax:+17654940771. Althoughtheremaybeconcomitantchangesinduration,inten- E-mail addresses: [email protected] (A. Krishnan), [email protected] (J.T. Gandour). sity,andphonation,themostimportantacousticcorrelateoftones 0093-934X/$-seefrontmatter(cid:2)2009ElsevierInc.Allrightsreserved. doi:10.1016/j.bandl.2009.03.005 136 A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 isvoicefundamental frequency(f ). Its primacyas acueto tonal by pattern-matching mechanisms (Cohen, Grossberg, & Wyse, 0 identification of citation forms has been confirmed in perception 1995).Acontrastingviewisthattheauditorysystemextractspitch testsusingbothnaturalandsyntheticspeechstimuli.InMandarin, fromthetimingofauditorynervefiberactivityirrespectiveoffre- f contoursprovidethedominantcuefortonerecognition(Howie, quency organization. These temporal models are based solely on 0 1976;Xu,1997).Bothf heightandmovementprovidesufficient thetiminginformationavailableintheinter-spikeintervalsrepre- 0 information for high intelligibility of tones in Thai (Abramson, sented in the simulated (Krumbholz, Patterson, Seither-Preisler, 1962),Mandarin(Howie,1976),andCantonese(Vance,1977).Ra- Lammertmann, & Lutkenhoner, 2003; Meddis & O’Mard, 1997; pid f movements are required for high intelligibility of contour Patterson, Allerhand, & Giguere, 1995) or actual (Cariani & Del- 0 tones(Abramson,1978).Theoverallshapeofthef contouriscru- gutte, 1996a, 1996b) auditory nerve activity. They derive a pitch 0 cially important, as exemplified in high identification of the four estimate by pooling timing information across auditory nerve fi- Mandarin tones in a frequency range reduced to 4Hz (Klatt, berswithoutregardtothefrequency-to-placemapping.Thus,neu- 1973).Therelativeimportanceoff heightandmovementincuing ralphase-lockingrelatedtovoicefundamentalfrequency(f )plays 0 0 tonal distinctions varies depending on the specific pair of tones adominantroleintheencodingoflowpitchassociatedwithcom- beingcontrasted(Lin&Repp,1989).InMandarin,theprimacyof plex sounds. Temporal encoding schemes provide a unified and f cues notwithstanding, concomitant acoustic cues may include parsimoniouswayofexplainingadiverserangeofpitchphenom- 0 theshapeoftheamplitudeenvelope,syllableduration,andphona- ena(Meddis&O’Mard,1997). tionquality(Fu,Zeng,Shannon,&Soli,1998;Garding,1986;Liu& The human frequency following response (FFR), reflects sus- Samuel,2004;Whalen&Xu,1992). tained phase-locked activity in a population of neural elements Identificationanddiscriminationtasksrevealclassicalpatterns within the rostral brainstem (see Krishnan, 2006, for review of of categorical perception (CP) in Mandarin Chinese listeners for FFR response characteristics and cochlear initiation site). The re- bothspeechandnonspeechstimulivaryingalongalinearf contin- sponse is characterized by a periodic waveform which follows 0 uumfrom(level[T1]torising[T2](Xu,Gandour,&Francis,2006). theindividualcyclesofthestimuluswaveform.Experimentalevi- CP for pitch direction depends on a listener’s experience with a dencepointstotheinferiorcolliculus(IC)asthesourceoftheFFR tonelanguage,asshownbythelackofsimilarCPeffectsinnontone generator, as reflected in a latency correspondence between FFR languagelisteners.Withrespecttopitchmovement,itappearsthat and the potential recorded directly from the IC (Smith, Marsh, & acontinuumrangingfromoneorleveltonetoanotherisnotper- Brown,1975);selectiveamplitudereductionoftheFFRfollowing ceivedcategorically(Abramson,1979;Francis,Ciocca,&Ng,2003), cryogeniccoolingoftheIC(Smithetal.,1975);absenceofFFRin whereasacontinuumrangingfromahighleveltonetoahighris- individuals with lesions confined to the IC (Sohmer & Pratt, ing contour tone is perceived categorically (Francis et al., 2003; 1977). The shorter latency of the FFR (around 6–9ms) correlates Wang,1976). well with activity from the IC region and is too early to reflect Indeed,ithasbeenshownthatsuchf featuresordimensions, activityfromcorticalgenerators(Galbraith,2008;Galbraithetal., 0 as opposed to categories, are critical to a fuller understanding of 2000).Furthermore,thenatureoftheauditorysystemmakesitun- toneperception.Asamultidimensionalperceptualattribute,pitch likelythatthelow-passfilteredphase-lockedactivityreflectedin reliesonseveralacousticdimensions(e.g.,height,slope,direction). theFFRisofcorticalorigin(Alkhounetal.,2008).However,there Psychophysicalevidenceforthemultidimensionalnatureofpitch iscompellingevidencetosuggestthatthisbrainstemcomponent perceptionintonelanguagescomesprimarilyfromcross-language is indeed subject to corticofugal modulation (Banai, Abrams, & multidimensional scaling (MDS) studies of dissimilarity ratings. Kraus, 2007; Suga, Gao, Zhang, Ma, & Olsen, 2000; Suga & Ma, BasedonasampleoftonelanguagesfromtheFarEast(Thai,Can- 2003). tonese,Mandarin,andTaiwanese)andWestAfrica(Yoruba)anda nontone language (English), three dimensions are reported to 1.4.Speechandlanguageprocessinginthebrainstem underlieacommonperceptualspace:averagef height,direction 0 off movement,magnitudeoff slope(Gandour,1978,1983;Gan- Therehasbeenincreasinginterestinroleoftheauditorybrain- 0 0 dour&Harshman,1978).Theirrelativeimportancevariesdepend- stem in speech processing within the past decade. In terms of ingonalisteners’familiaritywithspecifictypesofpitchpatterns speech intelligibility, FFRs show increased amplitude in response that occur in their native language. For example, the perceptual toforwardspeech,ascomparedtoreversedspeech,indicatingthat saliency of the contour dimension is greater for native speakers familiarphoneticandprosodicpropertiesofforwardspeechselec- oftonelanguagesthanforspeakersofEnglish,whileEnglishlisten- tivelyactivatebrainstemneurons(Galbraithetal.,2004).Usingthe ersgivegreaterweighttotheheightdimensionthandotonelan- /da/syllabletoelicitthebrainstemresponse,Krausandcolleagues guage speakers. Such differences in perceptual saliency suggest in a series of studies demonstrate how FFRs separately encode that long-term experience enhances listeners’ attention to pitch sourceand filtercharacteristics ofthespeechsignal(for reviews, dimensionsthatarephoneticallyrelevantinaparticularlanguage. seeJohnson,Nicol,&Kraus,2005;Kraus&Nicol,2005),andhow Recent MDS analyses of short-term training in the recognition of brainstemtimingpredictscerebralasymmetryforspeech(Abrams, lexicaltones(Francis,Ciocca,Ma,&Fenn,2008;Guion&Pederson, Nicol,Zecker,&Kraus,2006). 2007) show that training-related changes in listeners’ perceptual Intermsofsegmentalfeaturesofspeech,FFRspreservespectral space can be attributed to differential weighting of f height and peakscorrespondingtothefirsttwoformantsofbothsteady-state 0 direction dimensions comparable to those observed by Gandour vowels(Krishnan,1999,2002)andtime-variantconsonants(Krish- and colleagues. These perceptual data raise questions about the nan & Parkinson, 2000; Plyler & Ananthanarayan, 2001). Though nature of time-varying dimensions of lexical tones that apply to FFRsareknowntopreservepitch-relevantinformationaboutcom- pitchprocessingatthelevelofthebrainstem. plexsoundsthatproducetime-invariantpitch(Greenberg,Marsh, Brown,&Smith,1987),thequestionarisesastohowthebrainstem 1.3.Physiologicalmechanismsofpitch handlessuprasegmentalfeaturesofspeechthatischaracterizedby time-variantpitch. Thephysiologicalbasesofpitchperceptionarestillamatterof By virtue of time-variant pitch, tone languages are especially debate as reflected in the extant literature. One view is that the advantageous for isolating effects of encoding of voice pitch at auditory system extracts pitch from complex sounds by deriving theleveloftheauditorybrainstem.InMandarin,forexample,four aspectralprofilefromfrequency-specificauditoryinput,followed words may comprise a minimal quadruplet, minimally A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 137 distinguished by variations in pitch; but otherwise identical in scribe major findings from recent studies of experience- termsofconsonantandvowelsegments(seeSection1.1).Inaddi- dependent neural plasticity in the brainstem by A. Krishnan and tion to their linguistic significance, all four tones exhibit voice f hiscolleaguesatPurdueUniversityoverthepast5years.Aneval- 0 trajectories and harmonics that lie within the range of easily uation of this evidence will lead us to infer that although FFRs recordable FFRs (below 2Hz). The relatively longer duration of preserve pitch information of lexical tones in both native and citation forms (200–350ms) necessitates use of slower stimulus nonnative listeners, they are more robust in the former because repetition rates which in turn enables recording of robust FFRs of their long-term exposure to native pitch contours in natural withlittleornoneuraladaptation. speech. Early sensory encoding of pitch information below the cerebralcortexissensitive tolanguageexperience.Insteadof to- 1.5.Hierarchicalprocessingalongtheauditorypathwayandbeyond nal categories, pitch extraction in the brainstem is crucially dependent on specific dimensions or features of pitch contours. We have made considerable progress over the past decade in The interaction between auditory and cognitive processes is re- ourunderstandingofthecomplexseriesofprocessingstagesthat flected by the well-known differential sensitivity to rising and arerequiredtotranslatespeechsoundsintomeaningatthelevel falling pitch movements in the auditory system which is seen ofthecerebralcortex.Functionalimagingevidencepointstomul- to be enhanced in the brainstem depending upon the prosodic tiple, parallel, hierarchically organized processing pathways that needs of a particularlanguage. We propose that long-termexpe- are related tospeech processing in the cerebralcortex (Hickok & riencewithatonelanguagesharpensthetuningcharacteristicsof Poeppel, 2000, 2004, 2007; Obleser, Zimmermann, Van Meter, & neuronsalongthepitchaxiswithenhancedsensitivitytolinguis- Rauschecker, 2007; Poeppel, Idsardi, & van Wassenhove, 2008; tically-relevant, rapidly changing sections of pitch contours. Scott,2003;Scott&Wise,2004;Zatorre,Belin,&Penhune,2002; Though not specific to a speech context, experience-dependent Zatorre & Gandour, 2008). Speech processing in the cortex also brainstemmechanismsfor pitchrepresentation are clearly sensi- emerges from differential demands on distributed brain regions tivetoparticularaspectsofpitchcontoursthatnativespeakersof shared by both verbal and nonverbal auditory processing (Price, atonelanguagehavebeenexposedto.Allofthisevidencewillbe Thierry,&Griffiths,2005). broughttobearonmajortheoreticalissuesconcerningtheroleof Inthecaseofpitch,functionalimagingrevealshierarchicalpro- the brainstem in language processing. What is the nature of cessinginsubcorticalregionsalongtheauditorypathway.Encod- acoustic phonetic features at subcortical levels of processing? ing of temporal regularities of pitch begins as early as the What is the role of pitch mechanisms in the brainstem in the cochlear nucleus but is not completed until the auditory cortex encodingoflinguistically-relevantdimensionsofsounds?Looking (Griffiths,Uppenkamp,Johnsrude, Josephs,& Patterson,2001). Of at the auditorypathwayfrom the cochleato auditorycortex and relevance to FFRs, they found that the IC is more sensitive to theconnectionstothebrainstem,arelanguage-dependenteffects changesintemporalregularitythanthecochlearnucleus.Further tobeascribedtocorticofugalmechanismsorlocalreorganization evidenceofahierarchyofpitchprocessingisfoundinthecerebral in the brainstem itself in the hierarchical processing of the tem- cortex(Patterson,Uppenkamp,Johnsrude,&Griffiths,2002).When poral structure of sound? thepitchisvariedtoproduceamelody,activationmovesbeyond primaryauditorycortexwithrelativelymoreactivityintheright 2.Language-dependentplasticityintheencodingofpitchinthe hemisphere. humanbrainstem Electrophysiological recordings are crucial for investigating questions about the hierarchy of pitch processing not only corti- 2.1.Isitfeasibletousethescalp-recordedFFRtoevaluatetheneural callybutsubcorticallyas well(Griffiths,Warren,Scott,Nelken,& representationofpitchinthebrainstem? King, 2004). In addition, our focus on language and pitch in the brainstemstemsfromtheviewthatacompleteunderstandingof Auditory nerve single-unit population studies have demon- theprocessingoflinguistically-relevantdimensionsoftheauditory strated that phase-locking plays a dominant role in the neural signalcanonlybeachievedwithinaframeworkinvolvingaseries encoding of both the spectrum and voice pitch of speech sounds ofcomputationsthatapplytorepresentationsatdifferentstagesof (Cariani&Delgutte,1996a,1996b;Meddis&O’Mard,1997;Patt- processing(Hickok&Poeppel,2004,p.69).Theyarguethatearly erson et al., 1995). Phase-locked neural activity underlying the processing stages (e.g., brainstem) may perform transformations scalp-recordedhumanFFRhasalsobeenshowntoencodecertain ontheacousticdatathatarerelevanttolinguisticaswellasnon- spectral features of steady-state and time-variant speech sounds linguistic auditory perception. Scott (2003) similarly argues for as well as pitch of several complex sounds that produce time- hierarchicalprocessingatthecorticallevel,allowingforthepossi- invariant pitch percepts (Aiken & Picton, 2008; Alkhoun et al., bilityofdifferencesinthedegreeofprocessingofspeechandnon- 2008; Johnson et al., 2005; Krishnan, 2006). By extension, we speech stimuli. Indeed, we shall suggest that sensory hypothesized that the human FFR may preserve pitch-relevant representations of speech parameters at the brainstem are based information for speech sounds that elicit time-variant as well as ongradedscalesthataremodulatedbylanguageexperience. steady-state pitch percepts (Krishnan, Xu, Gandour, & Cariani, 2004). FFRs were recorded differentially between EEG type scalp 1.6.Scopeofthisreviewoflanguage-dependentprocessinginthe electrodesplacedonthehighforeheadatmidlineandattheipsi- brainstem lateral mastoid in response to the four lexical tones of Mandarin Chinese. Short-term autocorrelation functions and running auto- Ithasalsobeenshownthatthemalleabilityoftheauditorysys- correlograms(ACGs)werecomputedfromtheFFRstoindexvari- temcanbeexploitedtostudytheinteractionbetweensensoryand ations in FFR periodicities over the duration of the response cognitiveprocessesatthelevelofthebrainstem(seeKraus&Banai, (Krishnan,Swaminathan,&Gandour,inpress;Krishnan,Xu,Gan- 2007,forreview).Theyemphasizethat‘‘auditoryprocessingisnota dour,&Cariani,2005).ACGrepresentstherunningdistributionof rigid,encapsulatedprocess;rather,itinteractsintimatelywithother all-order phase-locked intervals present in the population re- neuralsystems...andisaffectedbyexperience[music,language], sponse and quantifies periodicity and pitch strength variations environmentalinfluences...,andactivetraining...”(p.105). overtime.These measuresrevealedthatthephase-lockedneural This review will focus on the malleability of language-depen- activity underlying the FFR faithfully follows the pitch changes dent effects on pitch processing in the brainstem. We will de- presented in each stimulus (Fig. 1). 138 A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 attentionormemoryconfounds,andtherebyrepresentsanauto- matic, preattentive level of processing. As reflected by the MMN, it has been shown that language experience similarly influences the early cortical automatic processing of linguistically-relevant pitch contours (Chandrasekaran, Gandour, & Krishnan, 2007; Chandrasekaran,Krishnan,&Gandour,2007;Kaan,Wayland,Bao, & Barkley, 2007), and moreover, that lexical tonesare lateralized totherighthemisphere,incontrasttoconsonantswhicharelater- alizedtothelefthemisphere(Luoetal.,2006).TheLuoetal.data arenotirreconcilablewithpreviousPETandfMRIstudies.Rather theysuggestthathemisphericlateralityeffectsarearesultofspe- cializedneuralcomputationsthatapplytorepresentationsatdif- ferent stages of auditory processing. The leftward asymmetry observed in a discrimination task likely reflects neural computa- tions that occur downstream from preattentive auditory processing. In animals, it is already well-established that experience- dependent neural plasticity is not limited to the cerebral cortex. Fig.1. FFRvoicepitchcontours(solidlines)superimposedonstimulusf0contours ResponsepropertiesandfrequencymapsintheICofbatsundergo (brokenlines)forthefourMandarinChineselexicaltones(yi1‘clothing’highlevel; changeafterauditoryconditioningorfocalelectricalstimulationof yi2‘aunt’highrising;yi3‘chair’lowfalling-rising;yi4‘easy’highfalling).Pitchwas theauditorycortex(see Suga,1990,1994,forreviews).Auditory extracted using a short-term autocorrelation algorithm (Boersma, 1993) on experienceofalteredinterauralcuesforlocalizationinyoungowls multipleframesofthesignalutilizingaHanningwindowofeffectivelengthequal leadtofrequency-dependentchangesininterauraltimedifference to0.04s.(Krishnanetal.,2004,withpermissionofElsevier.) tuningandfrequencytuningofICneurons(Gold&Knudsen,2000). Other recent data also support experience-dependent neural plasticityattheleveloftheICinhumans.ThelatencyofwaveV Thesefindingsfor time-variantf suggestthata robustneural inhearing-impairedlistenerswhouseamplificationisshorterthan 0 temporalrepresentationforpitchispreservedinthephase-locked in those who do not (Philibert, Collet, Vesson, & Veuillet, 2005). neural activity of an ensemble of neural elements in the rostral UsingtheFFR,weobservethatbrainstemresponsesimproveafter brainstem (Hall, 1979). They extend previous findings of FFR auditorytraininginchildrenwithlearningimpairments(Russo,Ni- encoding of pitch-relevant information in complex stimuli with col,Zecker,Hayes,&Kraus,2005);pitchtrackingaccuracyofMan- time-invariantf (Greenbergetal.,1987).Anotherinterestingfind- darin tones is more accurate in nonnative musicians than 0 ing on voice pitch is that FFR pitch strengths of yi2 and yi3 were nonmusicians (Wong, Skoe, Russo, Dees, & Kraus, 2007); experi- greaterthanyi1andyi4.Ithaspreviouslybeendemonstratedthat ence with sounds composed of acoustic elements relevant to the FFR amplitude for falling tonal sweeps is smaller than that speech leads to developmental changes in brainstem responses for corresponding rising tonal sweeps (Krishnan & Parkinson, (Johnson, Nicol, Zecker, & Kraus, 2008); and pitch track accuracy 2000).Similarselectivitytorisingtonalsweepshasbeenobserved improves in native English-speaking adults after undergoing for cochlear microphonics (Shore & Cullen, 1984), eighth nerve short-termtrainingonusingMandarintonesinwordidentification compoundactionpotentials(Shore&Nuttall,1985),andresponses (Song, Skoe, Wong, & Kraus, 2008). Also relevant is the conse- oftheventralcochlearnucleusunits(Shore,Clopton,&Au,1987). quenceofadisruptioninthenormalinteractionbetweenlocalpro- ItisplausiblethatdifferencesinFFRpitchstrengthbetweenChi- cesses and the corticofugal modulation of subcortical function nese tones with rising (T2, T3) vs. nonrising (T1, T4) f contours, whichcontributestoplasticity.Thedeficitsinbrainstemencoding 0 may reflect a differential temporal response pattern for rising in children with a variety of language-based learning problems and falling tones among the neural elements generating the FFR. couldverywellreflectsuchadisruptionintheabilityofthecorti- Thus,weconcludethatthescalp-recordedFFRprovidesanoninva- cofugalsystemtofinetunesubcorticalprocesses(Banai,Nicol,Zec- sive window to view neural processing of voice pitch in human ker,&Kraus,2005;Russoetal.,2008;Song,Banai,&Kraus,2008). speechsoundsattheleveloftheauditorybrainstem. The question then arises whether early, preattentive stages of pitchprocessinginthebrainstemmayalsobeinfluencedbylan- 2.2.Doeslanguageexperienceinfluencetheneuralrepresentationof guageexperience.Weconductedacross-languagestudytodeter- pitchinthebrainstem? mine whether native speakers’ long-term exposure and experience using pitch patterns in a tonal language has an influ- Inthecerebralcortex,theneuralsubstratesofpitchperception enceonFFRresponseproperties(Krishnanetal.,2005).FFRswere in the processing of lexicaltones are shapedby language experi- elicitedbythefourMandarintonespresentedtonativespeakersof ence(Gandour,2006a,2006b,2007;Zatorre&Gandour,2008,for MandarinandnontonelanguagespeakersofEnglish.Ifdrivenby reviews). Based on evidence from positron emission tomography acousticpropertiesregardlessoflanguageexperience,FFRswould (PET) and functional magnetic resonance imaging (fMRI) studies beexpectedtobehomogeneousacrosslisteners. ofpitchprocessinginMandarinandThai,itappearsthatpitchpro- Resultsshowedthatpitchstrengthwassignificantlygreaterfor cessingengagesthelefthemisphereonlywhenthepitchpatterns the Chinese group than for the English across all four Mandarin areoflinguisticrelevance(cf.Wong,2002;Wong,Parsons,Marti- tones. Grand-average f contours extracted from FFRs elicited by 0 nez, & Diehl, 2004). These experiments all employed discrimina- theTone2stimulus(yi2)indicatethatbothlanguagegroupswere tion tasks, and thus likely reflect temporally aggregated neural abletofollowtherisingf contouroftheoriginalspeechstimulus 0 events at relatively late attention-modulated stages of auditory (Fig. 2). However, pitch tracking is more variable for the English processing. The mismatch negativity (MMN), a cortical event-re- group compared to the Chinese (see enlarged inset). Indeed, FFR lated response associated with auditory discrimination, provides pitchtracking,asmeasuredbyrank-transformedcross-correlation awindowonearlystagesofpitchprocessinginthecortex.Itiselic- coefficients,wassignificantlygreaterintheChinesegroupascom- ited using a passive oddball paradigm, eliminating task-related paredtotheEnglishacrossallfourMandarintones. A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 139 Asaninitialattempttoaddressthisquestion(Xu,Krishnan,& Gandour, 2006), we conducted an experiment to determine whether linear f ramps (90–140Hz, rising; 140–90Hz, falling), 0 similartoMandarinT2andT4indirection,butdissimilarintrajec- tory(Fig.3),elicitbrainstemFFRsdifferentiallyasafunctionoflan- guageexperience(Chinese,English).Itisnotedthatlinearramps donotoccurinnaturalspeechbecauseofphysiologicalconstraints of the speech production apparatus. In speech perception, how- ever,linearitymaybeoverriddenatlatercorticalstagesofprocess- ingwhichrecruitattentionandmemorycomponents.Forexample, behavioral data from multidimensional scaling (Gandour, 1983) and categorical perception (Xu, Gandour, et al., 2006) of lexical tonesreveallanguage-dependenteffectselicitedbylinearf ramps. 0 Butlinearrampsarenotecologicallyrepresentativeofwhatspeak- ersoflanguagesoftheworld,tonalorotherwise,areexposedto.By examiningFFRselicitedbylinearapproximationsofMandarinT2 andT4,wewereabletoassessthetolerancelimitsforpriminglin- guistically-relevantfeaturesoftheauditorysignalusefulforpitch Fig. 2. Grand-average f0 contours of Mandarin Tone 2 derived from the FFR extractionatthelevelofthebrainstem. waveformsofallsubjectsacrossbothearsintheChineseandEnglishgroups.Thef0 ANOVAsonFFRpitchstrengthandpitchtrackingaccuracyre- contouroftheoriginalspeechstimulusisdisplayedinblack.Theenlargedinset vealednosignificantmaineffectsforeitherlanguagegroup(Chi- showsthatthef0contourderivedfromtheFFRwaveformsoftheChinesegroup nese, English) or pitch direction (rising, falling). Neither was (red) more closely approximates that of the original stimulus (yi2 ‘aunt’) when comparedtotheEnglishgroup(green).(Krishnanetal.,2005,withpermissionof thereanyinteractionbetweenlanguagegroupandpitchdirection. Elsevier.) Weinferthatnolanguage-dependenteffectsareobservedinre- sponsetolinearrisingorfallingf rampsbecausetheyarenotpart 0 ofnativeChineselisteners’experience.Eventhough thef ramps 0 Current temporal encoding schemes of pitch extraction based aredynamic,linearapproximationsofT2andT4,theyareconstant onthedominantintervalinthedistributionofinter-spikeintervals inaccelerationanddeceleration,respectively.ThefactthatManda- rely purely on the acoustic properties of the stimulus (Cariani & rinandEnglishFFRsarehomogeneousinresponsetolineartrajec- Delgutte, 1996a, 1996b). In those schemes, no significant differ- tories suggests that pitch representations in the brainstem are ences in the characteristics of pitch encoding are predicted as a acutely sensitive to dynamic, curvilinear changes in trajectory functionoflanguageexperience.Ourfindingsdemonstrateother- throughoutthedurationofapitchcontour.Intheauditorybrain- wise,i.e.,thattheencodingschemeisnotstaticnorisitdedicated stem,neuralmechanismsrespondtospecificdimensionsofpitch tofaithfullyextractonlythephysicalpropertiesofthestimulus.In- contours that native speakers have been exposed to. Language- stead,theypointtoatemporalencodingschemewhichissensitive dependent neuroplasticity occurs only when salient dimensions to language experience. This language-dependent plasticity en- ofpitch-relevanttospeechperceptionarepresentintheauditory hancesorprimestemporalintervalsthatcarrylinguistically-rele- signal. vant features of pitch contours. The relatively greater pitch Furthersupportfordimensionalspecificitycomesfromarecent strengthandsmootherpitchtrackinginnativeMandarinlisteners studyinwhichFFRswererecordedfromChineseandEnglishpar- may reflect the operation of this language-dependent encoding scheme. We conclude that experience-driven adaptive neural mecha- nismsareinvolvedsubcorticallythatsharpenresponseproperties ofneuronstunedforprocessingpitchcontoursthataresensitiveto theprosodicneedsofaparticularlanguage.Fromtheperspective of auditory neuroethology, this adjustment in processing pitch contoursofMandarintonesiscomparabletoneuralmechanisms thataredevelopedforprocessingbehaviorallyrelevantsoundsin other nonprimate and nonhuman primate animals (Suga, Ma, Gao,Sakai,&Chowdhury,2003).Auditoryprocessingisnotlimited to a simple representation of acoustic features of speech stimuli. Indeed,language-dependentoperationsmaybeginbeforethesig- nalreachesthecerebralcortex. 2.3.Whatarethelimitsofsensitivityofbrainstemneuronsto linguistically-relevantfeaturesofthespeechsignal? InKrishnanetal.(2005),stimuliexhibitedprototypical,curvi- linearf contoursthatweremodeledafterMandarintonesthatoc- 0 cur in natural speech. If brainstem reorganization is induced by languageexperience,thequestionarisesastowhatthetolerance limits are for linguistic sensitivity at this subcortical level. What specificf propertiesorfeaturesofthepitchstimuli,staticordy- 0 namic, are relevant? To what extentcan a stimulusdeviate from naturalspeechexemplarsbeforeexceedingtheupperorthelower F(yigi2.‘3a.uSnpt’e,crtirsoinggralminseaarndrafm0cpo;nytio4u‘resasoyf’M,faanlldinagrinlinCehairnerasemspy)n.t(hXeut,icKsrpisehencahns,teimtaull.i, limitoflinguisticsensitivityofbrainstemneurons? 2006,adaptedwithpermissionofLippincottWilliams&Wilkins.) 140 A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 ticipantsinresponsetoaniteratedripplednoise(IRN;seeSection 1996). The delay-and-add process introduces temporal regularity 2.4fordetaileddescription)homologofaprototypicalT2incon- into the fine structure of the noise without appreciable changes trast to three f variants (two linear, one curvilinear) that do not inthespectrumexceptfor‘‘ripples”atfrequencyintervalswhich 0 occurintheMandarintonalspace(Krishnan,Gandour,Bidelman, are the reciprocal of the delay (Patterson et al., 1996). Increases &Swaminathan,2009).Ofthetwolinearvariants,onerepresented in temporal regularity of steady-state IRN stimuli lead to better alinearascendingramp(cf.Xu,Krishnan,etal.,2006),theother,a temporally-locked neural activity in auditory structuresfrom the tri-linear approximation of T2, preserving the major points of cochlearnucleustocortex(Bendor&Wang,2005;Griffiths,Buchel, inflection besides onset and offset. The curvilinear variant was a Frackowiak, & Patterson, 1998; Griffiths et al., 2001; Patterson polynomialflipofT2.Nogroupdifferencesinpitchstrengthwere etal.,2002;Shofner,1999).TheIRNalgorithmwasrecentlygener- observedforanyofthesevariants.Theabsenceoflanguagegroup alizedtoallowmultipletimedependentdelaysoverarangeofiter- effectsinresponsetocurvilinear,inadditiontolinearvariantsof ationsteps(Denham,2005),makingitpossibletodetectpitchin T2,emphasizesthatlanguage-dependentneuroplasticityatthele- dynamicIRNby humans.WeappliedthisdynamicIRN algorithm velofthebrainstemextendsonlytothosepitchpatternsthatactu- togeneratetime-variant,curvilinearpitchcontoursrepresentative allyoccurintheMandarintonalspace. ofthosethatoccurinnaturalspeech,specifically,polynomialmod- elingofMandarinTone2(Swaminathanetal.,2008a,2008b). 2.4.CanweextractFFRsfromnonspeechhomologsofecologically Temporalandspectralpropertiesofthetime-variantcurvilinear representativeMandarintones? risingtonestimulus(leftpanels)andelectrophysiologicalresponse atthebrainstem(rightpanels)atahighiterationstepareshownin Earlystagesofprocessingontheinputsidemayperformcom- Fig.4.Waveformperiodicity(1strow)andresolutionofcurvilinear putations on the acoustic data that are relevant to linguistic spectralbands(2ndrow)aremarkedlyimprovedatthehighiter- dimensionsevenwhenembeddedinanonspeechcontext.Indeed, ation step, as comparedto a low iteration step (cf. Swaminathan perceptualstudiesoftoneperceptionhavedemonstratedthatthe etal.,2008a,2008b),inbothstimulusandFFRresponse.Theauto- effectsoflinguisticexperiencemayextendtononspeechprocess- correlogramsshowclearerbands(black)oftemporalregularityin ingundercertainstimulusandtask(discrimination,identification) thestimulusandphase-lockedactivityintheFFRresponseatthe conditions(Bent,Bradlow,&Wright,2006;Luo,Boemio,Gordon,& highiterationstep(3rdrow). Poeppel,2007).Inthebrainstem,FFRspermitustoindexpitchpro- SuchdynamicIRNstimulienableustoevaluatethesensitivity cessing in a nonspeech context minus any volitional memory or ofFFRresponsestospeech-likepitchcontoursinaparametrically attention demands. Auditory brainstem responses (ABRs) have controllablewaywithoutlexical-semanticconfounds.Ourviewis beenshowntoreflectlargelyseparateneuralprocessesinresponse that although cross-language differences in FFR responses may tononspeech(broadspectrumclickofshortduration)andspeech emerge from language experience, the effects of such experience (temporallycomplexsyllableoflongerduration)stimuli(Song,Ba- are not necessarily specific to speech perception. FFR responses nai,Russo,&Kraus,2006).Whereasaclickvs./da/arenotequiva- to IRN stimuli modeled after homologous pitch features in the lent in acoustic complexity, the task before us is to create speech context are expected to show that neural mechanisms at nonspeechstimulithatareequivalenttospeechintermsofpitch the brainstem are targeting particular features of speech rather attributes. thanspeechperse. OurpreviousFFRfindingsbasedonspeechstimulileadustothe questionofwhetherlanguage-specificityintheencodingofpitch atthelevelofthebrainstemappliestolinguistically-relevantpitch changes that are devoid of concurrent segmental information, as when pitch patterns associated with lexical tones are presented inanonspeechcontext,thusblockingaccesstothementallexicon. To answer this question,we must eliminateany potential lexical bias for native listeners. That is, we need to be able to generate auditorystimuliinanonspeechcontextthatpreservethepercep- tionofpitch,butdonothavewaveformperiodicityorhighlymod- ulatedstimulusenvelopesthatarecharacteristicofspeechstimuli. If we are to simulate human speech perception, it is imperative that we be able to generate homologous pitch contours that are ecologicallyrepresentativeofwhatoccursinnaturalspeech.From theperspectiveofauditoryneuroethology(Suga,1994;Sugaetal., 2003), this would enable us to investigate neural mechanisms underlying the processing of pitch contours that are of linguistic relevancecomparabletothoseunderlyingtheprocessingofbehav- iorally relevant sounds in other nonprimate and nonhuman pri- mateanimals. Suchauditorystimuliexistintheformofiteratedripplednoise (IRN), which preserves the temporal regularity of the stimulus withouthavingtorepeatthewaveforminaperiodicmanner.An IRN stimulus is generated using a broadband noise which is de- layed and added to it repeatedly, and therefore does not have a prominentmodulatedenvelope(Patterson,Handel,Yost,&Datta, 1996;Yost,1996).Theperceivedpitchcorrespondstotherecipro- calofthedelay,andthepitchsalienceincreaseswiththenumber Fig.4. Waveform,spectrogram,andautocorrelogram,inorderfromtoptobottom, of iterations of the delay-and-add process. In its simplest form, ofthecurvilinearIRNhomologofMandarinTone2(leftpanels)andelectrophys- steady-state stimuli are constructed by adding broadband noise iological response at the brainstem (right panels) for a high iteration step. iterativelytoitselfwithaconstantdelay(Yost,Patterson,&Sheft, (Swaminathanetal.,2008a,2008b,adaptedwithpermissionofIEEE.) A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 141 2.5.DoFFRstoIRNhomologsofMandarintonesvaryasafunctionof whole,f contoursderivedfromtheFFRwaveformsoftheChinese 0 languageexperience? groupmorecloselyapproximatethoseoftheoriginalIRNstimuli. Cross-languagecomparisonsshowedthatcross-correlationcoeffi- Wehypothesizedthatpitchrepresentationinthebrainstemin cients were significantly larger in the Chinese than the English responsetoIRNhomologsofthefourMandarintones,asreflected group for T2, T3, and T4, but not T1. FFR spectral variations are byFFRs,wouldbemorerobustinnativespeakersofMandarinChi- stronglyrepresenteduptothe5thharmonicintheChinesegroup, nese as compared to monolingual English speakers who had no butonlyuptothe3rdor4thharmonicintheEnglishgroup.The prior knowledge of Mandarin or any other tone language (Krish- betterrepresentation ofmultiplepitch-relevantharmonicsin the nan,Swaminathan,etal.,inpress).Pitchstrength(wholecontour, Chinese group points to more accurate pitch tracking as well as 250ms;40mssegments)andpitchtrackingaccuracy(wholecon- strongerpitchrepresentations. tour) were extracted from the FFRs using autocorrelation algo- FFR pitch strength, as measured by the average magnitude of rithms. Narrow band spectrograms were used to extract spectral thenormalizedautocorrelationpeakperlanguagegroup,isshown information. forsixtonalsectionswithineachofthefourIRNhomologsofMan- Fig. 5 (left panels) shows that overall pitch tracking, as mea- darintones(Fig.5,rightpanels).PitchstrengthvaluesfortheChi- sured by the time lag associated with the autocorrelation maxi- neseandEnglishgroups,respectively,appearaboveandbelowthe mumperlanguagegroup,islessvariablefortheChinese(dashed f contour.InT2andT3,pitchstrengthoftheChinesegroupwas 0 line) comparedto theEnglish group(dotted line). That is,on the greater than the English group in three out of six tonal sections Fig.5. PitchtrackingaccuracyofIRNhomologsofMandarintones(left)andpitchstrengthoftonalsections(right)derivedfromthegrandaveragedFFRwaveformsof ChineseandEnglishsubjects.ThefourMandarintonalcategoriesarerepresentedasT1,T2,T3,andT4.LeftpanelsshowthattheFFR-derivedf0contoursoftheChinesegroup (dashedline)morecloselyapproximatethoseoftheoriginalIRNstimuli(solidline)whencomparedtotheEnglishgroup(dottedline).Rightpanelsshowthatthepitch strengthoftheChinesegroup(valueabovethesolidline)isgreaterthanthatoftheEnglishgroup(valuebelowthesolidline).Verticaldottedlinesdemarcatesix40-ms sectionswithineachf0contour:5–45,45–85,85–125,125–165,165–205,and205–245.SectionsthatyieldedsignificantlylargerpitchstrengthfortheChinesegrouprelative toEnglishareunshaded;thosethatdidnotareshadedingray.(Krishnan,Swaminathan,etal.,inpress,withpermissionofMITPress.) 142 A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 (Fig.5,rightpanels,unshaded).InT1andT4,pitchstrengthofthe comes support the view that at early stages of brain processing, Chinese group was greater than the English group in all tonal neural mechanisms underlying speech perception are shaped by sections. particulardimensionsofpitchpatternsregardlessofthestimulus A discriminant analysis was used to determine the extent to contextinwhichtheyareembedded. which individual subjects can be classified into their respective languagegroupsbasedonaweightedlinearcombinationoftheir 2.6.Dospeechstimuliinducemorerobustneuralphase-locking,as pitchstrengthofthree40-mstemporalintervalsthatweremaxi- reflectedintheFFR,thannonspeechstimuli(IRN)withdegraded mally differentiated in terms of slope (flat, rising, falling). About periodicity? 83%ofsubjectswerecorrectlyclassifiedintotheirrespectivelan- guagegroups.ThegroupcentroidoftheChinesegroupwaslarger At an early preattentive ‘subcortical’ stage of processing, FFRs than that of the English. Univariate tests of pitch strength con- elicitedinresponsetoMandarintonesrevealsmootherpitchtrack- firmedthatmore dynamicchangesin pitch(rising, falling)had a inginnativevs.nonnativelisteners,nomatterthecontext,speech greaterinfluenceontheFFRresponsesoftheChinesegrouprela- or nonspeech (Krishnan, Swaminathan, et al., in press; Krishnan tive to the English, whereas less dynamic changes in pitch (flat) etal.,2005).Bymeasuring‘pitchstrength’,peakofautocorrelation didnotyieldalanguagegroupeffect. function,weareabletofocusonindividualsectionsofpitchcon- Pitchstrengthoftherisingf trajectorywasthemostimportant tours. In our next cross-language study (Swaminathan et al., 0 in discriminating listeners by language affiliation. Both psychoa- 2008a,2008b),FFRswererecordedfromChineseandEnglishpar- coustic(Collins&Cullen,1978;Klatt,1973;Nabelek,1978;Scho- ticipantsinresponsetofourMandarintonalcontourspresentedin uten,1985)andphysiologicstudies(Krishnan&Parkinson,2000; aspeechandnonspeechcontext.Thismadeitpossibletocarryout Shore&Nuttall,1985;Shoreetal.,1987)indicatebettersensitivity adirectcomparisonbetweenspeechandnonspeech(IRN)stimuli forrisingvs.fallingtones.Multidimensionalscalinganalysesshow todeterminewhetherthebrainstemmechanisms responsiblefor thattheperceptualdimensionrelatedtodirectionofpitchchange extractingpitchinformationaresusceptibletostimulusdegrada- isspatiallydistributedprimarilyintermsofrisingvs.nonrisingf tion. Another objective was to determine whether experience- 0 movements (Gandour, 1983; Gandour & Harshman, 1978). In a dependent neural mechanisms for pitch representation in the previous FFR study (Krishnan et al., 2004), pitch strength of T2 brainstem are sensitive to specific time-varying features of pitch and T3, averaged across the stimulus, is greater than that of T1 contoursthatnativespeakersofatonelanguagearefamiliarwith andT4.BothT2andT3containsteeperrisingpitchsections.This regardlessofcontext. response asymmetry in FFRs presumably reflects greater neural Asexpected,giventherelativelylessrobusttemporalperiodic- synchrony(Shore&Nuttall,1985)andmorecoherenttemporalre- ity in the stimulus waveform, pitch strength was observed to be sponsepatternstorisingthantofallingtones(Shoreetal.,1987). greater for speech than nonspeech stimuli for both English and PerhapsthegreaterweightingtowardsrisingpitchfortheChinese Chinese listeners. Notwithstanding, dynamic IRN stimuli do pre- groupinthisstudyreflectsanexperience-dependentenhancement serve fine-grained measures of pitch representation at the level ofbothneuralsynchronyandtemporalresponsepatterncoherence ofthebrainstem. amongtheneuralelementsgeneratingtheFFR. Regardlessofcontext,pitchstrengthoftheChinesegroupwas Whichtime-varyingfeaturesordimensionsofthelexicaltones greaterthanthatoftheEnglish(Fig.6).However,groupdifferences arearguablymorerelevanttopitchprocessingatthelevelofthe inpitchstrengthwerenotuniformthroughoutthedurationofFFR brainstem?Asreflectedbythepositivecorrelationbetweenpitch responses to either speech or their IRN homologs. In some tonal strengthandaccelerationacrosstones,weproposethatthedegree sections that have rapid changes (e.g., T4, S4; speech), the two ofaccelerationordecelerationisacriticalvariablethatinfluences language groups did not differ in pitch strength. In others that pitchextractionintherostralbrainstem.Asreflectedbythenum- arerelativelysmooth(e.g.,T1,S3;speech),theydid.Nonetheless, ber of sections of maximal acceleration/deceleration within each weinferthatneuralmechanismsinthebrainstemarenotrespond- tone itself, language experience is observed to have an influence ing to lexical tones per se, but rather to specific time-varying onpitchstrengthprimarilyinthosetonalsectionsexhibitinghigh- acoustic properties of the input stimuli. Moreover, the degree of erdegreesofaccelerationordeceleration.Thesefindingsarecon- acceleration(e.g.,T3,S5)anddeceleration(e.g.,T4,S5)ofthepitch sistent with speech production data showing that f patterns in trajectories seems to be a critical factor. We hypothesize that 0 Mandarinhaveagreateramountofdynamicmovementasafunc- cross-language differences in the sustained phase-locked activity tionoftimeandnumberofsyllablesthanthoseinEnglish(Eady, ofthebrainstemreflectanenhancementofselectivitytopitch-rel- 1982). This experience-dependent effect, however, occurs in the evantperiodicitiesthatcorrespondwithrapidlychangingdynamic brainstemonlywhenstimulireflectcurvilineardynamiccontours segmentsofthepitchcontour. representative of Mandarin tones as opposed to linear dynamic approximations(cf.Xu,Krishnan,etal.,2006). 3.Theoreticalframeworkforbrainstem,pitch,andlanguage Overallwehavedemonstratedthatexperience-dependentneu- ralmechanismsforpitchrepresentationatthebrainstemlevelare 3.1.Neuralmechanismsunderlyingexperience-dependentpitch sensitive to specific dimensions of pitch contours that native enhancementinthebrainstem speakers of a tone language are frequently exposed to in natural speech.Theysuggestthatbrainstemneuronsaredifferentiallysen- Long-termexperiencewithlanguageandmusic(Krishnanetal., sitivetochangesinpitchwithoutregardtothecontextinwhich 2005; Musacchia, Sams, Skoe, & Kraus, 2007; Wong et al., 2007) they are presented. We infer that the role of the brainstem is to andshort-termauditorytraining(Hayes,Warrier,Nicol,Zecker,& facilitate cortical level processing of pitch-relevant information Kraus,2003;Russoetal.,2005;Song,Banai,etal.,2008;Song,Skoe, by optimally capturing those dimensions of the auditory signal et al., 2008) can influence neural encoding in the brainstem. We thatareoflinguisticrelevance. herebyproposeanempirically-driventheoreticalframeworktoac- Byfocusingontonalsectionsinsteadofthewholetone,weare countforourdatashowingexperience-dependentpitchrepresen- abletodeterminewhetherlanguage-dependenteffectsarebetter tationinthebrainstem.Thisframeworkhasthreecomponents:(i) conceptualizedasapplyingthroughoutthetonalcontour,oralter- afunctionalconnectivitymodelthatconsiderstheIC,MGB(medial natively, as applying to sections that exhibit certain acoustic geniculate body) and the AC (auditory cortex) as an ipsilateral dimensionsirrespectiveoftonalcategory.Suchexperimentalout- dominantfunctionalunitwithdominantinputsfromthecontralat- A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 143 Fig.6. PitchstrengthoftonalsectionsderivedfromtheFFRwaveformsinresponsetospeech(left)andnonspeech(right)stimuliforthetwolanguagegroups.Thefour MandarintonalcategoriesarerepresentedasT1,T2,T3,andT4.Consistentacrossspeechandnonspeechstimuli,pitchstrengthoftheChinesegroup(valueabovethesolid line)isgreaterthanthatoftheEnglishgroup(valuebelowthesolidline).SectionsthatyieldsignificantlylargerpitchstrengthfortheChinesegrouprelativetoEnglish (unshaded)arethosesectionswhichgenerallyexhibitlargeraccelerationordecelerationvalues.SeealsocaptionofFig.5.(Swaminathanetal.,2008a,2008b,withpermission ofLippincottWilliams&Wilkins.) eral ear; (ii) a dominant ipsilateral corticocollicular component Schonwiesner,Krumbholz,Rubsamen,Fink,&vonCramon,2007). that presumably facilitates experience-dependent reorganization Asfarasweknow,thereareasofyetnodataonexperience-depen- of pitch mechanisms in the brainstem during critical periods of denteffectsoneardominanceatpreattentivesubcorticalstagesof speech and language development to optimally encode linguisti- tonalprocessing.Ifelectrophysiologicalactivityatthebrainstemis cally-relevantpitch;and(iii)atemporallybasedpitchmechanism drivenbyacousticsonly,thenullhypothesisisthateffectsshould in the IC. This framework can also be expanded to incorporate be homogeneous across listeners regardless of language experi- independent local mechanisms subsequent to reorganization by ence.AnalternativehypothesisisthatearasymmetryintheICis corticofugalinfluence. alsoinfluencedbylanguageexperience,andthattheireffectswill begreaterthanthosepredictedbystructuralasymmetryalone. 3.2.Theroleofcorticofugalpathwaysinexperience-dependentpitch LetusillustrateacaseoffunctionalasymmetryinICresponses encoding elicitedbyREstimulation,shownbythickerredlinescomparedto the LE (Fig. 7). For linguistically-relevant sounds, this functional Thehumanauditorypathway(Fig.7)iscontralateraldominant asymmetryisinitiallyreinforcedbycorticofugalinput(CF )topro- L suchthatthelateralizedresponseintheleftauditorycortex(AC) motereorganizationintheICforlanguageacquisition.LEstimula- L directlyreflectsthatintheMGB andIC andisdominatedbystim- tion may still yield stronger responses for native relative to L L ulationofthecontralateralear(Langers,vanDijk,&Backes,2005; nonnative listeners because of the weaker contralateral (CF and C 144 A.Krishnan,J.T.Gandour/Brain&Language110(2009)135–148 claimed to induce large scale frequency-specific plasticity in the loop. Top–downguidedplasticitymayalsobemediatedbymemory- basedtheories.Ithasbeenproposedthatsensoryspecificmemory guidesplasticity,insteadofhigher-levelmemoryfunctionsmedi- ated in prefrontal and parietal regions, to enhance encoding at thesensorylevelandincreasetheprobabilityofcreatingaccurate sensory memory traces (Pasternak & Greenlee, 2005). Long-term storedepisodesofnativepitchpatterns(cf.pitchtemplates)may also guide plasticityby engaging thecorticofugal pathways (Gol- dinger, 1998). Thus, the corticofugal system is triggered during learningtoenhancesensoryencodingofspecificdimensionsthat arebehaviorallyrelevant. Whilethecorticofugalsystemmaybecruciallyinvolvedinthe experience-driven reorganization of subcortical neural mecha- nisms,itisnotnecessarilydedicatedtomaintenanceoflong-term, permanent on-line subcortical processing. We already know that thecorticofugalsystemcanleadtosubcorticalegocentricselection of behaviorally relevant stimulus parameters in nonprimate and nonhuman primate animals (Suga & Ma, 2003; Suga et al., 2000, 2003).Inthecaseofhumans,thecorticofugalsystemlikelyshapes the reorganization of the brainstem pitch encoding mechanisms Fig. 7. Anatomic model of the auditory pathway illustrating asymmetry in IC responses favoring RE stimulation, shown by the relatively thicker red lines forenhancedpitchextractioninearlierstagesoflanguagedevelop- compared to LE stimulation (blue lines). For linguistically-relevant sounds, this ment when plasticity presumably would be most vigorous (Keu- functionalasymmetryisinitiallyreinforcedbytheCFLinputtopromotereorga- roghlian & Knudsen, 2007; Kral & Eggermont, 2007). Once this nizationintheIC forperceptuallearning.LE,leftear; RE,rightear; IC,inferior reorganizationiscompleteoverthecriticaldevelopmentalperiod, colliculus(brainstem);subscriptsLandR,leftandright,respectively;MGB,medial localmechanismsinthebrainstemaresufficienttoextractlinguis- geniculate body (thalamus); AC, (primary) auditory cortex; CF, corticofugal; subscript C, contralateral. At the level of the IC and AC, the solid black double tically-relevantpitchinformationinarobustmannerwithoutfur- arrowindicatesthatpitchinformationmaybetransmittedcontralaterallyineither ther corticofugal interaction. Importantly, however, we would direction. continuetoexpectcorticofugalinfluencetoplayanimportantrole underavarietyof challenging auditorycircumstances(e.g., noise degradation, learning novel stimuli, retraining of hearing-im- IC )inputstotheIC .Linguistically-relevantsoundsmayalsotrig- paired,andadaptationtounfamiliar‘accents’). C R ger the relatively smaller CF influence. In contrast, nonnative Severallinesofevidencecanbeadducedinfavoroflocalmech- R listenersmayexhibitafunctionalasymmetryfavoringLEstimula- anisms in the brainstem as mediating the experience-dependent tion,shownbytherelativelythickerbluelinescomparedtotheRE. plasticityobservedinourcross-languageFFRdata.First,thecorti- This latter asymmetryeffect is less than that observedfor native cofugalegocentricselectionisshort-term,andtakestime(latency) listenersbecausethereisnoexperience-dependentreorganization tobeactivated,whereastheFFRresponselatencyisonlyabout6– in the IC affected by the CF to enhance pitch representation. 8ms.EvenconsideringthesustainedportionofourFFRresponse R R Thus, we see how corticofugal mechanisms can facilitate experi- (260mstimewindowforourstimuli),theslowertimeconstants ence-dependent reorganization for pitch processing in the ofadaptiveplasticity(Dean,Robinson,Harper,&McAlpine,2008) brainstem. and the medial olivary cochlear bundle (MOCB) reflex (Backus & Other theoretical schemes have been proposed that invoke Guinan, 2006) are much too sluggish to effectively influence the corticofugalinfluencetomodulatesubcorticalsensoryprocessing. dynamic pitch pattern over its entire duration. Moreover, the The reverse hierarchy theory (RHT) provides a representational MOCBreflexwouldrequireustoinvokebottom–upprocesses,in- hierarchy to describe the interaction between sensory input and steadoftop–down,thatcaninfluencetheneuralplasticityseenat top–down processes to guide plasticity in primary sensory areas the IC level. Secondly, plasticity in the IC of mice persists after (Ahissar & Hochstein, 2004; Nahum, Nelken, & Ahissar, 2008; deactivation of the corticofugal system (Yan, Zhang, & Ehret, Shamma, 2008). Its basic claim is that neural circuitry mediating 2005).Infactithasbeendemonstratedinanimalmodelsthatonce acertainperceptcanbemodifiedstartingatthehighestrepresen- tuning is well-established with corticofugal modulation, local tational level and progressing to lower-levels in search of more mechanismscanmaintaintheplasticitywithoutpermanentcorti- refined high resolution information to optimize perception. RHT cofugalinfluence(Gao&Suga,1998;Ji,Gao,&Suga,2001).Thirdly, has been invoked as a plausible explanation for top–down influ- andamorecompellingargumentinfavorofalocalmechanismis encesonsubcorticalsensoryprocessing(Banaietal.,2007). the absence of a cross-language FFR effect at the brainstem level Another proposed circuitry for mediating learning-induced elicitedbylinear,orevennonnativecurvilinear,f contours(Krish- 0 plasticity is the cortico–colliculo–thalamo–cortico-collicular nan,Gandour,etal.,2009;Krishnan,Swaminathan,etal.,inpress; (CTCC)loop(Xiong,Zhang,&Yan,inpress).Thisframeworkcon- Krishnanetal.,2005;Xu,Krishnan,etal.,2006).Indeed,multidi- tains bottom–up (colliculo-thalamic, thalamo-cortical) and top– mensionalscalingdatashowthatdimensionsunderlyingtheper- down (corticofugal) projections, forming a tonotopic CTCC loop ception of linear f contours, similar to those in Xu, Krishnan 0 thatisassumedtobetheonlyneuralsubstratecarryingaccurate etal.(2006),areweighteddifferentiallyasafunctionoflanguage auditoryinformation.TheCTCCmodelissimilartoourfunctional experience(Gandour,1983;Gandour&Harshman,1978).Cortical connectivitymodel,bothinstructureandproposedfunction.Addi- modulationofthebrainstemresponsewouldhaveledustoexpect, tionally,the CTCC loop incorporates severalneuromodulatory in- contrarytofact,differentialFFRresponsestothelinearf contours 0 puts forming a core neural circuit mediating sound-specific between Chinese and English listeners. These observations to- plasticityassociatedwithperceptuallearning.Auditorystimuliin- getheraremoreconsistentwiththeviewthattheoperationofthis duce activation of the loop; and neuromodulatory inputs are language-dependentencodingscheme,presumablyinducedbyna-
Description: