1 Individual Differences in Children’s Perception of Foreign-Accented Speech: The Role of Temperament Undergraduate Research Thesis Presented in Partial Fulfillment of the Requirements for graduation with honors research distinction in Speech and Hearing Science in the undergraduate colleges of The Ohio State University by Sarah Mabie The Ohio State University May 2015 Project Advisor: Professor Rachael Frush Holt, Ph.D., Department of Speech and Hearing Science 2 Abstract Individual differences in children’s speech perception are large and have been partially attributed to differences in executive function (e.g., Lalonde & Holt, 2014), but much unexplained variability remains. Temperament is a potential influence that has been studied in related fields, such as stuttering (Eggers et al., 2010), but has been ignored as a contributing factor in children’s speech perception. We investigated the influence of temperament on individual differences in children’s perception of foreign-accented speech. Eighty-four 5- to 7-year-old monolingual English children were presented with 60 English sentences produced by either a native English or Mandarin talker (Wildcat Corpus; Van Engen et al., 2010) embedded in multi-talker babble at +8 dB SNR. For 30 sentences, the final (target) word was highly predictive from sentence context and for the other 30, it was not; the same final words appeared in both predictability conditions. Temperament was assessed with the very short form of the Children’s Behavior Questionnaire (Putnam & Rothbart, 2006), which was completed by each child’s parent/caregiver. Semantic context was of similar benefit to children in both the native and foreign-accent conditions. Children who scored high on Surgency (scales of positive emotion, reflecting a tendency to enjoy high-intensity activities) had poorer word recognition in both predictability conditions. These results preliminarily suggest that temperament contributes to individual differences in children’s speech perception in that children who desire a high level of activity tend to perform more poorly on difficult speech perception tasks. Keywords: foreign-accented speech, child temperament, speech perception 3 I. Introduction On a daily basis, listeners must compensate for variability in speech signals. This variability has been attributed to idiolect differences, positional effects, speaking rate differences, and coarticulatory effects (Bent & Holt, 2013). Compared to native speech, foreign-accented speech differs in both segmental and suprasegmental domains (Bent & Holt, 2013). Segmental variability is displayed in phoneme additions, distortions, substitutions, and omissions (Bent & Holt, 2013), and can be based upon the phonological constraints of a speaker’s native language (Adank, Evans, Stuart-Smith, & Scott, 2009). For example, Japanese learners of English often have difficulty discriminating between /l/ and /r/ due to the phonological constraints of Japanese (which does not contain this phonemic distinction). Suprasegmental variance includes word stress, intonation patterns, and rhythm (Adank et al., 2009; Bent & Holt, 2013). When listening to a nonnative speaker, listeners must compensate for these segmental and suprasegmental variances (Adank et al., 2009). The result is that adults’ recognition of foreign-accented speech is poorer than the recognition of native speech, especially in background noise (Adank et al., 2009; Munro & Derwing, 1995; Rogers, Dalby, & Nishi, 2004). For example, Munro and Derwing (1995) reported that native English listeners make more speech recognition errors and show longer response times when listening to nonnative speakers of English. Background Noise Effects The presence of background noise is particularly detrimental to the perception of foreign- accented speech (Adank et al., 2009). Rogers et al. (2004) found a correlation between adverse listening conditions and the perception of foreign-accented speech when comparing the intelligibility of native English sentences and Mandarin-accented English sentences. Sentences were presented in quiet and at three signal-to-noise ratios (SNRs) (+10 dB, 0 dB, and -5 dB). 4 Intelligibility was measured as a proportion of correctly identified target words. Whereas the addition of noise was detrimental in both native and nonnative conditions, noise showed substantially greater detriments to the perception of Mandarin-accented sentences compared to native English sentences. It was also found that an increasingly poor SNRs caused greater degradations to intelligibility in the nonnative condition compared to the native condition. Adank et al. (2009) found a similar influence of background noise on native dialectal variations. When sentences of the Glaswegian English (GE) dialect of Scotland were presented in moderately adverse listening conditions of +3 dB and 0 dB SNR, speakers of the Standard English (SE) dialect of England were slower to give correct responses in comparison to presentations of SE dialect sentences (SE speakers self-reported little to no experience with GE speakers prior to the study). This delay in response reflects additional processing costs placed on the listener when compensating for dialectal differences. Developmental Effects There is also a benefit of age (or experience) on perceiving foreign-accented speech. Bent (2014) found that adults perform better than children [4;0 to 7;7 (years;months)] on native and foreign-accented word identification tasks. In this experiment, both age groups were exposed to either a native speaker of American English or a native speaker of Korean. Within each condition, half of the words were lexically “easy” and half were lexically “hard.” Lexical difficulty was defined by word frequency and neighborhood density according to the Neighborhood Activation Model of spoken word recognition (Luce & Pisoni, 1998). Sentences were embedded in a speech- shaped noise masker at a +5 dB SNR. Participants were asked to repeat the word they heard. Results showed that adults performed better than children, stimuli produced by the native speaker were correctly identified more often than the non-native speaker stimuli, and the lexically easy 5 words were identified correctly more often than the lexically hard words. An interaction between lexical difficulty and age was also significant. This was due to adults showing a larger benefit between the lexically easy and hard words than child listeners. In addition, Bent (2014) found that children’s perception of foreign-accented speech improved with age. This suggests that foreign- accented speech perception improves during this developmental time period. Context Effects Another area of interest is whether listeners benefit from semantic context when listening to a nonnative speaker, specifically in comparison to perception of a native speaker. It is easier to identify words in a sentence if they are semantically predictable from the preceding context (Duffy & Giolas, 1974; Kalikow, Stevens, & Elliot, 1977). For example, identification of the word ‘cake’ is easier in a sentence like “He blew out the candles on the birthday cake” than “He talked about the cake.” The semantic context effect is reduced, however, at poor SNRs (Kalikow et al., 1977). Clopper (2012) reported that English sentences presented in noise ending in high-predictability words were more intelligible than the low-predictability target words for adult listeners. This shows that a poor SNR did not completely eliminate listeners’ reliance on semantic cues. However, when the talker dialect was less familiar, the listeners relied less on semantic information (Clopper, 2012). According to the cue-weighting model of speech perception (Mattys, White, & Melhorn, 2005), different cues are used in easy and hard listening situations. For instance, Mattys et al. (2005) found that in easy listening conditions, listeners rely on lexical and semantic cues more than segmental cues, whereas in difficult listening conditions, listeners rely more on segmental cues than lexical and semantic ones. In addition, Bradlow and Alexander (2008) found that the benefit of semantic context is different for native and nonnative English speaking listeners. English sentences produced in either plain or clear speech, with the final target word either predictable or 6 not predictable from context, were presented in noise to native and nonnative adult English listeners. Results revealed that, whereas native listeners benefitted from the acoustic and semantic enhancements separately and in combination, nonnative listeners’ word recognition only improved when both enhancements were available. The results of this study suggest that whereas native and nonnative listeners apply similar strategies for speech-in-noise perception, nonnative listeners require more favorable signal clarity in order for contextual information to be of benefit. Recent work has investigated children’s use of semantic cues in degraded listening conditions for native speech. Fallon, Trehub, and Schneider (2002) compared 5-year-olds’, 9-year- olds’, and adults’ accuracy in identifying final (target) words in high- and low-context sentences at various levels of background noise. Low-noise condition SNRs and high-noise condition SNRs were created for each age group (3 dB harder for each increasing age group). Listeners were assigned to one of the two noise conditions. Overall, Fallon et al. (2002) reported that all listeners, regardless of age, identified the target words in the high-context sentences more accurately than the low-context sentences, and more accurately in the lower noise conditions than higher noise conditions. Whereas 5-year-olds performed poorer than the older participants, they still benefitted from context in the presence of background noise. This suggests that noise does not impede children’s use of contextual cues. A goal of the current study is to extend these results by examining context effects for foreign-accented English sentences in 5- through 7-year-old normal- hearing children. Individual Differences: Temperament Children encounter speakers with nonnative accents frequently so it is important to understand what influences their ability to perceive this phonologically and phonetically varying speech signal. One possible contributing factor that has not been explored is temperament. 7 Temperament is defined as “constitutionally based individual differences in reactivity and self- regulation, in the domains of affect, activity, and attention” (Rothbart & Bates, 1998, p.100). ‘Constitutionally based’ refers to temperament being inherent to an individual from birth; that is, it has a biological basis (Eggers et al., 2010). ‘Reactivity’ refers to an individual’s responsiveness to changes in the environment, and can be measured by threshold, intensity, latency of response, and the rise and recovery time (Rothbart & Bates, 1998; Zetner & Bates, 2008). This includes an individual’s response to fear and negative emotionality. Self-regulation is an individual’s ability to control and modulate reactivity (Rothbart & Bates, 1998). Temperament is seen as the core of personality (Rothbart & Bates, 1998), and can be modified by heredity, maturation, and experience (Eggers et al., 2010; Rothbart, Ahadi, Hershey, & Fisher, 2001; Strelau, 1983). Traits are not continuously expressed, but rather are elicited by appropriate conditions (Rothbart & Bates, 1998). Rothbart and Bates (1998) proposed that temperament traits show consistency over time, but traits exhibiting stability still can change over time in the way they are expressed. For example, a 6-year-old spends less time crying than a 6-month-old, but spends more time worrying (Rothbart & Bates, 1998). Temperament contributes to the development of social-emotional and personality profiles (Rothbart & Bates, 1998). Some children may be more responsive to reward, while others are more responsive to punishment, implicating temperament in social learning. Coping strategies are also developed under the influence of temperament (Rothbart & Bates, 1998). According to the Neural Model of Temperament developed by Rothbart & Bates (1998), temperament is constructed by three broad factors: Surgency, Negative Affect, and Effortful Control. Surgency is part of the reactivity domain and contains scales of positive emotionality such as approach, high intensity pleasure, activity level, and negative scales of shyness (Eggers et al., 2010). Surgency is a child’s tendency to approach new situations in a positive emotional state, and 8 contains scales of positive emotion (Eggers et al., 2010). Negative Affect is the second and final factor in the reactivity domain and consists of scales of negative emotion including fear, discomfort, anger/frustration, sadness, and negative scales of soothability (Eggers et al., 2010). The self-regulation domain consists of the factor Effortful Control, which consists of scales of attentional focusing, attentional shifting, and inhibitory control (Eggers et al., 2010). Effortful Control is an individual’s ability to regulate her/his attention, and inhibit dominant responses for subdominant responses (Eggers et al., 2010; Zetner & Bates, 2008). Children who have higher loadings on the approach scales of surgency will be more open to meeting strangers, compared to children with higher loadings on the fearful scale of negative affect who therefore develop strategies to avoid strangers (Rothbart & Bates, 1998). Temperament has been studied in related fields as a potential influence on speech and language development and differences. Eggers et al. (2010) found significant differences between typically developing children and children who stutter in the composite factors of Negative Affect and Effortful Control when using the Dutch version of the Children’s Behavioral Questionnaire (Van den Bergh & Ackx, 2003). Children between the ages of 3- and 8-years-old who stuttered had lower scores on scales of inhibitory control and attentional shifting, and had higher scores on scales of anger, frustration, approach, and motor activation compared to age- and gender-matched peers who were typically developing. Salley and Dixon (2007) found a correlation between children who scored low on scales of executive control and high on negative affect, and language development in 51 21-month-old infants using the Early Childhood Behavior Questionnaire (Putnam, Gartstein, & Rothbart, 2006) and the MacArthur-Bates Communicative Development Inventory, Words and Sentences version (Fenson, Dale, Reznick, Thal, & Pethick, 1994), which is a vocabulary measure. Temperament has also been studied as a possible influence on the 9 development of psychopathological disorders. Bijittebier and Royers (2009) presented evidence that all three temperament domains (surgency, negative affect, and effortful control) play a role in the onset, development, and maintenance of anxiety disorders. Purpose and Hypotheses The primary purpose of this study on children’s perception of foreign-accented word-in- sentence recognition in background noise was to investigate the role that child temperament contributes to individual differences in the perception of foreign-accented speech. We hypothesized that children with better native and nonnative speech recognition abilities will score lower on the Negative Affect domain (specifically anger/frustration scales) compared to children with poorer speech recognition abilities. In addition, we hypothesized that children with better native and nonnative speech recognition abilities would score higher on the Effortful Control domain compared to children with poorer speech recognition abilities (specifically, higher on scales of low intensity pleasure, inhibitory control, attentional focusing, attentional shifting, and excitatory control). Because children will had to focus their attention not only on the task at hand, but also on a single talker in the midst of background noise, it was predicted children with higher loadings on scales of attentional focusing, inhibitory control, and attentional shifting would perform better than children with lower loadings on these scales. The experiment took place at an interactive science center; therefore, we predicted that children who had a greater capacity to sit and complete the experiment (excitatory control) and those who found enjoyment in this low intensity, novel task (low intensity pleasure) would perform better than children with the opposite characteristics. Lastly, we hypothesized children who achieved higher scores on the native and foreign-accent conditions would demonstrate lower scores on the Extraversion/Surgency domain (specifically lower on the scales of high intensity pleasure and shyness). It was hypothesized that 10 children who were more outgoing (lower loadings on the shyness scale) would be more willing to interact with the researcher (a stranger) and actively participate in the experimental task. It was also hypothesized that children who found less enjoyment in high-intensity activities would perform better on this physically low-intensity task. A secondary purpose was to evaluate if children benefited from semantic context in their perception of non-native speech and, if so, to determine whether the size of the benefit differed between native and nonnative speech. We hypothesized that children would benefit from semantic context and that the benefit would be greater for native than for nonnative speech. II. Method A. Participants Data from 84 monolingual 5- through 7-year-old children (42 males and 42 females) with normal parent-reported speech, language, and hearing recruited from the general population at the Center of Science and Industry in Columbus, Ohio were used in this study. Nine additional participants were excluded from final data analysis due to: significant exposure to a foreign language (n = 3), speech, language, or hearing disorders (n = 1), or inability to complete the experimental task (n = 5). The participants were stratified into three age groups: 5-year-olds (n = 28, mean age = 5;4 SD = 0;3), 6-year-olds (n = 28, mean age = 6;6, SD = 0;4), and 7-year-olds (n = 28, mean age = 7;5, SD = 0;3). An equal number of female and male participants were included in each age group. Prior to participation in the experiment, all parents/legal guardians of participants provided informed consent and children provided verbal assent. Participants were not paid for their participation. This study was approved by the Ohio State University institutional review board. B. Stimuli
Description: