ebook img

1998 Auer, Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: An initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals.pdf (PDFy mirror) PDF

0.94 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview 1998 Auer, Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: An initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals.pdf (PDFy mirror)

Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: An initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals Edward T. Auer, Jr. and Lynne E. Bernstein SpokenLanguageProcessesLaboratory,HouseEarInstitute,2100WestThirdStreet,LosAngeles, California90057 David C. Coulter CoulterAssociates,Fairfax,Virginia (cid:126)Received 6 March 1997; revised 27 April 1998; accepted 30 June 1998(cid:33) Fourexperimentswereperformedtoevaluateanewwearablevibrotactilespeechperceptionaidthat extractsfundamentalfrequency(F0) anddisplaystheextractedF0 asasingle-channeltemporalor aneight-channelspatio-temporalstimulus.Specifically,weinvestigatedtheperceptionofintonation (cid:126)i.e., question versus statement(cid:33) and emphatic stress (cid:126)i.e., stress on the first, second, or third word(cid:33) under Visual-Alone (cid:126)VA(cid:33), Visual-Tactile (cid:126)VT(cid:33), and Tactile-Alone (cid:126)TA(cid:33) conditions and compared performanceusingthetemporalandspatio-temporalvibrotactiledisplay.Subjectswereadultswith normal hearing in experiments I–III and adults with severe to profound hearing impairments in experiment IV. Both versions of the vibrotactile speech perception aid successfully conveyed intonation. Vibrotactile stress information was successfully conveyed, but vibrotactile stress informationdidnotenhanceperformanceinVTconditionsbeyondperformanceinVAconditions. In experiment III, which involved only intonation identification, a reliable advantage for the spatio-temporal display was obtained. Differences between subject groups were obtained for intonation identification, with more accurate VT performance by those with normal hearing. Possibleeffectsoflong-termhearingstatusarediscussed. © 1998AcousticalSocietyofAmerica. (cid:64)S0001-4966(cid:126)98(cid:33)03010-0(cid:35) PACS numbers: 43.71.Ma (cid:64)WS(cid:35) INTRODUCTION Boothroyd,1995a,b(cid:33).VoiceF0 hasbeenstudiedforseveral reasons:(cid:126)1(cid:33) F0 characteristics contribute to speech percep- The substitution of the sense of touch for the sense of tion at several different linguistic levels; (cid:126)2(cid:33) voice F0 is hearing has been explored intermittently across almost the generated at the glottis, which is invisible to the entiretwentiethcentury(cid:126)Reedetal.,1993(cid:33),withtheprimary speechreader; and (cid:126)3(cid:33) the presentation of simple acoustic goal to develop portable devices to enhance speech commu- stimuli composed of pulses generated as a function of F0 nication by hearing-impaired individuals. Fundamental is- greatly enhances speechreading. Normal-hearing adults have sues continue to be: (cid:126)1(cid:33) what speech information to encode been shown to improve 40–50 percentage points over for presentation to the skin (cid:126)e.g., spectral patterns or ex- speechreading alone when they speechread with an acoustic tracted acoustic-phonetic cues(cid:33); and (cid:126)2(cid:33) how to present the F0 supplement (cid:126)Boothroyd etal., 1988; Breeuwer and information (cid:126)e.g., as mechanical or electrical stimulation; as Plomp, 1984, 1985, 1986; Grant, 1987; Rosen etal., 1981(cid:33). rate displays, or spatial displays(cid:33). To date, several different A brief inventory of the linguistic levels at which F0 tactile speech perception aids have been developed and hasbeenshowntobeafactorreadilyillustratesthefirstpoint tested, encoding a variety of sources of speech information above. The onset time of vocal fold vibration relative to su- in a variety of presentation schemes designed to enhance pralaryngeal articulator release (cid:126)i.e., voice onset time(cid:33) is a speechreading (cid:126)for reviews see Kishon-Rabin etal., 1996; primary contributor to perception of prevocalic consonantal Reed etal., 1993; Summers, 1992(cid:33). Reported enhancements voicingdistinctions(cid:126)LiskerandAbramson,1967(cid:33).F0 direc- to speechreading have generally been in the range of 4–20 tion and height at the onset of prevocalic consonants also percentage points for words correct in open-set sentence varysystematicallywiththeconsonantalvoicingdistinctions identification tasks. (cid:126)Haggardetal.,1970;HouseandFairbanks,1953(cid:33).F0 isan Onetypeofspeechperceptionaid,whichhasmotivated acoustical correlate of lexical stress (cid:126)e.g., CONvert versus afairamountofresearch,deliversextractedvoicefundamen- conVERT(cid:33) and sentential stress (cid:126)Lehiste, 1970; Bolinger, tal frequency (F0) (cid:126)Bernstein etal., 1989, 1998; Boothroyd 1958; Fry, 1955, 1958(cid:33) also referred to as accent (cid:126)Sluijter and Hnath, 1986; Eberhardt etal., 1990; Grant, 1987; Hanin and van Heuven, 1996(cid:33). Lexical stress patterns may be in- etal., 1988; Hnath-Chisolm and Kishon-Rabin, 1988; volvedintheperceptionofwordboundaries(cid:126)CutlerandBut- Kishon-Rabinetal.,1996;PlantandRisberg,1983;Rothen- terfield, 1992(cid:33). F0 also systematically signals syntactic berg and Molitor, 1979; Summers, 1992; Waldstein and structure (cid:126)Bolinger, 1978; Lehiste, 1970; Pike, 1945(cid:33) (cid:126)e.g., 2477 J.Acoust.Soc.Am.104(4),October1998 0001-4966/98/104(4)/2477/13/$15.00 ©1998AcousticalSocietyofAmerica 2477 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms the contrast between a sentence spoken as a statement or a sis of Fig. 5 in the Rothenberg and Molitor (cid:126)1979(cid:33) study question(cid:33) and can signal phrase boundaries (cid:126)Streeter, 1978(cid:33). suggests that stress judgments were 51% correct (n(cid:53)8 The present study reports on an initial evaluation of a normal-hearing individuals, chance at 33%(cid:33), and intonation newdesignforawearablevibrotactilespeechperceptionaid judgmentswereapproximately90%correct(n(cid:53)8,chanceat that employs a digital processor for the automatic extraction 50%(cid:33). of F0. The new aid was evaluated using closed-set identifi- Using a similar paradigm to Rothenberg and Molitor cation of emphatic sentential stress (cid:126)on the first, second, or (cid:126)1979(cid:33), Bernstein etal. (cid:126)1989(cid:33) tested several alternate third word of the stimulus(cid:33) and intonation (cid:126)question versus F0-to-vibrotactiletransformations.However,Bernsteinetal. statement(cid:33). The main questions were: (cid:126)1(cid:33) whether one of the employed a much larger and more varied stimulus sentence two different vibrotactile displays of F0 (cid:126)one a single- set leading to higher uncertainty. In addition, because use of channel temporal display and one an eight-channel spatio- an F0 speech perception aid would necessarily involve temporal display(cid:33) was more successful for conveying F0; speechreading under practical (cid:126)not laboratory(cid:33) conditions, and(cid:126)2(cid:33)whetherauditoryspeechperceptionexperiencewasa they compared stress and intonation identification under factor in the successful perception of F0 information. tactile-alone(cid:126)TA(cid:33),visual-tactile(cid:126)VT(cid:33),andvisual-alone(cid:126)VA(cid:33) conditions. When F0 was shifted into the range below ap- proximately 100 Hz, stress and intonation patterns were per- I. DISPLAYS ceived most accurately. Differences in display effectiveness A main issue in designing an F0 vibrotactile speech wereonlyobservedunderTAconditions.UndertheTAcon- perception aid is how to tailor the linguistically relevant dition,themosteffectivetransformationproducedsomewhat characteristics of the F0 signal to vibrotactile perceptual more accurate identification of stress (cid:126)approximately 60% characteristics. F0 varies over the range of approximately correct for the best subject and a group mean of 52%, n 70–500Hzacrosschildren,women,andmen,withindividu- (cid:53)5 normal-hearing individuals(cid:33) and somewhat less accurate als’ voices exhibiting ranges of 1–1.5 octaves (cid:126)Hess, 1983(cid:33). identification of intonation (cid:126)approximately 80% correct for Based on their research on F0, Hnath-Chisolm and Boo- the best subject and a group mean of 70%, n(cid:53)5) than ob- throyd (cid:126)1992(cid:33) suggested that 1/8-octave or better resolution tained by Rothenberg and Molitor (cid:126)1979(cid:33). is needed to effectively encode F0 tactually, but semitone (cid:126)1/12-octave(cid:33) resolution has also been suggested in the lit- erature (cid:126)Hermes and van Gestel, 1991(cid:33). B. Multichannel spatio-temporal displays Vibrotactile psychophysical experiments by Rothenberg An alternate method for solving the problem of match- etal. (cid:126)1977(cid:33) showed that the vibrotactile frequency differ- ing the acoustic F0 characteristics to the skin’s perceptual encelimenresultsinatleasttendiscriminablestepsbetween capabilities is to engage spatial perception, following the approximately 10 and 90 Hz with some additional steps, if suggestion of Kirman (cid:126)1973(cid:33) who argued that temporal pat- the range is extended to 300 Hz. Beyond 300 Hz, frequency tern perception is not a forte of the skin. Spatial resolution discrimination falls off rapidly. Therefore, a direct one-to- for touch is extremely good. Phillips and Johnson (cid:126)1985(cid:33) onetransformationfromacousticpitchperiodstovibrotactile reportedspatialresolutionofthefingerpadtobe0.7mmfor rate is unlikely to be optimal. For example, the entire range amovingstimulus.BlindreadersofBraillecanachieverates of a child’s voice (cid:126)approximately 157–444 Hz; Hess, 1983(cid:33) of 80–200 words per minute (cid:126)Foulke, 1991(cid:33), indicating po- wouldbeabovetherangeatwhichfrequencyisbestresolved tential for high information transfer rates. Spatial F0 dis- by the skin. The F0 range of adult males was even found to plays might deliver greater information than temporal dis- be too large and poorly resolved with a direct acoustic-to- plays and also result in more effective processing under vibrotactile transformation (cid:126)Bernstein etal., 1989(cid:33). cross-modal conditions. However, vibrotactile perception is also susceptible to distortions due to interactions between A. Single-channel temporal displays timeandspace(cid:126)Geldard,1985;GeldardandSherrick,1972(cid:33). Rothenberg and Molitor (cid:126)1979(cid:33) proposed to solve the Thesedistortionscouldenhanceordegradetheperceptionof problem of transforming acoustic F0 to vibrotactile F0 by speech information (cid:126)Kirman, 1973(cid:33). using frequency range compression and frequency shifting. Boothroyd and colleagues implemented a spatio- Detected voice F0 was encoded as vibrotactile pulses deliv- temporal vibrotactile F0 display scheme (cid:126)Boothroyd and ered by a single vibrator. The most favorable results were Hnath, 1986; Boothroyd etal., 1988; Hnath-Chisolm and obtained when frequency was shifted so as to center around Medwetsky, 1988; Hanin etal., 1988; Waldstein and Boo- 50Hz,withthescalefactorat1:1or1:0.75(cid:126)foramaletalker throyd, 1995a(cid:33) (cid:126)for an aid employing a purely spatial elec- whose untransformed 80-Hz range was centered at 125 Hz(cid:33). trotactile display scheme, see Grant etal., 1986(cid:33). F0 was Their experimental task was forced-choice identification of extractedusingananalogcircuitandwasencodedviavibra- thestressedwordandintonationtypeofasentence,Ronwill tion rate and vibrator location in a single-dimensional vibra- win,presentedasaseriesofvibrotactilepulses.Thesentence torarray(cid:126)Yeungetal.,1988(cid:33).Underthisscheme,frequency was spoken with emphasis on one of the three words and resolution is theoretically limited only by vibrotactile spatial with the intonation pattern of either a question (cid:126)rising(cid:33) or a resolutionandthenumberofvibrotactilestimulators(cid:126)Hnath- statement (cid:126)falling(cid:33). The results confirmed the prediction that ChisolmandKishon-Rabin,1988(cid:33).Thespatialcomponentof perception is most accurate when vibrotactile stimuli make theBoothroyddisplayusedachangeof0.16octavesininput use of the frequencies below approximately 100 Hz. Analy- frequency to cause a change in selection of the vibrator for 2478 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2478 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms activation.1 The temporal component of the display was the vision, and English as their native language. Six were ran- output of a pulse on the spatially appropriate stimulator for domly assigned to the single-channel temporal display and every other detected pitch pulse. six to the eight-channel spatio-temporal display. Subjects Hnath-Chisolm and Kishon-Rabin (cid:126)1988(cid:33) compared this were paid for their time in the experiment. spatio-temporal display with a single-channel temporal-only display of F0, which also employed the every-other-pulse scheme. Their experimental tasks included identification of B. Stimuli stress and intonation in separate conditions, and results were interpreted as showing some advantage for the spatio- temporal display (VT(cid:53)80% correct and TA(cid:53)78% correct, 1.Sentencestimuli n(cid:53)6 normal-hearing individuals(cid:33).2 However, Hnath- Foursentenceswereusedtogeneratestimuliforthisand Chisolm and Kishon-Rabin’s temporal display of intonation subsequent experiments. (cid:126)VT(cid:53)70% correct and TA(cid:53)64% correct, n(cid:53)6) failed to (cid:126)1(cid:33) We owe you a yoyo. replicate TA intonation identification accuracy obtained by Rothenberg and Molitor (cid:126)1979(cid:33), and Bernstein etal. (cid:126)1989(cid:33) (cid:126)2(cid:33) We will weigh you. (cid:126)3(cid:33) Pat cooked Pete’s breakfast. withseveraldifferenttemporaldisplays,callingintoquestion (cid:126)4(cid:33) Chuck caught two cats. the conclusion that there is a general advantage for a spatio- temporal display. Phonetic content was controlled to examine the influ- ence of continuously voiced versus interrupted vibrotactile stimulus patterns: the first two sentences (cid:126)referred to as II. CURRENT STUDY ‘‘continuous’’(cid:33) have voiced continuants (cid:126)resulting in almost ThepresentstudyusedthesametaskemployedinBern- uninterrupted voicing(cid:33), and the second two (cid:126)referred to as stein etal. (cid:126)1989(cid:33) to investigate whether an eight-channel ‘‘discontinuous’’(cid:33) have voiceless stop and fricative conso- spatio-temporalorsingle-channeltemporaldisplaywasmore nants (cid:126)except for /b/ in ‘‘breakfast’’(cid:33) (cid:126)resulting in brief peri- effective at conveying intonation and sentential stress infor- ods of silence(cid:33). It was predicted that vibrotactile stress iden- mation.Thetemporaldisplayconsistedofthepresentationof tification would be easier for sentences with voiceless every other detected pitch period, and the spatio-temporal consonants, because the silences would facilitate identifying display consisted of presentation of the same pulses but dis- syllablelocationandduration(cid:126)Bernsteinetal.,1989(cid:33).Atthe tributed across an eight-channel single-dimensional array. same time, inasmuch as those gaps interrupt intonation con- Experiment I investigated whether vibrotactile F0 enhanced tours,theymightreducevibrotactileintonationidentification stressandintonationidentificationbeyondVAidentification, accuracy (cid:126)Bernstein etal., 1989(cid:33). Twenty-four stimulus sen- and whether there were differential effects of display. Ex- tences were generated from orthogonal combination of into- periment II assessed the perception of stress and intonation nation pattern (cid:126)statement or question(cid:33) and position of the information and display in the TA condition. Experiment III word receiving emphatic stress (cid:126)first, second, or third(cid:33). For further investigated display effects for intonation identifica- example, the base sentence ‘‘We owe you a yoyo.’’ gener- tion under VT and TA conditions using a within-subjects ated the following six sentences in which the capitalized design to compare displays. In addition to group compari- word received emphatic stress: sons,individualperformancedifferenceswereexaminedasa WE owe you a yoyo. function of display. WE owe you a yoyo? The goal of sensory substitution is to develop vibrotac- we OWE you a yoyo. tile speech perception aids for hearing-impaired individuals. we OWE you a yoyo? However, the relationship between the specialized experi- we owe YOU a yoyo. ence of hearing-impaired individuals and vibrotactile speech we owe YOU a yoyo? perception has received little attention in the literature (cid:126)e.g., Bernstein etal., in press; Rothenberg and Molitor, 1979(cid:33). Twotokensofeachsentenceinitssixdifferentinstantiations werespokenbyamaleandafemaletalker(cid:126)96tokenstotal(cid:33). Therefore,experimentIVexaminedeffectsofauditoryexpe- Stimuli were recorded on videotape and then stored on rience on vibrotactile F0 perception by hearing-impaired video laserdisc (cid:126)Bernstein and Eberhardt, 1986(cid:33). The 96 adults. stimuli had been previously evaluated for production accu- racy via auditory testing (cid:126)98% correct stress and intonation III. EXPERIMENT I. VISUAL-ALONE VERSUS VISUAL- identification(cid:33) (cid:126)Bernstein etal., 1989(cid:33). A linear-phase elec- TACTILE JUDGMENTS OF STRESS AND troglottograph (cid:126)EGG(cid:33) (cid:126)Synchrovoice(cid:33) signal, which was INTONATION passed through a high-pass filter (cid:126)Glottal Enterprise(cid:33) was re- In experiment I, under VA and VT conditions, subjects corded on one audio channel of the video recording. The identified both the position of an emphatically stressed word dominantfrequencyintheEGGsignalisthevoiceF0 (cid:126)Four- in a sentence and the sentence intonation pattern. cin,1981(cid:33).TheEGGsignalswereusedasinputtothevibro- tactile aid following gating to eliminate spurious vibrotactile A. Subjects outputduetoshiftsinthenoisefloorasthevideodiscplayer The 12 subjects (cid:126)all female(cid:33), were age 20–29 years old, moved from pause to play. Video stimuli were presented on with normal hearing, 20/30 or better normal or corrected a color monitor. 2479 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2479 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms FIG.1. FrequencyinHzisdisplayedasafunctionoftimefortheinputEGGsignal(cid:126)indicatedbythe(cid:51)’s(cid:33)andtheoutputtactilesignal(cid:126)indicatedbythefilled circles(cid:33).Thestatement‘‘WeWILLweighyou.’’wasspokenbythefemaletalker.Tactile-alonestress(chance(cid:53)33%) judgmentaccuracyforthistokenwas 30%correctwiththetemporaldisplayand45%correctwiththespatio-temporaldisplay.Tactile-aloneintonation(chance(cid:53)50%)judgmentaccuracywas85% correctwiththetemporaldisplayand100%correctwiththespatio-temporaldisplay. 2.Vibrotactilestimulusgeneration the model, a pitch period was reported by the processor. Over time, in the absence of input, the model returned to its Both vibrotactile displays in these experiments em- initial values. ployed the same real-time F0 extraction algorithm. The al- The performance of the current F0 extraction algorithm gorithm was based on a time domain model of the decay of has not yet been subjected to direct comparison with other the acoustic speech waveform following glottal excitation. possible real-time extraction F0 extraction algorithms (cid:126)for a The algorithm was implemented on a Texas Instruments comparison of three alternate algorithms, see Bosman and TMS320 digital signal processing chip. Using a set of con- Smoorenburg, 1997(cid:33). However, we have examined several ditions prespecified in software, the algorithm compared the randomlyselectedexamplesofthepitchtrackingbythecur- declineinamplitudeoverasequenceofdigitalsampleswith rentalgorithm(cid:126)seeFigs.1–4(cid:33).FrequencyinHzisdisplayed the decay rate of the model. Criteria known to the model as a function of time for the handpicked pitch of the input were adaptively updated. Initial conditions were based on EGG signal and the output vibrotactile signal. Figures 1 and likely first formant values, which determine the largest up- 3 show better frequency tracking for continuously voiced wardexcursioninthespeechwaveform.Whentheamplitude sentences, as would be expected given the difficulty of ex- of the sampled waveform exceeded criteria for decay, or the tracting pitch from speech with many stop consonants. The durationoftheputativepitchperiodwaswithinthelimitsof figures also show that errors were made in tracking F0. FIG.2. FrequencyinHzisdisplayedasafunctionoftimefortheinputEGGsignal(cid:126)indicatedbythe(cid:51)’s(cid:33)andtheoutputtactilesignal(cid:126)indicatedbythefilled circles(cid:33).Thequestion‘‘ChuckCAUGHTtwocats?’’wasspokenbythefemaletalker.Tactile-alonestress(chance(cid:53)33%) judgmentaccuracyforthistoken was85%correctwiththetemporaldisplayand60%correctwiththespatio-temporaldisplay.Tactile-aloneintonation(chance(cid:53)50%)judgmentaccuracywas 100%correctwiththetemporaldisplayand90%correctwiththespatio-temporaldisplay. 2480 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2480 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms FIG.3. FrequencyinHzisdisplayedasafunctionoftimefortheinputEGGsignal(cid:126)indicatedbythe(cid:51)’s(cid:33)andtheoutputtactilesignal(cid:126)indicatedbythefilled circles(cid:33).Thequestion‘‘WewillWEIGHyou?’’wasspokenbythemaletalker.Tactile-alonestress(chance(cid:53)33%)judgmentaccuracyforthistokenwas40% correctwiththetemporaldisplayand70%correctwiththespatio-temporaldisplay.Tactile-aloneintonation(chance(cid:53)50%) judgmentaccuracywas100% correctwiththetemporaldisplayand100%correctwiththespatio-temporaldisplay. However,examinationofthehumanperformancedatataken Theschemeusedtoassignoutputchannelsfortheeight- from experiment II (cid:126)see captions of Figs. 1–4(cid:33) suggests that channel aid is shown in Fig. 5. Output channel number is with tactile information alone, stress and intonation judg- plotted as a function of input rate. The lines with the open ments can be accurate even when tracking errors occurred. squares represent the channel selection scheme for the male For example, despite substantial errors in the tracking of the talker. Channel selection was programmed on the basis of sentence final frequency contour for the token displayed in prior analysis of the frequency ranges of the two talkers Fig. 2, intonation judgments were highly accurate. (cid:126)Bernstein etal., 1989(cid:33) and was designed to center the eight Theprocessorinthevibrotactileaidwasprogrammedto channels over the range of frequencies containing the center send a signal to a custom electronic circuit that initiated a 90% of detected pitch periods for each talker. Frequency square pulse for every second detected pitch pulse. This ranges were allocated to channels equally except for the scheme was required by the spatio-temporal display in order highest and lowest channels, which were expanded to ac- to determine the appropriate output channel. Discarding the commodate extreme pitch excursions. Spatial resolution was initialpulseintroducedavariabledelayintheoutput.Analy- approximately0.083octavespervibrotactilestimulatorloca- sis of 8 of the 96 tokens suggested that onset delays ranged tion for the male talker and 0.125 octaves for the female between0and50mswithanaverageonsetdelay(cid:126)(cid:59)30ms(cid:33). talker. Vibrotactile stimulation was delivered via small sole- However, this delay was within the 40-ms limit that noids with contact points of 2 mm in diameter arrayed with McGrath and Summerfield (cid:126)1985(cid:33) suggested as sufficient to 4-mm separation between them. The single-channel display not dramatically affect audio-visual speech understanding. presentedeverypulseonasinglesolenoid.Amplitudeofthe FIG.4. FrequencyinHzisdisplayedasafunctionoftimefortheinputEGGsignal(cid:126)indicatedbythe(cid:51)’s(cid:33)andtheoutputtactilesignal(cid:126)indicatedbythefilled circles(cid:33).Thequestion‘‘ChuckCAUGHTtwocats?’’wasspokenbythemaletalker.Tactile-alonestress(chance(cid:53)33%)judgmentaccuracyforthistokenwas 55%correctwiththetemporaldisplayand30%correctwiththespatio-temporaldisplay.Tactile-aloneintonation(chance(cid:53)50%)judgmentaccuracywas75% correctwiththetemporaldisplayand90%correctwiththespatio-temporaldisplay. 2481 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2481 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms TABLEI. ExperimentI:Individualsubjectdatainpercentcorrectforstress judgments under visual-alone and visual-tactile conditions (Chance (cid:53)33%). Display(cid:53)1 indicates the subject used the temporal display; Display(cid:53)8 indicates that the subject used the spatio-temporal display. Cont.(cid:53)ContinuousSentences;Discont.(cid:53)discontinuoussentences. Visual-alone Visual-tactile Subject (cid:126)Display(cid:33) Cont. Discont. Combined Cont. Discont. Combined 1(cid:126)1(cid:33) 89 78 83 85 79 82 2(cid:126)1(cid:33) 82 74 78 74 81 78 3(cid:126)1(cid:33) 93 84 88 92 84 88 4(cid:126)1(cid:33) 85 75 80 83 85 84 5(cid:126)1(cid:33) 90 80 85 93 93 93 6(cid:126)1(cid:33) 97 93 95 88 87 88 7(cid:126)8(cid:33) 82 73 78 75 81 78 8(cid:126)8(cid:33) 77 71 74 78 69 73 FIG.5. Channelselectionforspatio-temporaldisplayofvoiceF0.Thelines 9(cid:126)8(cid:33) 84 80 82 84 80 82 delimitedbytheopensquaresindicatetheinputrangesforthemaledisplay. 10(cid:126)8(cid:33) 93 84 89 86 80 83 Thelinesdelimitedbythefilledsquaresindicatetherangesforthefemale 11(cid:126)8(cid:33) 86 79 83 78 80 79 display. 12(cid:126)8(cid:33) 82 83 82 91 88 90 output signal was not modulated by the amplitude input sig- Mean 87 79 83 84 82 83 nal. All stimuli were presented to the hypothenar of the left hand.3 sual trials with 48 sentences, each sentence followed by the C. Design and procedure subject’s response and feedback. Practice prior to the VT conditionconsistedofaudiovisualpresentationofthe48sen- Stimulus presentation and response collection were tences, the same sentences presented with video and EGG computer controlled. The subject’s task was to make a six- signals auditorily, and then presented with VT stimulation. alternative forced-choice identification of the emphatically stressedword(cid:126)first,second,orthirdword(cid:33)andtheintonation Feedback was given following each trial. pattern (cid:126)question or statement(cid:33) of each stimulus sentence. Responses were collected using a six-button response box D. Analyses with ‘‘question’’ responses on the right side of the box and Separate 2(cid:51)2(cid:51)2(cid:51)2(cid:51)2 (cid:126)condition(cid:51)talker(cid:51)sentence ‘‘statement’’responsesontheleft.Buttonsoneachsidewere type(cid:51)order(cid:51)display(cid:33) repeated measures analyses of vari- numberedtocorrespondtothepositioninthesentenceofthe ance were performed on the mean percentage correct re- stressed word. sponses separately for stress and intonation.4 Condition, LEDs on the response box signaled the beginning of a talker, and sentence type were within-subjects variables, and trial. The first frame of a video stimulus was then presented order and display were between-subjects variables. The con- and paused briefly, with the talker in a relaxed position. The dition factor corresponded to whether the trials were per- video stimulus was then played in real-time. The final frame formed VA or VT. The sentence-type factor compared the of the stimulus was then paused with the talker again in a two sentences with voiced consonants to the two sentences relaxed position. Following each response, feedback was with unvoiced consonants. Order corresponded to whether providedbylightinganLEDoverthebuttonassociatedwith the subjects were tested in the VA condition first or second. the correct response. Display corresponded to the temporal or spatio-temporal vi- Subjects were tested in both VA and VT conditions, brotactile stimuli. Only the last five blocks of data for each withtheorderofconditionscounterbalanced.Sentencesfora talker,foreachcondition,wereanalyzed.Thoseblockswere single talker were presented in blocks of 48 (cid:126)2 tokens(cid:51)4 considered representative of the subjects’ most experienced basesentences(cid:51)2typesofintonation(cid:51)3positionsofstress(cid:33). performance within a given condition. Subjects received 20 blocks in the visual condition and 30 blocks in the VT condition. The talker was alternated every E. Results and discussion fiveblocks,andtheorderoftalkerpresentationwascounter- balanced across subjects. During VT trials, subjects wore VA stress and intonation identifications were reliably earplugs and received shaped noise through headphones to above chance (chance(cid:53)33.3% for stress and 50% for into- maskpossiblesoundscausedbythestimulator.Thestimulus nation(cid:33)accordingtothebinomialtest,butnotatceiling.(cid:126)See blocks lasted approximately 60 mins, and subjects were Tables I and II for individual subject data.(cid:33) testedinsevensessions.Thesubjectsreceivedatmost3hof Stress identification did not differ as a function of con- vibrotactile experience. dition (cid:64)F(1,8)(cid:53)0.00, p(cid:46)0.96(cid:35) (cid:126)see Table I(cid:33). The overall Subjectswereinformedofboththestructureandcontent mean percent correct stress was 79% for the female talker ofthestimulussentencesandpracticewasadministeredprior and 87% for the male talker (chance(cid:53)33%) (cid:64)F(1,8) to data collection. Practice preceding the VA condition con- (cid:53)24.94, p(cid:44)0.01(cid:35). This result was consistent with previous sisted of audiovisual trials with 48 sentences (cid:126)two talkers resultssuggestingthatthefemaletalkerwasmoredifficultto (cid:51)four base sentences(cid:51)two intonation(cid:51)three stress(cid:33) and vi- speechread than the male (cid:126)Demorest and Bernstein, 1987; 2482 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2482 TABLE II. Experiment I: Individual subject data in percent correct for IV. EXPERIMENT II. TACTILE-ALONE JUDGMENTS intonation judgments under visual-alone and visual-tactile conditions OF STRESS AND INTONATION (Chance(cid:53)50%). Display(cid:53)1 indicates the subject used the temporal dis- play;Display(cid:53)8indicatesthatthesubjectusedthespatio-temporaldisplay. Experiment II was conducted to assess how accurately Cont.(cid:53)ContinuousSentences;Discont.(cid:53)discontinuoussentences. stress and intonation could be perceived by vibrotaction alone and to determine whether a difference between dis- Visual-alone Visual-tactile Subject plays would emerge in the absence of visual stimulation. (cid:126)Display(cid:33) Cont. Discont. Combined Cont. Discont. Combined A. Stimuli and subjects 1(cid:126)1(cid:33) 63 53 58 87 74 80 2(cid:126)1(cid:33) 63 55 59 70 62 66 Thevibrotactilestimuliandthestimulusdeliverysystem 3(cid:126)1(cid:33) 67 65 66 86 89 88 were the same as in experiment I. Seven subjects from ex- 4(cid:126)1(cid:33) 76 66 71 93 88 91 perimentIweretestedinexperimentII.Aneighthsubject,a 5(cid:126)1(cid:33) 58 63 66 91 81 86 native-English speaking female, was recruited. This subject 6(cid:126)1(cid:33) 79 68 74 81 75 78 7(cid:126)8(cid:33) 65 53 59 92 88 90 performed the experiment I protocol before proceeding to 8(cid:126)8(cid:33) 61 60 61 85 70 78 experiment II. Subjects ranged from 20 to 27 years of age. 9(cid:126)8(cid:33) 68 62 65 92 86 89 Theeightsubjectsweredividedintotwogroupsaccordingto 10(cid:126)8(cid:33) 68 65 67 84 78 81 their previously assigned display and performed experiment 11(cid:126)8(cid:33) 63 68 65 83 86 85 II with that display. 12(cid:126)8(cid:33) 62 54 58 73 66 70 Mean 67 61 64 85 79 82 B. Design and procedure A block of 48 VT trials was presented to refamiliarize Eberhardtetal.,1990(cid:33).Stresswasmoreaccuratelyidentified the subjects with the stimuli. They then judged stress and in continuous sentences (cid:126)85%(cid:33) than in discontinuous sen- intonation for 25 blocks of TA trials for each talker. Five tences (cid:126)81%(cid:33) (cid:64)F(1,8)(cid:53)20.96, p(cid:44)0.01(cid:35). This result was blocks per talker were presented in each session. The talker counter to the prediction that stress would be more percep- was alternated every five blocks, and the order of talker pre- tible when sentences had unvoiced consonants to mark syl- sentation was counterbalanced across subjects. After day 1, lable boundaries. However, because the visual stimulus ap- subjects were informed of a monetary bonus that could be peared to provide most of the information about stress, it earned based on day 5 accuracy for stress identification. To seemedthatthemostlylabialplaceofarticulationforconso- earn the bonus, the subjects were required to maintain into- nants in the continuous sentences provided the syllable nation accuracy while improving stress accuracy. boundary information. This result provides support for the hypothesis that visible changes in the rhythm of alternating C. Analyses mouth opening during vowels and closure or approximation during consonants provides cues to the location of emphati- Each subject’s last five blocks of stress and intonation cally stressed words in sentences (cid:126)Bernstein etal., 1989(cid:33). identification for each talker were analyzed. Separate 2(cid:51)2 Both talker and sentence-type interacted with condition. Ex- (cid:51)2 (talker(cid:51)display(cid:51)sentence-type) repeated measures amination of the means for both of these interactions sug- analyses of variance were performed on the mean percent- gested that the sentence-type (cid:126)see Table I(cid:33) and talker differ- ages of correct responses for intonation and stress. Talker ences (VA: male(cid:53)88% correct, female(cid:53)78% correct; and sentence-type were within-subjects factors, and display VT: male(cid:53)86% correct, female(cid:53)80% correct(cid:33) were re- was a between-subjects factor. duced in magnitude in the visual-tactile condition relative to the visual-alone condition. These interactions are not readily D. Results and discussion interpretable in terms of the influence of the vibrotactile speech perception aid. Stress information was available from the vibrotactile Identification accuracy for intonation differed across stimuli at levels reliably above chance according to the bi- conditions (cid:126)82% VT and 64% VA(cid:33) (cid:64)F(1,8)(cid:53)58.39, nomial test (cid:126)56% for the temporal display and 48% for the p(cid:44)0.01] (cid:126)see Table II(cid:33). However, the spatio-temporal and spatio-temporal display(cid:33) (chance(cid:53)33.3%) (cid:126)see Table III(cid:33). temporal display were not different (cid:64)F(1,8)(cid:53)0.20, p TA stress identification accuracy did not differ significantly (cid:53)0.66], and display did not interact with con- as a function of displays (cid:64)F(1,6)(cid:53)3.11, p(cid:53)0.13(cid:35). Stress dition (cid:126)VA: temporal(cid:53)66%, spatio-temporal(cid:53)62%; was more accurately identified in discontinuous sentences VT: temporal(cid:53)82%, spatio-temporal(cid:53)82% correct(cid:33).Asig- (cid:126)59%(cid:33) than in continuous sentences (cid:126)45%(cid:33) (cid:64)F(1,6) nificant main effect of sentence-type and a sentence-type by (cid:53)82.14, p(cid:44)0.01(cid:35), as predicted. In addition, sentence-type talker interaction were also obtained. These effects did not interacted with talker (cid:64)F(1,6)(cid:53)26.24, p(cid:44)0.01(cid:35), such that interact with condition and therefore were not interpretable performance was better with the female talker only for the in relation to the tactile information. This pattern of results continuouslyvoicedsentences (cid:126)ContinuousMale(cid:53)40% cor- provides evidence of a benefit from the tactile information rect, Continuous Female(cid:53)50% correct, and Discontinuous for making intonation judgments. However, as in the VT Male(cid:53)58% correct, Discontinuous Female(cid:53)61% correct(cid:33). condition of study I reported in Bernstein etal. (cid:126)1989(cid:33), no Analysis of the results showed that the bonus did not influ- evidence was observed for any display differences. ence performance levels. 2483 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2483 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms TABLE III. Experiment II: Individual subject data in percent correct for spatio-temporal displays in the current study was not statis- stress judgments under the tactile-alone condition (Chance(cid:53)33%). tically significant (cid:64)F(1,6)(cid:53)2.58, p(cid:53)0.16(cid:35). However, the Display(cid:53)1 indicatesthesubjectusedthetemporaldisplay;Display(cid:53)8 in- absenceofstatisticalsignificancemaybearesultofthesub- dicatesthatthesubjectusedthespatio-temporaldisplay. *indicatessubject didnotparticipateinexperimentI. ject sample size. Intonation identification accuracy with the temporal dis- Tactile-alone play was worse than that reported by Rothenberg and Moli- Subject ExperimentI (cid:126)Display(cid:33) subjectnumber Continuous Discontinuous Combined tor (cid:126)1979(cid:33) (cid:126)90% correct(cid:33) and the same as Bernstein etal. (cid:126)1989(cid:33) (cid:126)70% correct(cid:33). However, average performance was 1(cid:126)1(cid:33) 3 54 69 61 better than reported by Hnath-Chisolm and Kishon-Rabin 2(cid:126)1(cid:33) * 47 60 53 3(cid:126)1(cid:33) 5 40 58 49 (cid:126)1988(cid:33) (cid:126)64%(cid:33) for their temporal display. Intonation identifi- 4(cid:126)1(cid:33) 6 50 71 61 cation accuracy with the spatio-temporal display (cid:126)83% cor- rect(cid:33) was greater than spatio-temporal performance reported Mean 48 64 56 by Hnath-Chisolm and Kishon-Rabin (cid:126)1988(cid:33) (cid:126)78% correct(cid:33). 5(cid:126)8(cid:33) 7 40 54 47 Examinationofthesubjectdatasuggestedthattheaccu- 6(cid:126)8(cid:33) 8 38 42 40 racy of intonation judgments by several individuals matched 7(cid:126)8(cid:33) 11 48 65 56 or exceeded the best subject’s performance (cid:126)81% correct(cid:33) in 8(cid:126)8(cid:33) 12 43 56 50 Bernstein etal.’s experiment II in which pitch was hand- Mean 42 54 48 picked. Thus even with errors introduced by real-time F0 extraction, levels of accuracy were achieved with both dis- playsthatwerecomparabletoresultsobtainedwhenthetac- Accuracywascomparabletothatforstressidentification tile signal was generated off-line for the same sentence to- obtained earlier by Rothenberg and Molitor (cid:126)1979(cid:33), and by kens (cid:126)Bernstein etal., 1989(cid:33). Bernstein etal. (cid:126)1989(cid:33), who used the same stimuli and a Individualdifferenceswereobservedintheabilitytouse different set of vibrotactile displays based on hand-picked the tactile information (cid:126)see Table IV(cid:33). For example, exami- pitch periods. Scores were also roughly comparable to those nation of the intonation judgment performance shows that obtained by Hnath-Chisolm and Kishon-Rabin (cid:126)1988(cid:33) (cid:126)56% subject 1 performed exceptionally well with the temporal- correct temporal and 51% correct spatio-temporal(cid:33). display, and subject 8 performed somewhat less accurately Although no statistically significant differences between withthespatio-temporaldisplay.Unfortunately,thebetween displays were observed for stress judgments, the temporal subjects design employed limits our ability to interpret the display tended to convey stress more accurately. This trend relationship of the individual differences and display type. was also present in the TA performance reported by Hnath- Experiment III examined the relationship of individual dif- Chisolm and Kishon-Rabin (cid:126)1988(cid:33). However, given the evi- ferences in performance accuracy and display type. dence,obtainedinexperimentI,thatemphaticstresstendsto be highly visible, this trend was not explored further in sub- sequent experiments. Intonation information was available from the vibrotac- tile stimuli at levels reliably above chance according to the V. EXPERIMENT III. TEMPORAL VERSUS binomialtest(cid:126)73%forthetemporaldisplayand83%forthe SPATIO-TEMPORAL DISPLAYS FOR INTONATION spatio-temporaldisplay(cid:33)(chance(cid:53)50%) (cid:126)seeTableIV(cid:33).The JUDGMENT difference between performance with the temporal and Given the theoretically motivated prediction of an ad- vantage with the spatio-temporal display and the possibility TABLE IV. Experiment II: Individual subject data in percent correct for intonation judgments under the tactile-alone condition (Chance(cid:53)50%). of individual differences in display effectiveness, an addi- Display(cid:53)1 indicatesthesubjectusedthetemporaldisplay;Display(cid:53)8 in- tionalexperimentwasconductedtoattemptmoresensitively dicatesthesubjectusedthespatio-temporaldisplay.*indicatessubjectdid to observe differential effects of display. Experiment II pro- notparticipateinexperimentI. videdanindication,althoughnotstatisticallysignificant,that the two displays were differentially effective for both stress Tactile-alone Subject ExperimentI and intonation judgments. However, the results of experi- (cid:126)Display(cid:33) subjectnumber Continuous Discontinuous Combined ment I provided evidence that only intonation judgments 1(cid:126)1(cid:33) 3 85 92 89 were influenced by the addition of vibrotactile information. 2(cid:126)1(cid:33) * 61 68 65 Experiment III examined the advantage for the spatio- 3(cid:126)1(cid:33) 5 68 59 63 temporal display for intonation judgments. 4(cid:126)1(cid:33) 6 69 79 74 Arguably, the complexity of the task and the stimulus Mean 71 74 73 setinexperimentsIandIIcouldresultinperformanceerrors that reduced sensitivity. Therefore in experiment III, one 5(cid:126)8(cid:33) 7 93 84 88 6(cid:126)8(cid:33) 8 92 76 84 talker was presented, the subjects judged intonation only, 7(cid:126)8(cid:33) 11 85 85 85 and testing was under the two conditions employing vibro- 8(cid:126)8(cid:33) 12 76 74 75 tactile stimuli, VT and TA. To enhance the power of the design, subjects received both displays in counterbalanced Mean 86 80 83 order and repeated measures analysis was performed. 2484 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2484 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms A. Subjects and stimuli TABLE V. Experiment III: Individual subject data in percent correct for intonation judgments under the visual-tactile and tactile-alone condition The 21 subjects were age 18–45 years old, with normal (Chance(cid:53)50%). hearing, 20/30 or better normal or corrected vision, and En- glish as their native language. Complete data were obtained Temporal Spatio-temporal for16ofthe21,5whoweredividedintogroupsofeightand Subject Tactile-alone Visual-tactile Tactile-alone Visual-tactile receivedbothdisplaysincounterbalancedorder.Subjectsre- 1 66 65 95 94 ceived approximately 3 h of experience with each display 2 55 68 89 90 and had not participated in experiments I or II. 3 85 91 98 95 In experiments I and II, two talkers were used to in- 4 93 88 83 81 crease the generalizability of the findings (cid:126)Demorest etal., 5 88 90 85 92 1996(cid:33). However, the use of multiple talkers is known to in- 6 65 70 93 98 crease task difficulty (cid:126)Mullennix etal., 1989(cid:33). In the current 7 45 53 89 78 8 87 94 93 98 experiment, we chose to use a single-talker to reduce task 9 68 75 92 94 difficulty.Thesentencestimuliwerethe48tokensspokenby 10 83 86 95 95 the female talker. 11 69 78 95 95 12 45 57 72 63 13 44 65 90 97 B. Design, procedure, and analyses 14 84 87 95 98 Stimuli were presented under TA and VT conditions. 15 95 97 97 96 16 83 90 88 86 Subjectsmadetwo-alternativeforced-choiceintonationiden- tifications by pressing one of two buttons (cid:126)one labeled Mean 72 78 90 91 ‘‘question,’’ and one ‘‘statement’’(cid:33). Practice trials were pre- sented on the first day of testing with each of the two dis- plays.Foreachdisplay,15blocksoftrialswerepresented,5 been optimized for this female talker. The spatio-temporal on each of 3 days. Only the data from the last five blocks of displaywasoptimizedspecificallyforthepresentationofthe trialsforeachdisplaywereanalyzed.Foreachsubject,order female talker (cid:126)see Fig. 5(cid:33). The temporal display was de- of the 48 sentences was independently randomized within signed to deliver both male and female talkers and was eachblockandcondition,andthe2conditions(cid:126)VTandTA(cid:33) knowntobelessthanoptimalforthisfemaletalker.Specifi- were randomly ordered across the 5 blocks presented on a cally, the female talker’s mean F0 was 200 Hz, with her given day with the constraint that no more than 3 blocks mean for the every-other-pulse scheme being 100 Hz. The were presented in a given condition. best frequency resolution for the skin is below 100 Hz, thus Arepeatedmeasuresanalysisofvariancewasperformed the temporal display of the female talker’s F0 is not opti- on percent correct intonation identification. The within- mally suited to the temporal resolution capabilities of the subjects factors were display (cid:126)temporal versus spatio- skin. Second, the advantage could be due to the use of the temporal(cid:33), condition (cid:126)TA versus VT(cid:33), sentence-type (cid:126)con- spatial dimension. tinuous versus discontinuous(cid:33), and block. The between- Although the current data alone do not allow us to de- subjects factor was display order (cid:126)temporal first versus cidebetweenthesetwopossibleinterpretations,examination spatio-temporal first(cid:33). of previous performance with tactile transformation of these stimuli can provide some insight. In their Fig. 2, Bernstein C. Results and discussion etal.(cid:126)1989(cid:33)presentdataontemporaldisplaysoptimizedfor Across VT and TA conditions, intonation identi- the presentation of the same stimuli spoken by the female fication was more accurate with the spatio-temporal (cid:126)91% talker as were used in the current experiment. Average TA correct(cid:33) than with the temporal display (cid:126)75% correct(cid:33) performance levels with the best of these temporal displays (cid:64)F(1,14)(cid:53)18.53, p(cid:44)0.01(cid:35) (cid:126)see Table V(cid:33). Also, VT identifi- was comparable (cid:126)74% correct(cid:33) to the current TA perfor- cation differed significantly from TA identification mance with a temporal display (cid:126)72% correct(cid:33). Thus the in- (cid:64)F(1,14)(cid:53)9.29, p(cid:44)0.01(cid:35). However, a statistically signifi- crease in performance with the spatio-temporal display was cant display(cid:51)condition interaction was also obtained likely due to the use of the spatial dimension and not the (cid:64)F(1,14)(cid:53)10.25, p(cid:44)0.01(cid:35): Spatio-temporal display resulted optimization of the display for a specific talker. in the same performance levels across conditions (cid:126)91% cor- Examinationoftheindividualsubjectdata(cid:126)seeTableV(cid:33) rect VT and TA(cid:33), whereas performance varied across condi- suggeststhatperformancelevelvariedmoreforthetemporal tionswiththetemporaldisplay(cid:126)78%correctVTversus72% display than for the spatio-temporal display. Within-subject correctTA(cid:33).Theseresultsdemonstratedanadvantageforthe performance levels (cid:126)in percent correct(cid:33) were not signifi- spatio-temporal display under the TA and VT conditions. cantly correlated across displays for either condition TheadvantageobservedunderVTconditionsisimportantin (TA: r(cid:53)0.377 and VT: r(cid:53)0.398). The pattern of results that it suggests a difference may exist under practical (cid:126)not suggests an advantage for spatio-temporal display in the sta- just laboratory(cid:33) conditions. bility of performance across individuals. That is, most indi- Theincreasedaccuracywiththespatio-temporaldisplay viduals could successfully use the spatio-temporal display, observed in this experiment has two possible sources. First, whereas only some of those individuals could successfully the two displays differed in the extent to which they had use the temporal display. 2485 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2485 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms TABLEVI. Display-type,ageofonsetinyears,three-frequency(cid:126)0.5kHz,1kHz,2kHz(cid:33)pure-toneaverages (cid:126)dB HL(cid:33) and speechreading ability of sentences in percent words correct for subjects in experiment IV. NMH(cid:53)nomeasurablehearing. Subject Ageofonsetof Left Right Speechreading (cid:126)onset(cid:33) Display hearingloss ear ear ofsentences 1(cid:126)Post(cid:33) Temporal 4 NMH NMH 67 2(cid:126)Post(cid:33) Temporal 4 NMH NMH 37 3(cid:126)Post(cid:33) Spatio-Temporal 3 110 105 61 4(cid:126)Post(cid:33) Spatio-Temporal 15 103 95 28 5(cid:126)Pre(cid:33) Temporal Birth 83 92 67 6(cid:126)Pre(cid:33) Temporal Birth 87 100 72 7(cid:126)Pre(cid:33) Spatio-Temporal Birth 115 110 62 8(cid:126)Pre(cid:33) Spatio-Temporal Birth 100 105 71 VI. EXPERIMENT IV. VISUAL-ALONE VERSUS examples of the different types of intonation patterns and VISUAL-TACTILE JUDGMENTS OF STRESS stresspatternsusinglivevoice(cid:126)andnomanualsigns(cid:33),which AND INTONATION BY HEARING-IMPAIRED was processed and transformed into vibrotactile stimulation INDIVIDUALS in real-time. It was verified that the subjects had a basic The goal of sensory substitution is to develop vibrotac- understandingofthetask.Subjectswerethenpresentedwith tile speech perception aids for hearing-impaired individuals. practice consisting of 48 VT trials and 48 VA trials. Each Having observed several hearing-impaired subjects who trial was followed by feedback. Then subjects identified were less successful identifying vibrotactile stress and into- stress and intonation in 4 blocks of 48 sentences in the VA nationthanhearingsubjects,RothenbergandMolitor(cid:126)1979(cid:33) conditionfollowedby30blocksofVTtrials,followedby20 suggested that vibrotactile pitch perception could be based blocksofVAtrials.Halfoftheblockswereproducedbythe on experience with auditory pitch. Alternatively, the ability femaletalkerandhalfbythemaletalker,withorderoftalker to judge stress and intonation could be based on experience counterbalancedacrosssubjects.Thetotalnumberoftrialsin with spoken language. That is, individuals with reduced ex- eachconditionwasequivalenttothatinexperimentI.Atthe perience perceiving spoken language, might have more dif- end of testing, each subject was presented with a set of 100 ficulty with identification of stress and intonation regardless prerecorded sentences spoken by a different male talker to of the form of stimulation. Experiment IV, which was a assess speechreading ability. Sentences were presented one modified version of experiment I, employed pre- and post- at a time under computer control and after each sentence linguallyhearing-impairedadultsubjects.Inadditiontotest- subjects typed what they thought the talker said. ing stress and intonation judgments, a test was also given to C. Analyses estimate the speechreading ability of our subjects with early ages-of-onset of hearing impairment. Separate 2(cid:51)2(cid:51)2(cid:51)2(cid:51)2 (cid:126)condition(cid:51)talker(cid:51)sentence- type(cid:51)pre-post(cid:51)display(cid:33)repeatedmeasuresanalysesofvari- A. Subjects ance were performed on the mean percentages of correct The subjects were eight hearing-impaired adults, age responsesforintonationandstress,asinexperimentI,except 19–30 years, four with pre- and four with post-lingual hear- that pre–post corresponded to whether the subjects had pre- ing impairments. They were all from the Gallaudet Univer- orpost-lingualhearingimpairments.Again,onlythelastfive sity community. Table VI shows the three-frequency pure- blocks of data within the VA and VT conditions were ana- tone averages (cid:126)dB HL(cid:33) for each of the subjects, their age at lyzed. onset of hearing impairment, the display they received and theiraccuracyforspeechreadingwordsinsentences.Allsub- D. Results and discussion jects reported English as their native language. They were VA stress identification was reliably above chance for paid for their participation. bothtalkers(cid:126)80%correctforthefemaletalkerand84%cor- rect for the male talker(cid:33) according to the binomial test B. Stimuli, design, and procedure (chance(cid:53)33.3%) (cid:126)see Table VII(cid:33). VA intonation identifica- The stimuli and task in experiment IV were the same as tionwasreliablyabovechanceforthemaletalker(cid:126)66%cor- in experiment I. The sequence of conditions was held con- rect(cid:33)butnotforthefemaletalker(cid:126)56%correct(cid:33),accordingto stant across subjects. VT stimuli were used to help explain the binomial test (chance(cid:53)50%) (cid:126)see Table VIII(cid:33). The ac- and give initial practice. It was hypothesized that the vibro- curacy of speechreading sentences (cid:126)see Table VI(cid:33) is consis- tactile stimuli would aid in conveying the concept of stress tentwithourpreviousstudiesassessingspeechreadingability andintonation,intheeventthatthesubjectswereunfamiliar (cid:126)Bernsteinetal.,1998;Auer,1997(cid:33).Specifically,individuals with these linguistic characteristics. with early onset hearing impairments (cid:126)subjects Subjects were given written instructions explaining the 1, 3, 5, 6, 7,8(cid:33) frequently far outperform individuals with task followed by a spoken explanation with manual sign ac- late-onset hearing impairments (cid:126)e.g., subject 4(cid:33) or normal companiment. The experimenter presented the subjects with hearing. 2486 J.Acoust.Soc.Am.,Vol.104,No.4,October1998 E.T.Auer,Jr.andL.E.Bernstein:Temporalvibrotactiledisplays 2486 Downloaded 11 Jul 2013 to 128.95.155.147. Redistribution subject to ASA license or copyright; see http://asadl.org/terms

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.