ebook img

Handling Emotions in Human-Computer Dialogues PDF

279 Pages·3.377 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Handling Emotions in Human-Computer Dialogues

Handling Emotions in Human-Computer Dialogues Johannes Pittermann • Angela Pittermann Wolfgang Minker Handling Emotions in Human-Computer Dialogues ABC JohannesPittermann WolfgangMinker UniversitätUlm UniversitätUlm Inst.Informationstechnik Fak.Ingenieurwissenschaften Albert-Einstein-Allee43 And 89081Ulm Elektrotechnik Germany Albert-Einstein-Allee43 [email protected] 89081Ulm Germany [email protected] AngelaPittermann UniversitätUlm Inst.Informationstechnik Albert-Einstein-Allee43 89081Ulm Germany [email protected] ISBN978-90-481-3128-0 e-ISBN978-90-481-3129-7 DOI10.1007/978-90-481-3129-7 SpringerDordrechtHeidelbergLondonNewYork LibraryofCongressControlNumber:2009931247 °cSpringerScience+BusinessMediaB.V.2010 Nopartofthisworkmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorby anymeans,electronic,mechanical,photocopying,microfilming,recordingorotherwise,withoutwritten permissionfromthePublisher,withtheexceptionofanymaterialsuppliedspecificallyforthepurpose ofbeingenteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthework. Coverdesign:BoekhorstDesignb.v. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface “Thefinestemotionofwhichwearecapableisthemysticemotion” (AlbertEinstein,1879–1955) During the past years the “mystery” of emotions has increasingly attracted interest in research on human–computer interaction. In this work we investigate theproblemofhowtoincorporatetheuser’semotionalstateintoaspokenlanguage dialoguesystem.Thebookdescribestherecognitionandclassificationofemotions andproposesmodelsintegratingemotionsintoadaptivedialoguemanagement. In computer and telecommunication technologies the way in how people communicate with each other is changing significantly from a strictly structured and formattedinformationtransfer to a flexible and more natural communication. Spokenlanguageisthemostnaturalwayofcommunicationbetweenhumansandit alsoprovidesaneasyandquickwaytointeractwithacomputerapplication.These systemsrangefrominformationkioskswheretravelerscanbookflightsorbuytrain tickets to handheld devices which show tourists around cities while interactively giving information about points of interest. Generally, spoken language dialogue doesnotonlymeansimplicity,comfortandsavingoftimebutmoreovercontributes tosafetyaspectsincriticalenvironmentslikeincars,wherehands-freeoperationis indispensible in order to keep the driver’sdistraction minimal. Within the context ofubiquitouscomputinginintelligentenvironmentsdialoguesystemsfacilitateev- erydaywork,e.g.,athomewherelightsorhouseholdappliancescanbecontrolled by voice commands, and provide the possibility, especially in assisted living, to quicklysummonhelpinemergencycases. In parallel to the progress made in technical development the customer’s de- mandsconcerningtheproductshaveincreased.Whilecarownersinthe1920smight havebeencompletelysatisfiedoncetheyarrivedatadestinationwithoutanymajor complications,peopleinthe1970swouldhavealreadytendedtobecomeannoyed oncetheirenginerefusestostartonthefirstturnoftheignitionkey.Andnowadays anavigationsystemshowingthewrongwaymightevencausemoreanger.Forubiq- uitoustechnologylikecarsthismeansontheonehandthatthedriverisliterallyat themercyofsophisticatedtechnologyontheotherhandthisdoesnothinderhim/her from building some kind of personalrelation to the car, rangingfrom decorations v vi Preface likecarfreshenersorfuzzydicetoexpensivetuning.Sucharelationincludesaswell theexpressionofemotionstowardsthecar–justimaginedriversspurringontheir cars when climbing a steep hill and being glad having reached the top, or drivers shoutingattheirnon-functioningnavigationsystem, hittingorkickingtheircars... A similar behavior can be observed among computer users. Having successfully writtenabookusingawordprocessingsoftwaremightarousehappiness,however asuddenharddisccrashdestroyingalldocumentswillprobablydrivetheauthorup thewall. Normally neither the car nor the computer is capable of replying to the user’s affect.Sowhynotenabledevicestoreactaccordingly?Thinkofacarthatrefuses to startandthe drivershoutingangrily“Stupidcar, I paidmorethan$40,000and nowit’sonlycausingtrouble!”.Hereacar’sreplylike“Iamsorrythattheengine doesnotrun properly. This is due to a defective spark-plugwhich needsto be re- placed.”wouldcertainlydefusethetensesituationanditmoreoverprovidesuseful informationon how to solve the problem.This again contributesto safety aspects in the car as the driver can be calmed down, e.g., in the case of a delay due to a trafficjam,whereuponthedrivertriestomakeupthelossoftimebyspeeding.Here thecar’scomputercouldtrytorearrangetheplannedmeetingandinformtheuser: “Due to ourdelay I haverescheduledyourmeeting one hourlater. So there is no needtohurry.” To implementa more flexible system, the typical architecture of a spoken lan- guage dialogue system needs to be equipped with additional functionality. This includes the recognition of emotions and the detection of situation-based param- etersaswellasuser-stateandsituationmanagerswhichcalculatemodelsbasedon theseparametersandinfluencethecourseofthedialogueaccordingly. Constituting a hot topic of interest in current research there exist several ap- proachesto classify the user’semotions. These methodsincludethe measurement of physiological values using biosensors, the interpretation of gestures and facial expressionsusingcameras,naturallanguageprocessingspottingemotivekeywords and fillers in recognizedutterancesor classification of prosodicfeatures extracted fromthespeechsignal.Concentratingonamonomodalsystemwithoutvideoinput andtryingtoreduceinconveniencestotheuser,thisworkfocusesontherecognition ofemotionsfromthespeechsignalusingHiddenMarkovModels(HMMs).Based onadatabaseofemotionalspeech,asetofprosodicfeatureshasbeenselectedand HMMshavebeentrainedandtestedforsixemotionsandtenspeakers.Duetovari- ationsinmodelparametersmultiplerecognizershavebeenimplemented. Accordingtotheoutputoftheemotionrecognizer(s)thecourseofthedialogue isinfluenced.Withthehelpofauser-statemodelandasituationmodelthedialogue strategyisadaptedandanappropriatestylisticrealizationofitspromptsischosen. I.e.,if the user is ina neutralmoodandspeaksclearly,thereare noconfirmations necessaryandthedialoguecanbekeptrelativelyshort.Howeveriftheuserisangry andspeakscorrespondinglyunclearly,thesystemhastotrytocalmdowntheuser butitalsohastoaskoftenforconfirmation,whichagainmakestheuserturnangry... Principallythere existtwo methodsto modelthe influenceof these so-calledcon- trolparameterslikeemotions:arule-basedapproachwhereeveryeventualityinthe Preface vii user’sbehavioriscoveredbyarulewhichcontainsasuitablereply,orastochastic approachwhichmodelstheprobabilityofacertainreplyindependenceoftheuser’s previousutterancesandcorrespondingcontrolparameters. So how is this book organized? An introduction to the research topic is fol- lowedbyanoverviewonemotions–theoriesandemotionsinspeech.Inthethird chapter, dialogue strategy concepts with regard to integrating emotions in spoken dialogue are described. Signal processing and speech-based emotion recognition arediscussedinChapter4andimprovementstoourproposedemotionrecognizers as well as the implementation of our adaptive dialogue manager are discussed in Chapter5.Chapter6presentsevaluationresultsoftheemotionrecognitioncompo- nentandoftheend-to-endsystemwithrespecttoexistingspokenlanguagedialogue systems evaluationparadigms.The bookconcludeswith a final discussion and an outlookonfutureresearchdirections. Ulm, Johannes&AngelaPittermann May2009 WolfgangMinker Contents 1 Introduction.................................................................... 1 1.1 SpokenLanguageDialogueSystems................................... 2 1.2 EnhancingaSpokenLanguageDialogueSystem ..................... 6 1.3 ChallengesinDialogueManagementDevelopment................... 8 1.4 IssuesinUserModeling ................................................ 11 1.5 EvaluationofDialogueSystems........................................ 14 1.6 SummaryofContributions.............................................. 16 2 HumanEmotions.............................................................. 19 2.1 DefinitionofEmotion................................................... 19 2.2 TheoriesofEmotionandCategorization............................... 22 2.3 EmotionalLabeling ..................................................... 36 2.4 EmotionalSpeechDatabases/Corpora ................................. 42 2.5 Discussion ............................................................... 45 3 AdaptiveHuman–ComputerDialogue...................................... 47 3.1 BackgroundandRelatedResearch..................................... 48 3.2 User-StateandSituationManagement................................. 61 3.3 DialogueStrategiesandControlParameters........................... 65 3.4 IntegratingSpeechRecognizerConfidenceMeasures intoAdaptiveDialogueManagement .................................. 66 3.5 IntegratingEmotionsintoAdaptiveDialogueManagement.......... 72 3.6 ASemi-StochasticDialogueModel.................................... 78 3.7 ASemi-StochasticEmotionalModel .................................. 90 3.8 ASemi-StochasticCombinedEmotionalDialogueModel ........... 95 3.9 Extending the Semi-Stochastic Combined EmotionalDialogueModel.............................................100 3.10 Discussion ...............................................................104 4 HybridApproachtoSpeech–EmotionRecognition........................107 4.1 SignalProcessing........................................................108 4.2 ClassifiersforEmotionRecognition ...................................120 4.3 ExistingApproachestoEmotionRecognition.........................127 ix x Contents 4.4 HMM-BasedSpeechRecognition......................................131 4.5 HMM-BasedEmotionRecognition ....................................135 4.6 CombinedSpeechandEmotionRecognition..........................142 4.7 EmotionRecognitionbyLinguisticAnalysis..........................144 4.8 Discussion ...............................................................149 5 Implementation................................................................151 5.1 EmotionRecognizerOptimizations....................................151 5.2 UsingMultiple(Speech–)EmotionRecognizers.......................159 5.3 ImplementationofOurDialogueManager ............................173 5.4 Discussion ...............................................................185 6 Evaluation......................................................................187 6.1 DescriptionofDialogueSystemEvaluationParadigms...............187 6.2 SpeechDataUsedfortheEmotionRecognizerEvaluation...........190 6.3 PerformanceofOurEmotionRecognizer..............................192 6.4 EvaluationofOurDialogueManager..................................217 6.5 Discussion ...............................................................223 7 ConclusionandFutureDirections...........................................227 A EmotionalSpeechDatabases.................................................237 B UsedAbbreviations............................................................251 References...........................................................................253 Index.................................................................................273 Chapter 1 Introduction “How may I help you?” (cf. Gorin et al. 1997) – Imagine you are calling your travelagency’stelephonehotlineandyoudon’tevennoticethatyouaretalkingto acomputer.Wouldyoubesurprisedifyourvirtualdialoguepartnerrecognizedyou bymeansofyourvoiceandifitaskedyouhowyoulikedyourprevioustrip? Theongoingtrendofcomputersbecomingmorepowerful,smaller,cheaperand more user-friendly leads to the effect that these devices increasingly gain in im- portanceineverydaylife andbecome“invisible”.Withinthisso-calledubiquitous computingthereexistalargevarietyofapplicationsanddatastructuresrangingfrom informationretrievalsystems to controltasks andemergencycall functionality.In order to handle these applications, a manageable user interface is required which canberealizedwiththeaidofaspokenlanguagedialoguesystem(SLDS). Inthischapter,wegiveabriefoverviewonthefunctionalityofSLDSandtheir implementationincurrentdialogueapplications.Someoftheideaspresentedhere alreadyapplysuccessfullyinstate-of-the-artdialogueapplications,otherideasare still part of ongoing research. Thus, certain challenges still exist in the develop- mentofspeechapplications(seealsoMinkeretal.2006b).Inthisbook,weaddress theuser-friendlinessandthe naturalnessofanSLDS. Thisincludestheadaptation of the dialogue to the user’s emotional state and, to accomplish that, the recog- nition of emotionsfrom the speech signal. Thereforewe describe the architecture of an SLDS and refer to approaches where regular dialogue systems may be im- proved and how these improvements can be realized. Here, in Sections 1.2–1.4 especiallychallengesinthedevelopmentofadaptivedialoguemanagementaread- dressed. In Chapters 3–5, we describe our strategies of integrating emotions into adaptivedialoguemanagementandourapproachtospeech-basedemotionrecogni- tionanditsderivativeslikecombinedspeech–emotionrecognitionandoptimization approaches.Anevaluationofourmethodsaswellasasummaryandadiscussionof futureperspectivesisgiveninChapters6and7. J.Pittermannetal.,HandlingEmotionsinHuman-ComputerDialogues, 1 DOI10.1007/978-90-481-3129-7 1, c SpringerScience+BusinessMediaB.V.2010

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.