Table Of ContentIntroducing Spoken Dialogue Systems into
Intelligent Environments
Tobias Heinroth • Wolfgang Minker
Introducing Spoken Dialogue
Systems into Intelligent
Environments
123
TobiasHeinroth WolfgangMinker
InstituteofCommunicationsEngineering InstituteofCommunicationsEngineering
UniversityofUlm UniversityofUlm
Albert-Einstein-Allee43 Albert-Einstein-Allee43
Ulm,Germany Ulm,Germany
ISBN978-1-4614-5382-6 ISBN978-1-4614-5383-3(eBook)
DOI10.1007/978-1-4614-5383-3
SpringerNewYorkHeidelbergDordrechtLondon
LibraryofCongressControlNumber:2012951331
©SpringerScience+BusinessMediaNewYork2013
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof
thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,
broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation
storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology
nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.
PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations
areliabletoprosecutionundertherespectiveCopyrightLaw.
Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication
doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant
protectivelawsandregulationsandthereforefreeforgeneraluse.
While the advice and information in this book are believed to be true and accurate at the date of
publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor
anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with
respecttothematerialcontainedherein.
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
ForTina,Sophia,and Jonas
Preface
One of the main reasons for the complexity of spoken dialogue systems (SDSs)
developmentconstitutes the multi-domainand thus the multi-topic nature of real-
lifeprocesses.Iftheapplicationdomainisnotclearlydefinedcollectingacorpusor
establishingvalidrulestocontrolthedialogueflowoftheSDSbecomesacomplex
task.WithintheframeworkoftheEU-fundedprojectATRACOwehavedeveloped
a model-based spoken dialogue manager called OwlSpeak.1 It provides a spoken
interfacetoanexistingIntelligentEnvironment(IE)inreal-lifesituations.Themost
importantfeatureofthedialoguemanagerisitsabilitytopause,resume,andswitch
betweenmultipleinteractivetasks,henceenablingmultitasking.
Our novel model-based approach allows for persistently storing the states and
thestructuresofvariousspokendialogues.Basedonthemultitaskingcapabilitywe
have definedtopic switchingstrategies. These allow to navigatebetween different
dialoguetopicsduringanongoinguser–systemconversation.Furthermore,wehave
integratedrepairstrategiesinordertokeepthedialoguecoherent.Wehavedefined
mechanismsofadaptiveunderstandingtoenhancetherecognitionperformance.Our
frameworkalsosupportsspeakerawaredialoguesandvoice-baseddialoguecontrol.
This enables adaptive system behaviour. Finally, we have elaborated a formal
definition of dialogue descriptions that facilitate dialogue managementwithin the
dynamicIEdomain.TheimplementedprototypeiscompliantwiththeVoiceXML
dialoguedescriptionandtheOWLontologydefinitionstandards.Thelatterisused
forknowledgerepresentation.
During an initial system evaluation we have investigated the effects of multi-
taskingonuserswithinthespokendialoguecontext.Theresultsaretwofold:users
engaged in multitasking dialogues are more inclined to interact with the SDS. In
turn, users who sequentially received one task after anotherare able to remember
more facts than those who used the multitasking approach. These results led to
an evaluationseries focussingon differentassistive dialoguestrategies. They may
be applied to switch the dialogue focus to a different topic (and afterwards back
1http://sourceforge.net/projects/owlspeak/
vii
viii Preface
to the original one). The underlying idea is to guide the user by alerting him of
possibledialogueinterruptions.Afterasub-dialogueisbeingprocessed,theoriginal
dialogue is re-introduced by reminding the user of the main topic. The analysis
resultsindicatedthatthesophisticatedexplanationstrategyperformsbest.Notably,
theapplieddialoguestrategyalsohadameasurableandsignificantinfluenceonthe
overalldialoguequality.
AsocialevaluationhasbeenconductedwithinanexistingIErevealedqualitative
results and a positive learning process the subjects went through during the three
successive evaluation sessions. However, the prototype generated dialogues that
were too rigid and not sufficiently intuitive. Considering the high motivation of
the subjects and the eagerness with which they controlled the IE via speech, we
haveinvestigatedwaystoenhancetheunderstandingcapabilitiesofOwlSpeak.The
main goal was to render the interface more intuitive. Hence, we have evaluated
different mechanisms to solve this issue whilst keeping the complexity of the
domainmodelslow.Wehavediscoveredthat,especiallyforcommand-and-control
dialogues, semantic strategies that enhance the understanding capabilities are the
mostpromisingapproaches.Afurtherevaluationcoveredtheissueofhowtocope
with errors occurring during spoken human–machine interaction. Therefore, we
havecomparedthreestrategiesrangingfromasimplere-prompttoamorecomplex
self-repairstrategy.Themainoutcomeofthisevaluationisthe strongdependency
betweenthechoiceofanappropriaterepairstrategyandtheusercharacteristics.The
subjective rating of experts differed significantly from the rating of novices. This
underpinstheimportanceofuser-relatedinformationfordialoguemanagement.
The theoretical foundations of a working ontology-based spoken dialogue
description framework, the prototype implementation of the ASDM, and the
evaluationactivitiesthathavebeenconductedaspartofthisworkcontributetothe
ongoingresearch on spoken dialogue managementby establishing the framework
ofmodel-basedAdaptiveSpokenDialogueManagement.
The research leading to our results has received funding from the European
Community’s7thFrameworkProgramme(FP7/2007–2013)undergrantagreement
◦
n 216837andfromtheTransregionalCollaborativeResearchCentreSFB/TRR62
“Companion-TechnologyforCognitiveTechnicalSystems”fundedbytheGerman
ResearchFoundation(DFG).
Ulm,Germany TobiasHeinroth
WolfgangMinker
Contents
1 Introduction .................................................................. 1
1.1 ProblemSetting......................................................... 4
1.2 ProposedSolution:AdaptiveSpokenDialogueManagement........ 6
1.3 DocumentStructure.................................................... 8
2 Background................................................................... 11
2.1 SpokenDialogueSystems ............................................. 11
2.2 IntelligentEnvironments:Adaptive and TRusted
AmbienteCOlogies .................................................... 14
2.3 InteractionWithinIntelligentEnvironments.......................... 18
2.4 GeneralApproachestoSpokenDialogueManagement.............. 22
2.4.1 State-MachinesandGrammars................................ 22
2.4.2 StochasticApproaches......................................... 24
2.4.3 Plan-andInformationState-basedSystems .................. 25
2.5 EnhancedSpokenDialogueManagementMethodologies ........... 27
2.5.1 UnderstandingMethods........................................ 27
2.5.2 SpokenDialogueManagementStrategies .................... 29
2.6 Conclusion.............................................................. 31
3 NovelApproachtoSpokenDialogueManagement
inIntelligentEnvironments................................................. 33
3.1 User-centredAdaptation ............................................... 35
3.2 SDS-centredAdaptation ............................................... 38
3.3 Environment-centredAdaptation...................................... 41
3.4 SummarisingSpokenDialogueAdaptation........................... 45
3.5 ApplicationScenario................................................... 47
3.5.1 Episodes........................................................ 49
3.5.2 ConversationalActs............................................ 50
3.6 CharacterisationoftheFunctionalities................................ 52
3.6.1 MultitaskingSpokenDialogues............................... 52
3.6.2 DialogueStrategies ............................................ 54
3.6.3 AdaptiveUnderstanding....................................... 58
ix
x Contents
3.6.4 SpeakerAwareDialogues...................................... 60
3.6.5 Voice-basedDialogueControl................................. 61
3.7 Conclusion.............................................................. 62
4 TheOwlSpeakAdaptiveSpokenDialogueManager .................... 65
4.1 ArchitecturalOverview ................................................ 65
4.2 ThePresenter ........................................................... 70
4.2.1 PresenterInterface ............................................. 71
4.2.2 BasicFunctionalityofthePresenter........................... 75
4.2.3 EnhancedSpokenDialogueManagement
forIntelligentEnvironments................................... 76
4.3 TheDomainModel..................................................... 85
4.3.1 StaticKnowledge .............................................. 86
4.3.2 DynamicKnowledge........................................... 89
4.3.3 AnElaboratedExample........................................ 95
4.4 TheView................................................................ 103
4.4.1 BasicTurns..................................................... 103
4.4.2 ClarificationTurns ............................................. 106
4.4.3 AlternativeViews .............................................. 108
4.5 Conclusion.............................................................. 109
5 ExperimentsandEvaluation................................................ 113
5.1 InitialEvaluation ....................................................... 114
5.1.1 ExperimentalSet-up ........................................... 115
5.1.2 EvaluationResults ............................................. 119
5.1.3 Conclusion...................................................... 125
5.2 ScalabilityExperiments................................................ 125
5.2.1 ExperimentalSet-up ........................................... 126
5.2.2 EvaluationResults ............................................. 127
5.2.3 Conclusion...................................................... 128
5.3 TopicSwitchingStrategies............................................. 129
5.3.1 ExperimentalSet-up ........................................... 129
5.3.2 EvaluationResults ............................................. 133
5.3.3 Conclusion...................................................... 139
5.4 RepairStrategies........................................................ 140
5.4.1 ExperimentalSet-up ........................................... 141
5.4.2 EvaluationResults ............................................. 143
5.4.3 Conclusion...................................................... 147
5.5 SocialEvaluation....................................................... 148
5.5.1 StudyOutline................................................... 148
5.5.2 QualitativeResults............................................. 150
5.5.3 QuantitativeResults............................................ 151
5.5.4 Conclusion...................................................... 153
5.6 AdvancedUnderstandingMethods.................................... 154
5.6.1 ExperimentalSet-up ........................................... 154