ebook img

DTIC ADA460942: The Pragmatics of Taking a Spoken Language System Out of the Laboratory PDF

4 Pages·0.09 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview DTIC ADA460942: The Pragmatics of Taking a Spoken Language System Out of the Laboratory

The Pragmatics of Taking a Spoken Language System Out of the Laboratory JodyJ.Daniels and HelenWrightHastie LockheedMartinAdvancedTechnologyLaboratories 1FederalStreet A&E3-W Camden,NJ08102 jdaniels,hhastie @atl.lmco.com (cid:0) (cid:1) Abstract LockheedMartin’sAdvancedTechnologyLab- oratorieshasbeendesigning,developing,test- ing, and evaluating spoken language under- standingsystemsinseveraluniqueoperational environmentsoverthepastfiveyears.Through theseexperienceswehaveencounterednumer- ouschallengesinmakingeachsystembecome an integral part of a user’s operations. In this paper, we discuss these challenges and report howweovercamethemwithrespecttoanum- berofdomains. 1 Introduction Lockheed Martin’s Advanced Technology Laboratories (LMATL) has been designing, developing, testing, and Figure1:LCSMarineonthemove evaluatingspokenlanguageunderstandingsystems(SLS) inseveraluniqueoperationalenvironmentsoverthepast fiveyears.Thismodelofhumaninteractionisreferredto 2 Architecture asListen,Communicate,Show(LCS).InanLCSsystem, thecomputerlistensforinformationrequests,communi- The LCS Spoken Language systems use the Galaxy ar- cateswiththeuserandnetworkedinformationresources chitecture(Seneffetal.,1999). ThisGalaxyarchitecture to compute user-centered solutions, and shows tailored consistsofacentralhubandservers. Eachoftheservers visualizations to individual users. Through developing performs a specific function, such as converting audio thesesystems,wehaveencounterednumerouschallenges speech into a text translation of that speech or combin- inmakingeachsystembecomeanintegralpartofauser’s ing the user’s past statements with what was said most operations.Forexample,Figure1showsthedeployment recently.Theindividualserversexchangeinformationby ofadialoguesystemforplacingMarinesupplyrequests, sendingmessagesthroughthehub. Thesemessagescon- whichisbeingusedinatacticalvehicle,aHMMWV. taininformationtobesenttootherserversaswellasin- Some of the challenges of creating such spoken lan- formationusedtodeterminewhatserverorserversshould guagesystemsincludegivingappropriateresponses.This becontactednext. involvesmanagingthetensionbetweenutterancebrevity VariousGalaxyServersworktogethertodevelopase- and giving enough context in a response to build the manticunderstandingoftheuser’s statementsandques- user’strust. Similarly,thelengthofuserutterancesmust tions. Thesoundspokenintothemicrophone,telephone, be succinct enough to convey the correct information or radio is collected by an Audio Server and sent on to withoutaddingtothesignatureofthesoldier.Thesystem the recognizer. The recognizer translates this wave file must be robust when handling out of vocabulary terms intotext,whichissenttoanaturallanguageparser. The andconcepts. Itmust alsobe ableto adapttonoisyen- parser converts the text into a semantic frame, a repre- vironmentswhose parameters changefrequentlyand be sentationofthestatement’smeaning. Thismeaningrep- able use input devices and poweraccess uniqueto each resentation is passed on to another server, the Dialogue situation. Manager. This server monitors the current context of a Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2003 2. REPORT TYPE 00-00-2003 to 00-00-2003 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER The Pragmatics of Taking a Spoken Language System Out of the 5b. GRANT NUMBER Laboratory 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Lockheed Martin Advanced Technology Laboratories,1 Federal Street, REPORT NUMBER A&E 3W,Camden,NJ,08102 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES The original document contains color images. 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE 3 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 conversationand, based on this context, can promptthe with the uniqueenvironmentalfactors. The formerpre- userforanynecessaryclarificationandpresentintelligent sentedtheconflictinggoalsofbrevityversusconfirming responses to the user. Since the Dialogue Manager is userinputs. Marineswanttorestricttheirtimeonthera- aware of what information has been discussed thus far, dionetasmuchaspossible. Atthesametimetheywant itisabletodeterminewhatinformationisstillneeded.A toensurethatwhattheyrequestediswhattheywerego- semanticframeis createdbytheDialogueManagerand ingtoreceive.Muchtimewentintodefiningandrefining this is sent through the Language Generation Server to system responses that met both needs as best possible. generatea textresponse. The textresponseis thenspo- ThisinvolvedseveralsessionswithanumerousMarines kentotheuserthroughaspeechsynthesisserver. evaluating possible dialogue responses. We also spent To solve the problem of retrieving or placing data muchtimeensuringthatLCSMarinecouldhandleboth from/in remote and local sources, we gave the sys- properandmalformedradioprotocols.Broadcoverageof tems below the use of mobile software agents. If user- potentialexpressions,especiallythosewhenunderstress, requested information is not immediately available, an suchasrecognitionoftheliberaluseofcursewords,led agentcanmonitorthedatasourcesuntilitis possibleto togreateruserabilitytosuccessfullyinteractthroughthe respond. Users mayrequestanotificationoralertwhen system. a particularactivityoccurs,whichmayhappenatan in- The second set of challenges, unique environmental determinatetimeinthefuture.Becauseofthepotentially factors, included access while on the move, battlefield significant time lag, it is important to manage dialogue noise, and coping with adverse conditions such as sand activitysothattheuserisonlyinterruptedwhentheneed storms.AccessingLCSMarinewhileonthemovemeant for information is more important than the current task usingaSINCGARS radioastheinputdevice. Attempts that the user is currently undertaking. This active man- to use the system by directly collecting speech from a agementofinterruptionsaidstaskmanagementandlight- SINCGARSradioweredroppedduetothetechnological enscognitiveload(Danielsetal.,2002). challengespresentedbythedistortionintroducedbythe radioonthesignal. Instead,weinstalledthemajorityof 3 Domains the system on laptops and put these into the HMMWV. We sent mobile agents over the SINCGARS data link 3.1 LCSMarine backto thedatasources. This meantsecuringhardware One of the first LCS systems to be tested out in the inaHMMWVandpoweringitoffofthevehicle’sbattery field was our MarineLogistics spokendialoguesystem. asillustratedinFigure1. (Onlyonelaptopwasdamaged This application sought to connect the Marine in the during testing.) The mobile agents were able to easily field tothe SmallUnit Logistics(SUL)database, which traversea retransmissionlinkandreachthe remotedata maintainscurrentinformationaboutsupplyrequisitions. source. Warfighters wanted to be able to place requests as well Dealingwithhugelyvaryingbackgroundnoisesources as check on the status of existing requests without the was less of a problem than originally predicted. Fortu- need of additional hardware or communicating with a nately, most of the time that one of these loud events third party. It was also highly desirable to use existing wouldoccur,userswouldsimplystoptalking.Theirhear- communicationsprocedures,so that the trainingtime to ingwasimpairedandsotheywouldwaitforthenoiseto use the system was minimized. The system needed to abateandthencontinuethedialogue. Ontheotherhand, bespeaker-independentandmixedinitiativeenablingthe wedidencounterseveraluserswho,becauseoftheLom- warfighterstodevelopasenseoftrustinthetechnology. bard effect, insisted upon always yelling at the system. Marinesusingthesystemwereabletoperformseveral While we did not measure decibel levels, there were a tasks. Theycouldcreatenewrequestsforsupplies,with few times when the system was not able to understand thesystempromptingthemforinformationneededtofill theuserbecauseofbackgroundnoise. in a request form. They could also modify and delete 3.2 ShipboardInformation previouslyplacedrequestsandcouldcheckonthestatus ofrequestsinoneoftwoways. Theycoulddirectlyask AnLCSsystemhasalsobeendevelopedtomonitorship- aboutthecurrentstatus, ortheycoulddelegateanagent board system information aboard the Sea Shadow (IX to monitor the status of a particular request. It was an 529), a Naval test platform for stealth, automation, and easy addition to the system to add a constraint that the control technologies. From anywhere on the ship, per- agentreturnafteraspecifiedtimeperiodifnoactivityoc- sonnelusetheon-boardintercomtocontactthissystem, cursontherequest,whichisalsovaluableinformationfor SUSIE(ShipboardUbiquitousSpeechInterfaceEnviron- theMarine. Thesedelegatedagentstravelacrossa low- ment),toaskaboutthestatusofequipmentthatislocated bandwidthSINCGARSradionetworkfromtheMarineto throughout the ship. Crew members do not have to be thedatabaseandaccessthatdatabasetoplace,alter,and anywhere near the equipment being monitored in order monitorsupplyrequisitions. toreceivedata. Figure2illustratesasailorusingSUSIE The challenges in deploying this system to the field throughtheship’sintercom. were twofold - building trust in the system so that it Personnel can ask about pressures, temperatures, and would becomepart of normaloperationsand in dealing voltages of various pieces of equipmentor can delegate 3.3 BattlefieldCasualtyReportingSystem We are currently developing a new LCS system known as the Battlefield Casualty Reporting System or BCRS. The goal of this system is to assist military personnel in reporting battlefield casualties directly into a main database.Thisinvolvesintelligentdialoguetoreduceam- biguity, resolveconstraints, and refinesearches on indi- vidualnamesandthecircumstancessurroundingtheca- sualty. Priorknowledgeofeveryindividual’snamewill not be possible. The deploymentof this system will be againpresentmanychallengessuchasnoiseeffectsona battlefield, effects of stress on the voice, and the ability tointegrateintoexistingmilitaryhardware. 4 Future Work The areas of researchneededto addressneeds formore dynamicandrobust systems includebetter, morerobust orpartialparsingmechanisms.Inaddition,systemsmust Figure 2: Sailor interacting with SUSIE through the beabletocopewithmulti-sentenceinputs,includingthe ship’sintercom system’s ability to insert back channel activity. Ease of domainexpansionisimportantassystemsevolve. Vary- ingenvironmentalfactorsmeanthatthesystems require monitoringthosemeasurements(sensorreadings)to the additional noise adaptation or mitigation techniques, in system. A user can request notification of an abnormal addition,theabilitytoswitchmodesofcommunicationif readingbyasensor. ThiscausestheLCSsystemtodele- oneisnotappropriateatagiventime. gateapersistentagenttomonitorthesensorandtoreport the requested data. Should an abnormal reading occur, 5 Conclusions theuseriscontactedbythesystem,againusingtheinter- We have discussed the pragmatics involved with taking com. an SLS system out of the laboratoryor away fromtele- This domain presented several challenges and oppor- phonyandplacingitinavolatileenvironment.Thesesys- tunities. Throughnumerous discussions with users and temshavetoberobustandbeabletocopewithvarying presentation of possible dialogues, we learned that the inputstylesandmodesaswellasbeabletomodifytheir users would benefit from a system ability to remember, outputto the appropriatesituation. In addition,the sys- between sessions, the most recent activity of each user. temsmustbeanintegralpartofthetechnologythatisin This would permit a user to simply log in and request: currentuseandbeabletowithstandadverseconditions. ”What aboutnow?”. SUSIE would determinewhat had Satisfyingalloftheseconstraintsinvolvesactivepartici- been this user’s most recent monitoring activity, would pationin thedevelopmentprocesswiththeend-usersas thenseekoutthecurrentstatus,andthenreportit. While wellascreativesolutionsandtechnologicaladvances. this seems quite simple, there is significant behind-the- scenesworktostorecontextandmaketheinteractionap- 6 Acknowledgments pearseamless. This work was supported by DARPA contract N66001- Itwasnecessarytousetheorganicintercomsystemin- 01-D-6011. stalledintheSea Shadowforcommunicationwithcrew members. Collecting speech data through the intercom systemtopasstoSUSIErequiredlinkingtwoDSPs(and References adjustingthem)tothe hardwareforthe SLS.Oncecon- nected in, the next significant challenge was that of the JodyDaniels,SusanRegli,andJerryFranke. 2002. Sup- varyingnoiselevels. Backgroundnoisevariedfromone port for intelligent interruption and augmented con- roomtothenextandevenwithinasinglespace.Wewere textrecovery. InProceedingsof7thIEEEConference not able to use a push-to-talk or hold-to-talk paradigm onHumanFactorsandPowerPlants,Scottsdate,AZ, becauseoftheinconveniencetothecrewmembers;they September. leavetheintercomopenforthedurationoftheconversa- tion. Fortunately, the recognizer(built on MIT’s SUM- StephanieSeneff,RayLau,andJoePolifroni. 1999. Or- MIT) is able to handle a great deal of a noise and still ganization,communication,andcontrolinthegalaxy- hypothesizeaccurately. Toimprovethesystemaccuracy, ii conversational system. In Proceedings for Eu- wewillincorporateautomaticretrainingoftherecognizer rospeech’98,Budapest,Hungary. onnoiseeachtimethatanewsessionbegins.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.