ebook img

Human-Centric Interfaces for Ambient Intelligence PDF

511 Pages·2010·6.75 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Human-Centric Interfaces for Ambient Intelligence

AcademicPressisanimprintofElsevier 30CorporateDrive,Suite400,Burlington,MA01803,USA 525BStreet,Suite1900,SanDiego,California92101-4495,USA 84Theobald’sRoad,LondonWC1X8RR,UK Copyright#2010,ElsevierInc.Allrightsreserved. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans, electronicormechanical,includingphotocopy,recording,oranyinformationstorageand retrievalsystem,withoutpermissioninwritingfromthepublisher. PermissionsmaybesoughtdirectlyfromElsevier’sScience&TechnologyRights DepartmentinOxford,UK:phone:(þ44)1865843830,fax:(þ44)1865853333,E-mail: permissions@elsevier.com.YoumayalsocompleteyourrequestonlineviatheElsevier homepage(http://www.elsevier.com),byselecting“Support&Contact”then“Copyright andPermission”andthen“ObtainingPermissions.” LibraryofCongressCataloging-in-PublicationData Applicationsubmitted BritishLibraryCataloguing-in-PublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary. ISBN:978-0-12-374708-2 ForinformationonallAcademicPresspublications visitourWebsiteatwww.elsevierdirect.com PrintedintheUnitedStatesofAmerica 09 10 11 12 6 5 4 3 2 1 Foreword In the early days of cinema, its inventors produced simple film clips. Because little was known about people’s perceptions, behaviors, expectations, and reactions, the field did not go far. The maturation of film as a medium occurred only when engineers and scientists began to work hand in hand with designers and artists to achieve a balance between science, engineering, and art. Like earlyfilm, today’s many new technologies do not exist in a vacuum. Rather, they powerfully affect people at work, at home, and on the street both individually andsocially,impactingthewayweinteract with eachother,thewaywedesign and construct our buildings and cities, and the way we conduct daily life. However, manyofthesetechnologiesarenotwelldesignedforthemassesandmanyareimple- mentedwithoutfullytakingintoaccountthewaypeopleperceiveandinteractwith information, and how their use may influence social behavior. This renders them much less effective than they could be. Becauseofthepervasivenatureandencompassingcharacteroftechnologiesthat interfacewithpeopleindifferentdailylifeenvironments,thosewhodesign,develop, and implement them have an important—and unprecedented—responsibility to incorporate user concerns and behavior norms in their design, development, and implementation efforts. Technologists are good in their respective areas, such as writing code and designing systems, but they generally do not have the necessary understandingoforexperienceinhowpeopleperceiveinformation,interactsocially, anduseandinteractwithtechnology. Human-centered computing (HCC) has emerged from the convergence of multi- ple disciplines and research areas that concern understanding human behavior, human communication, and the design of computational devices and interfaces. These areas include computer science, sociology, psychology, cognitive science, engineering, the arts, and graphic and industrial design. Human-Centric Interfaces for Ambient Intelligence addresses these broad areas within the framework of Ambient Intelligence (AmI). Its fundamental message is twofold: n Serving the user should be the central aim of an AmI application. n A system should not demand specific training or technical knowledge on the part of the user if the intention is to achieve natural and efficient interaction. Presented here is a snapshot of the state of the art in human-centric interface (HCI) design for ambient intelligence that lays out the fundamental concepts and intro- duces recent advances in practical application. The editors and contributors are well-known experts in signal and speech processing, computer vision, multimodal xvii xviii Foreword analysis,interfacedesign,human–computerinteraction,andrelatedfields.Thebook is an excellent resource for researchers, students, and practitioners in human- centered computing in general and in ambient intelligence and multimodal human- centric interfaces in particular. Nicu Sebe UniversityofTrento,ItalyandUniversityofAmsterdam,TheNetherlands May2009 Preface Ambientintelligence(AmI)isafast-growingmultidisciplinaryfieldthatisusheringin new opportunities for many areas of research to have a significant impact on society. Its foundation is the enrichment of the environment, through sensing and processing technologies, to understand, analyze, anticipate, and adapt to events and to users and their activities, preferences, intentions, and behaviors. Basically, AmI gathers real-time information from the environment and combines it with his- torical data accumulated over time, or a knowledge base, to provide user services. Interfacing with the user is a major aspect of any AmI application. While inter- faces may employ different technologies, the underlying notion of user centricity dominates their design. In AmI, the traditional paradigm of human–computer inter- faces, in which users must adapt to computers by learning how to use them, is replacedbyanewone,inwhichcomputersadapttousersandlearnhowtointeract withtheminthemostnaturalway.Themonopolaremphasisontheuserinthenew terminology (human-centric) reflects the shift from the bipolar phrase (human– computer) used in the traditional terminology. Considering that serving the user is the central aim of the AmI application and that AmI systems should not demand special training and technical knowledge on the user’s part, user interface design is of paramount importance. This book offers a description of the state of the art in human-centric interface design for AmI appli- cations, focusing not only on fundamental concepts but also on recent advances in practical applications. The different parts of the book provide a perspective on the research potentials in the field through studies on visual, audio, and multi-modal interfaces and applications in smart environments. AMBIENT INTELLIGENCE AmI has been presented not just as another step toward embedding technological advances in society but also as a new computing paradigm that will revolutionize the way we conceive the relationship between computing systems and users. At present, in order to benefit from traditional computing devices, users must have some degree of knowledge and experience. This restricts the groups of people that canbenefitfromcomputingpowerandinsomecaseshasresultedinthecreationof the so-called “digital divide.” Integrationofmultiplesensorsinadistributedhuman-centricinterfaceembodies new research and development opportunities in algorithm design based on collabo- rativesensingandprocessing,datafusion,eventinterpretation,contextextraction, xix xx Preface and behavior modeling. The development of proper algorithmic interfaces between quantitative information units (in charge of sensing and processing) and high-levelqualitative information units(inchargeofcontextandbehavior models) is another area of growing interdisciplinary interest within the field of ambient intelligence. Figure 1 is an example of the multiple layers of processing and reasoning leading to the accumulation of knowledge from user observations and the deduction of a user behavior model based on activities and events observed. HUMAN-CENTRIC DESIGN Human-centric design has been interpreted in a few different ways depending on the application or technology context. For example, in vision-based reasoning, where employment of cameras may have implications for user privacy, “smart cameras” can abstract information by local processing and then delete the images captured. Another interpretation of human-centric design is systems that are easy or intuitive to use without the need for training. This has paramount implications for adoption of the technology by the masses and for reaching segments of the Ambient Intelligence Behavior Communication Models Modes Behavior Context Events Adaptation High-Level Reasoning Knowledge Observation User Preferences Accumulation Validation and Availability Interface Detection Pose/Activity User Interface 3d Model Processing Sensors and Network FIGURE1 MultilayerdataprocessingandreasoningformodelinguserbehaviorandpreferencesinanAmI application. Preface xxi community on the other side of the “digital divide.” A final example of human cen- tricity relates to smart environments that sense and interpret user-based events and attribute without requiring wearable sensors. Yet, in the context of applications that are designed to assist userswith physical disabilities, human-centric design may take on a totally different meaning, one in which, for example, a system may provide assistance to its user, employing the same interfaces developed for security or biometric systems such as irisdetection,eyetracking,orfingerprintidentification(biometricsystemsareoften regardedasintrusiveastheyaremostlyusedinnetwork-centricsurveillanceapplica- tions). In suchapplications, both the system and the user will gothrough a training phase to learn each other’s behavior and response mechanisms over time. An important consideration in introducing new technologies into daily life is thesocialimplicationscreated,whichcanpotentiallypromote,inhibit,orindiffer- entwaysreshapethewayatechnologyisadoptedandutilizedbyvarioususerseg- ments. The pace of technology development often limits a developer’s timely access to usage models obtained from observations in the field. However, such information mayoffer vital design paradigm clues at the early stages oftechnology development. For example, privacy concerns of users of a camera-based assisted- living application can guide the developer in designing a system based on smart cameras employing local processing in such a way that no image data is sent out. This decision will in turn have implications for the types ofvision-processing algo- rithms needed, as well as the communication bandwidth and latency factors to be considered. Thus, while the notion of human centricity finds many interpretations, its true meaning goes beyond any of the individual aspects that have been discussed in the literature. It truly refers to a new paradigm in the development and use of tech- nologytoservetheuser inwhateverformandflavorbestoffer theintendedexperi- ence of the application of interest. In the new paradigm, privacy management, ease ofuse,unobtrusivedesign,andcustomizationcaneachbepartofthedefinitionofa human-centric interface. Thesefactorsarenotregardedasrigidrequirementswhentheessenceofhuman centricityis considered, meaning that differentlevelsofeachfactor and relative pri- orities among them must be derived from the context and objective of the applica- tion. For example, the same vision-sensing mechanism that may be perceived as intrusive when used in surveillance applications can be employed in a human- centric paradigm to improve the quality of life of patients and the elderly by timely detection of abnormal events or accidents. Figure 2 illustrates the new application design space based on novel concepts and methods in user-centric smart environments, ambient intelligence, and social networks. Many applications can be enabled in this space, through proper interfac- ingofasensingsystem, aprocessingandreasoningengine, anetworking infrastruc- ture,andaninteractiveresponse system, whenauser-centric frameworkisadopted for technology development. xxii Preface Smart Ambient user adaptivity / Environments Intelligence behavior models Sense, Adaptive, perceive, unobtrusive, interpret, user context-based project, services react, tailored to anticipate learned user preferences Social Networks Multi-modal, pervasive connectivity for sharing an experience with others in a flexible, layered visual representation FIGURE2 User-centricdesignspaceenablingnovelapplicationdomains. VISION AND VISUAL INTERFACES Along with other sensing modalities, cameras embedded in a smart environment offermuchpotentialforavarietyofnovelhuman-centricinterfacesthroughthepro- visioning of rich information. Vision-based sensing fits well with pervasive sensing and computing environments, enabling novel interactive user-based applications that do not require wearable sensors. Access to interpretations of human pose and gesture obtained from visual data over time enables higher-level reasoning modules todeducetheuser’sactions,context,andbehavior,andtodecideonsuitableactions or responses to a given situation. Local processing of acquired video at the source camera facilitates operation of scalable vision networks by avoiding transfer of raw images.Additionalmotivationfordistributedprocessingstemsfromanefforttopre- serve the user’s privacy by processing the data near the source. Vision networks offer access to quantitative knowledge about events of interest such as the user’s location and other attributes. Such quantitative knowledge can eithercomplementorprovidespecificqualitativedistinctionsforAmI-basedfunctions. In turn, qualitative representations can provide clues on which features would be of interest to derive from the visual data, allowing the vision network to adjust its pro- cessing operation according to the interpretation state. In this way the interaction betweenthevision-processingmoduleandthereasoningmoduleinprincipleenables both sides to function more effectively. For example, in a human gesture analysis application, the observed elements of gestureextractedbythevisionmodulecanassisttheAmI-basedhigh-levelreasoning module in its interpretative tasks, while the deductions made by the reasoning sys- tem can provide feedback to the vision system from the available contextual or Preface xxiii Multi-camera network / Vision processing Context High-level Event interpretation reasoning camera camera Behavior models Feedback for individual cameras camera camera camera (features, parameters, tasks, etc.) Feedback for the network (task dispatch, resource allocation, scheduling, etc.) FIGURE3 Exampleinteractionsbetweenvision-processingandhigh-levelreasoningmodules. behavior model knowledge to direct its processing toward the more interesting features and attributes. Figure 3 illustrates these interactions. Varioustechnologieshavebeenexploredforreceivingexplicitorimplicituserinput. Manyapplicationsinteractwiththeuserthroughthepresentationofinformationondis- plays.Supportingremotecollaborationamongusersisagrowingareaofdevelopment. In collaborative applications the participants may use gestures or a touch screen to communicatedataorexpressions(seeChapter1,Face-to-FaceCollaborativeInterfaces). The availability of inexpensive sensing and processing hardware and powerful software tools has created unprecedented opportunities in applications for interac- tiveart,offeringnewvisualinterfaces,drivingexperimentationbasedonthecombi- nation ofartandtechnology,andenablinganewdomain forexpressivenessinboth audience-interactive and performance art (see Chapter 2, Computer Vision Inter- faces for Interactive Art). Many interactive applications obtain valuable information from the user’s facial analytics. Gaze, with its various behavioral elements, plays an important role in social interaction. A person’s area of interest, duration of attention, facial expres- sions accompanying a gaze, and even mental state can be inferred by studying gaze through a vision-based interface. Suchinference can often provide a situation-aware response or facilitate acquisition of the user’s behavior model within the prevailing application context (see Chapter 3, Ubiquitous Gaze: Using Gaze at the Interface). Vision-based processing normally leads to the measurement of a set of quantita- tiveparametersfromtheenvironmentunderobservation.Asapplicationsinambient intelligence start to demand a higher levelof cognition, they call for the abstraction of acquired information, which is often obtained under uncertainty, to a semantic level, and even possibly its description using a natural language (see Chapter 4, Exploiting Natural Language Generation in Scene Interpretation). Description of visual interpretation results in human-centric interfaces can also be symbolically represented through a language whose elements reflect human actions to a fine granularity. Such a language can offer a universal platform for describinghumanactivitiesinavarietyofapplications(seeChapter5,TheLanguage of Action: A New Tool for Human-Centric Interfaces). xxiv Preface SPEECH PROCESSING AND DIALOGUE MANAGEMENT Speech processing involves a number of technologies to enable speech-based inter- action between computers and humans. These include automatic speech recogni- tion, speaker recognition, spoken language understanding, and speech synthesis. Dialogue management aims to provide a speech-based interaction that is as natural, comfortable, and friendly as possible, especially taking into account the state-of- the-art limitations of automatic speech recognition. Interfaces supporting speech processingtechnologiesareappealinginhuman-centricapplications,astheyenable, for example, turning lights on or off by talking directly into a microphone or ambi- ently to speech sensors embedded in the environment. Automatic speech recognition (ASR) is the basis of a speech-based interface. However,inspiteofadvancesmadeinrecentyears,theperformanceofASRsystems degrades drastically when there is mismatch between system training and testing conditions. Hence, it is necessary to employ techniques to increase the robustness of these systems so that they can be usable in a diversity of acoustic environments, considering different speakers, task domains, and speaking styles (see Chapter 6, Robust Speech Recognition Under Noisy Ambient Conditions). Speakerrecognitionistheprocessoftheidentificationofthecurrentuserbythe systemthroughspeechsignals.Thisisimportantinhuman-centricAmIinterfacesin order toadapttheinterfacetothepreferencesand/orneedsofthecurrentuserand to optimize its performance (see Chapter 7, Speaker Recognition in Smart Environments). The goal of spoken language understanding is to infer a speaker’s intentions in order to build intelligent interfaces. This is a challenging topic not only because of the inherent difficulties of natural language processing but also because of the pos- sibleexistenceofrecognitionerrorsinthesentencestobeanalyzed(seeChapter8, Machine Learning Approaches to Spoken Language Understanding). Dialogue management techniques are fundamental in speech-based interfaces giventhecurrentlimitationsofstate-of-the-artASRsystems.Thesetechniquesenable the interface to decide whether it must ask the user to confirm recognized words, clarify the intended message, or provide additional information. For example, the user maysay“Turnonthelight”inaroomwherethereareseverallamps,requiring the interface to ask for clarification (see Chapter 9, The Role of Spoken Dialogue in User–Environment Interaction). The goal of speech synthesis is to enable the speech-based interface to “talk” to the user. However, even though significant advances have been made in recent years,currentspeechsynthesissystemsarefarfromofferingthesameflexibilitythat humanshave.Theycanproducespeechthatisarguablypleasanttohumanears,but they are limited in a number of aspects, such as their affective processing capa- bilities and their adaptation of synthesized output to different environments and user needs (see Chapter 10, Speech Synthesis Systems in Ambient Intelligence Environments).

Description:
To create truly effective human-centric ambient intelligence systems both engineering and computing methods are needed. This is the first book to bridge data processing and intelligent reasoning methods for the creation of human-centered ambient intelligence systems. Interdisciplinary in nature, the
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.