ebook img

Framework for Electroencephalography-based Evaluation of User Experience PDF

4.3 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Framework for Electroencephalography-based Evaluation of User Experience

Framework for Electroencephalography-based Evaluation of User Experience Je´re´myFrey MaximeDaniel JulienCastet Univ. Bordeaux,France ImmersionSAS,France ImmersionSAS,France [email protected] [email protected] [email protected] MartinHachet FabienLotte Inria,France Inria,France [email protected] [email protected] 6 1 0 2 n a J 2 1 ] C H Figure1: Wedemonstratehowelectroencephalographycanbeusedtoevaluatehuman-computerinteraction. Forexample,akeyboard(left)canbe . s comparedwithatouchinterface(middle)usingacontinuousmeasureofmentalworkload(right,hereparticipant4). c [ ABSTRACT INTRODUCTION 1 Measuringbrainactivitywithelectroencephalography(EEG) In practice, a tool is only as good as one’s ability to assess v 8 is mature enough to assess mental states. Combined with it. Forinstance,evaluationsinHuman-ComputerInteraction 6 existing methods, such tool can be used to strengthen the (HCI)usuallyrelyoninquiries–e.g.,questionnairesorthink 7 understanding of user experience. We contribute a set of aloudprotocols–oronusers’behaviorduringtheinteraction– 2 methodstoestimatecontinuouslytheuser’smentalworkload, e.g.,reactiontimeorerrorrate. However,whilebothtypesof 0 attentionandrecognitionofinteractionerrorsduringdifferent methodshavebeenusedsuccessfullyfordecades,theysuffer . interactiontasks. Wevalidatethesemeasuresonacontrolled fromsomelimitations. Inquiriesarepronetobecontaminated 1 0 virtualenvironmentandshowhowtheycanbeusedtocompare by ambiguities [23] or may be affected by social pressure 6 differentinteractiontechniquesordevices,bycomparinghere [26]. Itisalsoverydifficulttogainreal-timeinsightswithout 1 a keyboard and a touch-based interface. Thanks to such a disruptingtheinteraction. Indeed, thinkaloudprotocoldis- : framework,EEGbecomesapromisingmethodtoimprovethe tractsusersandquestionnairescanbegivenonlyatspecific v overallusabilityofcomplexcomputersystems. timepoints,usuallyattheendofasession–whichleadsto i X abiasduetoparticipants’memorylimitations[17]. Onthe r otherhand,metricsinferredfrombehavioralmeasurescanbe a AuthorKeywords computedinreal-time,butaremostlyquantitative. Theydo EEG;HCIEvaluation;Workload;Attention;Interaction notprovidemuchinformationaboutusers’mentalstates. For errors;Neuroergonomy example,ahighreactiontimecanbecausedeitherbyalow concentrationlevelorbyadifficulttask[1,14]. ACMClassificationKeywords Recently,ishasbeensuggestedthatportablebrainimaging H.5.2UserInterfaces: Evaluation/methodology techniques–suchaselectroencephalography(EEG)andfunc- tionalnearinfraredspectroscopy(fNIRS)–havethepotential toaddresstheselimitations[8,27,11].fNIRShasbeenstudied toassessusers’workload,forexampletoevaluateuserinter- faces[15]ordifferentdatavisualizations[25]. However,most PublicationrightslicensedtoACM.ACMacknowledgesthatthiscontributionwas oftheseworksareevaluatingpassivetasksorverybasicin- authoredorco-authoredbyanemployee,contractororaffiliateofanationalgovern- ment.Assuch,theGovernmentretainsanonexclusive,royalty-freerighttopublishor teractionsthatarenotecological,especiallyasuserinterfaces reproducethisarticle,ortoallowotherstodoso,forGovernmentpurposesonly andinteractionsaregettingmorecomplex. CHI’16,May07-12,2016,SanJose,CA,USA Copyrightisheldbytheowner/author(s).PublicationrightslicensedtoACM. ACM978-1-4503-3362-7/16/05...$15.00 DOI:http://dx.doi.org/10.1145/2858036.2858525 In this paper, not only do we detail a framework for HCI Errorrecognitionrelatestothedetectionbyusersofanout- designerstogainfurtherinsightsaboutuserexperienceusing come different from what is expected [22]. We focused on brain signals – here from EEG – but we also validate this interaction errors [9], which arise when a system reacts in frameworkonanactualandrealistictask. Weshowthatthey an unexpected way, for example if a touch gesture is badly canbeusedtocomparedifferentHCI,inanenvironmentthat recognized. Interactionerrorsenabletoassesshowintuitive providesmanysimulationsandwhereparticipantsareengaged a UI is, and they are hardly measurable by another physio- inrichinteractions. logicalsignalthanEEG.Thecombinedmeasureofworkload, attentionanderrorrecognitionconstitutesapowerfulcomple- Inthefollowingsections,wedescribethevirtualenvironment mentaryevaluationtoolforpeoplewhodesignnewinteraction that we developed, specifically aimed at validating the use techniques. ofEEGasanevaluationmethodforHCI.Wevalidatedthe workload induced by our environment in a first study, with Eventhoughcommercialsolutions,suchastheB-Alertsys- NASA-TLXquestionnaires[14]. Then,wedetailhowEEG tem1, are already pointing this direction, they are validated canbeusedtoassess3differentconstructs: workload,atten- in the literature with lab tasks only [1]. Our work, on the tionanderrorrecognition. Finally,duringthemainstudywe otherhand,isclosertothefield. Moreimportantly,asopposed employEEGrecordingstomeasurecontinuouslysuchwork- toproprietarysoftware,ourmethodologyistransparentand load,altogetherwiththeattentionlevelofparticipantstoward ourmultidimensionalindexofuserexperiencecanbeeasily externalstimuliandthenumberofinteractionerrorstheyper- replicated. ceived,whiletheyinteractwithourvirtualenvironment. We tooktheopportunityoftheseEEG-basedmeasurestocompare VIRTUAL3DMAZE akeyboardinputtoatouchscreeninput. The3Dvirtualenvironmentthatwebuiltusesgamification[7] toincreaseusers’engagementandensurebetterphysiological Tosummarize,ourmaincontributionsare: recordings[10]. Suchavirtualenvironmentalsoenablesusto 1. TovalidatetheuseofEEGasacontinuousHCIevaluation assessworkload,attentionanderrorrecognitionduringeco- methodthankstocontrolledandrealisticinteractiontasks logicalandrealisticinteractiontasks. Indeed,suchconstructs wedesigned. aretraditionallyevaluatedduringcontrolledlabexperiments 2. Todemonstratehowsuchtoolcanassesswhichofthetested basedonprotocolsfrompsychologythatarevastlydifferent interactiontechniqueisbettersuitedforaparticularenvi- fromanactualinteractiontask,see,e.g.,[12]. ronment. 3. To propose a framework that can be easily replicated to Overalldescription improveexistinginterfaceswithlittleornomodifications. Thevirtualenvironmenttakestheformamazewhereplayers havetolearnandreproduceapathbytriggeringdirectionsat regularintervals(seeFigure2). Acharacterdisplayedwitha thirdpersonperspectivemovesbyitselfatapredefinedspeed Relatedwork insideorthogonaltunnels. Soonafterthecharacterentersa Since a few years, brain imaging has been used in HCI to new tunnel, symbols appear on-screen. Those symbols are deepentheunderstandingofusers,thanksnotablytothespread basic2Dshapes,suchassquares,circles,triangles,diamonds ofaffordableandlightweightdevices. Forinstance,EEGand or stars, and their positions (bottom, top, left or right) indi- fNIRSareparticularlywellsuitedformobilebrainimaging catewhichdirectionsare“opened”. Playersmustselectone [20,6]. Thesetechniquescanbeemployedtostudythespatial ofthesesymbolsbeforethecharacterreachestheendofan focusofattention[32]ortoidentifysystemerrors[33,29]. intersection,eitherbypressingakeyortouchingthescreen. Ifusersrespondtooearly,i.e.,beforesymbolsappeared,too EEG and fNIRS are opportunities to assess the overall us- late,oriftheyselectadirectionthatdoesnotexist,theyloose ability of a system and improve the ergonomics of HCI. In points and the character “dies” by smashing against a wall, a recent work we showed preliminary results regarding the respawningsoonafteratthebeginningofthecurrenttunnel. evaluationofmentalworkloadduringa3Dmanipulationtask [35]. Afterbeingprocessed,EEGsignalshighlightedwhich Themainelementofthegameplayconsistsinselectingthe partsoftheinteractioninducedahighermentalworkload. In directionsinthecorrectorder. Indeed,onelevelcomprised thepresentpaperwegomuchbeyond,exploringacontinuous twophases. Duringthe“learning”phaseaparticularsequence indexofworkloadaswellastwoothersconstructs,namely ofsymbolsishighlighted;ateachsymbols’appearanceone attentionanderrorrecognition. Wealsorigorouslyvalidate ofthemisbouncingtoindicatethecorrectdirection. Another suchindexes,andstudytheminlightofbehavioralmeasures cuetakestheformofa“breadcrumbtrail”,abeamoflightthat (performancesandreactiontimes)andinquiries(NASATLX precedesthecharacterandpointstotherightdirection(see questionnaire). Figure2b). Selectinganavailablebutincorrectdirectiondoes notresultinthecharacter’s“death”butleadstoalossofpoints. Attentionreferstotheabilitytofocuscognitiveresourcesona Avisualfeedbackisgiventouserswhentheyselectadirection: particularstimulus[17]. InHCI,measuringtheattentionlevel thecorrespondingsymbolturnsgreenifthechoiceiscorrect couldhelptoestimatehowmuchinformationusersperceive. andredotherwise. Whentheendofthemazeisreached,the In the present work the measure of attention relates to inat- characterloopsovertheentirepathsothatplayershaveanother tentionalblindness;i.e.itconcernsparticipants’capacityto processstimuliirrelevanttothetask[4]. 1http://www.advancedbrainmonitoring.com/ (a) (b) (c) Figure2: Thevirtualenvironment,whereplayerscontrolacharacterthatmovesbyitselfinsidea3Dmaze. A:Symbolsappearineachtunnelto indicatethepossibledirectionsforthenextturn;playershavetoselectaparticularsequenceofsymbols/directions.B:Duringthe“learning”phase,the correctdirectionishighlightedbyabreadcrumbtrailandtheassociatedsymbolbounces(herethediscontop).C:Controlsdependonthepositionof thecharacter.Ifthecharacterisontherightside,playershavetopressrightinordertogoup. opportunitytolearnthesequence. Whenthetrainingphase symbolssequencetheyhavetolearn. Moresymbolstobe ends, the“recall”phasefollows. Thesymbolsareidentical heldintheworkingmemoryincreasesworkload[12,31]. butthecuesarenomoredisplayed;playershavetoremember • Numberofdirections:ateachintersection,upto4directions bythemselvestherightpath. Symbolspositionineachtunnel are“opened”inthemaze;thecomplexityofthesymbols andsymbolssequencearerandomlydrawnwhenanewlevel sequencegrowsasthisnumberincreases. starts. • Gamespeed: thepaceofthegamecanbeadjustedtoin- creasetemporalpressure. Whenthespeedincreasessym- Beside learning a sequence, the principal challenge comes bols appear sooner and users must respond quicker, thus fromhowthedirectionsareselected. Thethird-personview increasing overall stress [14, 19]. In the easiest level the fulfillsapurpose: theinputdevicethatusersarecontrolling characterspends6sinatunnelandplayersmustrespond –i.e.keyboardortouchscreen–ismappedtothecharacter within3saftersymbolsappearance;inthehardestlevela position. Sincethecharacterisafuturisticsurferthatdefies tunnellasts2sandplayershave1stochooseasymbol. thelawofgravity, itslidesbyitselffromthebottomofthe tunneltooneofthewallsortotheceilingfromtimetotime. • Spatialorientation: inordertokeepselectingthecorrect In this latter situation, when the character is upside down, directions, users have to perform a mental rotation if the commandsareinvertedcomparedtowhatplayersareusedto, charactertheycontroljumpsfromthefloortothewallsor eventhoughsymbolsremaininthesamepositions. Thisgame totheceiling. Furthermore,theyneedtoupdatetheirframe mechanismstressesspatialcognitionabilities;usershaveto ofreferenceasoftenasthecharactershiftsfromoneside constantlyremainawareoftwodifferentframesofreference. toanother. Dependingonthespatialabilityofusers,this Forexample,if“up”and“left”directionsareopeninagiven mechanismcancauseanimportantcognitiveload[28]. tunnelandifthecharacter’sposition–controlledbytheappli- We used those mechanisms and dimensions to create 4 dif- cation–isontherightwall,asillustratedinFigure2c,users ferentdifficultylevelsforthegame: “EASY”,“MEDIUM”, havetopressrighttogoup. Thisdiscrepancybetweeninput “HARD” and “ULTRA” (see Table 1). These levels affect and output is a reminder of the problematic often observed mostly(symbolic)memoryloadandtimepressure. Indeed, with3Duserinterfaces,wheremostusersmanipulateadevice the3Dmazeismoreaboutrememberingasequenceofsym- with2degreesoffreedom(DOF),suchasamouse,tointeract bolsordirectionsratherthanspatialnavigationperse.Because witha6DOFenvironment. randomizationcouldcreateloopsinthemazetopographyand Thecombinationofthegamedesignandgamemechanisms sincetherewerenolandmarks,itisunlikelythatparticipants hereindescribedoffersawidevarietyofelementsthatweput wereabletoadoptanallocentricstrategy. inusesoastoinvestigateusers’mentalstates.Inparticular,we WhiletheEASYlevelisdesignedtobecompletedwithvery detailbelowhowwetunedthegameelementstomanipulate littleeffort,theULTRAlevel,ontheotherhand,isdesigned theuser’sworkloadandattentionincontrolledwaysaswell tosustainaveryhighlevelofworkload,uptothepointthatit as to trigger interaction errors. Knowing which constructs isbarelypossibletocompleteitwithnoerror. Whileduring value(e.g.highorlowworkload)toexpect,wecanvalidate EASYlevelsthereisnoneedtoperformmentalrotationsand whether our EEG-based estimates during interaction match playershavetomemorizeonly2symbolsthatareconstrained theseexpectations,andthuswhethertheyarereliable. toeitherleftandrightdirections,inULTRAlevelstheframe ofreferencechangesbetweeneachselectionandthesequence Manipulatingworkload reaches5symbolsthatcouldappearinall4directions,and Ourvirtualenvironmentpossessesseveralcharacteristicsthat playershavetoreactthriceasfast. Nomatterthelevel,players couldbeusedtoinducedifferentlevelsofmentalworkload. had3“loops”tolearnthemazeandanothersetof3loopsto Wecannotablyadjust4parameters: reproducethepath. • Mazedepth: thenumberoftunnelsplayershavetocross beforereachingtheendofthemaze,hencethelengthofthe pointingisco-locatedwithsoftwareevents,sinceuserscan Difficulty Depth Directions Resp. time Orientation directly indicate where they want to interact. However, in EASY 2 2 3s 0% our case, we decided to mimic exactly the behavior of the MEDIUM 4 3 2.5s 30% keyboardinterface. Thatistosaythatwiththetouchscreen HARD 5 4 2s 60% aswellplayershavetoorientatethemselvesdependingonthe ULTRA 5 4 1s 100% positionofthecharacter. Hence,ifthecharacterispositioned Table1:Fourdifficultylevelsarecreatedbyleveragingongamemech- ontheleft,playershavetotouchtherightfringeofthescreen anisms.Depth:numberofdirections/symbolsplayershavetolearn.Di- inordertogoup.Thisismostlycounter-intuitivesinceplayers rections: numberofpossibledirectionsateachintersection. Response havetoinhibittheurgetopointtotheactualdirectionthey time:howmuchtimeplayershavetorespondaftersymbolsappearance. wanttogo;thereisacognitivedissonance. Orientation:percentagechancethatthecontrolledcharacterchangesits orientation. Sinceinourexperimentaldesigntheuseofthedirect(touch- based) interaction is counter-intuitive, we hypothesize that itwillleadtoanoverallhighernumberofinteractionerrors Assessingattention comparedtotheindirectinterface(keyboard). We relied on stimuli not congruent to the main task in or- dertoprobeforinattentionalblindness,usingthe“oddball” PILOT STUDY: VALIDATION OF THE INDUCED WORK- paradigm. TheoddballparadigmisoftenemployedwithEEG astheappearanceofrare(i.e. “odd”)stimuliamongastream LOADLEVEL offrequentstimuli(i.e.distractors)triggersaparticularevent- Wedesignedourvirtualenvironmentasatest-benchaimedat related potential (ERP) within EEG signals [5]. ERP are inducingseveralmentalstateswithinusers,notably,different “peaks”and“valleys”inEEGrecordings,andtheamplitudeof workloadlevels. Thus,wehadtoformallyvalidatethemental someofthemdecreasesasusersarelessattentivetostimuli. workloadthateachgamelevelseekstoinduce. Assuch,we conducted a pilot study – separate from the main study to Our protocol uses audio stimuli. It is based on [3], which alleviate the protocol of the latter –, with no physiological studiedtheimmersionofvideogameplayers. Inourvirtual recordings but using the NASA-TLX questionnaire [14], a environment,whileusers’characterswerenavigatinginthe wellestablishedquestionnairethataccountsforworkload. maze, sounds were played at regular intervals, serving as a background “soundtrack” that was consistent with the user Protocol experience. 20%ofthesesoundshadahighpitch(oddevent) 15participantstookpartinthisstudy(4females),meanage andtheremaining80%hadalowpitch(distractingevents)– 24.53(SD:3.00). Weusedawithin-subjectdesign;allpartici- thisproportionisonparwiththeliterature[3,9]. pantsansweredforall4difficultylevels. Thegamingsession Ourhypothesisisthattheattentionlevelofparticipantstoward used the keyboard and started with 2 “training levels”, that sounds – as measured with the oddball paradigm – should introducedparticipantstothegamemechanisms. Inthefirst decreaseastheworkloadincrease,sincemostoftheircognitive traininglevel,playerslearnedtheobjectiveofthegame. In resourceswillbeallocatedtothemaintaskduringthemost thesecondtraininglevel,theydiscoveredhowthecharacter demandinglevels. couldchangeitsorientationbyitself. Afterthistrainingphase, participantscontinuedwiththemainphaseoftheexperiment. Assessingerrorrecognition Duringthemainphaseoftheexperiment,participantsplayed EEGcouldbeusedtomeasureinteractionerrors,i.e.errors once each of the four levels (EASY, MEDIUM, HARD or originatingfromanincorrectresponseoftheuserinterface, ULTRA), in a random order. Immediately after the end of thatdiffersfromwhatuserswereexpecting[9]. Interaction alevel,participantsweregivenaNASA-TLXquestionnaire errorsareofparticularinterestforHCIevaluationsincethey to inquire about their mental workload. The questionnaire couldaccountforhowintuitiveaninterfaceis[11]. Inorderto took the form of a 9-points Likert scale. As in the original testthefeasibilityofsuchmeasure,wedecidedtoimplement questionnaire[14],itcomprised6items,thatassessedmental twodifferentinteractiontechniques. Bothofthemusediscrete demand, physical demand, temporal demand, performance, events–i.e.symbols’selection–sothatwecouldmoreeasily effortandfrustration. Theexperimentlastedapproximately synchronizeEEGrecordingswithin-gameeventslateron. 25minutesandfinishedonceparticipantsplayedall4levels Thefirsttechniqueusesindirectinteractionsbythemeanofa andfilledthecorrespondingNASA-TLXquestionnaire. keyboard(Figure1,left). Induetime,left,right,upordown arrow keysare used to sendthe character inthe tunnel that Results is situated to its left, right, top or bottom. Indeed, we have Foreachparticipantandeachlevelofdifficulty,weaveraged seenpreviouslythatinourvirtualenvironmentplayershaveto the6itemsoftheNASA-TLXquestionnaireandnormalized orientatethemselvesdependingofthepositionofthecharacter. thescalesfrom[1;9]to[0;1]–exceptforthe“performance” Ifthecharacterismovingonthesides,playershavetoperform item,thatwasnormalizedfrom[1;9]to[1;0]becauseitsscale amentalrotationof90°,ifitisontheceilingthentheangleis isinreverseordercomparedtotheotheritems(“1”for“good” 180°,i.e.commandsareinverted. and“9”for“poor”). Thesecondtechniqueusesdirectinteractionbythemeanofa The resulting averaged scores are: EASY: 0.11 (SD: 0.09); touchscreen(Figure1,middle). Usually,withtouchscreen, MEDIUM:0.32(SD:0.17);HARD:0.43(SD:0.13);ULTRA: 1.00 ofsuchconstructsduringdifferentandcomplexinteraction tasks. 0.75 Concerningattention,wedidnotdevelopadedicatedtaskper e TLX scor0.50 sgeraftoerdittsocoaulirbvrairtitouna.lSeninvcireotnhmeaeundt,iowpersoibmesplwyeureseadlraeasdpyecinifitec- A− levelofourgame. S NA l 0.25 Calibrationofworkload WeusedtheprotocolknownastheN-backtasktoinduce2 0.00 differentworkloadlevelsandcalibrateourworkloadestimator. EASY MEDIUMDifficultyHARD ULTRA TheN-backtaskisawell-knowntasktoinduceworkloadby Figure3:NASA-TLXscoresobtainedduringthepilotstudy.Eachdiffi- playingonmemoryload[24]. Itshowedpromisingresultsin cultyleveldifferssignificantlyfromtheothers(p<0.01). [35]whereitcouldbeusedtotransfercalibrationresultstoa 3Dcontext. 0.65(SD:0.13)–seeFigure3. Arepeatedmeasuresanaly- IntheN-backtask,userswatchasequenceoflettersonscreen, sisofvariance(ANOVA)showedasignificanteffectofthe thelettersbeingdisplayedonebyone. Foreachlettertheuser difficultyfactorovertheNASA-TLXscoresandapost-hoc hadtoindicatewhetherthedisplayedletterwasidenticalor pairwiseStudent’st-testwithfalsediscoveryrate(FDR)cor- differenttotheletterdisplayedNlettersbefore,usingaleftor rectionshowedthateachlevelsdifferedsignificantlyfromthe rightmouseclickrespectively. Hence,usershavetoremember others(p<0.01). nitemsatalltimes. Discussion Weimplementedaversionsimilarto[12],removingvowelsto Inthispilotstudy, wedemonstratedthroughquestionnaires preventchunkingstrategiesbasedonphonemes. Weusedthe thateachdifficultylevelpresentedinTable1inducesadiffer- sametimeconstraintasin[35],i.e.lettersappearedfor0.5s, entworkloadlevel.Hence,wecanuseourvirtualenvironment withaninter-stimulusintervalof1.5s. Eachuseralternated asabaselinetoassessthereliabilityofanalogousEEGmea- between“easy”blockswiththe0-backtask(theuserhadto suresandputintoperspectivethisnewevaluationmethod. identify whether the current letter was a randomly chosen targetletter,e.g. ‘X’)and“difficult”blockswiththe2-back EEGINPRACTICE task(theuserhadtoidentifywhetherthecurrentletterwasthe EEGmeasuresthebrainactivityundertheformofelectrical sameletterastheonedisplayed2lettersbefore),seeFigure4. currents[21]. ToidentifymentalstatesfromEEG,3typesof informationcanbeused: • Frequency domain: oscillations that occur when large groupsofneuronsfirealtogetheratasimilarfrequency • Temporal information: ERPs possess temporal features; positiveandnegative“peaks”withvaryingamplitudesand delays. • Spatial domain: position of the electrodes that record a specificbrainactivity. However,thereisanimportantvariabilitybetweenpeople’s Figure4: Workloadcalibrationtask. Top: difficulttask(2-backtask), EEGsignals,andmanyexternalfactorsthatcouldinfluence thetargetletteristheonethatappearedtwostepsearlier,usershaveto selecttrials4and5. Bottom: easytask(0-backtask),thetargetletter EEGrecordings(amplifier’sspecifications,electrodesexact “S”israndomlychosen,usershavetoselecttrials2and5. location,andsoon). Assuch,itisdifficulttoidentifyauniver- salsetoffeaturestoestimateagivenmentalstate,fordifferent Eachblockcontained60letterspresentations. 4letterswere sessions and participants. This is why machine learning is drawnatthebeginningofablocksothatthenumberoftar- typicallyusedinEEGstudies[2]. Withthisapproach,acal- getlettersaccountedfor25%ofthetrials. Eachparticipant ibration phase occurs so that the system could learn which completed6blocks,3blocksforeachworkloadlevel(0-back featuresareassociatedtoaspecificindividual,duringatask vs2-back). Therefore,360calibrationtrials(i.e.onetrialbe- thatisknowntoinducethestudiedconstruct.Oncethecalibra- ingoneletterpresentation)werecollectedforeachuser,with tioniscompleted,themachinecouldthenusethisknowledge 180 trials for each workload level (“low” vs “high”). This togaininsightsaboutanunknowncontext,forexampleanew calibrationphasetakesapproximately12minutes. interactiontechniquethatonewouldwanttoevaluate. Tocalibrateworkloadandattention,wechosetousestandard Calibrationoferrorrecognition calibrationtasks,validatedbytheliterature,sothatourfind- Wereplicatedthestandardprotocoldescribedin[9]tocalibrate ings could be easily reproduced. Moreover, as shown later the system regarding error recognition. The task simulates in this paper, using a single of these tasks to calibrate each ascenarioinwhichuserscontrolthemovementsofarobot. constructestimatorwasenoughtoobtainreliableestimations The robot appears on screen and has to reach a target. At eachturnuserscommandtherobottogorightorleftinorder wasalowpitchedbeatof70ms–wedidnotusepuretonesto to reach the target as fast as possible (with the least steps). improveusersexperience. Thepaceofthegamewasadjusted However,therobotmayunderstandbadlythegivencommand. sothatasound(targetordistractor)wasplayedeverysecond. Thisissimulatedbysometrialsduringwhichthecommand Sincetheprobesforattentionreliestotheoddballparadigm, is(onpurpose)erroneouslyinterpreted;henceaninteraction wechosea20%likelihoodofappearanceforthetargetevent. errorhappens. TheERPthatcanbeseeninEEGfollowing Thecalibrationlastedabout7minutes,after350soundswere aninteractionerrorisknownasan“errorrelatedpotential”, played. Note that participants were instructed to count the ErrP[9]. Thecalibrationtaskisasimplifiedversionofthis “odd”eventsonlyduringthecalibrationphase,andnotduring scenario: therobotispicturedbyabluerectangleonscreen thecompletionofthe3Dmaze. thatuserscontrolwiththearrowkeys,thetargetisrepresented byablueoutline. TherobotisconstrainedtotheXaxisand MAINSTUDY:EEGASANEVALUATIONMETHOD alongthisaxisthereareonly7differentpositionsbothforthe Themainstudyconsistedintheevaluationofthegameenviron- robotandthetarget(seeFigure5). mentwithtwodifferenttypesofinterfacesusingEEGrecord- ings. Assuchwecreateda4(difficulty: EASY,MEDIUM, 1 Press 2 Cursor 3 Cursor HARD,ULTRA)×2(interaction: KEYBOARDvsTOUCH) right selected moves within-subjectexperimentalplan. Ourhypothesesare: 1. TheworkloadindexmeasuredbyEEGishigherinTOUCH and increases with the difficulty, reflecting NASA-TLX scoresobtainedduringthepilotstudy. 1s 1s 2. The attentional resources that participants assign to the 4 Press 5 Cursor 6 Cursor soundsdecreaseasthedifficultyincreases. right selected moves 3. TheTOUCHconditioninducesahighernumberofinterac- Interaction error tionerrorscomparedtotheKEYBOARDcondition. Thegamingphasewassplitintotwosequences,oneforeach interactiontechnique. Toavoidatootediousexperiment,par- ticipantsalternatedbetweengamesessionsandthe3calibra- 1s 1s Figure5:Errorrecognitioncalibrationtask. Userscontrolabluefilled tiontasks(workload,attentionanderrorrecognition). Since rectangle.Theyhavetomoveittoanoutlinedtargetbypressingtheleft the analysis were performed offline, there was no need to orrightarrowkey. 20%ofthetime,therectanglegoesintheopposite clusterallthecalibrationsatthebeginningoftheexperiment. direction,thuscausinganinteractionerror. Theorderofthegamingsessionsandcalibrationphaseswas We choose a ratio for the occurrence of interaction errors counter-balancedbetweenparticipantsfollowingalatinsquare thatisconsistentwiththeliterature. 80%ofthemovements (see Figure 6). After the experiment, the signals gathered matched the actual key pressed and for the other 20% the fromthecalibrationtaskswereprocessedinordertoevaluate “robot”movedintheoppositedirection. Itwasnecessarynot boththevirtualenvironment(difficultylevels)andthechosen tobalancebotheventsastoofrequenterrorsmaynotbeper- interactiontechniques. ceivedasunexpectedanymore,andthusmaynotleadtoan ErrP.Atimerwassettopreventtheappearanceofartifacts, suchasmusclemovements,withinEEGrecordings(seethe Discussionsectionforfurtherconsiderationforartifacts). The rectanglemoves1safterakeywaspressed,andaftermove- ment completion users have to wait another 1s before they couldpressakeyagain. Therectangleturnedyellowtotell userstheycouldnotcontrolitduringthatsecond. Atrialiscompletedoncetherobotreachesthetarget. Atrial fails if after 10 attempts the robot is not yet on target. At thebeginningofeachtrial,thescreenisreinitializedwitha randomnewpositionfortherobotandthetarget. Thelasttrial occurredafter350interactionswereperformed. Onaverage thiscalibrationphaselasted15minutes. Figure6: Theorderofthe3calibrationtasksand2interactiontech- niqueswascounter-balancedbetweenthe12participantstoimproveen- Calibrationofattention gagement. Thecalibrationofattentionoccurredwithinasimplifiedver- sionofthevirtualenvironment. Usersdidnothavetocontrol thecharacterduringthisspeciallevel,itwasmovingbyitself Apparatus through the maze. They were asked to watch the character EEGsignalswereacquiredat512Hzwith2g.tecg.USBamp andcountintheirheadhowmanytimestheyheardthe“odd” amplifiers. We used 32 electrodes placed at the AF3, AFz, sound,i.e.,ahighpitchedbelllasting200ms. Thedistractor AF4,F7,F3,Fz,F4,F8,FC3,FCz,FC4,C5,C3,C1,Cz,C2, C4,C6,CP3,CPz,CP4,P7,P3,Pz,P4,P8,PO7,POz,PO8, foreachbandasetofCommonSpatialPatterns(CSP)spa- O1,OzandO2sites. tial filters. That way, we reduced the 32 original channels downto6 “virtual”channels thatmaximize thedifferences 12participantstookpartinthisstudy(3females),meanage betweenthetwoworkloadlevels[30]. Sincethecalibration 26.25(SD:3.70). Allofthemreportedadailyuseoftactile (N-backtask)andusecontexts(virtualenvironment)differs interfaces. Theexperimentoccurredinaquietenvironment, substantially, we used a regularized version of these filters isolatedfromtheoutside. Thereweretwoexperimentersin calledstationarysubspaceCSP(SSCSP)[35]. SSCSPfilters theroomandtheprocedurecomprisedthefollowingsteps: aremorerobusttochangesbetweencontextssincetheytake 1. Participantsenteredtheroom,readandsignedaninformed into account the distributions of the EEG signals recorded consentformandfilledademographicquestionnaire. duringboththecalibrationandtheusecontexts(inanunsu- 2. WhileoneoftheexperimenterinstalledanEEGcaponto pervisedway,i.e.withoutconsideringtheexpectedworkload participants’heads,theotherexperimenterintroducedpar- levels)toestimatespatialfilterswhoseresultingsignalsare ticipantstothevirtualenvironment. Theyplayed2training stableacrosscontexts(see[35]fordetails). Finally,foreach levelsandthe4mainlevelsinanincreasingorderofdif- frequencybandandspatialfilter, weusedtheaverageband ficulty. They could redo some levels if they did not feel powerofthefilteredEEGsignalsasfeature. Thisresultedin confidentenough. 30EEGfeatures(5bands×6spatialfiltersperband). 3. Oneofthe3calibrationtasksoccurred(workload,attention orerrorrecognition). Processingattentionanderrorrecognition 4. Participantsplayedthegameusingoneofthe2interaction Sincebothattentionanderrorrecognitioncanbemeasuredin techniques(KEYBOARDorTOUCH).Thefourlevelsof ERPs,theysharethesamesignalprocessing.Weselectedtime difficulty (EASY, MEDIUM, HARD, ULTRA) appeared windowsof1s,startingattheeventofinterest(i.e.soundsfor twiceduringthesession,inarandomorder. ForTOUCH, attention, rectangle’s movements for error recognition). In a dedicated training session occurred beforehand so that ordertoutilizetemporalinformation,featureextractionrelied participantcouldgetusedtothisinteractiontechnique. on regularized Eigen Fisher spatial filters (REFSF) method 5. Anothercalibrationtaskoccurred,differentfromstep3. [16]. Thanks to this spatial filter, specifically designed for 6. Participantstestedthesecondinteractiontechnique. Asin ERPsclassification,the32EEGchannelswerereducedtoa step 4, TOUCH was preceded by a training session, that setof5channels. Wethendecimatedthesignalbyafactor32. lasteduntilparticipantsfeltconfidentenoughtoproceedto The“decimate”functionofMatlab, thatappliesalow-pass themaintask. filterbeforedecimationtopreventaliasing, wasused. Asa 7. Participantsperformedthelastremainingcalibrationtask. result,therewas80featuresbyepoch(5channels×512Hz× 1s/32). Agamesession(steps4and6)tookapproximately20minutes andthewholeexperimentlasted2hours. Classification WeusedashrinkageLDA(lineardiscriminantanalysis)asa EEGAnalyses classifiersinceitismoreefficientthantheregularLDAwitha The calibration tasks were used to train a classifier specific highnumberoffeatures[18]. toeachofthestudiedconstruct. Classifierswerecalibrated separatelyforeachparticipantwhichensuredmaximalEEG Foreachconstructtherewastwosteps: firstweusedthedata classificationperformances. WeusedEEGLAB13.4.4b2and collectedduringthecalibrationtaskstoestimatetheperfor- MatlabR2014atoprocessEEGsignalsoffline. EEGfeatures manceoftheclassifiers. Second,westudiedtheoutputofthe associatedtoworkloadrelatetothefrequencydomainwhile differentclassifierstoevaluatethevirtualenvironment. thefeaturesassociatedtoattentionanderrorrecognitionrelate Toassesstheclassifiers’performanceonthecalibrationdata, totemporalinformation,asdetailedbelow. weused4-foldcross-validation(CV).Moreprecisely,wesplit Processingworkload the collected data into 4 parts of equal size, selecting trials From the signals collected during the N-back tasks, we ex- randomly,used3partstocalibratetheclassifiersandtestedthe tractedEEGfeaturesfromeach2stimewindowfollowinga resultingclassifiersontheunseendatafromtheremainingpart. letterpresentation. Weusedeachofthesetimewindowsas Thisprocessoccurred3moretimessothatintheend,eachpart anexampletocalibrateourclassifier,whoseobjectivewasto wasusedonceastestdata. Finally,weaveragedtheobtained learnwhetherthesefeaturesrepresentedalowworkloadlevel classificationperformances. Theperformancewasmeasured (inducedbythe0-backtask)orahighworkloadlevel(induced usingtheareaunderthereceiver-operatingcharacteristiccurve by the 2-back task). Once calibrated, this classifier can be (AUROCC).TheAUROCCisametricthatisrobustagainst usedtoestimateworkloadlevelsonnewdata,herewhileour unbalancedclasses,asitisthecasewithattentionanderror userswereinteractingwiththevirtualenvironment. recognition(20%oftargets,80%ofdistractors). Ascoreof “1”meansaperfectclassification,ascoreof“0.5”ischance. Asin[35],wefilteredEEGsignalsinthedelta(1-3Hz),theta (4-6Hz),alpha(7-13Hz),beta(14-25Hz)andgamma(26- Once the classifiers were trained thanks to the calibrations 40 Hz) bands. To reduce features dimensionality, we used tasks,wecouldusethemontheEEGsignalsacquiredwhile participantswereinteractingwiththevirtualenvironment,to 2http://sccn.ucsd.edu/eeglab/ estimatethedifferentconstructsvalues. Forworkload,weused2slongslidingtimewindowsthatwere overlapping by 1s, to extract signals and feed the classifier. KEYBOARD TOUCH From the outputs that was produced by the LDA classifier 0.2 for each participant (i.e., the distance to the separating hy- perplane),wefirstremovedoutliersbyiterativelyremoving x e 0.1 oneoutlieratatimeusingaGrubb’stestwithp=0.05,until d n nomoreoutlierwasdetected[13]. Wethennormalizedthe d i outlier-free scores between -1 and +1. As such, for all par- oa 0.0 kl ticipantsaworkloadindexcloseto+1representsthehighest or mentalworkloadtheyhadtoendurewhiletheywereplaying. W−0.1 Itshouldcomeclosetothe2-backconditionofthecalibration phase. Ontheopposite,aworkloadindexcloseto-1denotes −0.2 thelowestworkload,similartothe0-backcondition. EASYMEDIUMHARD ULTRA EASYMEDIUMHARD ULTRA Difficulty Theprocesswassimilarforattention,butweonlyextracted (a) epochsthatcorrespondedtothetargetstimulionset,i.e.when the high pitch sound was played. Note that contrary to [3], l thatstudiedtheamplitudesofERPsanddidnotusethedata gatheredduringthecalibrationphase, herewekeptthema- 0.06 l o0.4 cspehaeirnntiecaislpeaaanrcntosinnnfigodtaeipcnpecdreooiandcdhde.xeAvoesfnstthusecwhLh,DitlhAeetcrhleaessyuslwitfiienerrgeaspbclooauryetisnwgch.aenthbeer on index0.04 l l gnition rati0.3 nti co Asfortheclassifierdedicatedtoerrorrecognition,theprocess- Atte or re0.2 ingdiffers. Indeed, wecouldnotassumewhichinteraction 0.02 l Err yieldedornotaninteractionerror,i.e.ifandwhenparticipants 0.1 perceived a discrepancy between what they intended to do andwhatoccurred. Consequently,wesimplycountedoveran EASY MEDIUM HARD ULTRA KEYBOARDTOUCH Difficulty Interaction entiregamesessionthenumberoftimestheclassifierlabelled (b) (c) aninteractionasbeingerroneousintheeyeoftheparticipants. Figure7: EEGmeasures. a: Theworkloadindexsignificantlydiffers acrossdifficultiesandbetweeninteractiontechniques. b: Theattention indexsignificantlydiffersacrossdifficulties. c: Thenumberofinterac- Results tionerrorsdiffersbyatendencybetweenKEYBOARDandTOUCH. Unlessotherwisenoted,wetestedforsignificanceusingre- peated measures ANOVA. For significant main effects, we usedpost-hocpairwiseStudent’st-testwithFDRcorrection. (Figure7b). Thepost-hocanalysisshowedthattheULTRA levelsignificantlydiffersfromtheothers(p<0.05). Workload Onaverage,theclassifierAUROCCscoreduringthetraining Errorrecognition taskwas0.92(SD:0.06)–seeTable2. Overthetestsetthere Onaverage,theclassifierAUROCCscoreduringthetraining wereonaverage2171datapointspersubjectacrossallcondi- taskwas0.82(SD:0.10)–seeTable2. Overthetestsetthere tion(timewindows). Thestatisticalanalysisoftheclassifier wereonaverage388datapointspersubjectacrossallcondi- outputduringthegamesessionshowedasignificanteffectof tions(interactions). Duetothenatureofthedata(numbers thedifficultyfactor(p<0.01);theworkloadindexincreasing ofinteractionerrorsacrossentiregamesessions),weuseda along the difficulty of the levels (Figure 7a). The post-hoc one-tailedWilcoxonSignedRankTesttostressourhypothesis. analysisshowedthatalldifficultylevelssignificantlydiffers Thenumberofinteractionerrorsdiffersbyatendency(p= onefromtheotherwithp<0.01;exceptfortheMEDIUM 0.08)betweentheKEYBOARDandtheTOUCHconditions. level,whichdiffersfromEASYwithp<0.05andwithHARD 19%oftheinteractions(SD:9%)werelabelledasinteraction onlybyamargin(p=0.11). Therewasasignificanteffectof errorsbytheclassifierforKEYBOARDvs22%(SD:9%)for theinteractionfactoraswell(p<0.01),theworkloadbeing TOUCH(Figure7c). higheronaverageduringtheTOUCHcondition. Therewas nointeractionbetweendifficultyandinteractionfactors. Behavioralmeasures BesidesEEGmetrics,wehadtheopportunitytostudypartic- Attention ipants’reactiontimeandperformancesoastogetaclearer Onaverage,theclassifierAUROCCscoreduringthetraining pictureoftheiruserexperience. taskwas0.86(SD:0.05)–seeTable2. Overthetestsetthere were on average 497 data points per subject across all con- Reactiontime ditions(oddevents). Thestatisticalanalysisoftheclassifier Therewasasignificanteffectofboththedifficultyandinter- output during the game session showed a significant effect actionfactors,aswellasaninteractioneffectbetweenthem of the difficulty factor (p < 0.01) but not of the interaction (p < 0.01). Post-hoc tests showed that all difficulty levels factor. Theattentionindexdecreasesasthedifficultyincreases differfromoneanother(p<0.01),exceptforMEDIUMand Construct P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 Average Workload 0.85 0.93 0.98 0.95 0.97 0.97 0.79 0.87 0.87 0.98 0.95 0.94 0.92 Attention 0.83 0.82 0.96 0.81 0.85 0.90 0.82 0.82 0.86 0.92 0.88 0.83 0.86 Errorrecognition 0.88 0.57 0.90 0.90 0.86 0.90 0.78 0.80 0.88 0.78 0.85 0.74 0.82 Table2:Classificationaccuracyduringthecalibrationtasksforthe3measuredconstructs(AUROCCscores). Thesearepromisingresultsforthosewhoseektoassesshow KEYBOARD TOUCH KEYBOARD TOUCH intuitiveaUIiswithexocentricmeasures[11]. me1.1 e1.0 e ti1.0 anc0.8 In this study, we chose to use the particularity of the touch s0.9 m espon00..78 Perfor0.6 sactroeuenchtoscmreaeknefthoertiatsskpomsosribeidliitfyficoufldt.irIencdteemda,nwiphuilleatwioenu,swede R 0.4 keptthecharacterasaframeofreference,resultingininput E M H U E M H U E M H U E M H U Difficulty Difficulty commandsthatwere(patently)notco-localizedwithoutput (a) (b) directions. Besidesresultsdenotingthedifferencesbetween Figure8:Behavioralmeasures:reactiontimeinseconds(left)andper- theconditions,participantsalsospontaneouslyreportedhow formance (proportion of correctly selected directions – right) signifi- nonintuitivethisconditionwas. Wewantedtoinvestigateour cantlydiffersbetweendifficultylevelsandinteractions. E:EASY,M: MEDIUM,H:HARD,U:ULTRA. evaluation methodon a salientdifference atfirst. Then our frameworkcouldwellbeemployedtogofurther;forexample seekingphysiologicaldifferencesbetweendirectandindirect HARD,whichdonotdiffersignificantly(p=0.91). Themean manipulationinterfacesinmoretraditionaltasks. reactiontimeswererespectivelyforEASY,MEDIUM,HARD It is interesting to note how those EEG measures could be andULTRA:0.78s(SD:0.14),0.97s(0.18),0.98s(0.15)and combinedwithexistingmethodstobroadentheoverallcom- 0.69s(0.06). Themeanreactiontimewas0.78(0.12)forKEY- prehension of the user experience. For instance, while we BOARDand0.93(0.13)forTOUCH.SeeFigure8a.Notethat did show significant differences across difficulty levels and usershadlesstimetorespondduringhigherdifficultylevels. betweeninteractiontechniqueswithbehavioralmeasures(re- actiontimeandperformanceindex),EEGmeasurescouldhelp Performance tounderstandtheunderlyingmechanisms. Becausewehave Theperformancewascomputedastheratiobetweenthenum- amoredirectaccesstobrainactivity,wecanmakeassump- berofcorrectselectionsandthetotalnumberofinteractions. tions about the cause of observed behaviors. For example Therewasasignificanteffectofboththedifficultyandinter- participants’worseperformancewithTOUCHthanwithKEY- actionfactors,aswellasaninteractioneffectbetweenthem BOARDcouldbeduetothefactthattheyanticipatelessthe (p < 0.01). Post-hoc tests showed that all difficulty levels outcomesoftheiractions(moreinteractionerrors);thehigher differfromoneanother(p<0.01). Themeanperformance reactiontimemaynotonlybecausedbytheinteractiontech- wasrespectivelyforEASY,MEDIUM,HARDandULTRA: niqueperse,butbyahigherworkload. Andwhileparticipants 98%(SD:3),89%(12),83%(17)and55%(21). TheMean managetocopewiththefastpaceoftheULTRAlevel(the performancewas85%(13)forKEYBOARDand77%(13) smallestreactiontimes),theincreaseinperceptualloadlower forTOUCH.SeeFigure8b. theirawarenesstotask-irrelevantstimuli. Additionally,wecanobservethattheperformancesobtainedat Discussion theEASY,MEDIUMandHARDlevelsareverysimilarwith Mostofthemainhypothesesareverified. Theworkloadin- thekeyboardandthetouchscreen(seeFigure8b). However, dex as computed with EEG showed significant differences EEG analyses revealed that the workload was significantly thatmatchtheintendeddesignofthedifficultylevels. Itwas higher in the TOUCH condition, meaning that users had to alsoshownthatinthehighestdifficultytheattentionlevelof allocate significantly more cognitive resources to reach the participantstowardexternalstimuliwassignificantlylower– same performance. This further highlights that EEG-based i.e.inattentionalblindnessincreased. Concerningtheinterac- measuresdobringadditionalinformationthatcancomplement tiontechniques,thenumberofinteractionerrorsasmeasured traditionalevaluationssuchabehavioralmeasures. by EEG was higher with the TOUCH condition, but this is atendencyandnotasignificanteffect. Theworkloadindex, Measuringusers’cognitiveprocessessuchasworkloadand on the other hand, was significantly higher in the TOUCH attentionmayproveparticularlyusefultoassess3Duserin- conditioncomparedtotheKEYBOARDcondition. terfaces(3DUI),sincetheyareknowntobemorecognitively demanding. They require users to perform 3D mental rota- Thanks to the ground truth obtained during the pilot study tiontaskstosuccessfullymanipulateobjectsortoorientate withtheNASA-TLXquestionnaire,theseresultsvalidatethe themselvesinthe3Denvironment. Moreover,theusualneed useofaworkloadindexmeasuredbyEEGforHCIevaluation foramappingbetweentheuserinputs(withlimiteddegrees- andsetthepathfortwootherconstructs: attentionanderror of-freedom–e.g.,only2foramouse)andthecorresponding recognition.Besidetheevaluationofthecontent(i.e.difficulty actionson3Dobjects(withtypically6degrees-of-freedom), levels)wewereabletocomparetwointeractiontechniques. makes3DUIusuallydifficulttoassessanddesign. Werepro- Thereliabilityofmentalstatesmeasuresisstronglycorrelated ducedpartofthisproblematicwithourgameenvironmentand to the quality of EEG signals. Interestingly enough, partic- obtainedcoherentresultsfromEEGmeasures. ipants’ mindset during the recordings is one of the factors influencingEEGsignals. Theirawarenessandinvolvement towardthetasksimprovesystemaccuracy.Theformofthecal- KEYBOARD TOUCH ibrationtaskscouldbeenhancedtoengagemoreusers,forex- amplethroughgamification[10]–andourvirtualenvironment x 0.25 e provedtobesuitabletodoso. Whereasourparticipantswere nd Difficulty volunteersenrolledamongstudents,intheendtheoutcomeof ad i 0.00 EMAESDYIUM anevaluationmethodbasedonEEGshouldbestrengthened o HARD Workl−0.25 ULTRA breyliraebclryuitthiengdidffeedriecnattecdontessttreurcst,sucsoinugldabseseelseticmtioatnedcrfitreormiathhoewir EEGsignalsduringcalibrationtasks. Finally,oneshouldacknowledgethatwhenitcomestorecord- 0 250 500 750 1000 0 300 600 900 Time (s) ingsassensitiveasEEG,artifactssuchastheonesinducedby Figure9: Workloadindexovertimeforparticipant3–60ssmoothing muscularactivityareofmajorconcern. Thewayweprevented window.Left:KEYBOARDcondition,right:TOUCHcondition.Back- theappearanceofsuchbiasinthepresentstudyisthreefold. groundcolorrepresentsthecorrespondingdifficultylevel. 1)Thehardwareweused–activeelectrodeswithAg/AgCl coating–isrobusttocablemovements,seee.g.,[34]. 2)The Above all, an evaluation method based on EEG enables a classifiersweretrainedonfeaturesnotrelatedtomotionarti- continuous monitoring of users. The intended use case of factsormotorcortexactivation. 3)Thepositionofthescreen ourframeworkistoenrolldedicatedtestersthatwouldwear duringthe“touch”conditionminimizedparticipants’motion, theEEGequipmentandperformwellduringthecalibration andgesturesoccurredmostlybeforethetimewindowusedfor tasks. Asamatteroffact,thebestperformerduringworkload detectinginteractionerrors. Theseprecautionsareimportant calibration(participant3inTable2)showspatternsthatclearly forthetechnologytobecorrectlyapprehended. meet the expectations concerning both difficulty levels and Tofurthercontrolforanybiasinourprotocol,weranabatch interactions,aspicturedinFigure9. of simulations where the labels of the calibration tasks had beenrandomlyshuffled,similarlytotheverificationprocess describedin[35]. Shouldartifactsbiasourclassifiers,differ- LIMITATIONSANDFUTURECHALLENGES ences would have appeared between the KEYBOARD and AlthoughusingEEGmeasuresasanevaluationmethodfor TOUCHconditionsevenwithsuchrandomtraining. Among HCIwasprovenconclusiveregardingworkload–weobtained the20simulationsthatranforeachofthe3constructs(work- acontinuousindexonparwithagroundtruthbasedontra- load, attention, error recognition), none yielded significant ditionalquestionnaires–thetwootherconstructswestudied differences. couldbenefitfromfurtherimprovements. Despitethedirectinteraction(TOUCH)beingmoredisorient- CONCLUSION ingforusersthantheindirectone(KEYBOARD),therecog- Inthispaper,wedemonstratedhowbrainsignals–recorded nitionofinteractionerrorsdifferedonlybyatendency. This bymeansofelectroencephalography–couldbeputintoprac- couldbeexplainedbythefactthatthecalibrationtaskwastoo tice in order to obtain a continuous evaluation of different dissimilartothevirtualenvironment.Notably,whiletherewas interactiontechniques,forassessingtheirergonomicprosand fewandslowpacedeventsduringthecalibration,userswere cons. Inparticular,wevalidatedanEEG-basedworkloadesti- confrontedtomanystimuliwhiletheywereplaying, hence matorthatdoesnotnecessitatetomodifytheexistingsoftware. overlapping ERPs must have appeared within EEG, which Furthermore,weshowedhowusers’attentionlevelcouldbe mayhavedisturbedtheclassifier. Acalibrationtaskcloserto evaluatedusingbackgroundstimuli,suchassounds. Finally, real-lifescenariosthantheonedescribedin[9]shouldbeenvi- weinvestigatedhowtherecognitionofinteractionerrorscould sioned. Suchtaskshouldremaingenericinordertofacilitate helptodeterminethebestuserinterface. thedisseminationofEEGasanevaluationmethodforHCI. Beingabletoestimatethesethreeconstructs–workload,atten- Signalprocessingcouldalsofacilitatethetransferoftheclassi- tionanderrorrecognition–continuouslyduringrealisticand ficationbetweenastandardtaskandtheevaluatedHCI.Indeed, complexinteractiontasksopenednewpossibilities. Notably, ifourresultsdemonstratethattheEEGclassificationofwork- itenabledustoobtainadditionalandmoreobjectivemetrics loadcouldbetransferredfromtheN-backtaskstoadissimilar ofuserexperience,basedontheusers’cognitiveprocesses. It virtual environment and user interface, we benefited from alsoprovideduswithadditionalinsightsthattraditionalmea- spatialfiltersthatspecificallytakeintoaccountthevariance sures(e.g.,behavioralmeasures)couldnotreveal. Tosumup, betweencalibrationcontextsandusecontexts–thestationary thissuggeststhatcombinedwithexistingevaluationmethods, subspace CSP [35]. Since ERPs may also slightly differ in EEG-basedevaluationtoolssuchastheonesproposedhere amplitudesanddelaysbetweencalibrationandusecontexts, canhelptounderstandbettertheoveralluserexperience. We inthefuture,itwouldbeworthdesigningsimilarapproaches hopethattheincreasingavailabilityofEEGdeviceswillfoster tooptimizetemporalorspatialfiltersforERPsaswell. suchapproachesandbenefittheHCIfield.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.