ebook img

DTIC ADA499029: Model-Based Motion Filtering for Improving Arm Gesture Recognition Performance PDF

13 Pages·0.26 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview DTIC ADA499029: Model-Based Motion Filtering for Improving Arm Gesture Recognition Performance

Model-Based Motion Filtering for Improving Arm Gesture Recognition Performance GregS.Schmidt(cid:0) DonaldH.House(cid:1) Abstract model-based approaches because it is an articulated structure with well understood musculature and fairly large inertias that We describe a model-based motion filtering process that when musthaveasignificanteffectongestureperformance. applied tohumanarm motiondata leadsto improvedarmges- Inourwork,wehavedesignedafilterforenhancingthesig- turerecognition. Byarmgestures,wemeanmovementsofthe nal leading to the gesture recognizer. Our motion adaptation arm (and positional placement of the hand) that may or may filterintegratesbothphysicalandcontrolmodelsofhumanges- not have any meaningful intent. Arm movements or gestures turalactionsintotheprocess. Themotionadaptationworksby canbeviewedasresponsestomuscleactuationsthatareguided usingtwofilters: oneisaugmentedwitha“learned”parametric byresponsesofthenervoussystem. Ourmethodmakesstrides gesturesequenceandcontrolsystem,whiletheotherhasnoaug- towardscapturingthisunderlyingknowledgeofhumanperfor- mentation. Ourmethodforincorporatingprocessknowledge— mance by integrating a model for the arm based on dynamics the model and its dynamics—is the extended Kalman filter, andcontainingacontrolsystem.Wehypothesizethatbyembed- though any process estimation filter can be used that can han- dingthehumanperformanceknowledgeintotheprocessingof dlenon-linearities. Thesquareddifferencebetweentheoutputs armmovements,itwillleadtobetterrecognitionperformance. ofbothfiltersissummedandnormalized,givingascorethatcan Wepresentdetailsforthedesignofourfilter,ouranalysisofthe beusedbytherecognitionsystem. filterfrombothexpert-userandmultiple-userpilotstudies. Our Our working hypothesis is that the motion adaptation filter resultsshowthatthefilterhasapositiveimpactontherecogni- will improve the unknown signal’s quality enough to improve tionperformanceforarmgestures. orsimplifytherecognitionprocess.Wetestedthehypothesisby integratingthefilterwithasimpletemplategesturerecognition 1. Introduction system, althoughourfilter canbe integratedwith anystandard type of gesturerecognition system. We tested the system with Gesture recognition techniques have been studied extensively anexpertuserperformingmultiplesetsofgesturesanddevised inrecentyearsbecauseoftheirpotentialforapplicationinuser amultiple-userpilotstudytodeterminetheimpactthatourfilter interfaces. It has long beena goalto apply the “natural” com- hasonarm-movementrecognitionperformance. munication means that humans employ with each other to the interfaces of computers. People commonly use arm and hand 2. Related Work gestures, ranging from simple actions of “pointing” to more complexgesturesthatexpresstheirfeelingsandallowcommu- Herewefirstgiveanoverviewofthemostcommonrecognition nication between each other. Having the ability to recognize methodsandhighlighttheirbenefitsandshortcomings.Thenwe armgesturesbycomputercreatesmanypossibilitiestoimprove brieflydescriberelevantworkthatutilizeshumanmodel-based application interfaces, especially those requiring difficult data approachesandrelatedstudiesinvolvinghandandarmgestures. manipulations (e.g., 3D transformations). Pointing operations More complete detailscan be foundin surveysby Watson [1], wouldcertainlybeaneffectivemeanstoinferdirectionalinfor- AggarwalandCai[2]andPavlovicetal.[3]. mation such as where to move an object in the computer en- vironment. To datenomethodhas beenfoundfor armgesture 2.1. OverviewofRecognitionMethodologies recognitionthatisbothveryaccurateandextendabletodifferent sets of gestures. Typical approaches (e.g., HMMs, neural net- Thecommonmethodologiesthathavebeenusedformotionand works)havefocusedonapplyinganalyticalmethodsforbreak- gesture recognition are: (1) template matching [1], (2) neural ingdownmotionsequencesandrecognizingpatterns. networks (also known as the feature-based approach) [1], (3) The human model-based approach takes into consideration statistical [4, 5, 6] and (4) multimodal probabilistic combina- that while a person is making gestures, the resulting motions tion [7, 8]. By far the most popular recognition methods are andposesareplayedoutbya known,ratherthananunknown, theneuralnetworks(e.g.,[9,10,11,12,13])andthestatistical process. Thegesturescanbeviewedasresponsesofaskeletal method,hiddenMarkovmodels(HMMs)(e.g.,[14,15,16,17]). frame to muscle actuations that are made in response to con- Eachofthesemethodshasasetofdrawbackswhicheitheraffect trolsignalsoriginatinginthenervoussystem. Thestructureof theirperformanceorlimittheirutilizationbyusers. Oneofthe the skeleton, joints, and musculature, is well known and well major drawbacks is that they depend on user-specific training studied. The neural control systems that actuate the muscles andparametertuning. arebecomingbetterunderstood. Witha solidmodel ofhuman The template approach compares the unclassified input se- dynamics and control, much of the analytical heuristic guess- quencewithasetofpredefinedtemplatepatterns.Thealgorithm workmightbeeliminated.Thearmisagoodsubjectfortesting requirespreliminaryworkforgeneratingthesetofgesturepat- terns,andhaspoorrecognitionperformancetypicallyduetothe (cid:2) VirtualRealityLaboratory,NavalResearchLaboratory,Washington,D.C. (cid:3) VisualizationLaboratory,TexasA&MUniversity difficultyofaligningtheinputwiththetemplatepatterns[1]. 1 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2003 2. REPORT TYPE 00-00-2003 to 00-00-2003 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Model-Based Motion Filtering for Improving Arm Gesture Recognition 5b. GRANT NUMBER Performance 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Virtual Reality Laboratory,Naval Research Lab,Washington,DC REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES International Gesture Workshop, GW 2003, Genova, Italy, April 15-17, 2003 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE Same as 12 unclassified unclassified unclassified Report (SAR) Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 Theneuralnetworkapproachworksbypre-determiningaset from the underlyingstructure of humanmotion thatcan be in- ofcommondiscriminatingfeatures,estimatingcovariancesdur- corporatedintherecognitionprocess. ingatrainingprocess,andusingadiscriminator(e.g.,theclas- Rohr[30]studiedhumanmovementsfromimagesequences siclineardiscriminator[18])toclassifygestures.Thedrawback by incorporating models from medical motion studies. He of thismethodis that featuresare manuallyselected andtime- specificallyanalyzedpeoplewalkingbyapplyingmeasurements consumingtrainingisinvolved[1]. of the body joints and vertical displacement of the torso from TheHMMmethodisavariantofafinitestatemachinechar- a “walking” study to build a model of motion. The model acterizedbyasetofstates,asetofobservationsymbolsforeach was integrated with image input data using the Kalman filter state,andprobabilitydistributionsforstatetransitions,observa- to estimate 3D positions and postures of the subjects walking tionsymbolsandinitialstates[4]. Thestatetransitions, which in the images. Others have performed similar work, including arehiddentotheobserver,generateanobservationsymbolfrom Hogg[31]. eachstate. ThebasicpremiseoftheHMMistoinferastatese- Weexploredtheuseofamodel-basedapproachforarmmo- quencethatproducesasequenceofobservations. Learningthe tionrecognitionperformanceinearlierwork[32]. Wewerenot statesequencecanhelptounderstandthestructureoftheunder- abletofindenoughevidencethattheapproachimprovedrecog- lyingmodelthatgeneratestheobservationsequence.Themajor nitionperformanceatthattime. Wefeltthiswasduetothelack drawbacksof theHMMs are: (1) theyrequire a set oftraining of sophistication of the model and control system, which was gestures to generate the state transition network and tune pa- based on a simple particle model representing the position of rameters; (2) they make assumptions that successive observed thewristanditsassociateddynamics. operationsareindependent,whichistypicallynotthecasewith humanmotionandspeech[19]. 2.3. ArmandHandGesture Studies In a multimodal recognition process, two or more human Thehandhasbeenstudiedextensivelyforcomputerhumanin- sensesarecapturedand/ortwo ormorecapturingtechnologies teraction[33,34,35,36,37,38,39,40].However,fewerstudies are combined together to recognize gestures. The multiple in- havebeenperformedongesturesinvolvingthearminaddition puts are processed by a classifier, which rates the set of possi- to the hand. Sturman [39] developeda system for recognizing ble output patterns with a value based upon the likelihood of gestures for orienting construction cranes. Morita et al. [41] a match. The set of probabilities for each input are then com- shows how to interpret gestures from a musical conductor by bined in a manner to be able to select the most likely pattern. trackingthetipofthewand. BaudelandBeaudouin-Lafon[42] Many groups have explored combining speech and gesture to- designedanapplicationthatuseshandandarmgesturesforcon- gether(e.g.,CohenandOviatt[7,8],Codellaetal.[20],Voand trollingacomputerpresentation.Kahnetal.[43]studiedpoint- Waibel[21]). ing operations and Campbell et al. [14] performed a study in- volvingT’aiChigestures. 2.2. Methods Utilizing Human Model-Based Ap- proaches 3. Background Human model-based approaches integrate a model of human motion,typicallyapproximatedasadynamicprocessandcon- Here wegivethebackgroundformethodsthat weutilizedand trolsystem, intotheprocessoffilteringmotioncapturedataof integratedinthedesignofourfilter. human movements. Model-based approaches using dynamics seemstohavefirstappearedinPentlandandHorowitz[22]and 3.1. ExtendedKalmanFilter others(e.g.,[23,24]). Themethodhasbeenshowntoimprove TheextendedKalmanfilter(EKF)[44]estimatesboththetime tracking by helping reduce the search space required to deter- sequenceofstatesofaninputdatastreamandastatisticalmodel minethepositionandorientationofatrackedobjectforthenext ofthatdatastream. TheEKFdiffersfromthestandardKalman motionstate. Thisincludesestimatingpositionandorientation filter[45]inthatitcanbeusedtoestimateaprocessthatisnon- wheninterferencebetweenthetrackingequipmentandtracked linearand/orhandlea measurementrelationshipto theprocess objectoccur[25]. that is non-linear. The EKF can be augmented by a dynamic Model-based approaches have been utilized by Zordan and modelofthesystembeingtracked,andknowledgeoftherelia- Hodgins [26], Metaxas [25] and others for generating motion bilityofthismodel. Simplydescribed,thefilterisasetoftime appropriateforanimatedcharacters. Badler[27]experimented updateequationsthatestimatethenextstatevector,currenterror withmodel-basedapproachestosimulatehuman-likevirtualac- covarianceandthe Kalman gain. The Kalman gainaffects the torsinasystemhedevelopedcalledJack. weightingofmeasurementdataversusthecontrolmodelinde- Wren and Pentland [28] applied dynamics to a 3D skeletal terminingthenextstatevectorestimate. Ifthedynamicmodel modelofthebodyforatrackingapplication. Theyapplied2D isleftoutorisunreliable,theKalmangainishighandthefilter measurementsfromimagefeaturesandcombinedthemwiththe simplysmoothestheinputdata. extended Kalman filter to drive the 3D model. Their resulting TheEKF’spredictionequationsmaybewritten trackingsystemwasabletotoleratetemporaryimageocclusions andthepresenceofmultiplepeopleinthetrackedarea.Inmore recent work [29] they explored the notion that people utilize (cid:21)(cid:0)(cid:2)(cid:3)(cid:5)(cid:3)(cid:5)(cid:1)(cid:1)(cid:4)(cid:7)(cid:4)(cid:7)(cid:6)(cid:9)(cid:6)(cid:9)(cid:8)(cid:8) (cid:22)(cid:10)(cid:2)(cid:11)(cid:3)(cid:0)(cid:21) (cid:3)(cid:13)(cid:3)(cid:12)(cid:13)(cid:22)(cid:24)(cid:14)(cid:15)(cid:23)(cid:3)(cid:26)(cid:3)(cid:16)(cid:12)(cid:18)(cid:25)(cid:28)(cid:17)(cid:20)(cid:19) (cid:27) (cid:3)(cid:30)(cid:29)(cid:31)(cid:3) (cid:27) (cid:3)(cid:23) (cid:12) (1) musclestoactivelyshapepurposefulmotion.Theywereableto extract models of purposeful actions from a system they built. where(cid:10) estimatestheaprioristatevector(cid:0) (cid:3)(cid:5)(cid:1)(cid:4)(cid:7)(cid:6) ,(cid:0) (cid:3) isthecurrent This reenforces the notion that there is important information state vector, (cid:14) (cid:3) is the process model vectorat the current time 2 step, (cid:21) (cid:3) and (cid:21) (cid:3)(cid:5)(cid:1)(cid:4)(cid:7)(cid:6) are the current and a priori estimated error Unaugmented Filter covariances, (cid:29) (cid:3) is the process model error covariance, (cid:22) and Motion State Estimation Dynamics Update (cid:27) aretheJacobiansof(cid:10) withrespectto(cid:0) and(cid:0) ,respectively, R Q − andT(cid:0)heifislatevre’sctuoprdoafteraenqduoamtiovnasrimabalyesb.ewritten Ki CCuoUrvrepandritaaEtnercreorA,WPwar.tri.at.l DSteartiev. Integrator xi+1 SDuimff eSreqnucaeres Score .. g q Parametric Learned (cid:1)(cid:21)(cid:0) (cid:3)(cid:3)(cid:3) (cid:8)(cid:8)(cid:8) (cid:0)(cid:21)(cid:11)(cid:8)(cid:14)(cid:3)(cid:1)(cid:3)(cid:3)(cid:1) (cid:10)(cid:2)(cid:25) (cid:1)(cid:3)(cid:23)(cid:1) (cid:3)(cid:11)(cid:2)(cid:3)(cid:2) (cid:11)(cid:8)(cid:9)(cid:3)(cid:3)(cid:19)(cid:21)(cid:3)(cid:11)(cid:21) (cid:3)(cid:4)(cid:10)(cid:13)(cid:1)(cid:3)(cid:1) (cid:2)(cid:12)(cid:12) (cid:11)(cid:3)(cid:23)(cid:0) (cid:3)(cid:1)(cid:25)(cid:6)(cid:12)(cid:13)(cid:5)(cid:17)(cid:20)(cid:19)(cid:13)(cid:3)(cid:8)(cid:7)(cid:19) (cid:3) (cid:5) (cid:3)(cid:23) (cid:19) (cid:1) (cid:6) (2) x−i K Ballemndan xi DFyonrwamaridcs R e wgii st thMrazotitioi on nO SS r iieIzenqeniu ttoeiaafntlicozenizi where (cid:1) (cid:3) is the current Kalman gain, (cid:15) is a vector of random Unkzniown MAormdel ui variables, (cid:12) relates the state vector to the measurement vector S Meqouteinocne Augmented Filter t(cid:9)h(cid:3)e, J(cid:7) a(cid:3)coisbitahnesmofea(cid:12) suwrietmherenstpeercrtotroc(cid:0)ovaanrdia(cid:15)nc.e, and (cid:2) and (cid:5) are x−i K Ballemndan xi DFyonrwamaridcs ti D Iynnvaemrsiecs DrCivoinntgroTlolerqrue .. g q 3.2. LagrangianFormulationforDynamics KCiCuorvreanritaEnrcreorA,WPartialDeriv. Integrator xii+nt1 Blending x−i+1 Update w.r.t. State Function TheLagrangianformulationfordynamicsisparticularlyappro- priateforarticulatedsystems.TheLagrangian R Qaug BFleancdtoinrg b (cid:16) (cid:11)(cid:18)(cid:17) (cid:12)(cid:20)(cid:17)(cid:19) (cid:19) (cid:8) (cid:21)(cid:23)(cid:22) (cid:11)(cid:18)(cid:17) (cid:12)(cid:24)(cid:17)(cid:19) (cid:19) (cid:10) (cid:21)(cid:26)(cid:25) (cid:11)(cid:8)(cid:17) (cid:19) (3) Motion State Estimation Dynamics Update Control System is the difference between the kinetic energy (cid:21) (cid:22) and potential Figure1: MotionAdaptationFilter energy of the system as a function of state q. The state is (cid:21) (cid:25) a set of generalized joint coordinates and its rate (cid:17)(cid:19) is a set of Influencing Arc Influencing Arc relatedvelocities.TheLagrangianformulationforthedynamics Output ofasystemis Output Input Line Input Wave (cid:27) (cid:27)(cid:29)(cid:28)(cid:11)(cid:30) #%(cid:31) $ (cid:10) (cid:30) #%(cid:31)$ (cid:8) & (cid:3)(cid:16)(cid:12)(’ (cid:8)*) (cid:12),+-+.+(cid:5)(cid:12)0/ (cid:12) (4) Arc bInyf Alurecnced Lineb Iyn fAlurecnced Waveb yIn Aflurcenced Circleb yIn Aflrucenced Angleb yIn Aflrucenced (cid:30)"! (cid:30) where isthesetofexternallyappliedforcesandtorques[46]. & Figure2: FiveGesturesInfluencedbyanArcMotionSequence SolutionstoEquation4canbefoundinclosedform,which are more efficient and readily parameterizable than the open formderivationsgeneratedbytheFeatherstonealgorithm[47], The unaugmented and augmented filters both contain units whichisaveryefficientrenditionoftheNewton-Eulerapproach formotionstateestimationanddynamicsupdate. Thestatees- todynamics[48]. Ontheotherhand,theopenformderivations timationunitblendstheinputmotionsequencewiththecurrent dohavetheadvantagethattheycanbeeasilyextendedtohandle statevectorandpassesthedatatothedynamicsupdateprocess. largesetsofjoint-spaceconfigurations. There,forwarddynamicsareperformedonthestatevectorpro- ducingangularaccelerations. Thesearenumericallyintegrated 4. Motion Adaptation Filter generatingthenextstatevector.Thenextstatevectorisfedback into the system at the Kalman blend and sent to be compared Thedesignofourmodel-basedmotionadaptationfilterisshown with theoutputfromtheaugmentedfilter. TheKalmangainis in Figure1. Itcontainstwo extendedKalmanfilters: oneaug- updatedfromthecurrenterrorcovariancewhichissubsequently mented with a model of the human arm, dynamics update and updatedbydatafromthedynamicsupdateprocess. controlsystem,theotherunaugmentedandcontainingonlythe Theaugmentedfilteralsocontainsacontrolsystem,whichis armmodelanddynamicsupdatecomponents. Theinputorun- composedofadrivingtorquecontrollerandablendingfunction. knownmotionsequenceispassedthrougheachfilter,compared Torquesusedbythecontrollerarederivedfromtheparametric andascoreiscomputed,whichisusedasoutputforthemotion learnedmotionsequenceandmodelandappliedtotheforward adaptationfilter. dynamics of the system. After numerical integration, an inter- Theunaugmentedfilter,ineffect,smoothestheinputmotion mediate state vector is passed to the blending function where sequence since it has no control system. The augmented filter it is mixedwith the aligned and parameterized learned motion attemptstoinfluencetherawinputmotionsequencetofollowa sequence producing the next state vector. The motivation be- learned motion sequence. We illustrate this notion in Figure 2 hind the augmented filter is that if the input motion sequence byshowingfivedifferentmotionsequences(arc,line,wave,cir- matches closely to the learned motion sequence (e.g., in Fig- cle and angle) as influenced by an arc motion sequence. Each ure2thearcinarcmodule),thentheresultingtrajectoryshould sequencestartsontherightside andproceedstowardstheleft. beverysimilartotheinput. Thusthetrajectoriesoutputbythe Thedarkestgreylineindicatesthe“influencing”arcsequence, unaugmentedandaugmentedfilterswillbenearlyidentical,and the lightest greyis theinput sequence, and the mid-greyis the the output score will be a very small number. However, if the outputsequence. Theimagesshowthedegreeofinfluencethat inputmotionsequenceisdissimilar(e.g.,inFigure2thelinein the arc has on the input sequences, which is controlled by the arcmodule)tothelearnedsequence,thetrajectorieswilldiffer KalmangainintheEKF. greatlyandlikewisethescorewillbelarge. 3 World , , and from Equations1and2 whichrelatethe pro- z (cid:22) (cid:27) (cid:2) (cid:5) Coordinate cessandmeasurementsystem’sstatevectorstothecurrentstate System Shoulder (3−DOF) vector. Theelementsofthesematricesarepredeterminedsym- x y bolically and updated numerically as the filter operates. They rU are q c U a % & ’)( * & ’ ( * Shoulder f lU *,+.+./- (10(cid:18)*,+4+2/6-3 587(cid:15)9 * +;+2:-( (10(cid:18)* +;+2:=-< 587 Coordinate x z and where I is the 8x8-identity matrix. The ma- (cid:2) (cid:8) (cid:5) (cid:8)?> System y trices (cid:22) and (cid:27) are updated by taking the partial derivatives m Elbow (1−DOF) U withrespecttothecurrentstatevectoroftheirrespectivecom- c rL E c pleteforwarddynamicsequation . Theaugmentedandunaug- r L @ lL mentedfiltershavedifferentformulations. Theformulationfor theaugmentedfilteris Wrist c mL W (cid:6)GF (cid:6) F @ (cid:11)(cid:18)(cid:17) (cid:12)(cid:24)(cid:17)(cid:19) (cid:12) (cid:0) (cid:6) (cid:12) (cid:0)BA (cid:19) (cid:8) CED(cid:1) A (cid:11) (cid:17)(cid:19) (cid:25) (cid:0)BA (cid:19) (cid:23) (cid:30) CJDLK (cid:11) (cid:17)(cid:19) (cid:25) (cid:0)BA (cid:19) (cid:10) Figure3: ArticulatedArmModel C (cid:19) D (cid:11) M(cid:19) (cid:25) (cid:0) A (cid:19) (cid:25) & (cid:11)(cid:18)(cid:17)(cid:27)N(cid:30)IH(cid:12)(cid:20)(cid:17)(cid:27)(cid:19) N (cid:19) K (cid:12) (6) andfortheunaugmentedfilteris Inthenextfewsectionswedescribethedetailsofeachofthe componentsofthemotionadaptationfilter. (cid:6)GF (cid:6) F 4.1. ArmModel @ (cid:11)(cid:18)(cid:17) (cid:12)(cid:24)(cid:17)(cid:19) (cid:12) (cid:0) (cid:6) (cid:12) (cid:0)BA (cid:19) (cid:8) CCE(cid:19) DD(cid:1)(cid:11) (cid:17)(cid:19) (cid:25)A (cid:11)(cid:0)(cid:17)(cid:19) A(cid:25) (cid:19) K(cid:0)B(cid:12) A (cid:19) (cid:23) (cid:30)I(cid:30)H CJDLK (cid:11) (cid:17)(cid:19) (cid:25) (cid:0)BA (cid:19) (cid:10) Adynamicarticulatedmodelofahumanarmisintegratedinto (7) the filter. The arm model consists of a 3-DOF shoulder joint, where (cid:6) and are vectors of random variables represent- (cid:0) (cid:0)BA a 1-DOF elbowjoint andcylinder linkagesbetweenthe shoul- ing“white”noisewithzeromeanandconstantvarianceassoci- der and elbow, and between the elbow and wrist. The model atedwiththeprocessmodel’sstatevectorandvelocities,respec- is shown in Figure 3. We ignore the wrist twist in the lower tively. C and C (cid:19) are the inertiamatrices defined in Section4.3 arm.Wealsocapturethethreedegreesoffreedomforthetorso, composedof membersfrom the statevector (cid:17) andangularve- which is used to produce a relative coordinate system for the locities(cid:17)(cid:19) . CED andC (cid:19) D aresimilarmatricestoC andC (cid:19) butwher- arm. The three degrees of freedom from the torso are elimi- ever an element of (cid:17) and (cid:17)(cid:19) appears, the appropriate random nated after the coordinate transformation takes place between variable from the vectors (cid:6) or is added to that member. (cid:0) (cid:0)BA thetorsoandshoulder. Forexample, if (cid:24) appearsin anelementofmatrix C , thenitis Thepositionofthewristandelbowcanbedeterminedbyus- replacedwith (cid:24) (cid:25) (cid:0) (cid:6)(cid:18)(cid:6) in CED,where (cid:0) (cid:6)(cid:13)(cid:6) isthefirstelementin ingthekinematicsequationsofmotionforthearmmodel. The thevector(cid:0) (cid:6) ,since(cid:24) isthefirstelementin(cid:17) . equations are parameterizedusing joint anglesfor eachdegree offreedomofthejointsinthemodel. Theyare 4.3. DynamicsUpdate The dynamics update process provides parameter updates for motion state estimation and the control system. It takes the (cid:0)(cid:2)(cid:1) (cid:8) (cid:11) (cid:10)(cid:4)(cid:3)(cid:6)(cid:5)(cid:8)(cid:7)(cid:10)(cid:9)(cid:12)(cid:11)(cid:14)(cid:13) (cid:12) (cid:10)(cid:4)(cid:3)(cid:6)(cid:5)(cid:2)(cid:7)(cid:10)(cid:9)(cid:15)(cid:7)(cid:10)(cid:13) (cid:12) (cid:10)(cid:4)(cid:3)(cid:6)(cid:5)(cid:16)(cid:11)(cid:14)(cid:9) (cid:19) (cid:23) (cid:12) current state of the system and the arm model (and a set of (cid:0)(cid:18)(cid:17) (cid:8) (cid:7)(cid:20)(cid:19) (cid:11)(cid:22)(cid:21) (cid:19) (cid:7)(cid:20)(cid:23) (cid:11)(cid:25)(cid:24) (cid:19) (cid:11) (cid:10)(cid:4)(cid:3) (cid:7)(cid:27)(cid:26)(cid:28)(cid:11)(cid:14)(cid:29) (cid:12) (cid:10)(cid:4)(cid:3) (cid:7)(cid:27)(cid:26)(cid:15)(cid:7)(cid:30)(cid:29) (cid:12) (cid:10)(cid:4)(cid:3) (cid:11)(cid:31)(cid:26) (cid:19) (cid:23) (cid:12) torquesfortheaugmentedfilter),andperformsforwarddynam- (cid:31) (cid:31) (cid:31) (5) ics to produce the parameter update functions (described in where(cid:0)(cid:2)(cid:1) and(cid:0)(cid:18)(cid:17) arethepositionsoftheelbowandwrist,re- @ Section4.2)andtheangularaccelerationsM . WeusedEulernu- spectively,(cid:3) (cid:5) and(cid:3) arethecorrespondinglengthsoftheupper O andlowerarm,(cid:7)(cid:20)(cid:19) (cid:11) (cid:31)(cid:21) (cid:19) and(cid:7)(cid:20)(cid:23) (cid:11)(cid:6)(cid:24) (cid:19) arerotationmatricesaboutthe macecreilcearlaitniotengs.raWtioenco[4u9ld]thoavuepduasteedthaemnoerxetssotaptheisvteiccatoterdusininteggtrhae- respectiveaxes and , and (cid:7) and (cid:11) are sines and cosines of ! " tion method, but we found that Euler integration was satisfac- anglesofrotation , , and . (cid:24) (cid:21) # $ tory. Hereweshowthederivationoftheforwarddynamicsequa- 4.2. MotionState Estimation tionforthe4-DOFarticulatedarmmodel, whichgeneratesthe Motion state estimation is used to predict the state vector at angularaccelerationsandisusedtoderivethecompleteforward the next time step for the current state of measured input, dy- dynamicsequations(Equations6and7). namicmodelandstatisticalmodelsofthemeasuredandcontrol In order to derive the dynamics equations, the masses, systems. The statistics for the measurement process and con- lengthsandmomentsofinertiaofthearmsegmentsareneeded. trolsystemareintheformoferrorcovariancematricesandare Each arm segment is represented by a thin cylinder rotating pre-determinedusingtrainingandmeasurementsfromtheuser about its endpoint. The center of mass for each cylinder is workspace. TheyareusedbytheEKFalongwithdatafromthe estimated using data from a study on anthropometric parame- dynamicsupdateprocesstodeterminethecurrentKalmangain. ters for the human body in [50]. The data gives estimations TheKalmangainiscriticalforstateestimationinthesystem forthesegmentalcenterofmass(COM)locationsexpressedin and requires knowledge from the dynamics and measurement percentages of the segment lengths. These are measured from process. This data includes the four (8x8)-Jacobian matrices the proximal end of the segments. The moment of inertia for 4 each segment is computed by combining the inertia tensor of withanintermediatestatevectorfromthedynamicsupdatepro- the representative cylinder body and inertial component asso- cess. The degree of its influence is controlled by a fixed pre- ciated with the shift of its COM to the endpoint. The inertial determinedblendingfactor. Thelearnedmotionsequencealso componentsassociatedwiththeshiftoftheCOMare remains fixed throughout the iteration of the filter. We see the driving torque controller as analagous to an open-loop predic- tive control and the blending function as analagous to propri- (cid:0) (cid:5) (cid:8) (cid:11) (cid:10)(cid:1)(cid:0)(cid:28)(cid:5)(cid:18)(cid:7)(cid:10)(cid:9)(cid:15)(cid:11)(cid:14)(cid:13) (cid:12) (cid:10)(cid:1)(cid:0)(cid:28)(cid:5)(cid:8)(cid:7)(cid:10)(cid:9)(cid:12)(cid:7)(cid:10)(cid:13) (cid:12) (cid:10)(cid:1)(cid:0)(cid:28)(cid:5)(cid:18)(cid:11)(cid:14)(cid:9) (cid:19) (cid:23) (cid:12) oceptive and sensory feedback. Our control system has simi- (cid:0) (cid:8) (cid:7)(cid:20)(cid:19) (cid:11) (cid:21) (cid:19) (cid:7)(cid:20)(cid:23) (cid:11)(cid:25)(cid:24) (cid:19) (cid:11) (cid:10)(cid:1)(cid:0) (cid:7)(cid:27)(cid:26)(cid:28)(cid:11)(cid:14)(cid:29) (cid:12) (cid:10)(cid:1)(cid:0) (cid:7)(cid:27)(cid:26) (cid:7)(cid:10)(cid:29) (cid:12) (cid:10)(cid:1)(cid:0) (cid:11)(cid:31)(cid:26) (cid:19) (cid:23) (cid:12) laritiestothemodelreferenceadaptivecontrol(MRAC)system (cid:31) (cid:31) (cid:31) (cid:31) (8) presentedin[51,52],whichincorporatesareferencemodelofa where(cid:0) (cid:5) and(cid:0) arethepositionsinCartesianworldspaceof motionsequence,invertsitsdynamicsandappliestheresulting the estimated CO(cid:31) Ms of the upper and lower arm, respectively, torquesinacontrolledmannertotheinputdata. and (cid:0) (cid:5) and (cid:0) are the correspondingradial distancesfrom the The torques for the driving torque controller are computed shoulder and(cid:31)elbow, respectively. Time derivatives are taken usingtheinversedynamicstorqueformulation toget theangularvelocitiesatthe estimatedCOMsofthearm segments.Theseare (cid:0)(cid:19) (cid:3) (cid:8) (cid:2) (cid:3) (cid:17)(cid:19) (cid:12)(cid:26)’ (cid:8)(cid:4)(cid:3)(cid:6)(cid:5) (cid:12) (cid:16)(cid:1)(cid:7) (9) (cid:28) (cid:11) M (cid:12) M(cid:19)(cid:19) (cid:8) & (cid:11) M N (cid:12) M(cid:19)N (cid:19) (cid:25) A(cid:6) M(cid:19)(cid:23) (cid:30)(cid:30)# C M(cid:19) (cid:10) C (cid:19) M(cid:19)+ (14) where the Jacobian matrices (cid:2) (cid:5) (cid:8) (cid:30)(cid:9)(cid:8)(cid:11)#(cid:10) and (cid:2) (cid:8) (cid:30)(cid:12)(cid:8)(cid:11)#(cid:13) , and where& isthevectorofappliedtorquesfromthecontroller,and M(cid:19) (cid:8) (cid:11) (cid:24)(cid:19) (cid:12) (cid:21)(cid:19) (cid:12) #(cid:19) (cid:12) $(cid:19)(cid:19) (cid:23) . Theinertialcompon(cid:30)entsare (cid:31) (cid:30) jointanglesM N andangularvelocitiesM(cid:19)N arefromtheinfluenc- inggesturesequence. Thejointconfigurationsaretransformed (cid:14) (cid:5) (cid:8) / (cid:5) (cid:2) (cid:5)(cid:14)(cid:23) (cid:2) (cid:5) (cid:25) (cid:14)(cid:16)(cid:15)(cid:18)(cid:17)(cid:6)(cid:19) " (cid:5) (cid:12) (10) so thattheycorrelate with the learnedmodel’sjoint configura- (cid:14) (cid:8) / (cid:2) (cid:23) (cid:2) (cid:25) (cid:14)(cid:16)(cid:15)(cid:20)(cid:17)(cid:6)(cid:19) " (cid:12) tions. (cid:31) (cid:31) (cid:31) (cid:31) (cid:31) Since there is no feedback in the driving torque controller, wlohweerrea(cid:14)rm(cid:5) ,arnedsp(cid:14)e(cid:31)ctiavreelyth,e/ i(cid:5)nearntidal/ comarpeotnheenetsstiomfathteedumpapsesreasnodf the torques can be precomputed. When (cid:28) (cid:11) M (cid:12) M(cid:19)(cid:19) is applied to ctheesarermpresseegnmtienngtst,haentdhi(cid:14)(cid:21)n(cid:15)(cid:20)c(cid:17)(cid:6)y(cid:19) l"in(cid:5) daenrdb(cid:31) o(cid:14)(cid:21)d(cid:15)(cid:20)y(cid:17)(cid:6)(cid:19)i"n(cid:31)ertairaesdaibaoguotneaalcmhaptrai-- ttrhaejedcytonraymaicnsaliotginofluusetnoctehsethineflmueonticoinngofsethqeuemncoed.elHtoowfoelvloerw, iat is not necessarily strongly influencing the raw motion data to rameterized axes (cid:24) , (cid:21) , # and $ . The elements in (cid:14)(cid:21)(cid:15)(cid:20)(cid:17)(cid:6)(cid:19) " (cid:5) and movetowardsthelearnedmotionsequence. Thestrengthofthe arefilledbyconvertingthecylinder’sEuclideancoordi- n(cid:14)(cid:21)a(cid:15)(cid:20)t(cid:17)(cid:6)e(cid:19)s" t(cid:31)osphericalcoordinates. influenceiscontrolledbyascalingparameter(cid:29)(cid:31)(cid:30) thatisapplied The angular velocities and inertias are used to compute the totheKalmanfilter’sprocessmodelerrorcovariancematrix(cid:29) . Thisaffectshowmuchthesystem“trusts”therawmotiondata kineticenergy versus the dynamic model. As changes it directly impacts (cid:29)(cid:21)(cid:30) (cid:21) (cid:22) (cid:8) A(cid:6) (cid:17)(cid:19) (cid:23) C (cid:17)(cid:19) (cid:12) (11) rhoorwinththeerespyostretemd.cAonstarorlelseurlet,rrthoerrKelaaltmesantofitlhteerm’segaasiunremmaetrnixte(cid:1)r- whereC (cid:8) (cid:14) (cid:5) (cid:25) (cid:14) . Thepotentialenergyisgivenas (Equation2),stabilizesdifferently,thereforechanginghowthe (cid:31) Kalmanfilterweightsinputmotionversuscontrollerinfluence. The blending function supplements the driving torque con- (cid:21) (cid:25) (cid:8) (cid:10) / (cid:5) @ (cid:0)(cid:28)(cid:5)(cid:18)(cid:11) (cid:28) (12) trollerbyprovidingmoreguidancetothestateestimation. The F (cid:10) / @ (cid:3)(cid:6)(cid:5)(cid:16)(cid:11) (cid:28) (cid:10)(cid:22)(cid:0) (cid:7)(cid:24)(cid:23)I(cid:11)(cid:26)(cid:25) (cid:7) (cid:28) (cid:25) (cid:0) (cid:11) (cid:28) (cid:11)(cid:27)(cid:23) K (cid:12) driving torque controller provides the dynamics drive for the (cid:31) (cid:31) (cid:31) model, butitdoesnotalways providesufficientguidance. The where isthegravitationalconstant. Thetwoenergytermsare @ influencing motion sequence’s torques may be nonlinear with usedfortheLagrangian,(cid:16) ,ofEquations3and4.Thedynamics respecttothejointconfigurations,butthetrackingsystemper- equationsarecomputedandsolvedforangularacceleration forms blendingofjointconfigurationslinearly. Therefore, due tolinearblending,smallchangesinthejointconfigurationscan MO (cid:8) C (cid:1) (cid:6)GF A(cid:6) M(cid:19)(cid:23) (cid:30)# FC KM(cid:19) (cid:10) C (cid:19) M(cid:19) (cid:25) & K (cid:12) (13) produce large changes in the dynamics. This directly affects (cid:30) where isthesetofappliedtorques. howthedrivingtorquecontrollerperforms. Theblendingfunc- & tionisintendedtocounteractthiseffect. 4.4. Control System The blending function incorporates the current state of the system with the raw motion data from a learned motion se- Our control system, in effect, acts analogously to the motor quence. Therawmotiondataincludesthejointanglesandan- nervous system in the human body by providing guidance for gular velocities. This data is linear with respect to the motion how the arm model is applied to update the motion state vec- state configurations of the system. The blending function that tor. Thecontrolsystemspecificallyinfluenceshowthelearned weuseis motion sequence acts on the current motion state vector. It is composed of a driving torque controller and a blending func- tion. The driving torque controller uses data from the learned (cid:0) (cid:3)(cid:5)(cid:4)(cid:7)(cid:6) (cid:8) (cid:15) (cid:11) (cid:0) (cid:3) (cid:25)! #" (cid:0) (cid:19)(cid:3) (cid:19) (cid:25) (cid:11)0) + (cid:10) (cid:15) (cid:19) (cid:0) N(cid:3) (cid:12) (15) motion sequenceand armmodel andperformsinversedynam- ics, which generates torques for the dynamics update process. where (cid:0) (cid:3) (cid:8) FM (cid:12) M(cid:19)K(cid:23) , (cid:0) (cid:19)(cid:3) (cid:8) FM(cid:19) (cid:12) MO K(cid:23) , (cid:0) N(cid:3) (cid:8) FM N (cid:12) M(cid:19)N(cid:4)K(cid:23) , #" is the The blending function combines the learned motion sequence currenttimestep,and istheblendingfactor. (cid:15) 5 Filter for Gesture 1 Unknown Scores M Aortmion T Mr aUocnktiiiotnng S Meqouteinocne FGieltsetur rfeo r2 RecoUgnniittion R eGceosgtnuirzeed Parametric Learned Line Arc Wave Circle Angle Motion Sequences Filter for Gesture N Filter Parameters Figure5: Wrist-TrajectoryShapesoftheGestureDatasetsused Figure4: TestRecognitionSystemArchitecture fortheExpertUserExperiments 5. Analysis of Filter S S S S S S In order to test the effectiveness of our new filter, we imple- Arc and Wave Arc and Angle Arc and Circle Wave and Circle Wave and AngleCircle and Angle mented it, selected a difficult–to–discriminate gesture dataset, Figure6: OverlappingFeaturesEmbeddedinGesturePairs andranuserstudies. 5.1. DesignofTestSystem theoverlappingfeaturesembeddedinmanyofthepairsofges- tures (see Figure 6 for an illustration). Two distinct gestures We designed a system to test the motion adaptation filter by that have overlappingmotion segments, especiallyif theystart adaptinga simpletemplate-style gesturerecognizer. We chose with the same motion sub-sequence, are more difficult to dis- thetemplaterecognitionsystembecauseitiseasytoimplement tinguish than dissimilar nonoverlapping gestures. A properly and is very easy to understand. However, our filter can work tuned EKF bases its initial output more on the input data than withmoststandardrecognitionarchitectures(e.g.,onebasedon the dynamic model. But, when it convergesto a stable blend- neuralnetworks). Thetemplatearchitectureworksbycompar- ingstate,thedynamicsofthesystem takeover. Iftwogestures ingtheunknowninputsequencewitheachgesturepattern. For have similar starting trajectories and abruptly change after the our case, the unknowninput is passed through a motion adap- dynamicsbecomemoredominant, thesystemwillinitiallyfail tation filter associated with each gesture (see Figure 4 for an to discriminate between the two gestures because the derived overview). dynamics of the system are similar. Eventually the mixture of Human motion data is brought into the system by a motion the two dissimilar segments of the gestures will influence and trackingunitandsegmentedbysearchingforlongpausesinthe changethesystembehavior. motion sequences. The choice of tracking system is arbitrary, For our experiments, we considered the direction in which aslongitcangenerateacontinuoussequenceofmotionstates. the motiontrajectorywasperformedbythe userasameansto For this architecture, the output is distributed in parallel to N distinguish the gesture sequences. Thus the five basic shapes copies of the filter. Each of the filters is custom-tuned for a showninFigure5producetendifferentgestures(twoforeach specificgesture. Theoutputofthefiltersisa setofscoresthat shape). We usedcombinationsofthe fivebasic shapes togen- areprocessedbytherecognitionunit.Thescoresarethesquared erategesturedatasetsandtesttheperformance,generalizability differencesoftheinternalunaugmentedandaugmentedfilters. andextensabilityofthefilterinthefirstfourexpert–userexper- In our system, we used a magnetic tracking system simply iments. to ensure a degree of assurance that a reliable stream of input data is sent to our processing filter. There are obviously more 5.3. FilterParameters accurateandreliableinputtechnologies(e.g.,acousticandiner- tial)andvisionsystems,whichdonotguaranteethecontinuous Our filter requires a set of parameters that must be predeter- reliable stream of input, but do have the potential to produce minedandtunedforindividualgestures.TheEKFrequireserror moreaccuratetracking.Weemphasizethatourfiltercanaccept covariancedataforthemeasurementandcontrolprocesses.The trackingdatafromanymotioncapturingtechnology. dynamics update requires measurements from the user’s arm. Our system captures orientations of the lower arm, upper Thecontrolsystemrequiresablendingconstantandthelearned arm, and torso and returns four Euler angles. Angular veloci- motionsequence. ties are estimated from the angles using time difference meth- ods.Thesetofanglesandangularvelocitiesmakesupamotion 5.3.1. ParameterDetermination statevector. Thesequenceofstatevectorsissenttothemotion Tocomputethemeasurementerrorcovarianceweaffixedthree stateestimationunit. motiontrackingreceiversintheuserworkspacetoastationary configuration analogous to that of the right arm. We recorded 5.2. Selectionofa Hard–to–DiscriminateGesture 1000 samples continuously and estimated the error. The error Dataset covariancematrixiscomputedusingtheanglesandangularve- Ourfirststepforanalyzingtheperformanceofthefilterwasto locities. Theangularvelocitiesareestimatedbytreatingtheset select a set of gestures that are hard to distinguish from each of samplesas a continuousstream ofdata andtakingtime dif- other. Theselectioncriterionwasdeterminedbyobservingthe ferenceswithrespecttothesamplingperiod. Themeasurement physicaltrajectoriesatthewristpositionforeachgesturewhile errorneedstobecomputedonceforagivensetofhardwareand being performed by a user. The wrist trajectories for the ges- workspace. ture dataset we selected for the introductory experiments are The control process error is computed by using the pre- shown in Figure 5. This gesture set was chosen because of recorded gesture sequences. A parametric learned motion se- 6 quenceforeachgesturetypeisselectedbydeterminingtheclos- b 1.00 0.95 0.85 0.45 0.00 estfittingtrajectorytoanormaltrajectorythatiscomputedfrom kc thesamplesetofgestures. Theerrormatrixisestimatedusing 1.000 themeansquarederrorbetweentheparametriclearnedmotion sequenceandtherestofthesequences. Thecontrolerrorneeds tobecomputedforeverygesturesequence. 0.050 5.3.2. SubjectMeasurements Some of the parameters needed for the filters are taken from 0.010 measurementsoftheusers. Thefiltersrequirethelengths,radii and masses ofthe upperand lower arm. These parameters are obtainedbycombinationsoftwomethods:directmeasurements 0.001 and estimation from anthropometric parameters of the human body. Thelengthsare determinedbyeitherdirectly measuring the distance between the shoulder and elbow, and elbow and 0.000 wrist, or estimating them from the height and sex of the user. Estimations of anthropometric parameters of the human body aremadeaccordingtotheprocedureoutlinedinHall[50]. The Figure7: EffectsofVaryingAugmentedFilterParameters radii are obtainedby measuring the circumferencesof the arm segments at the midpoint. The masses are obtained by weigh- ing the subject and estimating the arm segment masses based effectsofadjustmentstotheparameters(cid:29) (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) and(cid:15) . Fordemon- strationpurposeswechoseanarcgestureasourrawdataanda ona study withmass measurementstakenfrom cadavers. The lineasthelearnedmotionsequence.Thearcandlinesequences masses for the arm segmentsare determined as percentagesof refertothecompletemotionofthearmasittracesoutaspatial thewholebodymassformalesandfemales. arcandline,respectively,atthewrist. InFigure7weillustrate the effecton the wrist position, overthe full motion sequence, 5.3.3. ParameterTuning ofvaryingtheparameters(cid:29) (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) and (cid:15) from0to1. Inthefigure, InordertousetheEKF,specificparametershavetobetunedin thelightlineistheinputarcsequence,theblacklineistheline order to get desirable guidance in the recognition units. One sequenceandthedarkgreylineistheadaptedmotion. of the parameters that needs tuning is a multiplicative factor Thefigureshowsthatthechangesarenotnecessarilylinear (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) used to scale the augmented filter’s control error covari- with respect to either parameter. In addition, both parameters (cid:29) ance. There is one such scaling factor for each control error produce different effects in the adapted trajectory. The con- covariancematrix. Thescalingfactorisusedtoadjustthelevel troller scaling parameter appears to influence the raw motion of“trust”inthefilterbychangingthecontrolerrorwithrespect data to morph into a rough form of the influencing sequence, tothemeasurementerror. Thelarger (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) is,themorethefilter but it does not align very well with it. This is evident as you (cid:29) outputdependsontheinput. Thesmaller (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) is,themorethe lookattheimagesintheleft-mostcolumn. Theblendingfactor (cid:29) filteroutputdependsonthecontrolleranddynamicmodel. As appearstoblendthetwotrajectoriesfairlywellbyitself,butnot aresulttheKalmangainmatrix,essentialfortheKalmanblend, completely. Anappropriatecombinationofthetwoparameters changes. producesthebestblend.Forthedifferentgesturesweusedvary- Fortheunaugmentedfilter,weusedamultiplicativescaling ingvaluesfor (cid:29) (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) , butwechoseafixedvaluefor (cid:15) . We tried factor toadjusthowmuchsmoothingisperformedonthein- to make our selectionbased onthe graphs in the figure. was (cid:29) (cid:15) put gesture sequence. This parameter, again, is applied to the chosenfromthegraphsthatproducedamixedtrajectorywitha control error covariance. acts in the same manner as (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) bitmoreinfluencefromthelearnedmotionsequence. (cid:29) (cid:29) doesfortheaugmentedfilter. isadjustedbyloweringitsvalue Animportantconsiderationwhenselectingtheparametersis (cid:29) untiltheinputdatatrajectorybecomessmoothandstillfollows thedegreeofalignmentoftheinputgesturewithrespecttothe roughly the same trajectory. If it is decreased too much, the learnedgesture. Intheexperiments,weasktheuserstoextend output trajectory will follow the unguided dynamic model too their right arm in a perpendicular direction to the front side of much. theirbody.Thegesturestheyareaskedtoperformarethencen- Anotherparametertobetunedistheblendingfactor . This teredaroundthathandpositionasbestaspossible.Roughalign- (cid:15) is applied in the blendingfunction, which performsa blend of ment and scaling is applied to the parametric learned gesture the intermediate state vector (cid:0) (cid:3)(cid:5)(cid:3)(cid:5)(cid:4)(cid:4)(cid:7)(cid:28)(cid:6) and the parametric learned in addition to the parameterizing that is necessary to perform motion sequence. This factor is important because it weights a matching comparison. This is the registration phase, which howtherawdataisblendedwiththeparametriclearnedmotion can be seen on the right side of the filter diagram in Figure 1. sequenceatthemotionstatelevel. TheKalmanblenddoesnot If theparametriclearned gesturedoesnotalignverywellwith directly incorporate knowledge of the parametric learned mo- the gesture it is supposed to accept, it creates a high score for tion sequence. We used oneblendingfactorfor all the gesture thecomparison. Thisisduetoourmethodforevaluationwhich types. comparestheaugmentedandrawinputtrajectories.Ifthealign- Toshowhowwemadeourchoiceofparametersfortheaug- mentisextremelybadwecouldnotadjustthe (cid:25)(cid:1)(cid:0)(cid:3)(cid:2) parameterto (cid:29) mented filter, we deviseda simple experimentthat showedthe “trust”themodelasmuch. Inmostcasesthisisnotaproblem, 7 butforadifficultdatasettorecognize,suchasthebasicfiveges- turesinFigure5,somegestureswillbeimproperlyclassified. 5.3.4. SensitivityAnalysis t 0.06 0.16 0.26 0.37 0.47 Ifweweretorunafulluserstudyonhumansubjectsofwidely varying mass and height, it would be important to understand howmuchofanimpactparameterchangeshaveonthedynam- icsofthesystem. Ifitcanbeshownthatthesystemisrelatively t 0.57 0.67 0.77 0.88 0.98 insensitivetochangesintheparametersthenitmaybeconsid- a) Line in Arc Module ered to be more generalizable and potentially more powerful. We analyzed the sensitivity of a few of the body parameters (summarized in Schmidt [53]), but did not determine enough meaningfulinformationtomakeconclusionsaboutthegeneral- izabilityofourfilter. 5.4. ExpertUser Experiments Wesetouttoverifytheeffectivenessofthefilterintegratedinto agesturerecognizerbydevisingasetofexperimentstobeper- formedbyanexpertuser. Theseweredesignedtotesttheper- b) Arc in Arc Module formanceoftherecognizerwithandwithoutourfilter. Wealso wantedtoascertainsomethingabouthowgeneralizableandex- Figure8: ArmModelMotioninTime tensableourfilteriswithrespecttodifferentandlargergesture datasets, respectively. To accomplish these goals, we ran five To get a better idea of how our method works, refer back experiments. Before beginning we pre-recorded a database of to Figure 2. The arc in the arc module shows the best match gestures from the user, computed the parameters and learned between the augmented and the unaugmented (effectively the models,andperformedmanualparametertuning. learned motion sequence) trajectories. The rest of the cases showthatthelearnedarcsequencehasalargeinfluenceonthe 5.4.1. AccuracyPerformance data running through the augmented filter which is evident by The purpose of the first experiment is to determine the perfor- the output augmented trajectories. This effect pulls the aug- manceratingoftherecognizerintegratedwithandwithoutour mented and raw data curves apart. The sequences in Figure 8 filter. We used the five gestures from Table 1, and recorded illustrateasmallsetofstatetransitionsfromthethreearmmod- 100 samples for each gesture. The gestures were first aligned elsusedingeneratingthetrajectoriesforthelineandthearcin withthelearnedmotionsequences,thenthelearnedmotionse- thearcmodule. Thefiguresshowframesfroma3Dsimulation quences were parameterized to match the size of the input se- of the corresponding schematic 4-DOF arm models. The arm quence. We supplied both the filtered (our method) and unfil- states are very similar for the arc in the arc module, but very teredrecognizerswiththe500gestures.Theresultsaregivenin differentforthelineinthearcmodule. Table1. 5.4.2. Generalizability Table1: ResultsofExperiment#1 To test the generalizability of our approach, we ran a second experiment. Inthe experimentwe used the reverse-orderwrist trajectoriesforthegesturesusedinthefirstexperiment(acom- pletely unique dataset). We recorded 100 samples for each of Arc Line Wave Circle Angle Totals thefivegesturesandpurposelyaddednoiseintothesamplesto test the robustnessof our filter. Then we passed them into the Unfiltered 99/100 99/100 100/100 100/100 99/100 Approach gesturerecognizertwice,withandwithoutourfilterinthesys- 99% 99% 100% 100% 99% 99.4% tem. TheresultingperformanceratingsaregiveninTable2. In thiscase, theaccuracyoftherecognizerintegratedwithour Our Filtered 98/100 100/100 100/100 100/100 99/100 Approach filterprovedtobefarsuperiorthanwithoutit. Theperformance 98% 100% 100% 100% 99% 99.4% ratingforourfilteredapproachis(cid:0)(cid:6)(cid:5) +(cid:5) (cid:3) ,whiletheunfilteredis (cid:7) (cid:0) +(cid:5) (cid:3) . Theyshowthatbothmethodshaveanaccuracyratingof(cid:0)(cid:1)(cid:0) +(cid:2)(cid:4)(cid:3) . 5.4.3. Extensability The fact that both methods produced acceptable results turned out to be only coincidental for the unfiltered approach, which Forthethirdexperiment,weexaminedtheextensabilityofour was later shown to be very inconsistent. We analyzed this approach. Todothis, weincreasedthenumberofdistinctges- datasetfurtherandnoticedthatthegestureswerefairlyspatially turesthat therecognizerhadtodistinguish. We decidedtouse regularwith respectto eachother. Forexample, therewas not thetwosetsofgesturesfromthefirsttwoexperimentsandcom- anextensiveamountofvariationduetoalignment,skewingand binethemintoonedatabase. Althoughdiagramsmakethetwo scalingamongthelikegesturesinthisset. gesturesetsappearsimilar,themotionsthatthehumansubject 8 Table2: ResultsofExperiment#2 Table4: ResultsofExperiment#4 Arc Line Wave Circle Angle Totals Arc TriangleBenAtr acrm Circle Angle Totals Unfiltered Unfiltered 60/100 100/100 78/100 62/100 99/100 Approach 75/75 68/75 65/75 75/75 74/75 Approach 60% 100% 78% 62% 99% 79.8% 100% 90.7% 86.7% 100% 98.7% 95.2% Our Filtered Our Filtered 98/100 100/100 100/100 100/100 96/100 Approach 74/75 74/75 72/75 74/75 74/75 Approach 98% 100% 100% 100% 96% 98.8% 98.7% 98.7% 96.0% 98.7% 98.7% 98.1% hastoperformwiththearmaretotallydifferent. Whenweper- goal of the experiment was to determine if our method works formedthesameexperimentalprocedureasbeforeweobtained wellwithgesturesthatareveryeasytodistinguishbecausethey theresultsinTable3. are quitedistinct andaremore natural. Ourchoiceofgestures included the “zorro” sign, Catholic cross, salute, a stop and a Table3: ResultsofExperiment#3 waving gesture. Diagrams of the motions of the wrist and re- sultsoftheexperimentareshowninTable5. Arc Line Wave Circle Angle Arc Line Wave Circle Angle Totals Table5: ResultsofExperiment#5 Unfiltered 99/100 99/100 100/100 100/100 99/100 60/100 100/100 78/100 62/100 99/100 Approach 99% 99% 100% 100% 99% 60% 100% 78% 62% 99% 89.6% Our Filtered 98/100 100/100 100/100 100/100 99/100 98/100 100/100 100/100 100/100 96/100 Approach 98% 100% 100% 100% 99% 98% 100% 100% 100% 96% 99.1% Catholic Zorro Waving Stop Salute Totals Cross Hereourmethodhasanaccuracyratingof(cid:0)(cid:1)(cid:0) +) (cid:3) whiletheun- 50/50 46/50 50/50 50/50 50/50 UAnpfpirlotearcehd filtered approach has a rating of (cid:5)(cid:6)(cid:0) +(cid:0)(cid:4)(cid:3) . This gives us a good 100% 92% 100% 100% 100% 98.4% indication that our method is extensable to larger size gesture datasets. 50/50 50/50 50/50 50/50 50/50 Our Filtered Approach 100% 100% 100% 100% 100% 100% 5.4.4. MoreGeneralizabilityExperiments Atthispointwedecidedtorevisitthefirstexperimentwiththe Theresultsshowthatourmethodwas ) (cid:17) (cid:17)(cid:4)(cid:3) accurateonthis hopeofmakingitmoredifficulttodistinguishthegesturesthan gestureset,whiletheunfilteredapproachachievedanaccuracy before. The goals of the new experiment were to show more ratingof(cid:0)(cid:1)(cid:5) +(cid:2)(cid:4)(cid:3) . generalizability with our method. In order to do this, we re- placed the line and the wave with a triangle and another form 5.4.5. Discussion ofthearc. Thenewarcgestureisgeneratedusingabendatthe elbowinsteadofthestraightarmmotionsusedfortheoriginal In the experiments, we evaluated the accuracy performance, arc. By our definition of arm gestures (i.e. movements of the generalizability and extensability of our filter when integrated arm that may or may not have any meaningful intent) and our inarecognitionsystem.Wemadestepstoensurethatitwasdif- analysis of only the “end-effector” position of the arm at the ficult todistinguish amonggesturesbycarefullyselectingges- wrist,wedonotmakeanydistinctionbetweenthenewandold ture datasets with overlapping motion traits. When compared arcgesturesincebothhaveidenticalwristtrajectories. Thetri- withtherecognizerwithnofilterattached,ourmethodshowed anglegestureresemblestheanglegestureinthefirsttimesteps, improved recognition performances. Our results from the five butdeviatesfromitneartheend. Ourassumptionwasthatthis experimentsshowthatourmethodisconsistentlyaccuratewith choice of gestures would be harder to discriminate. 75 trials ratesrangingfrom(cid:0)(cid:1)(cid:5) +) (cid:3) to(cid:0)(cid:6)(cid:0) +(cid:2) (cid:3) andextendstomultipleges- wererunforeachgesture. Theapproximategestureshapesand turedatasets. Thiscomparesveryfavorablywiththeunfiltered resultsofthisexperimentaregiveninTable4. methodwhoseaccuracyrangedfrom(cid:7) (cid:0) +(cid:5) (cid:3) to(cid:0)(cid:1)(cid:0) +(cid:2)(cid:4)(cid:3) . Theresultsshowthatthenewgesturesetwasabitharderto recognize by both methods. But our filtered approach showed 6. Pilot Study an accuracy rating of (cid:0)(cid:6)(cid:5) +) (cid:3) compared with the unfiltered ap- proach’srating of (cid:0)(cid:2)(cid:1) +(cid:4)(cid:3)(cid:1)(cid:3) . The results were again encouraging We performed a pilot study involving six different subjects, in withregardtoourmethod’sconsistencyandaccuracy,andalso order to evaluate our model-based approach in a more general thatitgeneralizestodifferentgesturesquitewell. sense (i.e. across different subjects). We also ran a followup Forour fifthexperimentwe ran50trials withfive newges- study to test howwe can reduce the amount of parameter tun- tures, each significantly different from the others. In addition, ing that is required for each of the filters. The details of each wedecidedtomakeachoiceofsomewhatnaturalgestures.The experimentaregiveninthenextsections. 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.