Table Of Content

3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers MohsenMansouryar JulianSteil YusukeSugano AndreasBulling PerceptualUserInterfacesGroup MaxPlanckInstituteforInformatics,Saarbru¨cken,Germany {mohsen,jsteil,sugano,bulling}@mpi-inf.mpg.de Scene camera o o o e e g g s g 6 p n p 1 t t Eye camera 0 2 (a)2D-to-2Dmapping (b)3D-to-3Dmapping (c)2D-to-3Dmapping n Figure1:Illustrationofthe(a)2D-to-2D,(b)3D-to-3D,and(c)2D-to-3Dmappingapproaches.3Dgazeestimationinwearablesettingsis a ataskofinferring3Dgazevectorsinthescenecameracoordinatesystem. J 9 1 Abstract While3Dgazeestimationhasbeenwidelystudiedinremotegaze estimation,therehavebeenveryfewstudiesinhead-mountedeye ] 3Dgazeinformationisimportantforscene-centricattentionanaly- tracking. This is mainly because 3D gaze estimation typically re- C sis, butaccurateestimationandanalysisof3Dgazeinreal-world quiresmodel-basedapproacheswithspecialhardware,suchasmul- H environments remains challenging. We present a novel 3D gaze tiple IR light sources and/or stereo cameras [Beymer and Flick- . estimation method for monocular head-mounted eye trackers. In ner 2003; Nagamatsu et al. 2010]. Hence, it remains unclear s contrasttopreviouswork,ourmethoddoesnotaimtoinfer3Deye- whether 3D gaze estimation can be done properly only with a c [ ballposes, butdirectlymaps2Dpupilpositionsto3Dgazedirec- lightweight head-mounted eye tracker. S´wirski and Dodgson pro- tionsinscenecameracoordinatespace. Wefirstprovideadetailed posedamethodtorecover3Deyeballposesfromamonoculareye 2 discussionofthe3Dgazeestimationtaskandsummarizedifferent camera [S´wirski and Dodgson 2013]. While it can be applied to v methods,includingourown. Wethenevaluatetheperformanceof lightweight mobile eye trackers, their method has been only eval- 4 different3Dgazeestimationapproachesusingbothsimulatedand uated with synthetic eye images, and its realistic performance in- 4 realdata. Throughexperimentalvalidation,wedemonstratetheef- cludingtheeye-to-scenecameramappingaccuracyhasneverbeen 6 fectivenessofourmethodinreducingparallaxerror,andweiden- quantified. 2 tifyresearchchallengesforthedesignof3Dcalibrationprocedures. 0 Wepresentanovel3Dgazeestimationmethodformonocularhead- . Keywords: Head-mountedeyetracking;3Dgazeestimation;Par- mountedeyetrackers. Contrarytoexistingapproaches,weformu- 1 allaxerror late3Dgazeestimationasadirectmappingtaskfrom2Dpupilpo- 0 sitionsintheeyecameraimageto3Dgazedirectionsinthescene 6 camera. Therefore,forthecalibrationwecollectthe2Dpupilposi- 1 1 Introduction tionsaswellas3Dtargetpoints,andfinallyminimizethedistance : v betweenthe3Dtargetsandtheestimatedgazerays. Research on head-mounted eye tracking has traditionally focused i X onestimatinggazeinscreencoordinatespace,e.g.ofapublicdis- Thecontributionsofthisworkarethreefold. First,wesummarize play. Estimatinggazeinsceneorworldcoordinatesenablesgaze and analyze different 3D gaze estimation approaches for a head- r a analysis on 3D objects and scenes and has the potential for new mounted setup. We discuss potential error sources and technical applications, such as real-world attention analysis [Bulling 2016]. difficultiesintheseapproaches,andprovideclearguidelinesforde- Thisapproachrequirestwokeycomponents: 3Dscenereconstruc- signing lightweight 3D gaze estimation systems. Second, follow- tionand3Dgazeestimation. ing from this discussion, we propose a novel 3D gaze estimation method. Ourmethoddirectlymaps2Dpupilpositionsintheeye Inpriorwork,3Dgazeestimationwasapproximatelyaddressedas camerato3Dgazedirections,anddoesnotrequire3Dobservation a projection from estimated 2D gaze positions in the scene cam- fromtheeyecamera. Third,weprovideadetailedcomparisonof era image to the corresponding 3D scene [Munn and Pelz 2008; ourmethodwithstate-of-the-artmethodsintermsof3Dgazeesti- Takemuraetal.2014; PfeifferandRenner2014]. However, with- mationaccuracy. Theopen-sourcesimulationenvironmentandthe out proper 3D gaze estimation, gaze mapping suffers from paral- datasetareavailableathttp://mpii.de/3DGazeSim. laxerrorcausedbytheoffsetbetweenthescenecameraoriginand eyeball position [Mardanbegi and Hansen 2012; Duchowski et al. 2 3D Gaze Estimation 2014]. To fully utilize the 3D scene information it is essential to estimate3Dgazevectorsinthescenecoordinatesystem. 3Dgazeestimationisthetaskofinferring3Dgazevectorstothe target objects in the environment. Gaze vectors in scene camera ThemainpartofthistechnicalreportwaspresentedatETRA2016,March coordinatescanthenbeintersectedwiththereconstructed3Dscene. 14-17,2016,Charleston,SC,USA. Therearethreemappingapproacheswediscussinthispaper:2D-to- 2D,3D-to-3D,andournovel2D-to-3Dmappingapproach. Inthis section,webrieflysummarizethreeapproaches. Formoredetails, pleaserefertotheappendixsection. 2D-to-2DMapping Standard2Dgazeestimationmethodsassume2Dpupilpositionsp intheeyecameraimagesasinput. Thetaskistofindthemapping functionfrompto2Dgazepositionssinthescenecameraimages (Figure1(a)). GivenasetofN calibrationdataitems(p ,s )N , i i i=1 themappingfunctionistypicallyformulatedasapolynomialregres- sion. 2D pupil positions are first converted into their polynomial representationsq(p), andthelinearregressionweightisobtained via linear regression methods. Following Kassner et al. [Kassner etal.2014],wedidnotincludecubictermsandusedananisotropic representationasq=(1,u,v,uv,u2,v2,u2v2)wherep=(u,v). Inordertoobtain3Dgazevectors,mostofthepriorworkassumes thatthe3Dgazevectorsareoriginatingfromtheoriginofthescene cameracoordinatesystem.Inthiscase,estimated2Dgazepositions Figure2:3Deyemodelandsimulationenvironmentwith3Dtarget fcanbesimplyback-projectedto3Dvectorsginthescenecamera points given as blue dots. The green and red rays correspond to coordinatesystem. Thisisequivalenttoassumingthattheeyeball groundtruthandestimatedgazevectors,respectively. center position e is exactly the same as the origin o of the scene cameracoordinatesystem. However,inpracticethereisalwaysan offsetbetweenthescenecameraoriginandtheeyeballposition,and we obtained from a simulation environment, whereas the second thisoffsetcausestheparallaxerror. studyexploitedreal-worlddatacollectedfrom14participants. 3D-to-3DMapping SimulationData Ifwecanestimatea3Dpupilpose(unitnormalvectorofthepupil We first analyzed the different mapping approaches in a simula- disc)fromtheeyecameraasdoneinS´wirskiandDodgson[S´wirski tionenvironment. Oursimulationenvironmentisbasedonabasic andDodgson2013],wecaninsteadtakeadirect3D-to-3Dmapping modelofthehumaneyeconsistingofapairofspheres[Lefohnetal. approach(Figure1(b)).Insteadofthe2Dcalibrationtargetss,we 2003]andthesceneandeyecameramodels. Theeyemodelanda assume3Dcalibrationtargetstinthiscase. screenshotofthesimulationenvironmentareillustratedinFigure2. We used human average anatomical parameters: R = 11.5mm, Withthecalibrationdata(n ,t )N ,thetaskistofindtherotation r = 7.8mm,d = 4.7mm,andq = 5.8mm. Thepupilisconsid- i i i=1 RandtranslationT betweenthesceneandeyecameracoordinate eredasthecenterofthecirclewhichrepresentstheintersectionof systems. This can be done by minimizing distances between 3D thetwospheres. Forbotheyeandscenecameras,weusedthepin- gaze targets t and the 3D gaze rays which are rotated and trans- holecameramodel. Intrinsicparametersweresettovaluessimilar i latedtothescenecameracoordinatesystem.Intheimplementation, tothoseoftheactualeyetrackingheadsetweusedinthereal-world wefurtherparameterizetherotationRbya3Danglevectorwith environment. theconstraintthatrotationanglesarebetween−π andπ, andwe Oneofthekeyquestionsabout3Dgazeestimationiswhethercal- initializeRassumingthattheeyecameraandthescenecameraare ibration at single depth is sufficient or not. Intuitively, obtaining facingoppositedirections. calibrationdataatdifferentdepthsfromthescenecameracanim- prove the 3D mapping performance. We set calibration and test 2D-to-3DMapping plane depths d and d to 1m, 1.25m, 1.5m, 1.75m, and 2m. At c t eachdepth,pointsareselectedfromtwogrids,a5by5gridwhich Estimating3Dpupilposeisnotatrivialtaskinreal-worldsettings. gives us 25 calibration points (blue) and an inner 4 by 4 grid for Anotherpotentialapproachistodirectlymap2Dpupilpositionsp 16testpoints(red)displayedonthewhiteplaneofFigure2. Both to3Dgazedirectionsg(Figure1(c)). ofthegridsaresymmetricwithrespecttothescenecamera’sprin- cipalaxis. Fromtheeyemodelused, weareabletoestimatethe Inthiscase,weneedtomapthepolynomialfeatureqtounitgaze correspondinggazeray. vectorsgoriginatingfromaneyeballcentere.gcanbeparameter- izedinapolarcoordinatesystem,andweassumealinearmapping Real-WorldData fromthepolynomialfeatureq totheanglevector. Theregression weightisobtainedbyminimizingdistancesbetween3Dcalibration Wealsopresentevaluationofgazeestimationapproachesusinga targetst andthemapped3Dgazeraysasinthe3D-to-3Dapproach. i real-world dataset to show the validity of 3D gaze estimation ap- Intheimplementation,weusedthesamepolynomialrepresentation proaches. asthe2D-to-2Dmethodtoprovideafaircomparisonwiththebase- line. Procedure 3 Data Collection The recording system consisted of a Lenovo G580 laptop and a Phex Wall 55” display (121.5cm × 68.7cm) with a resolution of Inordertoevaluatethepotentialandlimitationsoftheintroduced 1920 × 1080. Gaze data was collected using a PUPIL head- mappingapproaches,weconductedtwostudies.First,weuseddata mountedeyetrackerconnectedtothelaptopviaUSB[Kassneretal. Figure 4: Angular error performance over different numbers of calibrationdepthsfor2D-to-2D,2D-to-3Dand3D-to-3Dmapping approaches. Eachpointcorrespondstothemeanoverallangular (a)Video-basedhead-mountedeyetracker (b)Displayanddistancemarkers errorvaluesforeachnumberofcalibrationdepths.Theerrorbars Figure3: TherecordingsetupconsistedofaLenovoG580laptop, providethecorrespondingstandarddeviations. aPhexWall55”displayandaPUPILhead-mountedeyetracker. mated3Dfixationtargetpositiont(cid:48)assumingthesamedepthasthe 2014](seeFigure3a). Theeyetrackerhastwocameras: oneeye ground-truth target t. Then the angle between the lines from the camerawitharesolutionof640×360pixelsrecordingavideoof originowasmeasured. therighteyefromcloseproximity,aswellasanegocentric(scene) camera with a resolution of 1280 × 720 pixels. Both cameras 4 Results recordedvideosat30Hz. Pupilpositionsintheeyecamerawere detectedusingthePUPILeyetracker’simplementation. We compared different mapping approaches in Figure 4 using an increasingnumberofcalibrationdepthsinbothsimulationandreal- Weimplementedremoterecordingsoftwarewhichconductsthecal- worldenvironments. Eachplotcorrespondstomeanestimationer- ibrationandtestrecordingsshownonthedisplaytotheparticipants. rors of all test planes and all combinations of calibration planes. As shown in Figure 3b, the target markers were designed so that Angular error is evaluated from the ground-truth eyeball position. their3DpositionscanbeobtainedusingtheArUcolibrary[Garrido- Itcanbeseenthatinallcasestheestimationperformancecanbe Juradoetal.2014]. Intrinsicparametersofthesceneandeyecam- improved by taking more calibration planes. Even the 2D-to-2D eraswerecalibratedbeforerecording,andusedforcomputing3D mapping approach performs slightly better with multiple calibra- fixationtargetpositionstand3Dpupilposesn. tiondepthsoverallinbothenvironments. The2D-to-3Dmapping Werecruited14participantsagedbetween22and29years.Thema- approachperformedbetterthanthe2D-to-2Dmappinginallcases jorityofthemhadlittleornopreviousexperiencewitheyetracking. inthesimulationenvironment.Forthe3D-to-3Dmappingapproach Everyparticipanthadtoperformtworecordings,acalibrationand aparallaxerrorneartozerocanbeachieved. atestrecordingoffivedifferentdistancesfromthedisplay.Record- Similarly to the simulation case, we first compare the 2D-to-3D ing distances were marked by red stripes on the ground (see Fig- mappingwiththe2D-to-2Dmappingintermsoftheinfluenceof ure 3b). They were aligned parallel to the display with an initial different calibration depths displayed as stable lines in Figure 4. distance of 1 meter and the following recording distances with a Sinceitturnedoutthatthe3D-to-3Dmappingonreal-worlddata spacing of 25cm (1.0, 1.25, 1.5, 1.75, 2.0). For every participant hasmoreangularerror(over10◦)thanthe2D-to-3Dmapping,we werecorded10videos. omittheresultsinthefollowinganalysis. Asinthesimulationenvironment,theparticipantswereinstructed Contrarytothesimulationresult, withalowernumberofcalibra- tolookat25fixationtargetpointsfromthegridpatterninFigure3b. tion depths the 2D-to-2D approach performs better than the 2D- Afterthissteptheparticipantshadtoperformthesameprocedure to-3Dapproachforreal-worlddata. However, withanincreasing againwhilelookingat16fixationtargetsplacedondifferentposi- numberofcalibrationdepths,the2D-to-3Dapproachoutperforms tions than in the initial calibration to collect the test data for our 2D-to-2Dcomparingtheangularerrorinvisualdegrees. Forfive evaluationpart.Thisprocedurewasthenrepeatedfortheotherfour calibration depths we can achieve for the 2D-to-3D case an over- mentioneddistances. Theonlyrestrictionweimposedwasthatthe allmeanoflessthan1.3visualdegreesoveralltestdepthsandall participantsshouldnotmovetheirheadduringtherecording. participants. A more detailed analysis and discussion with corre- spondingperformanceplotsareavailableintheappendix. ErrorMeasurement Sincetheground-trutheyeballpositioneisnotavailableinthereal- 5 Discussion worldstudy,weevaluatetheestimationaccuracyusinganangular errorobservedfromthescenecamera.Forthecasewhere2Dgaze Wediscussedthreedifferentapproachesfor3Dgazeestimationus- positionsareestimated(2D-to-2Dmapping),weback-projectedthe ing head-mounted eye trackers. Although it was shown that the estimated2Dgazepositionf intothescene,anddirectlymeasured 3D-to-3Dmappingisnotatrivialtask,the2D-to-3Dmappingap- the angle θ between this line and the line from the origin of the proach was shown to perform better than the standard 2D-to-2D scene camera o to the measured fixation target t. For the cases mappingapproachusingsimulationdata. Oneofthekeyobserva- where3Dgazevectorsareestimated, wefirstdeterminedtheesti- tions from the simulation study is that the 2D-to-3D mapping approach requires at least two calibration depths. Given more than References twocalibrationdepths,the2D-to-3Dmappingcansignificantlyre- ducetheparallaxerror. BEYMER,D.,ANDFLICKNER,M. 2003. Eyegazetrackingusing anactivestereohead. InProc.CVPR. On the real data, we couldobserve a decreasing error for the 2D- to-3D mapping with an increasing number of calibration depths, BULLING, A. 2016. Pervasive attentive user interfaces. IEEE and could outperform the 2D-to-2D mapping. However, the per- Computer49,1,94–98. formanceofthe2D-to-3Dmappingbecameworsethaninthesim- DUCHOWSKI, A. T., HOUSE, D. H., GESTRING, J., CONGDON, ulationenvironment. Reasonsforthedifferentperformanceofthe R., S´WIRSKI, L., DODGSON, N. A., KREJTZ, K., AND KRE- mappingapproachesinthesimulationandreal-worldenvironment JTZ, I. 2014. Comparing estimated gaze depth in virtual and aremanifoldandrevealtheirlimitations. Oursimulationenviron- physicalenvironments. InProc.ETRA,103–110. mentconsidersanidealsettinganddoesnotincludenoisethatoc- cursintherealworld. Thisnoiseismainlyproducedbypotential GARRIDO-JURADO, S., MUNÕZ-SALINAS, R., MADRID- errorsinthepupilandmarkerdetection,aswellasheadmovements CUEVAS,F.J.,ANDMARÍN-JIMEŃEZ,M.J. 2014. Automatic oftheparticipants. generationanddetectionofhighlyreliablefiducialmarkersun- derocclusion. PatternRecognition47,6,2280–2292. Infutureworkitwillbeimportanttoinvestigatehowthe3D-to-3D mapping approach can work in practice. The fundamental differ- KASSNER, M., PATERA, W., AND BULLING, A. 2014. Pupil: encefromthe2D-to-3Dmappingisthatthemappingfunctionhas anopensourceplatformforpervasiveeyetrackingandmobile toexplicitlyhandletherotationbetweeneyeandscenecameraco- gaze-basedinteraction. InAdj.Proc.UbiComp,1151–1160. ordinatesystems.Inadditiontothefundamentalestimationinaccu- racy ofthe 3Dpupil poseestimation techniquewithout modeling LEFOHN,A.,BUDGE,B.,SHIRLEY,P.,CARUSO,R.,ANDREIN- real-world factors such as corneal refraction, we did not consider HARD,E.2003.Anocularist’sapproachtohumanirissynthesis. IEEETrans.ComputerGraphicsandApplications23,6,70–75. thedifferencebetweenopticalandvisualaxes.Amoreappropriate mapping function could be a potential solution for the 3D-to-3D MARDANBEGI,D.,ANDHANSEN,D.W. 2012. Parallaxerrorin mapping, andanotheroptioncouldbetousemoregeneralregres- the monocular head-mounted eye trackers. In Proc. UbiComp, siontechniquesconsideringthe2D-to-3Dresults. ACM,689–694. Throughout the experimental validation, this research also illus- MUNN,S.M.,ANDPELZ,J.B.2008.3dpoint-of-regard,position trated the fundamental difficulty of the 3D gaze estimation task. andheadorientationfromaportablemonocularvideo-basedeye It has been shown that the design of the calibration procedure is tracker. InProc.ETRA,ACM,181–188. also quite important, and it is essential to address the issue from thestandpointofbothcalibrationdesignandmappingformulation. NAGAMATSU, T., SUGANO, R., IWAMOTO, Y., KAMAHARA, J., Sincetheimportanceofdifferentcalibrationdepthshasbeenshown, AND TANAKA, N. 2010. User-calibration-free gaze tracking withestimationofthehorizontalanglesbetweenthevisualand thedesignofautomaticcalibrationprocedure, e.g., howtoobtain theopticalaxesofbotheyes. InProc.ETRA,ACM,251–254. calibration data at different depths using only digital displays, is anotherimportantHCIresearchissue. PFEIFFER, T., AND RENNER, P. 2014. Eyesee3d: Alow-costap- proachforanalyzingmobile3deyetrackingdatausingcomputer Finally,itisalsoimportanttocombinethe3Dgazeestimationap- visionandaugmentedrealitytechnology. InProc.ETRA,ACM, proachwith3Dscenereconstructionmethodsandevaluatetheover- 369–376. allperformanceof3Dgazemapping. Inthissense, itisalsonec- essarytoevaluateperformancewithrespecttoscenereconstruction S´WIRSKI, L., AND DODGSON, N. A. 2013. A fully-automatic, error. temporalapproachtosinglecamera,glint-free3deyemodelfit- ting[abstract]. InProc.PETMEI. 6 Conclusion TAKEMURA, K., TAKAHASHI, K., TAKAMATSU, J., AND OGA- SAWARA,T. 2014. Estimating3-dpoint-of-regardinarealenvi- Inthiswork,weprovidedanextensivediscussionondifferentap- ronmentusingahead-mountedeye-trackingsystem.IEEETrans. proachesfor3Dgazeestimationusinghead-mountedeyetrackers. onHuman-MachineSystems44,4,531–536. In addition to the standard 2D-to-2D mapping approach, we dis- cussed two potential 3D mapping approaches using either 3D or 2Dobservationfromtheeyecamera.Weconductedadetailedanal- ysis of 3D gaze estimation approaches using both simulation and realdata. Experimentalresultsshowedtheadvantageoftheproposed2D-to- 3Destimationmethods,butitscomplexityandtechnicalchallenges werealsorevealed. Togetherwiththedatasetandsimulationenvi- ronment,thisstudywouldprovideasolidbasisforfutureresearch on3Dgazeestimationwithlightweighthead-mounteddevices. Acknowledgements Wewouldliketothankallparticipantsfortheirhelpwiththedata collection. Thisworkwasfunded,inpart,bytheClusterofExcel- lenceonMultimodalComputingandInteraction(MMCI)atSaar- land University, the Alexander von Humboldt Foundation, and a JSTCRESTresearchgrant. Appendix andweassumealinearmappingfromthepolynomialfeatureqto theanglevectoras 3DGazeEstimationApproaches α=(θ,φ)=qw. (5) Wefirstintroducedetailedformulationsofthreeapproachesthatare brieflypresentedabove. Given the 3D calibration data (p ,t )N , w can be obtained by i i i=1 minimizingdistancesd between3Dgazetargetst andthegaze i i 2D-to-2DMappingApproach rays.Therefore,similarlytothe3D-to-3Dmappingcase,thetarget costfunctiontobeminimizedis Asbrieflydescribedabove, standard2Dgazeestimationmethods aanssdumtheet2aDskpiusptoilfipnodsittihoenpsoplyninomthiealemyeapcpaimngerfaunimctaiognesfraosminpputot, E2Dto3D(w,e)=(cid:88)N |g(qiw)×(ti−e)|2. (6) 2Dgazepositionssinthescenecameraimages.2Dpupilpositions i=1 arefirstconvertedintotheirpolynomialrepresentationsq(p),and acoefficientvectorwwhichminimizesacostfunction Inordertoinitializetheparametersfornonlinearoptimization,we firstsete = (0,0,0). Thenusingthepolarcoordinatesofgaze N 0 E (w)=(cid:88)|s −q w|2 (1) targetsti =(θi,φi),theinitialw0canbeobtainedbysolvingthe 2Dto2D i i linearregressionproblem i=1 ipscoabntabineemdavpiapelidnteoar2rDeggraezsesiopnosmitieotnhsodass.fTh=enqawny. pupilpositions E(w)=(cid:88)N |(θi,φi)−qiw|2. (7) i=1 3D-to-3DMappingApproach ExtendedAnalysis Inthiscase,theinputtothemappingfunctionis3Dpupilposeunit vectorsn. Giventhecalibrationdata(n ,t )N with3Dcalibra- Inthissection,weprovideextendedanalysisonthedifferentperfor- i i i=1 tiontargetst, thetaskistofindtherotationR andtranslationT mance taking single and multiple calibration depth combinations betweenthesceneandeyecameracoordinatesystems. intoaccount. If we denote the origin of the pupil pose vectors as e , 3D cam SimulationStudy gaze rays after the rotation and translation are defined as a line ecam +T +λRn, where λ parameterize the gaze line1. Given Figure5showstheerrorforallthreemappingapproachesonthe thecalibrationdata,RandT areobtainedbyminimizingdistances simulation data by fixing the calibration depth in a similar man- di between 3D gaze targets ti and the 3D gaze rays. In a vector nerasinMardanbegiandHansen’swork[MardanbegiandHansen form,thesquareddistanced2i canbewrittenas 2012]. Figure5aandFigure5barecorrespondingtoperformances usingoneandthreecalibrationdepth,respectively.Eachplotshows |Rn ×(t −(e +T))|2 d2 = i i cam themeanangularerrordistributionovertestdepths,andeachcolor i |Rn |2 i correspondstoacertaincalibrationdepth. Theerrorbarsdescribe = |Rn ×(t −(e +T))|2. (2) thecorrespondingstandarddeviations. Dashedlinescorrespondto i i cam the 2D-to-3D mapping, dotted lines correspond to the 3D-to-3D Sinceecam+T denotestheeyeballcenterpositioneinthescene mapping,andsolidlinescorrespondtothe2D-to-2Dmapping. cameracoordinatesystem,thecostfunctioncanbedefinedas Withonecalibrationdepth(Figure5a),theperformanceofthe2D- N to-3Dmappingisalwaysbetterthanthe2D-to-2Dcase. However, E3Dto3D(R,e)=(cid:88)|Rni×(ti−e)|2. (3) wecanobservethattheparallaxerrorisstillpresentinthe2D-to- i=1 3Dcase,whichindicatesthefundamentallimitationsoftheapprox- imatedmappingapproach.Withthreecalibrationdepth(Figure5b), MinimizationofEq.(3)canbedoneusingnonlinearoptimization the2D-to-3Dmappingapproachperformssignificantlybetterthan methodssuchastheLevenberg-Marquardtalgorithm.Attheinitial- inFigure5aandtheparallaxerrorreachesanearzerolevel. How- izationstepofthenonlinearoptimization,weassumee =(0,0,0) 0 ever, thereisatendencyfortheerrortobecomelargerasthetest andR =(0,π,0)consideringtheoppositedirectionofthescene 0 depthbecomesclosertothecamera,whichindicatesthelimitations andeyecamerasintheworldcoordinatesystem. oftheproposedmappingfunction. Theperformanceofthe2D-to- 2D mapping is also improved, but we can see that the increased 2D-to-3DMappingApproach numberofcalibrationdepthscannotbeafundamentalsolutionto theparallaxerrorissue. Forthe3D-to-3Dmapping,theangularer- Another potential approach is to directly map 2D pupil positions rorisclosetozeroevenforonlyonecalibrationdepth.Takingmore p to 3D gaze directions g. In this case, we map the polynomial calibrationdepthsintoaccountdoesnotleadtoafurtherimprove- featureqtounitgazevectorsgoriginatingfromtheeyeballcenter ment. einthescenecameracoordinatesystem. g canbeparameterized inapolarcoordinatesystemas Real-WorldStudy   sinθ g=cosθsinφ, (4) Similarly,weshowadetailedcomparisonofthe2D-to-2Dand2D- cosθcosφ to-3Dmappingapproachesusingthereal-worlddata.Figure6adis- playsthemeanangularerrorforbothapproachestakingonlyone 1Pleasenotethatλistheparameterrequiredtodeterminethe3Dgaze calibrationdepthoverall14participantsinthesamemannerasin point by intersecting the gaze ray to the scene, and does not have to be Figure5a.Forbothmappingapproaches,eachcalibrationdepthset- obtainedduringcalibrationstage. tingperformedbestforthecorrespondingtestdepth,andtheerror (a)Onecalibrationdepthsettingfor2D-to-2D,2D-to-3Dand3D-to-3Dmappings (b)Threecalibrationdepthsettingfor2D-to-2D,2D-to-3Dand3D-to-3Dmappings Figure 5: Comparison of parallax error. Vertical axis shows the mean angular error values over all test depths (D1-D5). Dashed lines correspondtothe2D-to-3Dmapping,Dottedlinescorrespondto3D-to-3Dandsolidlinescorrespondtothe2D-to-2Dmapping.Eachcolor representsoneofthedifferentcalibrationdepthsettings. 24 2D-to-2DCalibrationDepth1 8.5 2D-to-2DCalibrationDepth1+2+3 22 2D-to-2D Calibration Depth 3 8 2D-to-2D Calibration Depth 2 + 3 + 4 20 2D-to-2D Calibration Depth 5 7.5 2D-to-2D Calibration Depth 3 + 4 + 5 18 22DD--ttoo--33DD CCaalliibbrraattiioonn DDeepptthh 13 6.57 22DD--ttoo--33DD CCaalliibbrraattiioonn DDeepptthh 12 ++ 23 ++ 34 16 6 2D-to-3D Calibration Depth 5 2D-to-3D Calibration Depth 3 + 4 + 5 14 5.5 Error12 Error4.55 Angular 1860 Angular 32..4535 4 2 2 1.5 1 0 0.5 0 D1 - 1m D2 - 1.25m D3 - 1.5m D4 - 1.75m D5 - 2.0m D1 - 1m D2 - 1.25m D3 - 1.5m D4 - 1.75m D5 - 2.0m Depth Depth (a)Onecalibrationdepthsettingfor2D-to-2Dand2D-to-3Dmappings (b)Threecalibrationdepthsettingfor2D-to-2Dand2D-to-3Dmappings Figure6:Comparisonofoneandthreecalibrationdepthsconsideringthemeanangularerrorvaluesoverallparticipantsandforeverytest depth(D1-D5).Theerrorbarsshowthecorrespondingstandarddeviationforeverytestdepth. increasedwithanincreasedtestdistancefromthecalibrationdepth. However,forthe2D-to-2Dapproachtheangularerrorvaluesover 8.5 2D-to-2D 8 2D-to-3D alldistancesaresmallerthanforthe2D-to-3Dcase,exceptforthe 7.5 casewherethecalibrationdepthandtestdeptharethesame. 7 6.5 This behavior changes for an increasing number of calibration 6 depths, as can be seen in Figure 6b, where we used three differ- Error5.55 epnrotaccahlibpreartfioornmdsebpethttserasthinanFtihgeur2eD5-bto.-T2hDem2Dap-tpoi-n3gDfomranpepairnlgyaapll- Angular 43..455 combinations,exceptforthetestdepthD1,exploitingtheadditional 3 3Dinformationcollectedduringcalibrationtoimprovethegazedi- 2.5 rectionestimation. 2 1.5 1 Figure7showsthemeanangularerrorswithrespecttotheoffset 0.5 betweenthecalibrationandtestdepthsfortheonecalibrationdepth 1m 0.75m 0.5m 0.25m 0m 0.25m 0.5m 0.75m 1.0m setting.Thenegativedistancevaluesonthehorizontalaxisindicate Distance Difference caseswherethetestdepthiscloserthanthecalibrationdepth,and Figure7: Theeffectofdistancefromthecalibrationdepthfor2D- viceversaforthepositivedistancevalues. Ascanbeseen,the2D- to-2Dand2D-to-3D,takingonecalibrationdepthintoaccount.Ev- to-3D mapping approach tends to produce higher error if the test erypointdescribesthemeanangularerrorwithrespecttotheoffset depthdistancefromthecalibrationdepthincreases. betweenthecalibrationandtestdepth. Theerrorbarsprovidethe correspondingstandarddeviation.

3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers PDF

1.8 MB·

by Mohsen Mansouryar

#journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.