Table Of Content

Spatiotemporal Gabor filters: a new method for dynamic texture recognition WesleyNunesGonçalves,BrunoBrandoliMachado,OdemirMartinezBruno PhysicsInstituteofSaõCarlos(IFSC) UniversityofSaõPaulo(USP) Av.TrabalhadorSaõ-carlense,400 Cx.Postal369-SaõCarlos-SP-Brasil [email protected],[email protected],[email protected] 2 1 0 2 Abstract calledtemporaltextures.Theyarebasicallytextureinmo- n tionwhichisanextensionofimagetexturetothespatiotem- a J This paper presents a new method for dynamic tex- poral domain. Examples of dynamic texture includes real turerecognitionbasedonspatiotemporalGaborfilters.Dy- worldscenesoffire,flagblowing,seawaves,movingesca- 7 1 namic textures have emerged as a new field of investiga- lator,boilingwater,grass,andsteam. tionthatextendstheconceptofself-similarityoftextureim- Existing methods for dynamic texture can be classified ] agetothespatiotemporaldomain.Tomodeladynamictex- into four categories according to how they model the se- V ture,weconvolvethesequenceofimagestoabankofspa- quenceofimages.Duetotheefficientestimationoffeatures C tiotemporalGaborfilters.Foreachresponse,afeaturevec- basedonmotion(e.g.opticalflow),themotionbasedmeth- . s torisbuiltbycalculatingtheenergystatistic.Asfarasthe ods (i) are the most popular ones. These methods model c authors know, this paper is the first to report an effective dynamic textures based on a sequence of motion patterns [ methodfordynamictexturerecognitionusingspatiotempo- [9,13,8].Formodelingdynamictextureatdifferentscales 1 ralGaborfilters.Weevaluatetheproposedmethodontwo inspaceandtime,thespatiotemporalfilteringbasedmeth- v challengingdatabasesandtheexperimentalresultsindicate ods(ii)usespatiotemporalfilterssuchaswavelettransform 2 thattheproposedmethodisarobustapproachfordynamic [7,17,5].Modelbasedmethods(iii)aregenerallybasedon 1 6 texturerecognition. lineardynamicalsystems,whichprovidesamodelthatcan 3 beusedinapplicationsofsegmentation,synthesis,andclas- . sification[6,3,15].Basedonpropertiesofmovingcontour 1 0 surfaces,spatiotemporalgeometricpropertybasedmethods 2 1. Introduction (iv) extract motion and appearance features from the tan- 1 gentplanedistribution[10].Thereadermayconsult[4]for : v Thevisionofanimalsprovidesalargeamountofinfor- areviewofdynamictexturemethods. Xi mation that improves the perception of the world. This in- In this paper, we propose a new approach for dynamic formationisprocessedintodifferentdimensions,including texture modeling based on spatiotemporal Gabor filters r a color, shape, illumination, and motion. While most of the [12]. As far as the authors know, the present paper is the featuresprovideinformationaboutthestaticworld,themo- firstonetomodeldynamictextureusingspatiotemporalGa- tionprovidesessentialinformationforinteractionwithex- borfilters.Thesefiltersarebasicallybuiltusingtwoparam- ternal environment. In recent decades, the perception and eters:thespeedvanddirectionθ.Tomodeladynamictex- interpretation of motion have attracted a significant inter- ture,weconvolvethesequenceofimagestoabankofspa- estincomputervisioncommunity[14,16,1,9]motivated tiotemporalGaborfilterbuiltwithdifferentvaluesofspeed bytheimportanceinbothscientificandindustrialcommu- anddirection.Foreachresponse,afeaturevectorisbuiltby nities. Despite significant advances, the motion characteri- calculatingtheenergystatistic. zationisstillanopenproblem. We evaluate the proposed method by classifying dy- For modeling image sequences, three classes of motion namic texture from two challenging databases: dyn- patterns have been suggested [13]: dynamic texture, activ- tex [11] and traffic database [2]. Experimental results in ities and events. The main difference between them relies both databases indicate that the proposed method is an ef- on the temporal and spatial regularity of the motion field. fective approach for dynamic texture recognition. For In this work, we aim at modeling dynamic textures, also the dyntex database, filter with low speeds (e.g. 0.1 pixels/frame)achievedbetterresultsthanhighspeeds.Infact, the dynamic texture in this database presents low motion patterns. On the other hand, for the traffic database, high speeds (e.g. 1.5 pixels/frame) achieved the best cor- rectclassificationrate.Inthisdatabase,vehiclesaremoving ataspeedthatmatchesthefilter’sspeed. This paper is organized as follows. Section 2 briefly describes spatiotemporal Gabor filters. In Section 3, we present the proposed method for dynamic texture recognition based on spatiotemporal Gabor filters. An analysis of the proposed method with respect to the speed and direc- Figure 1. Example of spatiotemporal Gabor tion parameters is present in Section 4. Experimental re- filterforv =1. c sultsaregiveninSection5,whichisfollowedbythecon- clusionofthisworkinSection6. totheelongatedreceptivefieldalongyaxis.Thespeedvis 2. SpatiotemporalGaborFilters the phase speed of the cosine factor, which determines the speedofmotion.Thespeedwhichthecenterofthespatial Gabor filters are based on the important finding made Gaussianmovesalongthexaxisisspecifiedbytheparam- by Hubel and Wiesel in the beginning of the 1960s. They eterv .Whenv = 0,thecenteroftheGaussianenvelope foundthattheneuronsoftheprimaryvisualcortexrespond c c is stationary. On the other hand, a moving envelope is ob- to lines or edges of a certain orientation in different posi- tained when v = v. Figure 1 presents a moving envelope tions of the visual field. Following this discovery, compu- c withv =v =1. tationalmodelswereproposedformodelingthefunctionof c Theparameterλisthewavelengthofthecosinefactor.It thisneuronsandtheGaborfunctionsprovedtobesuitedfor √ isobtainedthroughtherelationλ=λ 1+v2,wherethe thispurposeinmanyworks. 0 constant λ = 2. The angle θ ∈ [0,2π] determines the di- Initially,theresearchesaimedatstudyingspatialproper- 0 rectionofmotionandthespatialorientationofthefilter.The tiesofthereceptivefield.However,someposteriorstudies phase offset ϕ ∈ [−π,π] determines the symmetry in the revealedthatcorticalcellschangeintimeandsomeofthem spatialdomain.TheGaussiandistributionwithmeanµ and are inseparable functions of space and time. Therefore, t standarddeviationτ isusedtomodelthechangeinintensi- these cells are essentially spatiotemporal filters and they ties.Themeanµ =1.75andstandarddeviationτ =2.75, combine information over space and time, which makes a t both parameters are fixed based on the mean duration of greatmodelfordynamictextureanalysis. mostreceptivefields. In this work, the spatiotemporal receptive field is mod- eledbyafamilyof3DGaborfilters[12]describedinEqua- tion1. 3. Dynamic Textures Modeling based on Spa- tiotemporalGaborFilters γ −((x+v t)2+γ2y2) g (x,y,t)= exp( c ) (v,θ,ϕ) 2πσ2 2σ2 Inthissection,wedescribetheproposedmethodfordy- 2π namictexturemodelingbasedonspatiotemporalGaborfil- .cos( (x+vt)+ϕ) λ ters. Briefly, the sequence of images is convolved with a .√1 exp(−(t−µt)2) (1) bank of spatiotemporal Gabor filters and a feature vector 2πτ 2τ2 is constructed with the energy of the responses as compo- nents. x=xcosθ+ysinθ Theresponser (x,y,t)ofaspatiotemporalGabor y =−xsinθ+ycosθ (v,θ,ϕ) filterg toasequenceofimagesI(x,y,t)iscomputed (v,θ,ϕ) byconvolution: We now discuss the parameters of the spatiotemporal Gaborfilter.Someparameterswereempiricallyfoundbased r (x,y,t)=I(x,y,t)∗g (2) (v,θ,ϕ) (v,θ,ϕ) onstudiesofresponseinthereceptivevisualfield[12].The sizeofthereceptivefieldisdeterminedbythestandardde- SpatiotemporalGaborfiltersarephasesensitivebecause viationσoftheGaussianfactor.Theparameterγistherate itsresponsetoamovingpatterndependsontheexactposi- thatspecifiestheellipticityoftheGaussianenvelopeinthe tionwithinthesequenceofimages.Toovercomethisdraw- spatialdomain.Thisparameterissettoγ = 0.5formatch back, a response that is phase insensitive can be obtained by: (cid:113) R = r2 (x,y,t)+r2 (x,y,t) (3) (v,θ) (v,θ,ϕ) (v,θ,ϕ/2) TocharacterizetheGaborspaceresultingfromthecon- volution,theenergyoftheresponseiscomputedaccording toEquation4. (cid:88) E = R2 (x,y,t) (4) (v,θ) (v,θ) x,y,t AcentralissueinapplyingspatiotemporalGaborfilters isthedeterminationofthefilterparametersthatcoversthe spatiotemporalfrequencyspace,andcapturesdynamictex- ture information as much as possible. Each spatiotemporal Gabor filter is determined by two main parameters: the direction θ and the speed of motion v. In order to cover a wide range of dynamic textures, we design a bank of spatiotemporal Gabor filter using a set of values for speed V = [v ,v ,...,v ] and direction Θ = [θ ,θ ,...,θ ]. 1 2 n 1 2 m Thefeaturevectorthatcharacterizesthedynamictextureis composedbytheenergyoftheresponseforeachcombina- tionofvelocitiesV anddirectionΘ(Equation5). ϕ=[E ,...,E ] (5) (v1,θ1) (vn,θm) The proposed method is summarized in Figure 2. First, wedesignabankofspatiotemporalGaborfilterscomposed Figure2.Proposedmethodconsidersthefol- byfilterswithdifferentdirectionsandspeeds.Then,these- lowingsteps:(i)Designabankofspatiotem- quenceofimagesisconvolvedwiththebankoffilters.For poral Gabor filters using different values of each convolved sequence of images, we calculate the en- speed v and direction θ; (ii) Convolve the se- ergytocomposeafeaturevector. quence of images with the bank of filters; (iii) Calculate the energy for each response tocomposethefeaturevector. 4. Response Analysis of Spatiotemporal Ga- borFilters Here, we analyze the speed and direction properties of θ = 0, which matches to the direction of the moving bar. the spatiotemporal Gabor filters in synthetic sequence of As we can see, the filter with moving envelope (vc = v) images.InFigure3,wepresenttheresponseofspatiotem- achievedhigherresponsethanafilterwithstationaryenve- poralGaborfilterstobarsmovingatthesamespeedbutin lope(vc =0).Theplotforthespeedparameterisshownin different direction θ. The filters and the moving bars have Figure5(b).Weconvolvedfilterstoanedgedriftingright- preferenceforthesamespeedv = 1.Theresponsehasthe wardindirectionθ =0ataspeedofv =2.Themaximum highestmagnitudewhenthedirectionofthefilterθmatches response is achieved for the filter whose speed matches to thedirectionofthemovingbar.Forinstance,whenθ =0,a stimulus’speed.Again,wecanconcludethatfilterswiththe verticalbarmovingrightwardsevokeshigherresponsethan moving envelope are more selective for both direction and barswithotherdirectionofmovement. speedthanfilterswithstationaryenvelope. ThespeedpropertyisevaluatedinFigure4.Weanalyze theresponseofspatiotemporalGaborfilterstoedgesdrift- 5. ExperimentalResults ing rightward at different speeds. The filters and the syn- theticsequenceofimageshaveforthesamepreferencedi- In this section, we present the experimental results us- rection θ = 0. The highest response is achieved by filters ingtwodatabases:(i)dyntexdatabaseand(ii)trafficvideo whichthespeedmatchesthespeedofthemovingedge. database.Thedyntexdatabaseconsistsof50dynamictex- In Figure 5(a), we plot the response of filters to a mov- ture classes each containing 10 samples collected from ing bar with direction θ = 0 at a speed v = 1. The re- Dyntex database [11]. The videos are at least 250 frames sponsereachesitsmaximumvaluetoafilterwithdirection long with dimension of 400×300 pixels. Figure 6 shows 0.5 1.2 Moving envelope Moving envelope Stationary envelope Stationary envelope 1 0.4 0.8 e 0.3 e s s n n po po 0.6 s s e e R 0.2 R 0.4 0.1 0.2 0 0 -180 -135 -90 -45 0 45 90 135 180 -1 0 1 2 3 4 5 Direction of the filter Speed of the filter Figure5.(a)Responseforfiltersofdifferentdirectionθ toamovingbarwithθ =0.(b)Responsefor filtersofdifferentspeedsv toaedgemovingatspeedv =2. Figure 3. Response of spatiotemporal Gabor Figure 4. Response of spatiotemporal Gabor filters to bars moving in different direction filters to edges moving at different speed v. θ. First row corresponds to the filters. Sec- First, second, and third rows corresponds to ond, third and fourth rows corresponds to theresponseofaedgesmovingatspeedv = the response of a bar moving in direction 1,2,4,respectively. θ =0,π,π,respectively. 4 2 parameters in the dynamic texture recognition. Table 1 examples of dynamic textures from the first database. The shows the average and standard deviation of correct clas- second database, collected from traffic database [2], con- sificationrateforthetrafficdatabase.Columnspresentthe sistsof254videosdividedintothreeclasses:light,medium, direction parameter evaluation using 4-directions, which andheavytraffic.Videoshad42to52frameswitharesolu- is a combination of filters with θ = [0,π,π,3π], and 2 2 tionof320×240pixels.Thevarietyoftrafficpatternsand 8-directions which is a combination of filters with θ = weather conditions are shown in Figure 7. All the experi- [0,π,π,3π,π,5π,3π,7π]. Rows present the speed eval- 4 2 4 4 2 4 mentsusedak-nearestneighborclassifierwithk = 1ina uation using combination of speed with step of 0.25. As scheme10-foldcrossvalidation. we can see, the bank of filter composed by filters with 8- Now,wediscusstheinfluenceofthedirectionandspeed directionsoutperformedthebankoffilterwith4-directions SpeedCombination 4-directions 8-directions [0.50−0.75] 90.06(5.53) 90.18(5.62) [0.50−1.00] 90.17(5.70) 90.57(5.15) [0.50−1.25] 89.07(5.91) 89.90(5.06) [0.50−1.50] 89.68(5.99) 89.74(5.80) [0.50−1.75] 89.69(6.18) 90.56(5.58) [0.50−2.00] 89.45(5.77) 91.50(5.20) [0.50−2.25] 90.24(5.51) 90.87(5.07) [0.50−2.50] 89.33(5.60) 90.75(5.26) [0.50−2.75] 89.76(5.44) 90.35(5.44) [0.50−3.00] 89.60(5.16) 90.63(5.21) Table 1. Correct classification rate and stan- Figure6.Examplesofdynamictexturesfrom dard deviation for different combinations of dyntex database [11]. The database is com- speedanddirectiononthetrafficdatabase. posed by 50 dynamic texture classes each containing10samples. SpeedCombination 4-directions 8-directions [0.1−0.2] 92.50(3.40) 94.92(3.09) [0.1−0.5] 96.00(2.40) 96.82(2.41) [0.1−1.0] 96.56(2.11) 98.02(1.65) [0.1−1.5] 96.92(2.32) 98.60(1.60) [0.1−2.0] 97.24(2.18) 97.84(1.92) [0.1−2.5] 96.92(2.49) 97.34(2.36) [0.1−3.0] 96.37(2.98) 96.94(2.73) Table 2. Correct classification rate and standard deviation for different combinations of speedanddirectiononthedyntexdatabase. of images and then the traffic jam can be modeled using theseparameters. Figure7.Examplesofdynamictexturesfrom InTable2,wepresenttheexperimentalresultsobtained trafficdatabase[2].Thedatabaseconsistsof onthedyntexdatabase.Thesamecombinationofdirections 254 videos split into three classes of light, of the early experiment was used to evaluate the proposed mediumandheavytraffic. method.However,asthedynamictexturesinthisdatabase present low speeds, the combination of speed started in v =0.1pixels/frameandtakenstepof0.1.Astheprevious results, the 8-direction bank of filters achieved higher val- forallcombinationofspeeds.However,verylittleimprove- uesofcorrectclassificationratecomparedtothe4-direction mentcanbeappreciatedasthecombinationofdirectionis bank of filters. In this case, a correct classification rate of increased. The improvement in correct classification rate 98.60% was obtained, which clearly shows the effective- was on average less than 1% when direction combination nessoftheproposedmethodonthedynamictexturerecog- rises from 4 to 8. This is because the cars on the high- nition. waymovealwaysonthesamedirection,whichcanbemod- eledby4-direction.Withrespecttothespeedparameter,the bestresultswereachievedforhighspeeds,suchas1.5pix- 6. Conclusion els/frame and 2.0 pixel/frame. A correct classification rate of 91.50% was achieved by a bank of filter composed by In this paper, we proposed a new method for dynamic 8-directions and speeds [0.5,0.75,1.0,1.25,1.5,1.75,2.0]. texture recognition based on spatiotemporal Gabor filters. Thehighspeedsmatchtospeedofthecarsinthesequence First,itconvolvesasequenceofimagestoabankoffilters andthenextractsenergystatisticfromeachresponse.Basi- [10] M.Fujii,T.Horikoshi,K.Otsuka,andS.Suzuki.Featureex- cally,thespatiotemporalGaborfiltersarebuiltusingspeed tractionoftemporaltexturebasedonspatiotemporalmotion v anddirectionθ.Thebankoffilteriscomposedbyfilters trajectory. InICPR,pagesVolII:1047–1051,1998. withdifferentvaluesofspeedanddirection. [11] R.Peteri,S.Fazekas,andM.Huiskes. Dyntex:Acompre- Promisingresultsconsideringdifferentcombinationsof hensivedatabaseofdynamictextures. PatternRecognition Letters,31(12):1627–1632,September2010. v and θ were achieved on two important databases: dyn- [12] N.PetkovandE.Subramanian. Motiondetection,noisere- texandtrafficdatabase.Onthetrafficdatabase,ourmethod duction, texture suppression, and contour enhancement by achievedacorrectclassificationrateof91.50%usingcom- spatiotemporal gabor filters with surround inhibition. Bio- binationofhighspeeds.Ontheotherhand,acorrectclassi- logicalCybernetics,97:423–439,March2008. ficationrateof98.60%wasobtainedonthedyntexdatabase [13] R.PolanaandR.C.Nelson. Temporaltextureandactivity usingacombinationoflowspeeds. recognition. InMotion-BasedRecognition,pageChapter5, 1997. Acknowledgments. WNG was supported by FAPESP [14] H. Qian, Y. Mao, W. Xiang, and Z. Wang. Recognition of grants 2010/08614-0. BBM was supported by CNPq. human activities using svm multi-class classifier. Pattern OMB was supported by CNPq grants 306628/2007-4 and RecognitionLetters,InPress,CorrectedProof:–,2009. 484474/2007-3. [15] M.SzummerandR.W.Picard. Temporaltexturemodeling. InICIP,pagesIII:823–826,1996. References [16] S. Wu and Y. F. Li. Motion trajectory reproduction from generalized signature description. Pattern Recognition, 43(1):204–221,2010. [1] C.CedrasandM.Shah.Motion-basedrecognition:Asurvey. ImageandVisionComputing,13(2):129–155,mar1995. [17] H.Zhong,J.Shi,andM.Visontai. Detectingunusualactiv- ityinvideo. InIEEEConferenceonComputerVisionand [2] A.B.ChanandN.Vasconcelos. Classificationandretrieval PatternRecognition,pages819–826,2004. oftrafficvideousingauto-regressivestochasticprocesses.In IEEEIntelligentVehiclesSymposium,pages771–776,June 2005. [3] A.B.ChanandN.Vasconcelos. Classifyingvideowithker- neldynamictextures. ComputerVisionandPatternRecog- nition,IEEEComputerSocietyConferenceon,0:1–6,2007. [4] D.ChetverikovandR.Pe´teri.Abriefsurveyofdynamictex- turedescriptionandrecognition. InComputerRecognition Systems,Proceedingsofthe4thInternationalConferenceon Computer Recognition Systems, volume 30 of Advances in SoftComputing,pages17–26.Springer,2005. [5] P.Dollar,V.Rabaud,G.Cottrell,andS.Belongie. Behav- ior recognition via sparse spatio-temporal features. In IC- CCN’05:Proceedingsofthe14thInternationalConference onComputerCommunicationsandNetworks,pages65–72, Washington,DC,USA,2005.IEEEComputerSociety. [6] G. Doretto, A. Chiuso, Y. N. Wu, and S. Soatto. Dy- namic textures. International Journal of Computer Vision, 51(2):91–109,February2003. [7] S. Dubois, R. Pe´teri, and M. Meńard. A comparison of wavelet based spatio-temporal decomposition methods for dynamic texture recognition. In IbPRIA ’09: Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis, pages 314–321, Berlin, Heidelberg, 2009. Springer-Verlag. [8] R.FabletandP.Bouthemy. Motionrecognitionusingnon- parametric image motion models estimated from temporal and multiscale cooccurrence statistics. IEEE Trans. Pat- ternAnalysisandMachineIntelligence,25(12):1619–1624, 2003. [9] S.FazekasandD.Chetverikov. Analysisandperformance evaluationofopticalflowfeaturesfordynamictexturerecog- nition. SP:IC,22(7-8):680–691,Aug.2007.