Table Of ContentInJournalofMathematicalImagingandVisiondoi:10.1007/s10851-016-0691-3Jan2017.
Temporal scale selection in time-causal scale space
TonyLindeberg
7
Received:date/Accepted:date
1
0
2
Abstract When designing and developing scale selection alently truncated exponential kernels coupled in cascade.
n
mechanisms for generating hypotheses about characteristic Specifically,bythediscretenatureofthetemporalscalelev-
a
J scalesinsignals,itisessentialthattheselectedscalelevels elsinthisclassoftime-causalscale-spaceconcepts,westudy
9 reflecttheextentoftheunderlyingstructuresinthesignal. two special cases of distributing the intermediate temporal
Thispaperpresentsatheoryandin-depththeoreticalanal- scalelevels,byusingeitherauniformdistributioninterms
] ysisaboutthescaleselectionpropertiesofmethodsforauto- ofthevarianceofthecomposedtemporalscale-spacekernel
V
maticallyselectinglocaltemporalscalesintime-dependent oralogarithmicdistribution.
C
signalsbasedonlocalextremaovertemporalscalesofscale- Inthecaseofauniformdistributionofthetemporalscale
.
s normalizedtemporalderivativeresponses.Specifically,this levels,weshowthatscaleselectionbasedonlocalextrema
c
[ paper develops a novel theoretical framework for perform- ofscale-normalizedderivativesovertemporalscalesmakes
ing such temporal scale selection over a time-causal and it possible to estimate the temporal duration of sparse lo-
1
time-recursive temporal domain as is necessary when pro- calfeaturesdefinedintermsoftemporalextremaoffirst-or
v
8 cessing continuous video or audio streams in real time or second-order temporal derivative responses. For dense fea-
8 whenmodellingbiologicalperception. tures modelled as a sine wave, the lack of temporal scale
0 Forarecentlydevelopedtime-causalandtime-recursive invariance does, however, constitute a major limitation for
5
scale-space concept defined by convolution with a scale- handlingdensetemporalstructuresofdifferenttemporaldu-
0
. invariantlimitkernel,weshowthatitispossibletotransfera rationinauniformmanner.
1
largenumberofthedesirablescaleselectionpropertiesthat In the case of a logarithmic distribution of the tempo-
0
7 holdfortheGaussianscale-spaceconceptoveranon-causal ral scale levels, specifically taken to the limit of a time-
1 temporaldomaintothistemporalscale-spaceconceptovera causal limit kernel with an infinitely dense distribution of
v: trulytime-causaldomain.Specifically,weshowthatforthis the temporal scale levels towards zero temporal scale, we
i temporal scale-space concept, it is possible to achieve true showthatitispossibletoachievetruetemporalscaleinvari-
X
temporalscaleinvariancealthoughthetemporalscalelevels ance to handle dense features modelled as a sine wave in
r
a havetobediscrete,whichisanoveltheoreticalconstruction. a uniform manner over different temporal durations of the
The analysis starts from a detailed comparison of dif- temporalstructuresaswelltoachievemoregeneraltempo-
ferent temporal scale-space concepts and their relative ad- ralscaleinvarianceforanysignaloveranytemporalscaling
vantages and disadvantages, leading the focus to a class of transformationwithascalingfactorthatisanintegerpower
recently extended time-causal and time-recursive temporal ofthedistributionparameterofthetime-causallimitkernel.
scale-spaceconceptsbasedonfirst-orderintegratorsorequiv- It is shown how these temporal scale selection proper-
tiesdevelopedforapuretemporaldomaincarryovertofea-
The support from the Swedish Research Council (Contract No.
turedetectorsdefinedovertime-causalspatio-temporaland
2014-4083)andStiftelsenOlleEngkvistByggma¨stare(ContractNo.
2015/465)isgratefullyacknowledged. spectro-temporaldomains.
Tony Lindeberg, Computational Brain Science Lab, Department of
Keywords Scalespace·Scale·Scaleselection·Temporal·
ComputationalScienceandTechnology,SchoolofComputerScience
andCommunication,KTHRoyalInstituteofTechnology,SE-10044 Spatio-temporal·Scaleinvariance·Differentialinvariant·
Stockholm,Sweden.E-mail:tony@kth.se Featuredetection·Videoanalysis·Computervision
2 TonyLindeberg
1 Introduction Overthespatialdomain,theoreticallywell-foundedmeth-
ods have been developed for choosing spatial scale levels
Whenprocessingsensorydatabyautomaticmethodsinar- amongreceptivefieldresponsesovermultiplespatialscales
eas of signal processing such as computer vision or audio (Lindeberg [66,65,68,74,75]) leading to e.g. robust meth-
processingorincomputationalmodellingofbiologicalper- odsforimage-basedmatchingandrecognition(Lowe[89];
ception, the notion of receptive field constitutes an essen- MikolajczykandSchmid[92];TuytelaarsandvanGool[116];
tialconcept(HubelandWiesel[33,34];AertsenandJohan- Bay et al. [5]; Tuytelaars and Mikolajczyk [117]; van de
nesma[2];DeAngelisetal.[16,15];Milleretal[94]). Sandeetal.[105];Larsenetal.[52])thatareabletohandle
For sensory data as obtained from vision or hearing, or large variations of the size of the objects in the image do-
their counterparts in artificial perception, the measurement mainandwithnumerousapplicationsregardingobjectrecog-
fromasinglelightsensorinavideocameraorontheretina, nition,objectcategorization,multi-viewgeometry,construc-
or the instantaneous sound pressure registered by a micro- tion of 3-D models from visual input, human-computer in-
phone is hardly meaningful at all, since any such measure- teraction,biometricsandrobotics.
ment is strongly dependent on external factors such as the Much less research has, however, been performed re-
illumination of a visual scene regarding vision or the dis- gardingthetopicofchoosinglocalappropriatescalesintem-
tancebetweenthesoundsourceandthemicrophoneregard- poraldata.Whilesomemethodsfortemporalscaleselection
inghearing.Instead,theessentialinformationiscarriedby havebeendeveloped(Lindeberg[64];LaptevandLindeberg
therelativerelationsbetweenlocalmeasurementsatdiffer- [50];Willemsetal.[121]),thesemethodssufferfromeither
ent points and temporal moments regarding vision or lo- theoreticalorpracticallimitations.
cal measurements over different frequencies and temporal Amainsubjectofthispaperispresentatheoryforhow
moments regarding hearing. Following this paradigm, sen- tocomparefilterresponsesintermsoftemporalderivatives
sory measurements should be performed over local neigh- thathavebeencomputedatdifferenttemporalscales,specif-
bourhoods over space-time regarding vision and over lo- icallywithadetailedtheoreticalanalysisofthepossibilities
cal neighbourhoods in the time-frequency domain regard- ofhavingtemporalscaleestimatesasobtainedfromatem-
ing hearing, leading to the notions of spatio-temporal and poral scale selection mechanism reflect the temporal dura-
spectro-temporalreceptivefields. tion of the underlying temporal structures that gave rise to
Specifically, spatio-temporal receptive fields constitute the feature responses. Another main subject of this paper
amainclassofprimitivesforexpressingmethodsforvideo istopresentatheoreticalframeworkfortemporalscalese-
analysis (Zelnik-Manor and Irani [124], Laptev and Linde- lection that leads to temporal scale invariance and enables
berg[51,49];Jhuangetal.[39];Kla¨seretal.[43];Niebleset thecomputationofscalecovarianttemporalscaleestimates.
al.[97];Wangetal.[118];Poppeetal.[101];ShaoandMat- Whilethesetopicscanforanon-causaltemporaldomainbe
tivi[112];Weinlandetal.[120];Wangetal.[119]),whereas addressed by the non-causal Gaussian scale-space concept
spectro-temporal receptive fields constitute a main class of (Iijima[35];Witkin[122];Koenderink[45];Koenderinkand
primitivesforexpressingmethodsformachinehearing(Pat- vanDoorn[47];Lindeberg[61,62,70];Florack[22];terHaar
tersonetal.[100,99];Kleinschmidt[44];Ezzatetal.[19]; Romeny [26]), the development of such a theory has been
MeyerandKollmeier[91];Schluteetal.[107];Heckmann missingregardingatime-causaltemporaldomain.
etal.[32];Wuetal.[123];Aliasetal.[4]).
A general problem when applying the notion of recep-
tivefieldsinpractice,however,isthatthetypesofresponses 1.1 Temporalscaleselection
that are obtained in a specific situation can be strongly de-
pendent on the scale levels at which they are computed. When processing time-dependent signals in video or audio
A spatio-temporal receptive field is determined by at least ormoregenerallyanytemporalsignal,specialattentionhas
a spatial scale parameter and a temporal scale parameter, tobeputtothefactsthat:
whereasaspectro-temporalreceptivefieldisdeterminedby
– thephysicalphenomenathatgeneratethetemporalsig-
at least a spectral and a temporal scale parameter. Beyond
nalsmayoccuratdifferentspeed—fasterorslower,and
ensuring that local sensory measurements at different spa-
– thetemporalsignalsmaycontainqualitativelydifferent
tial,temporalandspectralscalesaretreatedinaconsistent
typesoftemporalstructuresatdifferenttemporalscales.
manner, which by itself provides strong contraints on the
shapesofthereceptivefields(Lindeberg[72,78];Lindeberg In certain controlled situations where the physical system
andFriberg[82,83]),itisnecessaryforcomputervisionor thatgeneratesthetemporalsignalsthatistobeprocessedis
machinehearingalgorithmstodecidewhatresponseswithin sufficientlywellknownandifthevariabilityofthetemporal
thefamiliesofreceptivefieldsoverdifferentspatial,tempo- scales over time in the domain is sufficiently constrained,
ralandspectralscalestheyshouldbasetheiranalysison. suitable temporal scales for processing the signals may in
Temporalscaleselectionintime-causalscalespace 3
somesituationsbechosenmanuallyandthenbeverifiedex- tute the natural metric for measuring the scale levels in a
perimentally. If the sources that generate the temporal sig- spatialscalespace(Koenderink[45];Lindeberg[59]).
nals are sufficiently complex and/or if the temporal struc- As we shall see from the detailed theoretical analysis
tures in the signals vary substantially in temporal duration that will follow, this will imply certain differences in scale
bytheunderlyingphysicalprocessesoccurringsignificantly selectionpropertiesofatemporallyasymmetrictime-causal
fasterorslower,itisontheotherhandnaturalto(i)include scalespacecomparedtoscaleselectioninaspatiallymirror
a mechanism for processing the temporal data at multiple symmetric Gaussian scale space. These differences in the-
temporalscalesand(ii)trytodetectinabottom-upmanner oretical properties are in turn essential to take into explicit
atwhattemporalscalestheinterestingtemporalphenomena accountwhenformulatingalgorithmsfortemporalscalese-
arelikelytooccur. lectionine.g.videoanalysisoraudioanalysisapplications.
Thesubjectofthisarticleistodevelopatheoryfortem- Forthetemporalscale-spaceconceptbasedonauniform
poral scale selection in a time-causal temporal scale space distributionofthetemporalscalelevelsinunitsofthevari-
asanextensionofapreviouslydevelopedtheoryforspatial ance of the composed scale-space kernel, it will be shown
scaleselectioninaspatialscalespace(Lindeberg[66,65,68, that temporal scale selection from local extrema over tem-
74,75]),togeneratebottom-uphypothesesaboutcharacter- poral scales will make it possible to estimate the temporal
istictemporalscalesintime-dependentsignals,intendedto durationoflocaltemporalstructuresmodelledaslocaltem-
serveasestimatesofthetemporaldurationoflocaltempo- poral peaks and local temporal ramps. For a dense tempo-
ral structures in time-dependent signals. Special focus will ral structure modelled as a temporal sine wave, the lack of
be on developing mechanisms analogous to scale selection true scale invariance for this concept will, however, imply
innon-causalGaussianscale-space,basedonlocalextrema thatthetemporalscaleestimateswillnotbedirectlypropor-
overscalesofscale-normalizedderivatives,whileexpressed tionaltothewavelengthofthetemporalsinewave.Instead,
within the framework of a time-causal and time-recursive thescaleestimatesareaffectedbyabias,whichisnotade-
temporalscalespaceinwhichthefuturecannotbeaccessed sirableproperty.
and the signal processing operations are thereby only al- For the temporal scale-space concept based on a loga-
lowedtomakeuseofinformationfromthepresentmoment rithmic distribution of the temporal scale levels, and taken
andacompactbufferofwhathasoccurredinthepast. tothelimittoscale-invarianttime-causallimitkernel(Lin-
Whendesigninganddevelopingsuchscaleselectionmech- deberg[78])correspondingtoaninfinitenumberoftempo-
anisms,itisessentialthatthecomputedscaleestimatesre- ralscalelevelsthatclusterinfinitelyclosenearthetemporal
flect the temporal duration of the corresponding temporal scalelevelzero,itwillontheotherhandbeshownthatthe
structuresthatgaverisetothefeatureresponses.Tounder- temporalscaleestimatesofadensetemporalsinewavewill
stand the pre-requisites for developing such temporal scale be truly proportional to the wavelength of the signal. By a
selectionmethods,wewillinthispaperperformanin-depth generalproof,itwillbeshownthisscaleinvariantproperty
theoreticalanalysisofthescaleselectionpropertiesthatsuch oftemporalscaleestimatescanalsobeextendedtoanysuf-
temporalscaleselectionmechanismsgiverisetofordiffer- ficientlyregularsignal,whichconstitutesageneralfounda-
enttemporalscale-spaceconceptsandfordifferentwaysof tion for expressing scale invariant temporal scale selection
definingscale-normalizedtemporalderivatives. mechanisms for time-dependent video and audio and more
Specifically,afteranexaminationofthetheoreticalprop- generallyalsootherclassesoftime-dependentmeasurement
erties of different types of temporal scale-space concepts, signals.
we will focus on a class of recently extended time-causal As complement to this proposed overall framework for
temporalscale-spaceconceptsobtainedbyconvolutionwith temporalscaleselection,wewillalsopresentasetofgeneral
truncatedexponentialkernelscoupledincascade(Lindeberg theoretical results regarding time-causal scale-space repre-
[57,77,78];LindebergandFagerstro¨m[81]).Fortwonatu- sentations: (i) showing that previous application of the as-
ralwaysofdistributingthediscretetemporalscalelevelsin sumption of a semi-group property for time-causal scale-
sucharepresentation,intermsofeitherauniformdistribu- spaceconceptsleadstoundesirabletemporaldynamics,which
tion over the scale parameter τ corresponding to the vari- howevercanberemediedbyreplacingtheassumptionofa
ance of the composed scale-space kernel or a logarithmic semi-group structure be a weaker assumption of a cascade
distribution,wewillstudythescaleselectionpropertiesthat propertyinturnbasedonatransitivityproperty,(ii)formu-
resultfromdetectinglocaltemporalscalelevelsfromlocal lations of scale-normalized temporal derivatives for Koen-
extremaoverscaleofscale-normalizedtemporalderivatives. derink’stime-causalscale-timemodel[46]and(iii)waysof
Themotivationforstudyingalogarithmicdistributionofthe translatingthetemporalscaleestimatesfromlocalextrema
temporalscalelevels,isthatitcorrespondstoauniformdis- over temporal scales in the temporal scale-space represen-
tributioninunitsofeffectivescaleτ = A+Blogτ for tation based on the scale-invariant time-causal limit kernel
eff
some constants A and B, which has been shown to consti- into quantitative measures of the temporal duration of the
4 TonyLindeberg
corresponding underlying temporal structures and in turn bution of the temporal scale levels, some theoretical scale-
basedonascale-timeapproximationofthelimitkernel. spacepropertieswillturnouttobeeasiertostudyinclosed
In these ways, this paper is intended to provide a theo- formforthistemporalscale-spaceconcept.Wewillspecif-
reticalfoundationforexpressingtheoreticallywell-founded ically show that for a temporal peak modelled as the im-
temporalscaleselectionmethodsforselectinglocaltempo- pulseresponsetoasetoftruncatedexponentialkernelscou-
ralscalesovertime-causaltemporaldomains,suchasvideo pledincascade,theselectedtemporalscalelevelwillserve
and audio with specific focus on real-time image or sound as a good approximation of the temporal duration of the
streams. Applications of this scale selection methodology peak or be proportional to this measure depending on the
fordetectingbothsparseanddensespatio-temporalfeatures valueofthescalenormalizationparameterγ usedforscale-
invideoarepresentedinacompanionpaper[79]. normalizedtemporalderivativesbasedonvariance-basednor-
malizationorthescalenormalizationpowerpforscale-norm-
alizedtemporalderivativesbasedonL -normalization.For
p
atemporalonsetramp,theselectedtemporalscalelevelwill
1.2 Structureofthisarticle
ontheotherhandbeeitheragoodapproximationofthetime
constant of the onset ramp or proportional to this measure
Asaconceptualbackgroundtothetheoreticaldevelopments
of the temporal duration of the ramp. For a temporal sine
that will be performed, we will start in Section 2 with an
wave, the selected temporal scale level will, however, not
overviewofdifferentapproachestohandlingtemporaldata
bedirectlyproportionaltothewavelengthofthesignal,but
within the scale-space framework including a comparison
insteadaffectedbyasystematicbias.Furthermore,thecor-
of relative advantages and disadvantages of different types
responding scale-normalized magnitude measures will not
oftemporalscale-spaceconcepts.
be independent of the wavelength of the sine wave but in-
As a theoretical baseline for the later developments of
steadshowsystematicwavelengthdependentdeviations.A
methodsfortemporalscaleselectioninatime-causalscale
mainreasonforthisisthatthistemporalscale-spaceconcept
space,weshalltheninSection3giveanoveralldescription
doesnotguaranteetemporalscaleinvarianceifthetemporal
of basic temporal scale selection properties that will hold
scalelevelsaredistributeduniformlyintermsofthetempo-
ifthenon-causalGaussianscale-spaceconceptwithitscor-
ralscaleparameterτ correspondingtothetemporalvariance
responding selection methodology for a spatial image do-
ofthetemporalscale-spacekernel.
main is applied to a one-dimensional non-causal temporal
domain, e.g. for the purpose of handling the temporal do- Withalogarithmicdistributionofthetemporalscalelev-
mainwhenanalysingpre-recordedvideooraudioinanof- els, we will on the other hand show that for the temporal
flinesetting. scale-space concept defined by convolution with the time-
In Sections 4–5 we will then continue with a theoret- causallimitkernel(Lindeberg[78])correspondingtoanin-
ical analysis of the consequences of performing temporal finitely dense distribution of the temporal scale levels to-
scale selection in the time-causal scale space obtained by wardszerotemporalscale,thetemporalscaleestimateswill
convolution with truncated exponential kernels coupled in be perfectly proportional to the wavelength of a sine wave
cascade (Lindeberg [57,77,78]; Lindeberg and Fagerstro¨m forthistemporalscale-spaceconcept.Itwillalsobeshown
[81]). By selecting local temporal scales from the scales at thatthistemporalscale-spaceconceptleadstoperfectscale
which scale-normalized temporal derivatives assume local invariance in the sense that (i) local extrema over temporal
extremaovertemporalscales,wewillanalyzetheresulting scales are preserved under temporal scaling factors corre-
temporalscaleselectionpropertiesfortwowaysofdefining spondingtointegerpowersofthedistributionparametercof
scale-normalized temporal derivatives, by either variance- thetime-causallimitkernelunderlyingthistemporalscale-
basednormalizationasdeterminedbyascalenormalization spaceconceptandaretransformedinascale-covariantway
parameterγ orL -normalizationfordifferentvaluesofthe for any temporal input signal and (ii) if the scale normal-
p
scalenormalizationpowerp. ization parameter γ = 1 or equivalently if the scale nor-
Withthetemporalscalelevelsrequiredtobediscretebe- malization power p = 1, the magnitude values at the local
cause of the very nature of this temporal scale-space con- extrema over scale will be equal under corresponding tem-
cept,wewillspecificallystudytwowaysofdistributingthe poralscalingtransformations.Forthistemporalscale-space
temporalscalelevelsoverscale,usingeitherauniformdis- conceptwecanthereforefulfilbasicrequirementstoachieve
tribution relative to the temporal scale parameter τ corre- temporalscaleinvariancealsooveratime-causalandtime-
sponding to the variance of the composed temporal scale- recursivetemporaldomain.
spacekernelinSection4oralogarithmicdistributionofthe Tosimplifythetheoreticalanalysiswewillinsomecases
temporalscalelevelsinSection5. temporarily extend the definitions of temporal scale-space
Because of the analytically simpler form for the time- representationsoverdiscretetemporalscalelevelstoacon-
causalscale-spacekernelscorrespondingtoauniformdistri- tinuousscalevariable,tomakeitpossibletocomputelocal
Temporalscaleselectionintime-causalscalespace 5
extrema over temporal scales from differentiation with re- For off-line processing of pre-recorded signals, a non-
spect to the temporal scale parameter. Section 6 discusses causalGaussiantemporalscale-spaceconceptmayinmany
theinfluencethatthisapproximationhasontheoverallthe- situationsbesufficient.AGaussiantemporalscale-spacecon-
oreticalanalysis. ceptisconstructedoverthe1-Dtemporaldomaininasimi-
Section 7 then illustrates how the proposed theory for larmannerasaGaussianspatialscale-spaceconceptiscon-
temporal scale selection can be used for computing local structed over a D-dimensional spatial domain (Iijima [35];
scale estimates from 1-D signals with substantial variabili- Witkin[122];Koenderink[45];KoenderinkandvanDoorn
tiesinthecharacteristictemporaldurationoftheunderlying [47];Lindeberg[61,62,70];Florack[22];terHaarRomeny
structuresinthetemporalsignal. [26]),withorwithoutthedifferencethatamodelfortempo-
In Section 8, we analyse how the derived scale selec- raldelaysmayormaynotbeadditionallyincluded(Linde-
tion properties carry over to a set of spatio-temporal fea- berg[70]).
ture detectors defined over both multiple spatial scales and Whenprocessingtemporalsignalsinrealtime,orwhen
multiple temporal scales in a time-causal spatio-temporal modelling sensory processes in biological perception com-
scale-spacerepresentationforvideoanalysis.Section9then putationally, it is on the other hand necessary to base the
outlineshowcorrespondingselectionoflocaltemporaland temporalanalysisontime-causaloperations.
logspectralscalescanbeexpressedforaudioanalysisoper- The first time-causal temporal scale-space concept was
ationsoveratime-causalspectro-temporaldomain.Finally, developedbyKoenderink[46],whoproposedtoapplyGaus-
Section10concludeswithasummaryanddiscussion. sian smoothing on a logarithmically transformed time axis
To simplify the presentation, we have put some deriva- with the present moment mapped to the unreachable infin-
tions and theoretical analysis in the appendix. Appendix A ity. This temporal scale-space concept does, however, not
presents a general theoretical argument of why a require- haveanyknowntime-recursiveformulation.Formally,itre-
mentaboutasemi-grouppropertyovertemporalscaleswill quiresaninfinitememory ofthepastandhastherefore not
leadtoundesirabletemporaldynamicsforatime-causalscale beenextensivelyappliedincomputationalapplications.
space and argue that the essential structure of non-creation
Lindeberg[57,77,78]andLindebergandFagerstro¨m[81]
ofnewimagestructuresfromanyfinertoanycoarsertempo-
proposedatime-causaltemporalscale-spaceconceptbased
ral scale can instead nevertheless be achieved with the less
on truncated exponential kernels or equivalently first-order
restrictive assumption about a cascade smoothing property
integrators coupled in cascade, based on theoretical results
overtemporalscales,whichthenallowsforbettertemporal
by Schoenberg [108] (see also Schoenberg [109] and Kar-
dynamicsintermsofe.g.shortertemporaldelays.
lin [42]) implying that such kernels are the only variation-
In relation to Koenderink’s scale-time model [46], Ap-
diminishingkernelsovera1-Dtemporaldomainthatguar-
pendixBshowshowcorrespondingnotionsofscale-normal-
anteenon-creationofnewlocalextremaorequivalentlyzero-
izedtemporalderivativesbasedoneithervariance-basednor-
crossingswithincreasingtemporalscale.Thistemporalscale-
malizationorL -normalizationcanbedefinedalsoforthis
p spaceconceptisadditionallytime-recursiveandcanbeim-
time-causaltemporalscale-spaceconcept.
plementedintermsofcomputationallyhighlyefficientfirst-
Appendix C shows how the temporal duration of the
order integrators or recursive filters over time. This theory
time-causal limit kernel proposed in (Lindeberg [78]) can
hasbeenrecentlyextendedintoascale-invarianttime-causal
beestimatedbyascale-timeapproximationofthelimitker-
limitkernel(Lindeberg[78]),whichallowsforscaleinvari-
nelviaKoenderink’sscale-timemodelleadingtoestimates
ance over the temporal scaling transformations that corre-
ofhowaselectedtemporalscalelevelτˆfromlocalextrema
spondtoexactmappingsbetweenthetemporalscalelevels
over temporal scale can be translated into a estimates of
in the temporal scale-space representation based on a dis-
the temporal duration of temporal structures in the tempo-
cretesetoflogarithmicallydistributedtemporalscalelevels.
ralscale-spacerepresentationobtainedbyconvolutionwith
Basedonsemi-groupsthatguaranteeeitherself-similarity
the time-causal limit kernel. Specifically, explicit expres-
over temporal scales or non-enhancement of local extrema
sions are given for such temporal duration estimates based
with increasing temporal scales, Fagerstro¨m [20] and Lin-
onfirst-andsecond-ordertemporalderivatives.
deberg [70] have derived time-causal semi-groups that al-
low for a continuous temporal scale parameter and studied
theoreticalpropertiesofthesekernels.
2 Theoreticalbackgroundandrelatedwork
Concerningtemporalprocessingoverdiscretetime,Fleet
2.1 Temporalscale-spaceconcepts andLangley[21]performedtemporalfilteringforopticflow
computationsbasedonrecursivefiltersovertime.Lindeberg
Forprocessingtemporalsignalsatmultipletemporalscales, [57,77,78]andLindebergandFagerstro¨m[81]showedthat
different types of temporal scale-space concepts have been first-order recursive filters coupled in cascade constitutes a
developedinthecomputervisionliterature(seeFigure1): natural time-causal scale-space concept over discrete time,
6 TonyLindeberg
g(t; τ) gt(t; τ) gtt(t; τ)
h(t; µ,K=10) ht(t; µ,K=10) htt(t; µ,K=10)
√ √ √
h(t; K=10,c= 2) ht(t; K=10,c= 2) htt(t; K=10,c= 2)
h(t; K=10,c=2) ht(t; K=10,c=2) htt(t; K=10,c=2)
√ √ √
hKoe(t; c= 2) hKoe,t(t; c= 2) hKoe,tt(t; c= 2)
hKoe(t; c=2) hKoe,t(t; c=2) hKoe,tt(t; c=2)
Fig.1 Temporalscale-spacekernelswithcomposedtemporalvarianceτ =1forthemaintypesoftemporalscale-spaceconceptsconsideredinthis
paperandwiththeirfirst-andsecond-ordertemporalderivatives:(toprow)thenon-causalGaussiankernelg(t; τ),(secondrow)thecomposition
√
h(t; µ,K = 10)ofK = 10truncatedexponentialkernelswithequaltimeconstants,(thirdrow)thecompositionh(t; K = 10,c = 2)of
√
K =10truncatedexponentialkernelswithlogarithmicdistributionofthetemporalscalelevelsforc= 2,(fourthrow)correspondingkernels
√
h(t; K = 10,c = 2)forc = 2,(fifthrow)Koenderink’sscale-timekernelshKoe(t; c = 2)correspondingtoGaussianconvolutionover
a logarithmically transformed temporal axis with the parameters determined to match the time-causal limit kernel corresponding to truncated
√
exponentialkernelswithaninfinitenumberoflogarithmicallydistributedtemporalscalelevelsaccordingto(186)forc = 2,(bottomrow)
correspondingscale-timekernelshKoe(t; c=2)forc=2.(Horizontalaxis:timet)
Temporalscaleselectionintime-causalscalespace 7
based on the requirement that the temporal filtering over a 2.2 Relativeadvantagesofdifferenttemporalscalespaces
1-D temporal signal must not increase the number of local
extremaorequivalentlythenumberofzero-crossingsinthe Whendevelopingatemporalscaleselectionmechanismover
signal. In the specific case when all the time constants in atime-causaltemporaldomain,afirstproblemconcernswhat
thismodelareequalandtendtozerowhilesimultaneously time-causalscale-spaceconcepttobasethemulti-scaletem-
increasingthenumberoftemporalsmoothingstepsinsuch poral analysis upon. The above reviewed temporal scale-
awaythatthecomposedtemporalvarianceisheldconstant, spaceconceptshavedifferentrelativeadvantagesfromathe-
these kernels can be shown to approach the temporal Pois- oretical and computational viewpoint. In this section, we
son kernel [81]. If on the other hand the time constants of will perform an in-depth examination of the different tem-
the first-order integrators are chosen so that the temporal poral scale-space concepts that have been developed in the
scale levels become logarithmically distributed, these tem- literature,whichwillleadustoaclassoftime-causalscale-
poralsmoothingkernelsapproachadiscreteapproximation space concepts that we argue is particularly suitable with
ofthetime-causallimitkernel[78]. respecttothesetofdesirablepropertiesweaimat.
Thenon-causalGaussiantemporalscalespaceisinmany
casestheconceptuallyeasiesttemporalscale-spaceconcept
Applications of using these linear temporal scale-space
to handle and to study analytically (Lindeberg [70]). The
concepts for modelling the temporal smoothing step in vi-
correspondingtemporalkernelsarescaleinvariant,havecom-
sual and auditory receptive fields have been presented by
pactclosed-formexpressionsoverboththetemporalandfre-
Lindeberg[63,69,70,72,73,77,78],terHaarRomenyetal.
quencydomainsandobeyasemi-grouppropertyovertem-
[27], Lindeberg and Friberg [82,83] and Mahmoudi [90].
poral scales. When applied to pre-recorded signals, tempo-
Non-linearspatio-temporalscale-spaceconceptshavebeen
raldelayscanifdesirablebedisregarded,whicheliminates
proposedbyGuichard[25].Applicationsofthenon-causal
anyneedfortemporaldelaycompensation.Thisscale-space
Gaussiantemporalscale-spaceconceptforcomputingspatio-
conceptis,however,nottime-causalandnottime-recursive,
temporal features have been presented by Laptev and Lin-
which implies fundamental limitations with regard to real-
deberg [50,51,49], Kla¨ser et al. [43], Willems et al. [121],
time applications and realistic modelling of biological per-
Wang et al. [118], Shao and Mattivi [112] and others, see
ception.
specifically Poppe [101] for a survey of early approaches
Koenderink’sscale-timekernels[46]aretrulytime-causal,
to vision-based human human action recognition, Jhuang
allowforacontinuoustemporalscaleparameter,havegood
et al. [39] and Niebles et al. [97] for conceptually related
temporal dynamics and have a compact explicit expression
non-causalGaborapproaches,AdelsonandBergen[1]and
over the temporal domain. These kernels are, however, not
DerpanisandWildes[17]forcloselyrelatedspatio-temporal
time-recursive, which implies that they in principle require
orientationmodelsandHanetal.[29]forarelatedmid-level
aninfinitememoryofthepast(oratleastextendedtemporal
temporalrepresentationtermedthevideoprimalsketch.
buffers corresponding to the temporal extent to which the
infinite support temporal kernels are truncated at the tail).
Applications of the temporal scale-space model based Thereby, the application of Koenderink’s scale-time model
on truncated exponential kernels with equal time constants to video analysis implies that substantial temporal buffers
coupledincascadeandcorrespondingtoLaguerrefunctions are needed when implementing this non-recursive tempo-
(Laguerre polynomials multiplied by a truncated exponen- ralscale-spaceinpractice.Similarproblemswithsubstantial
tialkernel)forcomputingspatio-temporalfeatureshavepre- needforextendedtemporalbuffersarisewhenapplyingthe
sented by Rivero-Moreno and Bres [103], Shabani et al. non-causalGaussiantemporalscale-spaceconcepttooffline
[111]andBergetal.[6]aswellasforhandlingtimescales analysisofextendedvideosequences.Thealgebraicexpres-
invideosurveillance(JacobandPless[37]),forperforming sions for the temporal kernels in the scale-time model are
edgepreservingsmoothinginvideostreams(Paris[98])and furthermorenotalwaysstraightforwardtohandleandthere
iscloselyrelatedtoTikhonovregularizationasusedforim- isnoknownsimpleexpressionfortheFouriertransformof
agerestorationbye.g.Suryaetal.[115].Ageneralframe- thesekernelsornoknownsimpleexplicitcascadesmooth-
workforperformingspatio-temporalfeaturedetectionbased ing property over temporal scales with respect to the regu-
on the temporal scale-space model based on truncated ex- lar(untransformed)temporaldomain.Thereby,certainalge-
ponential kernels coupled in cascade with specifically the braic calculations with the scale-time kernels may become
both theoretical and practical advantages of using logarith- quitecomplicated.
mic distribution of the intermediated temporal scale levels The temporal scale-space kernels obtained by coupling
in terms of temporal scale invariance and better temporal truncated exponentialkernels orequivalently first-order in-
dynamics (shorter temporal delays) has been presented in tegratorsincascadearebothtrulytime-causalandtrulytime-
Lindeberg[78]. recursive(Lindeberg[57,77,78];LindebergandFagerstro¨m
8 TonyLindeberg
[81]). The temporal scale levels are on the other hand re- equationsthatarehardertohandleboththeoreticallyandin
quired to be discrete. If the goal is to construct a real-time terms of computational implementation. For these reasons,
signal processing system that analyses continuous streams weshallnotconsiderthosetime-causalsemi-groupsfurther
ofsignaldatainrealtime,onecanhoweverarguethatare- inthistreatment.
strictionofthetheorytoadiscretesetoftemporalscalelev-
elsislessofacontraint,sincethesignalprocessingsystem
2.3 Previousworkonmethodsforscaleselection
anyway has to be based on a finite amount of sensors and
hardware/wetwareforsamplingandprocessingthecontinu-
Ageneralframeworkforperformingscaleselectionforlocal
ousstreamofsignaldata.
differential operations was proposed in Lindeberg [60,61]
Inthespecialcasewhenallthetimeconstantsareequal,
basedonthedetectionoflocalextremaoverscaleofscale-
the corresponding temporal kernels in the temporal scale-
normalizedderivativeexpressionsandthenrefinedinLinde-
spacemodelbasedontruncatedexponentialkernelscoupled
berg[66,65]—seeLindeberg[68,75]fortutorialoverviews.
in cascade have compact explicit expressions that are easy
Thisscaleselectionapproachhasbeenappliedtoalarge
tohandlebothinthetemporaldomainandinthefrequency
number of feature detection tasks over spatial image do-
domain,whichsimplifiestheoreticalanalysis.Thesekernels
mains including detection of scale-invariant interest points
obeyasemi-grouppropertyovertemporalscales,butarenot
(Lindeberg [66,74], Mikolajczyk and Schmid [92]; Tuyte-
scaleinvariantandleadtoslowertemporaldynamicswhen
laars and Mikolajczyk [117]), performing feature tracking
alargernumberofprimitivetemporalfiltersarecoupledin
(Bretzner and Lindeberg [10]), computing shape from tex-
cascade(Lindeberg[77,78]).
ture and disparity gradients (Lindeberg and Ga˚rding [84];
Inthespecialcasewhenthetemporalscalelevelsinthis Ga˚rdingandLindeberg[24]),detecting2-Dand3-Dridges
scale-spacemodelarelogarithmicallydistributed,theseker- (Lindeberg[65];Satoetal.[106];Frangietal.[23];Krissian
nelshaveamanageableexplicitexpressionovertheFourier et al. [48]), computing receptive field responses for object
domain that enables some closed-form theoretical calcula- recognition (Chomat et al. [13]; Hall et al. [28]), perform-
tions.Derivinganexplicitexpressionoverthetemporaldo- inghandtrackingandhandgesturerecognition(Bretzneret
mainis,however,harder,sincetheexplicitexpressionthen al.[9])andcomputingtime-to-collision(Negreetal.[96]).
correspondstoalinearcombinationoftruncatedexponential Specifically,verysuccessfulapplicationshavebeenachi-
filtersforallthetimeconstants,withthecoefficientsdeter- eved in the area of image-based matching and recognition
minedfromapartialfractionexpansionoftheFouriertrans- (Lowe[89];Bayetal.[5];Lindeberg[71,76]).Thecombi-
form,whichmayleadtorathercomplexclosed-formexpres- nation of local scale selection from local extrema of scale-
sions. Thereby certain analytical calculations may become normalizedderivativesoverscales(Lindeberg[61,66])with
harder to handle. As shown in [78] and Appendix C, some affine shape adaptation (Lindeberg and Garding [85]) has
such calculations can on the other hand be well approx- madeitpossibletoperformmulti-viewimagematchingover
imated via a scale-time approximation of the time-causal largevariationsinviewingdistancesandviewingdirections
temporalscale-spacekernels.Whenusingalogarithmicdis- (Mikolajczyk and Schmid [92]; Tuytelaars and van Gool
tributionofthetemporalscales,thecomposedtemporalker- [116]; Lazebnik et al. [53]; Mikolajczyk et al. [93]; Roth-
nels do however have very good temporal dynamics and ganger et al. [104]). The combination of interest point de-
muchbettertemporaldynamicscomparedtocorresponding tectionfromscale-spaceextremaofscale-normalizeddiffer-
kernelsobtainedbyusingtruncatedexponentialkernelswith ential invariants (Lindeberg [61,66]) with local image de-
equal time constants coupled in cascade. Moreover, these scriptors (Lowe [89]; Bay et al. [5]) has made it possible
kernels lead to a computationally very efficient numerical todesignrobustmethodsforperformingobjectrecognition
implementation.Specifically,thesekernelsallowforthefor- of natural objects in natural environments with numerous
mulation of a time-causal limit kernel that obeys scale in- applications to object recognition (Lowe [89]; Bay et al.
varianceundertemporalscalingtransformations,whichcan- [5]), object category classification (Bosch et al. [8]; Mutch
notbeachievedifusingauniformdistributionofthetempo- and Lowe [36]), multi-view geometry (Hartley and Zisser-
ralscalelevels(Lindeberg[77,78]). man[30]),panoramastitching(BrownandLowe[12]),au-
Thetemporalscale-spacerepresentationsobtainedfrom tomatedconstructionof3-Dobjectandscenemodelsfrom
the self-similar time-causal semi-groups have a continuous visualinput(BrownandLowe[11];Agarwaletal.[3]),syn-
scaleparameterandobeytemporalscaleinvariance(Fager- thesisofnovelviewsfrompreviousviewsofthesameobject
stro¨m[20];Lindeberg[70]).Thesekernelsdo,however,have (Liu[86]),visualsearchinimagedatabases(Lewetal.[54];
lessdesirabletemporaldynamics(seeAppendixAforagen- Dattaetal.[14]),humancomputerinteractionbasedonvi-
eraltheoreticalargumentaboutundesirableconsequencesof sual input (Porta [102]; Jaimes and Sebe [38]), biometrics
imposingatemporalsemi-grouppropertyontemporalker- (Bicegoetal.[7];Li[55])androbotics(Seetal.[110];Si-
nelswithtemporaldelays)and/orleadtopseudodifferential cilianoandKhatib[113]).
Temporalscaleselectionintime-causalscalespace 9
Alternativeapproachesforperformingscaleselectionover to analyse the theoretical scale selection properties for dif-
spatialimagedomainshavealsobeenproposedintermsof ferenttypesofmodelsignals.
(i)detectingpeaksofweightedentropymeasures(Kadirand
Brady[40])orLyaponovfunctionals(Sporringetal.[114])
overscales,(ii)minimisingnormalizederrormeasuresover
3 Scaleselectionpropertiesforthenon-causalGaussian
scale (Lindeberg [67]), (iii) determining minimum reliable
temporalscalespaceconcept
scalesforedgedetectionbasedonanoisesuppressionmodel
(ElderandZucker[18]),(iv)determiningatwhatscalelev-
In this section, we will present an overview of theoretical
els to stop in non-linear diffusion-based image restoration
propertiesthatwillholdiftheGaussiantemporalscale-space
methods based on similarity measurements relative to the
conceptisappliedtoanon-causaltemporaldomain,ifaddi-
originalimagedata(Mra´zekandNavara[95]),(v)bycom-
tionallythescaleselectionmechanismthathasbeendevel-
paringreliabilitymeasuresfromstatisticalclassifiersfortex-
opedforanon-causalspatialdomainisdirectlytransferred
ture analysis at multiple scales (Kang et al. [41]), (vi) by
toanon-causaltemporaldomain.Thesetoftemporalscale-
computing image segmentations from the scales at which
space properties that we will arrive at will then be used as
asupervisedclassifierdeliversclasslabelswiththehighest
a theoretical base-line for developing temporal scale-space
reliabilitymeasure(Loogetal.[88];Lietal.[56]),(vii)se-
propertiesoveratime-causaltemporaldomain.
lecting scales for edge detection by estimating the saliency
ofelongatededgesegments(Liuetal.[87])or(viii)consid-
ering subspaces generated by local image descriptors com-
putedovermultiplescales(Hassneretal.[31]). 3.1 Non-causalGaussiantemporalscale-space
More generally, spatial scale selection can be seen as a
specific instance of computing invariant receptive field re- Overaone-dimensionaltemporaldomain,axiomaticderiva-
sponses under natural image transformations, to (i) handle tionsofatemporalscale-spacerepresentationbasedonthe
objects in the world of different physical size and to ac- assumptions of (i) linearity, (ii) temporal shift invariance,
countforscalingtransformationscausedbytheperspective (iii)semi-grouppropertyovertemporalscale,(iv)sufficient
mapping,andwithextensionsto(ii)affineimagedeforma- regularitypropertiesovertimeandtemporalscaleand(v)non-
tions to account for variations in the viewing direction and enhancementoflocalextremaimplythatthetemporalscale-
(iii)Galileantransformationstoaccountforrelativemotions spacerepresentation
betweenobjectsintheworldandtheobserveraswellasto
(iv)illuminationvariations(Lindeberg[73]). L(·; τ,δ)=g(·; τ,δ)∗f(·) (1)
Early theoretical work on temporal scale selection in a
time-causalscalespacewaspresentedinLindeberg[64]with
shouldbegeneratedbyconvolutionwithpossiblytime-delayed
primary focus on the temporal Poisson scale-space, which
temporalkernelsoftheform(Lindeberg[70])
possesses a temporal semi-group structure over a discrete
tdiemlaey-csa(usseaelAtepmppeonrdailxdAomfoarinawgehnileeralelathdeinogrettoiclaolnagrgteummpeonrta)l. g(t; τ,δ)= √1 e−(t−2τδ)2 (2)
2πτ
Temporalscaleselectioninnon-causalGaussianspatio-temp-
oralscalespacehasbeenusedbyLaptevandLindeberg[50]
whereτ isatemporalscaleparametercorrespondingtothe
andWillemsetal.[121]forcomputingspatio-temporalin-
variance of the Gaussian kernel and δ is a temporal delay.
terest points, however, with certain theoretical limitations
Differentiatingthekernelwithrespecttotimegives
thatareexplainedinacompanionpaper[79].1 Thepurpose
of this article is to present a much further developed and
(t−δ)
more general theory for temporal scale selection in time- gt(t; τ,δ)=− τ g(t; τ,δ) (3)
causal scale spaces over continuous temporal domains and
((t−δ)2−τ)
g (t; τ,δ)= g(t; τ,δ) (4)
1 The spatio-temporal scale selection method in (Laptev and Lin- tt τ2
deberg[50])isbasedonaspatio-temporalLaplacianoperatorthatis
notscalecovariantunderindependentrelativescalingtransformations seethetoprowinFigure1forgraphs.Whenanalyzingpre-
of the spatial vs. the temporal domains [79], which implies that the
recordedtemporalsignals,itcanbepreferabletosetthetem-
spatialandtemporalscaleestimatewillnotberobustunderindepen-
poraldelaytozero,leadingtotemporalscale-spacekernels
dentvariabilitiesofthespatialandtemporalscalesinvideodata.The
spatio-temporalscaleselectionmethodappliedtothedeterminantof havingasimilarformasspatialGaussiankernels:
thespatio-temporalHessianin(Willemsetal.[121])doesnotmake
uersaetoorfst[h7e9f]ualnldflehxaisbniloittyporefvtihoeusnloytiboenenofdγev-neolorpmeadliozveedrdaetriimvaet-ivceauospa-l g(t; τ)= √1 e−2t2τ. (5)
spatio-temporaldomain. 2πτ
10 TonyLindeberg
3.2 Temporalscaleselectionfromscale-normalized over scales of γ-normalized derivatives are preserved un-
derivatives derscalingtransformations.Specifically,thisscaleinvariant
propertyimpliesthatifalocalscaletemporallevellevelin
Asaconceptualbackgroundtothetreatmentsthatweshall dimensionoftimeσ =τ isselectedtobeproportionaltothe
√
laterdevelopregardingtemporalscaleselectionintime-causal temporalscaleestimateσˆ = τˆsuchthatσ = Cσˆ,thenif
temporal scale spaces, we will in this section describe the thetemporalsignalf istransformedbyatemporalscalefac-
theoreticalstructurethatarisesbytransferringthetheoryfor torS,thetemporalscaleestimateandthereforealsothese-
scale selection in a Gaussian scale space over a spatial do- lectedtemporalscalelevelwillbetransformedbyasimilar
maintothenon-causalGaussiantemporalscalespace: temporalfactorσˆ(cid:48) =Sσˆ,implyingthattheselectedtempo-
Given the temporal scale-space representation L(t; τ) ralscalelevelswillautomaticallyadapttovariationsinthe
of a temporal signal f(t) obtained by convolution with the characteristictemporalscaleofthesignal.Thereby,suchlo-
Gaussiankernelg(t; τ)accordingto(1),temporalscalese- calextremaovertemporalscaleprovideatheoreticallywell-
lection can be performed by detecting local extrema over foundedwaytoautomaticallyadaptthescalelevelstolocal
temporalscalesofdifferentialexpressionsexpressedinterms scalevariations.
ofscale-normalizedtemporalderivativesatanyscaleτ ac- Specifically,scale-normalizedscale-spacederivativesof
cordingto(Lindeberg[66,65,68,75]) ordernatcorrespondingtemporalmomentswillberelated
accordingto
∂ =τnγ/2∂ , (6)
ζn tn
L(cid:48) (t(cid:48); τ(cid:48))=Sn(γ−1)L (t; τ) (10)
where ζ = t/τγ/2 is the scale-normalized temporal vari- ζ(cid:48)n ζn
able, n is the order of temporal differentiation and γ is a whichmeansthatγ = 1impliesperfectscale-invariancein
free parameter. It can be shown [66, Section 9.1] that this thesensethattheγ-normalizedderivativesatcorresponding
notion of γ-normalized derivatives corresponds to normal- points will be equal. If γ (cid:54)= 1, the difference in magnitude
izing the nth order Gaussian derivatives g (t; τ) over a
ζn can on the other hand be easily compensated for using the
one-dimensionaldomaintoconstantL -normsoverscaleτ
p scalevaluesofthecorrespondingscale-adaptiveimagefea-
(cid:18)(cid:90) (cid:19)1/p tures(seebelow).
(cid:107)g (·; τ)(cid:107) = |g (t; τ)|pdt =G (7)
ζn p ζn n,γ
t∈R
3.3 Temporalpeak
with
1
p= (8) For a temporal peak modelled as a Gaussian function with
1+n(1−γ)
varianceτ
0
wheretheperfectlyscaleinvariantcaseγ = 1corresponds
toL1-normalizationforallordersnoftemporaldifferentia- g(t; τ0)= √21πτ e−2tτ20. (11)
0
tion.
itcanbeshownthatscaleselectionfromlocalextremaover
Temporalscaleinvariance. Ageneralandveryusefulscale scaleofsecond-orderscale-normalizedtemporalderivatives
invariant property that results from this construction of the
notionofscale-normalizedtemporalderivativescanbestated
L =τγL (12)
asfollows:Considertwosignalsf andf(cid:48)thatarerelatedby ζζ tt
atemporalscalingtransformation
implies that the scale estimate at the position t = 0 of the
f(cid:48)(t(cid:48))=f(t) with t(cid:48) =St, (9) peak will be given by (Lindeberg [65, Equation (56)] [74,
Equation(212)])
and assume that there is a local extremum over scales at
2γ
(t0; τ0)inadifferentialexpressionDγ−normLdefinedasa τˆ= 3−2γ τ0. (13)
homogeneouspolynomialofGaussianderivativescomputed
from the scale-space representation L of the original sig- Ifwerequirethescaleestimatetoreflectthetemporaldura-
nal f. Then, there will be a corresponding local extremum tionofthepeaksuchthat
over scales at (t(cid:48); τ(cid:48)) = (St ; S2τ ) in the correspond-
0 0 0 0
ing differential expression Dγ−normL(cid:48) computed from the τˆ=q2τ0, (14)
scale-space representation L(cid:48) of the rescaled signal f(cid:48) [66,
thenthisimplies
Section4.1].
This scaling result holds for all homogeneous polyno- 3q2
γ = (15)
mial differential expression and implies that local extrema 2(q2+1)