BOOSTINGDICTIONARYLEARNING WITHERROR CODES YigitOktar†, Mehmet Turkan‡ †DepartmentofComputerEngineering ‡Department ofElectrical and ElectronicsEngineering IzmirUniversityofEconomics,Izmir, Turkey 7 1 0 ABSTRACT to simple formulations, hence to efficient techniques in practice. 2 MethodofOptimalDirections(MOD)[16]isoneofthosesuccess- Inconventionalsparserepresentationsbaseddictionarylearningal- ful methods of non-parametric dictionary learning. MOD builds n gorithms,initialdictionariesaregenerallyassumedtobeproperrep- upontheK-meansprocessalternatingbetweenasparsecodingstep a resentatives of the system at hand. However, this may not be the J case,especiallyinsomesystemsrestrictedtorandominitializations. andaleast-squaresoptimizationbasedupdatestepofthedictionary. Although MOD is effective for low dimensional cases, because of 5 Therefore,asupposedlyoptimalstate-updatebasedonsuchanim- pseudo-inverse computation, itbecomes intractableinhighdimen- 1 propermodelmightleadtoundesiredeffectsthatwillbeconveyed sional problems. K–SVD [17] is a similar algorithm, alternating tosuccessiveiterations.Inthispaper,weproposeadictionarylearn- between sparse coding and dictionary update. The difference in ] ing method which includes a general feedback process that codes V dictionary update step is that, instead of updating D as a whole, theintermediateerrorleftoverfromalessintensiveinitiallearning K–SVD updates single dictionary atoms and corresponding sparse C attempt, and then adjusts sparse codes accordingly. Experimental codes, usingtherank-1approximation byminimizingtheapproxi- observationsshowthatsuchanadditionalstepvastlyimprovesrates . s ofconvergenceinhigh-dimensionalcases,alsoresultsinbettercon- mation error when the corresponding atom has been isolated from c the dictionary. Therefore, these iterative updates affect following verged states in the case of random initializations. Improvements [ ones,presumablyacceleratingconvergence. alsoscaleupwithmorelenientsparsityconstraints. 1 It isimportant to note that, both in MOD and K–SVD, sparse Index Terms— Dictionary learning, sparse coding, error cod- v ing,boosting. codingwithℓ0normconstraintisassumed.Codingmethodsforsuch 8 constraint include matching pursuit (MP) [24] and orthogonal MP 1 (OMP)[25]algorithmsingeneral. However,thereisnotheoretical 0 1. INTRODUCTION restrictiontouseℓ1normconstraintinstead. Therefore,inessence, 4 bothMODandK–SVDdefinespecificdictionaryupdateprocedures 0 Oneofthewell-knownpropertiesoftheFouriertransformandthe resultinginnon-structurallearneddictionaries,ratherthandefining . wavelettransformistoexploitcertainstructuresoftheinputsignal 1 thetypeofsparsecodingconstraint. Ontheotherhand,certaindic- and to represent these structures ina compact (or sparse) manner. 0 tionary learning algorithms are specifically based on coding with Sparse representations have thus become an active research topic 7 ℓ1 norm constraint. In this context, coefficient learning will be a whileprovidinggoodperformanceinalargediversityofsignaland 1 moreappropriatetermthansparsecoding whenreferringtocalcu- imageprocessingapplications[1,2,3,4,5,6,7,8,9,10]. : lationofsparserepresentationcoefficients. Basispursuit(BP)[26], v Sparserepresentationsconsistinrepresentingmostorallinfor- LARS[27],FOCUSS[28]canbelistedasexamples. Gradientde- i mationcontainedinasignalwitha(linear)combinationofasmall X scentorLagrangedualmethodscanthenbechainedasadictionary number of atoms carefully chosen from an over-complete (redun- updatestep[29].Thesealternativealgorithmsareprovedtobemore r dant)basis. Thisbasisisgenerallyreferredtoasadictionary. Such a efficientespeciallyintermsofconvergencerates. adictionary isacollectionofatomswhose number ismuchlarger thanthedimension ofthesignal space. Anysignal thenadmitsan Accomplishments ofsuchℓ1 normbasedalgorithmsovercon- infinitenumberofsparserepresentationsandthesparsestsuchrep- ventionalℓ0onesareapparent.However,asnotedbefore,ℓ1normis arelaxationontheoriginalproblem,soitdoesnotguaranteeanop- resentation happens to have interesting properties for a number of timalsolutioningeneral. Suchapproach isespeciallyproblematic signalandimageprocessingtasks. in low-dimensional cases. On the other hand, because of its non- Themostcrucialquestioninsparserepresentationsisthechoice ofthedictionaryD. Onecanrealizeavarietyofpre-definedsetsof convexity properties, ℓ0 norm approach leads to low convergence rates,beingfeasibleonlyinlow-dimensionalcases. waveforms, suchaswavelets[11],curvelets[12],contourlets [13], shearlets [14], bandelets [15]. However, both the sparsity and the Ausualdeficiencyinsparsecodingstepisthat,algorithmslisted qualityoftherepresentationdependonhowwelltheuseddictionary above assume proper dictionaries at each iteration. Thisis indeed isadaptedtothedataathand.Theproblemofdictionarylearning,or veryproblematic,especiallyatinitialstagesofthelearningprocess. evensimplyfindingadaptivewaystoconstructortoselectrelevant Inmanysituations,initialdictionarywillnotbeagoodrepresenta- dictionaries,forsparserepresentations—thatgoesfarbeyondusing tiveoftheoptimalone.Therefore,“optimal”codingdonewithsuch afewbases(likeDCT,DFT,wavelets,ortheothers)—hastherefore adictionary,astargetedbybothℓ0andℓ1normcodingschemes,will becomeakeyissueforfurtherprogressinthisarea. mostlikelyresultinsparsecodes,whichalsoarenotgoodrepresen- Various dictionary learning algorithms have been proposed in tativesof optimal state. Asaresult, thenext dictionarywilladopt theliterature,e.g.,[16,17,18,19,20,21,22,23].Mostofthesemeth- thisundesiredpropertytoacertainextentandconveyittosuccessive ods focus on ℓ0 or ℓ1 norm sparsity measures, potentially leading learningsteps. Inthispaper, weproposeagenericmodificationto sparsecodingorthecoefficientlearningstep,withanerrorfeedback Algorithm1EcMODalgorithmpseudocode. processbycodinganintermediateerrorandadjustingsparsecodes 1: functionECMOD(D,Y,m,k) accordinglyinalessintensivelearningattempt,henceleadingtoa 2: whilenotconvergeddo fasterconvergencewhencomparedtotheconventionalapproaches. 3: A←OMP(D,Y,m) This paper is organized as follows. Section 2 describes the 4: D←YA+ details of the proposed dictionary learning algorithm. Section 3 5: E←Y−DA demonstrates the experimental setup and the obtained results. Fi- 6: B←OMP(D,E,k−m) nally,Section4concludesthispaper. 7: D←Y(A+B)+ returnD 2. DICTIONARYLEARNINGWITHERRORCODES Algorithm2EcMOD+algorithmpseudocode. 2.1. DictionaryLearning functionECMOD+(D,Y,m,k) Theproblemofdictionarylearningforsparserepresentationscanbe 2: whilenotconvergeddo formulatedasanon-convexoptimizationproblemintheformofthe D←EcMOD(D,Y,m,k) equationasbelow, 4: X←OMP(D,Y,k) D←YX+ M returnD argminXkyi−Dxik22 subjectto kxik0 ≤k. (1) D,{xi} i=1 Heretheset{yi},i= 1...M,representstrainingsamples,Disthe Letusnowdenote{a∗i}andD∗astheresultingsparsecodesandthe dictionary to be learned and x is the sparse representation of the dictionary,respectively.Thesecondstageinvolvessparsecodingthe i sampleyi, ∀i. Parameterk definesthesparsityconstraintallowed approximationerrorei=yi−D∗a∗i,∀i,as forsparsecodingduringthelearningprocess. SinceEqn. (1)poses saonlvoinn-gcotnhvisexproopbtliemmiziastitohnoupgrohbtlteombtehrNoPu-ghharadn[ℓ300]n.oHrmowceovnesrt,raiitnits, argminXM kei−D∗bik22 s.t. kbik0 ≤k−m. (5) shown that for many high dimensional cases ℓ1 norm constraint is {bi} i=1 enoughtoensuresparsestsolution[31].Notethatℓ1normconstraint After acquiring {b∗} in Eqn. (5), current-state sparse codes can also turns the problem into a convex optimization one, which can i further be updated asa∗+b∗. This stepbasically corresponds to thenbesolvedviaregularconvexoptimizationtools. However,this i i somesortoffeedbacklogic,wherethefirstapproximationistested shouldstillberegardedasanapproximationtotheoriginalproblem. andthenitsdeviationissparsecodedtobeincorporatedintoactual In essence, traditional algorithms split Eqn. (1) into two ap- codes.Noteherethattheoriginalsparsityconstraintstillholdssince proximatesubproblemsandalternatebetweenthesetwosimplerbut convexoptimizationproblems,namelysparsecodinganddictionary ka∗i +b∗ik0 ≤k. Inthelaststage,afinaldictionaryupdateisper- formedasinEqn.(6)andaniterationiscompleted, updatesteps,tofindasolutionasfollows, M M argminXkyi−Dxik22 subjectto kxik0 ≤k, (2) argminXkyi−D(a∗i +b∗i)k22. (6) {xi} i=1 D i=1 M argminXkyi−Dx∗ik22. (3) 3. EXPERIMENTALRESULTS D i=1 Eqn. (2)correspondstothesparsecodingstepwherethedictionary Twovariantsof theproposed schemehavebeen testedexperimen- Disassumedtobefixed.Thissubproblemcaneasilybesolvedwith tally,namelyEcMODinAlgorithm1andEcMOD+inAlgorithm2. anypursuit algorithm. Eqn. (3)defines thedictionary update step EcMODincludesthemethodologythatisdefinedinSection2,and thatisperformedwith{x∗},i.e.,sparsecodesacquiredfromsparse EcMOD+includesaregularleast-squaresdictionaryupdate(MOD) i coding.Onedirectwaytosolvethissubproblemistheleast-squares attheendofeachiteration.OMPisusedforsparsecoding. optimizationasinMOD,suchthatD∗ =YX+whereY={yi}M1 Two experimental setups have been performed, corresponding andX = {x∗}M representthetrainingsamplesmatrixandsparse to low and high dimensional cases respectively. In the first setup, i 1 codesmatrix,respectively.X+istheMoore-Penrosepseudo-inverse 8×8distinctpatcheswereextractedfromtheBarbaraimageofsize ofX. Afulliterationiscompletedalternatingbetweensparsecod- 512×512, resulting in 4096 image patches. Dictionary size was inganddictionaryupdate,andthisprocedureisrepeateduntilcon- accordinglychosenas64×256.Sparsityconstraintkandadditional vergence. sparsity parameter m were chosen as 8 and 4 respectively, so m beingequaltok/2.Resultscorrespondingtothissetuparepresented inFigure1,Figure2andTable1. 2.2. IntroducingErrorFeedback In error coded schemes, as aconsequence of not directlycod- We propose a formulation that incorporates an intermediate error ing with sparsity k, final codes a∗ + b∗ may not necessarily be i i into the learning process. In the first stage, a regular sparse cod- optimal for k. However, asthere are two coding steps withlesser inganddictionaryupdateprocedureisperformedbutwithasparsity sparsityconstraintsandasummation,codeshaveahigherchanceof levelm<kbysolvingEqn.(4). being optimal for most of the sparsity levelsless than k. Asare- sult,convergeddictionaryisabetterrepresentativeofsuchsparsity M aDrg,{maii}nXi=1kyi−Daik22 subjectto kaik0 ≤m (4) lceovnesliss,teanstolybspeerrvfaobrmlebinetttheerirnessuplatsrsienrTcaasbeles.1. Errorcodedschemes Fig.1. ResultsofdictionarylearningperformedontheBarbaraimage. Dictionarysize64×256,k=8forbothlearningandtesting. Each column represents adifferent initialdictionary. Firstand middle columns are randomly initialized. Last column isof DCT initialization. FiguresdepictPSNR(dB)performancevaluesversusiterationnumber. k=2 k=5 k=10 k=20 Rand.1. MOD 26.33 32.08 36.82 40.43 EcMOD 26.91 33.90 37.87 41.72 EcMOD+ 26.89 33.90 38.10 41.86 Rand.2. MOD 26.61 32.20 36.89 40.68 EcMOD 27.92 33.36 36.88 40.25 EcMOD+ 26.73 33.87 38.08 41.83 DCTInit. MOD 26.22 33.02 38.40 42.82 EcMOD 26.95 34.05 38.06 41.98 EcMOD+ 26.94 33.96 38.19 42.07 Table 1. Average approximation PSNR (dB) performance values oflearntdictionariesasinFigure1andFigure2. k representsthe sparsityusedfortesting. Fig.2.Resultsfortencasesofuniformlyrandominitialdictionaries. MOD. Although not targeting optimal codes, error coded scheme Lowerandupperlinesofeachmethodcorrespondtominimumand EcMOD+ converges to DCT result in all 10 random initialization maximumPNSR(dB)valuesattainedamongalltencases. Middle cases. This is possible because, in each step from the beginning, linesrepresentaveragevalues. dictionarypassesthroughalessintensivevalidation,intheexpense of acquiring optimal codes. There is no total superiority in DCT initializationcaseasseen inTable 1. However, superiority can be Not targeting a sparsity level k directly leads to a possibility achievedwithmorecomplexerrorcodingschemes. ofconvergeddictionarytobesuboptimalforthatgivenk.However, Finally,regardingoverallsparsitylevelswithinerrorcodesand thisdrawbackcanbeworkedaroundbychainingaconventionalstep itsevolutionduringlearningprocess,theproposedmethodpresents that targets an exact sparsity level k. Referred to as EcMOD+ al- interestingtrends.Inconventionalsparsecoding(targetingthespar- gorithm, experimentswiththisfurthermodifiedmethodshowthat, sity k), as approximation threshold is kept very strict in general, such architecture possesses optimality for k sparsity and also bet- sparsecodesendupwithusingallksupports.Therefore,inmethods ter performance for the cases where sparsity level is less than k. suchasMODandK–SVD,codesconsistentlyhaveksupportseven Thisphenomenon isapparentinFigure1, wherelearningandtest- starting from the initial iteration. In the proposed method, during ing kchosen both as 8. Performance of EcMOD isnot consistent error coding, selection of previously selected supports is frequent. asitperformsmuchlikeMODinsecondrandominitializationcase. This is especially observable during initial iterations. Near maxi- Whereas,EcMOD+consistentlyperformswell. mumsupportcountsaregraduallyreachedasthesystemconverges, Inamoreextensivemanner,Figure2comparestheperformance butnotnecessarilyreachingexactmaximum. of MOD and EcMOD+ in the case of ten different randomly ini- In the second set of experiments, EcMOD+ scheme has been tialized dictionaries. DCT convergence is supplied as a baseline. testedwithallpossible255025imagepatchesextractedwithafull Thisfigure representssuperiority of thiserror coding scheme over coverageofslidingwindowalgorithmwithawindowsizeof8×8. conventionalcodinginthecaseofrandominitializations.”Optimal” Combinationsofsparsityof4,8,16,and32againstsmallandlarge coding with an improper random dictionary within initial stages dictionaries weretested. DCT wasused asinitial dictionary in all hamper the final convergence state as observable in the case of cases.Notethat,inDCTinitializationcases,thereisonlyanadvan- Fig.5. A frequency subdomain of learnt dictionaries formed with sameprocessingtime.MODontheleft,EcMOD+ontheright. the convergence plot for dictionary size 64 x 256 and k as 8 for bothlearningandtesting.Significantgaininconvergencerateisob- servablewiththeerrorcodedschemeinthishigh-dimensionalsetup. Notethat,mean-squarederrorisgivenasmeasuresinceslidingwin- Fig.3. Performancegainfactorfor differentsparsitylevels, inthe dow patches wereused. Finally, Figure5compares atomsthat lie caseofarelativelysmallandalargedictionary. withinsimilarfrequencydomains,learntwithMODandEcMOD+. Noteherethewell-definedstructureofEcMOD+. 4. CONCLUSION MOD,byitselfisagreedyalgorithmthattargetsoptimalityonetask at a time. Tasks are considered as isolated from each other, even withinthesameiteration. ThisresultsinMODbeingarathershort- sightedmethodwhichfailsattasksthatrequireabroaderperspective ofthesystem. Inthispaper,wepresentedamethodinwhichsparsecodingand dictionary update steps are intertwined through intermediate error codes. Notethattherecouldbeotherwaystoaccomplishthis. An- otherwaycouldbetoaddsparsecodesoftwosuccessiveiterations and perform a dictionary update based on this accumulated code, withoutevenintroducingerrorcodes. Asananalogy,MODcanbe consideredasasingle-stepnumericalmethod,whereastheexample givenwouldbeamulti-stepone.Ourmethodcanbeconsideredasa multi-stepapproachthatutilizesahalf-step. Tosummarize,ourframeworkisgenericenoughtobeincluded inmanyformsoflearning-basedapproaches.Inessence,ourscheme includes an initialattempt of learning withlesscomputational and Fig.4. Theconvergence performancefor dictionarysize64x256 spatialrequirementthanoriginallyallocated. Thiscorrespondstoa andkas8forbothlearningandtesting. singleiterationofMODperformedwithm<k.Inthisway,afeed- backcanbeacquiredthatreflectsthecongruenceofthemodeland thedataathand,sothatcurrentstatecanproperlybeadjustedbefore tageoffasterconvergenceratesbutnotofbetterconvergedstates,at thefinalmodelupdate,whichconsumestheremainingresources. least for this error coding scheme. Superior converged stateswith This approach will be most beneficial for systems that are re- more complex schemes have been achieved, but they are omitted strictedtorandominitializations(asapparentinFigure1,2andTa- herebecauseofthespacelimitation. ble 1). A random initial model is most likely to be an improper Experimental results with the second setup are summarized in representativeof aspecificsystem. Therefore, anupdate based on Figure3andFigure4. InFigure3,performanceratioswerecalcu- thismodel,nomatterhowintensiveitis,willresultinanundesired latedrelativetotheMODalgorithmintermsofmean-squarederror state.Infact,anoptimalupdatebasedonthisimpropermodelcould toestimateaperformancegainfactorforeachsparsitylevel,approx- bemoreimpairingthanasuboptimaloneinthisregard. imatelyatfifthequivalentiteration,foranapproximateconvergence Asa concluding remark, readers should bear inmind that this rateanalysis. Gainsforlargedictionaryinthecaseoflearningwith workisbasedonapragmaticperspective.Althoughsatisfactoryim- lenient sparsity levels are more striking, but stricter sparsity con- provementhasbeenobserved,amorerigoroustheoreticalapproach straintscausesubstandardperformance.Overall,theperformancein canleadtocertainvariationsbuiltontopofthisframeworkthatwill thiscaseispromisingasitsignalstoscalingwithinputsize.Smaller befarmorefruitful. sized dictionary safely performs above standard. Figure 4 depicts References [19] O. G. Sezer, O. Harmanci, and O. G. Guleryuz. “Sparse orthonormal transforms for image compression”. In: Proc. [1] M.EladandM.Aharon.“Imagedenoisingviasparseandre- IEEEInt.Conf.ImageProcess.2008,pp.149–152. dundantrepresentationsoverlearneddictionaries”.In:IEEE [20] J. Mairal et al. “Online learning for matrix factorization Trans.ImageProcess.15.12(2006),pp.3736–3745. andsparsecoding”.In:J.MachineLearningResearch11.1 [2] M. Protter and M. Elad. “Image sequence denoising via (2010),pp.19–60. sparse and redundant representations”. In: IEEETrans. Im- [21] K.SkrettingandK.Engan.“Recursiveleastsquaresdictio- ageProcess.18.1(2009),pp.27–35. nary learning algorithm”. In: IEEE Trans. Signal Process. [3] G.Peyre.“Sparsemodelingoftextures”.In:J.Math.Imag. 58.4(2010),pp.2121–2130. Vis.34.1(2009),pp.17–31. [22] R. Rubinstein, M. Zibulevsky, and M. Elad. “Double spar- [4] J. Mairal, M. Elad, and G. Sapiro. “Sparse representation sity: Learning sparse dictionaries for sparse signal approx- forcolorimagerestoration”.In:IEEETrans.ImageProcess. imation”. In: IEEE Trans. Signal Process. 58.3 (2010), 17.1(2008),pp.53–69. pp.1553–1564. [5] J. Mairal, G. Sapiro, and M. Elad. “Learning multiscale [23] J. Zepeda, C. Guillemot, and E. Kijak. “Image compres- sparserepresentations for image andvideo restoration”. In: sion using sparse representations and the iteration-tuned SIAMMultiscaleModel.Simul.7.1(2008),pp.214–241. and aligned dictionary”. In: IEEEJ. Selected TopicsSignal [6] O. Bryt and M. Elad. “Compression of facial images using Process.5.5(2011),pp.1061–1073. theK-SVDalgorithm”.In:J.VisualCommun.ImageRepre- [24] S. G. Mallat and Z. Zhang. “Matching pursuits with time- sent.19.4(2008),pp.270–283. frequency dictionaries”. In: IEEE Trans. Signal Process. [7] L. Peotta, L. Granai, and P. Vandergheynst. “Image com- 41.12(1993),pp.3397–3415. pression using an edge adapted redundant dictionary and [25] Y.C.Pati,R.Rezaiifar,andP.S.Krishnaprasad.“Orthogo- wavelets”.In:SignalProcess.86.3(2006),pp.444–456. nalmatchingpursuit:Recursivefunctionapproximationwith [8] M. J. Fadili, J. L. Starck, and F. Murtagh. “Inpainting and applications towavelet decomposition”. In: Proc. Asilomar zoomingusingsparserepresentations”.In:ComputerJ.52.1 Conf.Signals,Systems,Computers.1993,pp.40–44. (2007),pp.64–79. [26] S.ChenandD.Donoho.“Basispursuit”.In:Proc.Asilomar [9] J.Mairaletal.“Discriminativelearneddictionariesforlocal Conf.Signals,Systems,Computers.1994,pp.41–44. imageanalysis”.In:Proc.IEEEComp.Soc.Conf.Comp.Vis. [27] B. Efron et al. “Least angle regression”. In: The Annals of PatternRecog.2008,pp.1–8. Statistics32.2(2004),pp.407–499. [10] H.Y.LiaoandG.Sapiro.“Sparserepresentationsforlimited [28] I.F.Gorodnitsky,J.S.George,andB.D.Rao.“Neuromag- datatomography”.In:Proc.IEEEInt.Symp.Biomed.Imag.: netic source imaging with FOCUSS: a recursive weighted NanoMacro(ISBI).2008,pp.1375–1378. minimum norm algorithm”. In: Electroencephalography, [11] S. Mallat. A Wavelet Tour of Signal Processing. Academic ClinicalNeurophysiology95.4(1995),pp.231–251. Press,2008. [29] H. Lee et al. “Efficient sparse coding algorithms”. In: Adv. [12] E. J. Candes and D. L. Donoho. Curvelets - A Surpris- NeuralInformationProcess.Sys.2006,pp.801–808. inglyEffectiveNonadaptiveRepresentationforObjectswith [30] A.M.Tillmann.“Onthecomputationalintractabilityofex- Edges.VanderbiltUniversityPress,(CurveandSurfaceFit- act and approximate dictionary learning”. In: IEEE Signal ting:St-Malo,pp.105-120),1999. Process.Letters22.1(2015),pp.45–49. [13] M.N.DoandM.Vetterli.“Thecontourlettransform:Anef- [31] D. L. Donoho. “For most large underdetermined systems ficientdirectionalmultiresolutionimagerepresentation”.In: of linear equations the minimal L1-norm solution is also IEEETrans.ImageProcess.14.12(2005),pp.2091–2106. the sparsest solution”. In: Comm. Pure Applied Math. 59.6 [14] D.Labateetal.“Sparsemultidimensionalrepresentationus- (2006),pp.797–829. ingshearlets”. In:Proc.SPIE:WaveletsXI.2005, pp. 254– 262. [15] E.LePennecandS.Mallat.“Sparsegeometricimagerepre- sentations withbandelets”. In: IEEETrans. Image Process. 14.4(2005),pp.423–438. [16] K.Engan,S.O.Aase,andJ.H.Husoy.“Methodofoptimal directionsforframedesign”.In:Proc.IEEEInt.Conf.Acous. SpeechSignalProcess.1999,pp.2443–2446. [17] M.Aharon,M.Elad,andA.Bruckstein.“K-SVD:Analgo- rithmfordesigningovercompletedictionariesforsparserep- resentation”. In:IEEETrans.Signal Process.54.11(2006), pp.4311–4322. [18] S.Lesageetal.“Learningunionsoforthonormalbaseswith thresholded singular value decomposition”. In: Proc. IEEE Int.Conf.Acous.SpeechSignalProcess.2005,pp.293–296.