EURASIP Journal on Advances in Signal Processing Microphone Array Speech Processing Guest Editors: Sven Nordholm, Thushara Abhayapala, Simon Doclo, Sharon Gannot, Patrick Naylor, and Ivan Tashev Microphone Array Speech Processing EURASIP Journal on Advances in Signal Processing Microphone Array Speech Processing Guest Editors: Sven Nordholm, Thushara Abhayapala, Simon Doclo, Sharon Gannot, Patrick Naylor, and Ivan Tashev Copyright©2010HindawiPublishingCorporation.Allrightsreserved. Thisisaspecialissuepublishedinvolume2010of“EURASIPJournalonAdvancesinSignalProcessing.”Allarticlesareopenaccess articlesdistributedundertheCreativeCommonsAttributionLicense,whichpermitsunrestricteduse,distribution,andreproductionin anymedium,providedtheoriginalworkisproperlycited. Editor-in-Chief PhillipRegalia,InstitutNationaldesTe´le´communications,France Associate Editors AdelM.Alimi,Tunisia SudharmanK.Jayaweera,USA DouglasO’Shaughnessy,Canada KennethBarner,USA SorenHoldtJensen,Denmark Bjo¨rnOttersten,Sweden YasarBecerikli,Turkey MarkKahrs,USA JacquesPalicot,France KostasBerberidis,Greece MoonGiKang,SouthKorea AnaPerez-Neira,Spain EnricoCapobianco,Italy WalterKellermann,Germany WilfriedR.Philips,Belgium A.EnisCetin,Turkey LisimachosP.Kondi,Greece AggelosPikrakis,Greece JonathonChambers,UK AlexChichungKot,Singapore IoannisPsaromiligkos,Canada Mei-JuanChen,Taiwan ErcanE.Kuruoglu,Italy AthanasiosRontogiannis,Greece Liang-GeeChen,Taiwan TanLee,China GregorRozinaj,Slovakia SatyaDharanipragada,USA GeertLeus,TheNetherlands MarkusRupp,Austria KutluyilDogancay,Australia T.-H.Li,USA WilliamSandham,UK FlorentDupont,France HushengLi,USA B.Sankur,Turkey FrankEhlers,Italy MarkLiao,Taiwan ErchinSerpedin,USA SharonGannot,Israel Y.-P.Lin,Taiwan LingShao,UK SamanwoyGhosh-Dastidar,USA ShojiMakino,Japan DirkSlock,France NorbertGoertz,Austria StephenMarshall,UK Yap-PengTan,Singapore M.Greco,Italy C.Mecklenbra¨uker,Austria Joa˜oManuelR.S.Tavares,Portugal IreneY.H.Gu,Sweden GloriaMenegaz,Italy GeorgeS.Tombras,Greece FredrikGustafsson,Sweden RicardoMerched,Brazil DimitriosTzovaras,Greece UlrichHeute,Germany MarcMoonen,Belgium BernhardWess,Austria SangjinHong,USA ChristophorosNikou,Greece Jar-FerrYang,Taiwan JiriJan,CzechRepublic SvenNordholm,Australia AzzedineZerguine,SaudiArabia MagnusJansson,Sweden PatrickOonincx,TheNetherlands AbdelhakM.Zoubir,Germany Contents MicrophoneArraySpeechProcessing,SvenNordholm,ThusharaAbhayapala,SimonDoclo, SharonGannot(EURASIPMember),PatrickNaylor,andIvanTashev Volume2010,ArticleID694216,3pages SelectiveFrequencyInvariantUniformCircularBroadbandBeamformer,XinZhang,WeeSer, ZhangZhang,andAnoopKumarKrishna Volume2010,ArticleID678306,11pages First-OrderAdaptiveAzimuthalNull-SteeringfortheSuppressionofTwoDirectionalInterferers, Rene´ M.M.Derkx Volume2010,ArticleID230864,16pages Musical-NoiseAnalysisinMethodsofIntegratingMicrophoneArrayandSpectralSubtractionBasedon Higher-OrderStatistics,YuTakahashi,HiroshiSaruwatari,KiyohiroShikano,andKazunobuKondo Volume2010,ArticleID431347,25pages MicrophoneDiversityCombiningforIn-CarApplications,Ju¨rgenFreudenberger,SebastianStenzel, andBenjaminVenditti Volume2010,ArticleID509541,13pages DOAEstimationwithLocal-Peak-WeightedCSP,OsamuIchikawa,TakashiFukuda, andMasafumiNishimura Volume2010,ArticleID358729,9pages ShooterLocalizationinWirelessMicrophoneNetworks,DavidLindgren,OlofWilsson, FredrikGustafsson,andHansHabberstad Volume2010,ArticleID690732,11pages HindawiPublishingCorporation EURASIPJournalonAdvancesinSignalProcessing Volume2010,ArticleID694216,3pages doi:10.1155/2010/694216 Editorial Microphone Array Speech Processing SvenNordholm(EURASIPMember),1ThusharaAbhayapala(EURASIPMember),2 SimonDoclo(EURASIPMember),3SharonGannot(EURASIPMember),4 PatrickNaylor(EURASIPMember),5andIvanTashev6 1DepartmentofElectricalandComputerEngineering,CurtinUniversityofTechnology,Perth,WA6845,Australia 2CollegeofEngineering&ComputerScience,TheAustralianNationalUniversity,Canberra,ACT0200,Australia 3InstituteofPhysics,SignalProcessingGroup,UniversityofOldenburg,26111Oldenburg,Germany 4SchoolofEngineering,Bar-IlanUniversity,52900TelAviv,Israel 5DepartmentofElectricalandElectronicEngineering,ImperialCollege,LondonSW72AZ,UK 6MicrosoftResearch,USA CorrespondenceshouldbeaddressedtoSvenNordholm,[email protected] Received21July2010;Accepted21July2010 Copyright©2010Sven Nordholm et al. This is an open access article distributed under the Creative Commons Attribution License,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperly cited. Significant knowledge about microphone arrays has been highlyreverberantspeechgiventhatweonlycanobservethe gainedfromyearsofintenseresearchandproductdevelop- receivedmicrophonesignals. ment.Therehavebeennumerousapplicationssuggested,for This special issue contains contributions to traditional example, from large arrays (in the order of >100 elements) areas of research such as frequency invariant beamforming for use in auditoriums to small arrays with only 2 or 3 [1], hand-free operation of microphone arrays in cars [2], elementsforhearingaidsandmobiletelephones.Apartfrom and source localisation [3]. The contributions show new that,microphone arraytechnology hasbeen widelyapplied ways to study these traditional problems and give new inspeechrecognition,surveillance,andwarfare.Traditional insights into those problems. Small size arrays have always techniques that have been used for microphone arrays a lot of applications and interest for mobile terminals, include fixed spatial filters, such as, frequency invariant hearing aids, and close up microphones [4]. The novel beamformers, optimal and adaptive beamformers. These way to represent small size arrays leads to a capability to array techniques assume either model knowledge or cali- suppress multiple interferers. Abnormalities in noise and brationsignalknowledgeaswellaslocalizationinformation speech stemming from processing are largely unavoidable, for their design. Thus they usually combine some form and using nonlinear processing results often in significant of localisation and tracking with the beamforming. Today character change particularly in noise character. It is thus contemporarytechniquesusingblindsignalseparation(BSS) important to provide new insights into those phenomena and time frequency masking technique have attracted sig- particularly the so called musical noise [5]. Finally, new nificantattention.Thosetechniquesarelessreliantonarray and unusual use of microphone arrays is always interesting modelandlocalization,butmoreonthestatisticalproperties to see. Distributed microphone arrays in a sensor network of speech signals such as sparseness, non-Gaussianity, and [6] provide a novel approach to find snipers. This type of non-stationarity. The main advantage that multiple micro- processinghasgoodopportunitiestogrowininterestfornew phones add from a theoretical perspective is the spatial andimprovedapplications. diversity, which is an effective tool to combat interference, The contributions found in this special issue can be reverberation,andnoise.Theunderpinningphysicalfeature categorized to three main aspects of microphone array used is a difference in coherence in the target field (speech processing:(i)microphonearraydesignbasedoneigenmode signal)versusthenoisefield.Viewingtheprocessinginthis decomposition[1,4];(ii)multichannelprocessingmethods way one can understand also the difficulty in enhancing [2,5];and(iii)sourcelocalisation[3,6]. 2 EURASIPJournalonAdvancesinSignalProcessing ThepaperbyZhangetal.,“Selectivefrequencyinvariant array signal processing and spectral subtraction. To obtain uniform circular broadband beamformer” [1], describes a better noise reduction, methods of integrating microphone design method for Frequency-Invariant (FI) beamforming. arraysignalprocessingandnonlinearsignalprocessinghave This problem is a well-known array signal processing tech- beenresearched.However,nonlinearsignalprocessingoften niqueusedinmanyapplicationssuchas,speechacquisition, generates musical noise. Since such musical noise causes acoustic imaging and communications purposes. However, discomfort to users, it is desirable that musical noise is many existing FI beamformers are designed to have a mitigated. Moreover, it has been recently reported that frequency invariant gain over all angles. This might not be higher-order statistics are strongly related to the amount necessary and if a gain constraint is confined to a specific of musical noise generated. This implies that it is possible angle,thentheFIperformanceoverthatselectedregion(in to optimize the integration method from the viewpoint of frequency and angle) can be expected to improve. Inspired not only noise reduction performance but also the amount by this idea, the proposed algorithm attempts to optimize of musical noise generated. Thus, the simplest methods thefrequencyinvariantbeampatternsolelyforthemainlobe of integration, that is, the delay-and-sum beamformer and andrelaxtheFIrequirementonthesidelobes.Thissacrifice spectralsubtraction,areanalysedandthefeaturesofmusical on performance in the undesired region is traded off for noisegeneratedbyeachmethodareclarified.Asaresult,itis better performance in the desired region as well as reduced clarifiedthataspecificstructureofintegrationispreferable number of microphones employed. The objective function from the viewpoint of the amount of generated musical is designed to minimize the overall spatial response of the noise. The validity of the analysis is shown via a computer beamformer with a constraint on the gain being smaller simulationandasubjectiveevaluation. thanapredefinedthresholdvalueacrossaspecificfrequency ThepaperbyFreudenbergeretal.,“Microphonediversity rangeandataspecificangle.Thisproblemisformulatedasa combining for in-car applications” [2], proposes a frequency convex optimization problem and the solution is obtained domain diversity approach for two or more microphone by using the Second-Order Cone Programming (SOCP) signals, for example, for in-car applications. The micro- technique. An analysis of the computational complexity phones should be positioned separately to ensure diverse of the proposed algorithm is presented as well as its signal conditions and incoherent recording of noise. This performance. The performance is evaluated via computer enables a better compromise for the microphone position simulation for different number of sensors and different withrespecttodifferentspeakersizesandnoisesources.This thresholdvalues.Simulationresultsshowthattheproposed work proposes a two-stage approach: In the first stage, the algorithm is able to achieve a smaller mean square error of microphonesignalsareweightedwithrespecttotheirsignal- thespatialresponsegainforthespecificFIregioncompared to-noiseratioandthensummedsimilartomaximum-ratio- toexistingalgorithms. combining.Thecombinedsignalisthenusedasareference ThepaperbyDerkx,“First-orderazimuthalnull-steering forafrequencydomainleast-mean-squares(LMS)filterfor for the suppression of two directional interferers” [4] shows eachinputsignal.TheoutputSNRissignificantlyimproved thatanazimuthsteerablefirst-ordersuperdirectionalmicro- comparedtocoherence-basednoisereductionsystems,even phoneresponsecanbeconstructedbyalinearcombination ifonemicrophoneisheavilycorruptedbynoise. of three eigenbeams: a monopole and two orthogonal The paper by Ichikawa et al., “DOA estimation with dipoles. Although the response of a (rotation symmetric) local-peak-weighted CSP” [3], proposes a novel weighting first-order response can only exhibit a single null, the algorithm for Cross-power Spectrum Phase (CSP) analysis paper studies a slice through this beampattern lying in the to improve the accuracy of direction of arrival (DOA) azimuthal plane. In this way, a maximum of two nulls estimation for beamforming in a noisy environment. As in the azimuthal plane can be defined. These nulls are a sound source, a human speaker is used, and as a noise symmetric with respect to the main-lobe axis. By placing source broadband automobile noise is used. The harmonic these two nulls on maximally two-directional sources to structures in the human speech spectrum can be used for be rejected and compensating for the drop in level for the weighting the CSP analysis, because harmonic bins must desireddirection,thesedirectionalsourcescanbeeffectively contain more speech power than the others and thus give rejectedwithoutattenuatingthedesiredsource.Anadaptive us more reliable information. However, most conventional null-steering scheme for adjusting the beampattern, which methods leveraging harmonic structures require pitch esti- enablesautomaticsourcesuppression,ispresented.Closed- mation with voiced-unvoiced classification, which is not form expressions for this optimal null-steering are derived, sufficiently accurate in noisy environments. The suggested enabling the computation of the azimuthal angles of the approach employs the observed power spectrum, which is interferers. It is shown that the proposed technique has a directly converted into weights for the CSP analysis by gooddirectivityindexwhentheangulardifferencebetween retaining only the local peaks considered to be coming the desired source and each directional interferer is at least fromaharmonicstructure.Thepresentedresultsshowthat 90degrees. the proposed approach significantly reduces the errors in In the paper by Takahashi et al. “Musical noise analysis localization, and it also shows further improvement when in methods of integrating microphone array and spectral usedwithotherweightingalgorithms. subtractionbased onhigher-order statistics”[5],anobjective The paper by Lindgren et al., “Shooter localization in analysis on musical noise is conducted. The musical noise wireless microphone networks” [6], is an interesting com- is generated by two methods of integrating microphone bination of microphone array technology with distributed EURASIPJournalonAdvancesinSignalProcessing 3 communications. By detecting the muzzle blast as well as the ballistic shock wave, the microphone array algorithm is able to locate the shooter in the case when the sensors are synchronized. However, in the distributed sensor case, synchronizationiseithernotachievableorveryexpensiveto achieveandthereforetheaccuracyoflocalizationcomesinto question.Fieldtrialsaredescribedtosupportthealgorithmic development. SvenNordholm ThusharaAbhayapala SimonDoclo SharonGannot PatrickNaylor IvanTashev References [1] X. Zhang, W. Ser, Z. Zhang, and A. K. Krishna, “Selective frequencyinvariantuniformcircularbroadbandbeamformer,” EURASIP Journal on Advances in Signal Processing, vol. 2010, ArticleID678306,11pages,2010. [2] J. Freudenberger, S. Stenzel, and B. Venditti, “Microphone diversitycombiningforIn-carapplications,”EURASIPJournal onAdvancesinSignalProcessing,vol.2010,ArticleID509541, 13pages,2010. [3] O.Ichikawa,T.Fukuda,andM.Nishimura,“DOAestimation withlocal-peak-weightedCSP,”EURASIPJournalonAdvances inSignalProcessing,vol.2010,ArticleID358729,9pages,2010. [4] R.M.M.Derkx,“First-orderadaptiveazimuthalnull-steering for the suppression of two directional interferers,” EURASIP JournalonAdvancesinSignalProcessing,vol.2010,ArticleID 230864,16pages,2010. [5] Yu. Takahashi, H. Saruwatari, K. Shikano, and K. Kondo, “Musical-noiseanalysisinmethodsofintegratingmicrophone arrayandspectralsubtractionbasedonhigher-orderstatistics,” EURASIP Journal on Advances in Signal Processing, vol. 2010, ArticleID431347,25pages,2010. [6] D. Lindgren, O. Wilsson, F. Gustafsson, and H. Habberstad, “Shooterlocalizationinwirelesssensornetworks,”inProceed- ingsofthe12thInternationalConferenceonInformationFusion (FUSION’09),pp.404–411,July2009. HindawiPublishingCorporation EURASIPJournalonAdvancesinSignalProcessing Volume2010,ArticleID678306,11pages doi:10.1155/2010/678306 Research Article Selective Frequency Invariant Uniform Circular Broadband Beamformer XinZhang,1WeeSer,1ZhangZhang,1andAnoopKumarKrishna2 1CenterforSignalProcessing,NanyangTechnologicalUniversity,50NanyangAvenue,Singapore639798 2EADSInnovationWorks,EADSSingaporePteLtd.,No.41,ScienceParkRoad,01-30,Singapore117610 CorrespondenceshouldbeaddressedtoXinZhang,zhang [email protected] Received16April2009;Revised24August2009;Accepted3December2009 AcademicEditor:ThusharaAbhayapala Copyright©2010XinZhangetal.ThisisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense, whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. Frequency-Invariant(FI)beamformingisawellknownarraysignalprocessingtechniqueusedinmanyapplications.Inthispaper, analgorithmthatattemptstooptimizethefrequencyinvariantbeampatternsolelyforthemainlobe,andrelaxtheFIrequirement onthesidelobeisproposed.Thissacrificeonperformanceintheundesiredregionistradedoffforbetterperformanceinthe desiredregionaswellasreducednumberofmicrophonesemployed.Theobjectivefunctionisdesignedtominimizetheoverall spatialresponseofthebeamformerwithaconstraintonthegainbeingsmallerthanapre-definedthresholdvalueacrossaspecific frequencyrangeandataspecificangle.Thisproblemisformulatedasaconvexoptimizationproblemandthesolutionisobtained byusingtheSecondOrderConeProgramming(SOCP)technique.Ananalysisofthecomputationalcomplexityoftheproposed algorithmispresentedaswellasitsperformance.Theperformanceisevaluatedviacomputersimulationfordifferentnumber ofsensorsanddifferentthresholdvalues.Simulationresultsshowthat,theproposedalgorithmisabletoachieveasmallermean squareerrorofthespatialresponsegainforthespecificFIregioncomparedtoexistingalgorithms. 1.Introduction to use the Frequency-Invariant (FI) beampattern synthesis technique. As the name implies, such beamformers are Broadband beamforming techniques using an array of designed to have constant spatial gain response over the microphoneshavebeenappliedwidelyinhearingaids,tele- desiredfrequencybands. conferencing, and voice-activated human-computer inter- Over recent years, FI beamforming techniques are face applications. Several broadband beamformer designs developed in a fast pace. It is difficult to make a distinct have been reported in the literature [1–3]. One design classification.However,inordertograsptheliteratureonFI approachistodecomposethebroadbandsignalintoseveral beamforminginaglimpse,weclassifythemlooselyintothe narrowband signals and apply narrowband beamforming followingthreetypes. techniques for each narrowband signal [4]. This approach One type of FI beamformers includes those that focus requires several narrowband processing to be conducted on the design based on array geometry. These include, for simultaneously and is computationally expensive. Another example, the 3D sensor array design reported in [6], the designapproachistouseadaptivebroadbandbeamformers. rectangular sensor array design reported in [7], and the Such techniques use a bank of linear transversal filters to designofusingsubarraysin[8].In[9],theFIbeampatternis generatethedesiredbeampattern.Thefiltercoefficientscan achievedbyexploitingtherelationshipamongthefrequency be derived adaptively from the received signals. One classic responsesofthevariousfiltersimplementedattheoutputof design example is the Frost Beamformer [5]. However, in eachsensor. ordertohaveasimilarbeampatternovertheentirefrequency The second type of FI beamformers is designed on range, a large number of sensors and filter taps will be the base of a least-square approach. For this type of FI needed.Thisagainleadstohighcomputationalcomplexity. beamformers,theweightsofthebeamformerareoptimized Thethirdapproachofdesigningbroadbandbeamformersis such that the error between the actual beampattern and
Description: