ebook img

Single Channel Phase-Aware Signal Processing in Speech Communication: Theory and Practice PDF

254 Pages·2016·5.996 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Single Channel Phase-Aware Signal Processing in Speech Communication: Theory and Practice

(cid:2) SingleChannelPhase-AwareSignalProcessing inSpeechCommunication (cid:2) (cid:2) (cid:2) (cid:2) (cid:2) (cid:2) (cid:2) (cid:2) Single Channel Phase-Aware Signal Processing in Speech Communication: Theory and Practice PejmanMowlaee JosefKulmer JohannesStahl FlorianMayer GrazUniversityofTechnology,Austria (cid:2) (cid:2) (cid:2) (cid:2) Thiseditionfirstpublished2017 ©2017byJohnWiley&Sons,Ltd Registeredoffice: JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussex,PO198SQ,UnitedKingdom Fordetailsofourglobaleditorialoffices,forcustomerservicesandforinformationabouthowtoapplyfor permissiontoreusethecopyrightmaterialinthisbookpleaseseeourwebsiteatwww.wiley.com. Therightoftheauthortobeidentifiedastheauthorofthisworkhasbeenassertedinaccordancewiththe Copyright,DesignsandPatentsAct1988. Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,or transmitted,inanyformorbyanymeans,electronic,mechanical,photocopying,recordingorotherwise, exceptaspermittedbytheUKCopyright,DesignsandPatentsAct1988,withoutthepriorpermissionofthe publisher. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprintmaynot beavailableinelectronicbooks. Designationsusedbycompaniestodistinguishtheirproductsareoftenclaimedastrademarks.Allbrand namesandproductnamesusedinthisbookaretradenames,servicemarks,trademarksorregistered trademarksoftheirrespectiveowners.Thepublisherisnotassociatedwithanyproductorvendor mentionedinthisbook. LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbesteffortsin preparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyor completenessofthecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesofmerchantability orfitnessforaparticularpurpose.Itissoldontheunderstandingthatthepublisherisnotengagedin renderingprofessionalservicesandneitherthepublishernortheauthorshallbeliablefordamagesarising (cid:2) (cid:2) herefrom.Ifprofessionaladviceorotherexpertassistanceisrequired,theservicesofacompetent professionalshouldbesought ® MATLAB isatrademarkofTheMathWorks,Inc.andisusedwithpermission.TheMathWorksdoesnot ® warranttheaccuracyofthetextorexercisesinthisbook.Thisbook’suseordiscussionofMATLAB softwareorrelatedproductsdoesnotconstituteendorsementorsponsorshipbyTheMathWorksofa ® particularpedagogicalapproachorparticularuseoftheMATLAB software. LibraryofCongressCataloging-in-PublicationData Names:Mowlaee,Pejman,1983-author.|Kulmer,Josef,author.|Stahl, Johannes,1989-author.|Mayer,Florian,1986-author. Title:Singlechannelphase-awaresignalprocessinginspeechcommunication: theoryandpractice/[compiledandwrittenby]PejmanMowlaee,Josef Kulmer,JohannesStahl,FlorianMayer. Description:Chichester,UK;Hoboken,NJ:JohnWiley&Sons,Inc.,2016.| Includesbibliographicalreferencesandindex. Identifiers:LCCN2016024931(print)|LCCN2016033469(ebook)|ISBN 9781119238812(cloth)|ISBN9781119238829(pdf)|ISBN9781119238836 (epub) Subjects:LCSH:Speechprocessingsystems.|Signalprocessing.|Oral communication.|Phasemodulation. Classification:LCCTK7882.S65S5752016(print)|LCCTK7882.S65(ebook)| DDC006.4/54–dc23 LCrecordavailableathttps://lccn.loc.gov/2016024931 AcataloguerecordforthisbookisavailablefromtheBritishLibrary. CoverImage:Gettyimages/lestyan4 Setin10/12pt,WarnockProbySPiGlobal,Chennai,India. 10 9 8 7 6 5 4 3 2 1 (cid:2) (cid:2) v Contents AbouttheAuthors xi Preface xiii ListofSymbols xvii PartI History,TheoryandConcepts 1 1 Introduction:PhaseProcessing,History 3 PejmanMowlaee 1.1 ChapterOrganization 3 (cid:2) 1.2 ConventionalSpeechCommunication 3 (cid:2) 1.3 HistoricalOverviewoftheImportanceorUnimportanceofPhase 6 1.4 ImportanceofPhaseinSpeechProcessing 9 1.4.1 SpeechEnhancement 9 1.4.1.1 UnimportanceofPhaseinSpeechEnhancement 10 1.4.1.2 EffectsofPhaseModificationinSpeechSignals 10 1.4.1.3 PhaseSpectrumCompensation 10 1.4.1.4 PhaseImportanceforImprovedSignalReconstruction 11 1.4.2 SpeechWatermarking 11 1.4.3 SpeechCoding 12 1.4.4 ArtificialBandwidthExtension 13 1.4.5 SpeechSynthesis 14 1.4.6 Speech/SpeakerRecognition 15 1.5 StructureoftheBook 16 1.6 Experiments 18 1.6.1 Experiment1.1:PhaseUnimportanceinSpeechEnhancement 18 1.6.2 Experiment1.2:EffectsofPhaseModification 20 1.6.3 Experiment1.3:MismatchedWindow 22 1.6.4 Experiment1.4:PhaseSpectrumCompensation 24 1.7 Summary 26 References 26 2 FundamentalsofPhase-BasedSignalProcessing 33 PejmanMowlaee 2.1 ChapterOrganization 33 (cid:2) (cid:2) vi Contents 2.2 STFTPhase:BackgroundandSomeRemarks 33 2.2.1 Short-TimeFourierTransform 33 2.2.2 FourierAnalysisofSpeech:STFTAmplitudeandPhase 34 2.3 PhaseUnwrapping 35 2.3.1 ProblemDefinition 35 2.3.2 RemarksonPhaseUnwrapping 38 2.3.3 PhaseUnwrappingSolutions 38 2.3.3.1 DetectingDiscontinuities 39 2.3.3.2 NumericalIntegration(NI) 40 2.3.3.3 IsolatingSharpZeros 41 2.3.3.4 IterativePhaseUnwrapping 41 2.3.3.5 PolynomialFactorization(PF) 42 2.3.3.6 TimeSeriesApproach 42 2.3.3.7 CompositeMethod 43 2.3.3.8 Schur–CohnandNyquistFrequency 44 2.4 UsefulPhase-BasedRepresentations 44 2.4.1 GroupDelayRepresentations 45 2.4.2 InstantaneousFrequency 48 2.4.3 BasebandPhaseDifference 49 2.4.4 HarmonicPhaseDecomposition 50 2.4.4.1 BackgroundontheHarmonicModel 50 2.4.4.2 PhaseDecompositionusingtheHarmonicModel 51 (cid:2) 2.4.5 Phasegram:UnwrappedHarmonicPhase 52 (cid:2) 2.4.5.1 DefinitionsandBackground 52 2.4.5.2 CircularMeanandVariance 52 2.4.6 RelativePhaseShift 53 2.4.7 PhaseDistortion 54 2.5 Experiments 57 2.5.1 Experiment2.1:One-DimensionalPhaseUnwrapping 57 2.5.1.1 CleanSignalScenario 57 2.5.1.2 NoisySignalScenario 58 2.5.2 Experiment2.2:ComparativeStudyofPhaseUnwrappingMethods 58 2.5.3 Experiment2.3:ComparativeStudyonGroupDelaySpectra 59 2.5.4 Experiment2.4:CircularStatisticsoftheHarmonicPhase 60 2.5.5 Experiment2.5:CircularStatisticsoftheSpectralPhase 62 2.5.6 Experiment2.6:ComparativeStudyofPhaseRepresentations 63 2.6 Summary 65 References 65 3 PhaseEstimationFundamentals 71 JosefKulmerandPejmanMowlaee 3.1 ChapterOrganization 71 3.2 PhaseEstimationFundamentals 71 3.2.1 BackgroundandFundamentals 71 3.2.2 KeyExamples:PhaseEstimationProblem 72 3.2.2.1 Example1:Discrete-TimeSinusoid 72 3.2.2.2 Example2:Discrete-TimeSinusoidinNoise 76 (cid:2) (cid:2) Contents vii 3.2.3 PhaseEstimation 80 3.2.3.1 MaximumLikelihoodEstimation 80 3.2.3.2 MaximumaPosterioriEstimation 83 3.3 ExistingSolutions 84 3.3.1 IterativeSignalReconstruction 84 3.3.1.1 Background 84 3.3.1.2 Griffin–LimAlgorithm(GLA) 85 3.3.1.3 ExtensionsoftheGLA 87 3.3.2 PhaseReconstructionAcrossTime 89 3.3.3 PhaseReconstructionAcrossFrequency 90 3.3.4 PhaseRandomization 91 3.3.5 Geometry-BasedPhaseEstimation 93 3.3.6 LeastSquares(LS) 95 3.3.7 Spectro-TemporalSmoothingofUnwrappedPhase 97 3.3.7.1 SignalSegmentation 97 3.3.7.2 LinearPhaseRemoval 98 3.3.7.3 ApplySmoothingFilter 98 3.3.7.4 ReconstructionoftheEnhanced-PhaseSignal 101 3.4 Experiments 101 3.4.1 Experiment3.1:MonteCarloSimulationComparingMLandMAP 101 3.4.2 Experiment3.2:MonteCarloSimulationonWindowImpact 103 3.4.3 Experiment3.3:PhaseRecoveryUsingtheGriffin–LimAlgorithm 105 (cid:2) 3.4.4 Experiment3.4:PhaseEstimationforSpeechEnhancement:AComparative (cid:2) Study 105 3.5 Summary 107 References 108 PartII Applications 113 4 PhaseProcessingforSingle-ChannelSpeechEnhancement 115 JohannesStahlandPejmanMowlaee 4.1 IntroductionandChapterOrganization 115 4.2 SpeechEnhancementintheSTFTDomain:GeneralConcepts 116 4.2.1 AprioriSNREstimation 116 4.2.1.1 Decision-DirectedaprioriSNREstimation 117 4.2.1.2 Cepstro-TemporalSmoothing 118 4.2.2 NoisePSDEstimation 118 4.2.2.1 MinimumStatistics 119 4.3 ConventionalSpeechEnhancement 119 4.3.1 StatisticalModel 119 4.3.2 Short-TimeSpectralAmplitudeEstimation 121 4.4 Phase-SensitiveSpeechEnhancement 123 4.4.1 PhaseEstimationforSignalReconstruction 123 4.4.2 SpectralAmplitudeEstimationGiventheSTFTPhase 124 4.4.3 IterativeClosed-LoopPhase-AwareSingle-ChannelSpeech Enhancement 126 (cid:2) (cid:2) viii Contents 4.4.4 IncorporatingVoiced/UnvoicedUncertainty 128 4.4.5 UncertaintyinPriorPhaseInformation 130 4.4.6 Stochastic–DeterministicMMSE-STFTSpeechEnhancement 131 4.4.6.1 ObtainingtheSpeechParameters 134 4.5 Experiments 135 4.5.1 Experiment4.1:ProofofConcept 135 4.5.2 Experiment4.2:Consistency 136 4.5.3 Experiment4.3:SensitivityAnalysis 137 4.6 Summary 139 References 139 5 PhaseProcessingforSingle-ChannelSourceSeparation 143 PejmanMowlaeeandFlorianMayer 5.1 ChapterOrganization 143 5.2 WhySingle-ChannelSourceSeparation? 143 5.2.1 Background 143 5.2.2 ProblemFormulation 144 5.3 ConventionalSingle-ChannelSourceSeparation 145 5.3.1 Source-DrivenSCSS 146 5.3.1.1 IdealBinaryMask 147 5.3.1.2 IdealRatioMask 147 5.3.2 Model-BasedSCSS 147 (cid:2) 5.3.2.1 DeepLearning 149 (cid:2) 5.3.2.2 Non-NegativeMatrixFactorization 150 5.4 PhaseProcessingforSingle-ChannelSourceSeparation 152 5.4.1 ComplexMatrixFactorizationMethods 152 5.4.1.1 ComplexMatrixFactorization 152 5.4.1.2 ComplexMatrixFactorizationwithIntra-SourceAdditivity 154 5.4.2 PhaseImportanceforSignalReconstruction 155 5.4.2.1 MultipleInputSpectrogramInversion 155 5.4.2.2 PartialPhaseReconstruction 156 5.4.2.3 InformedSourceSeparationUsingIterativeReconstruction(ISSIR) 157 5.4.2.4 Sinusoidal-BasedPPR 158 5.4.2.5 SpectrogramConsistency 159 5.4.2.6 Geometry-BasedPhaseEstimation 160 5.4.2.7 PhaseDecompositionandTemporalSmoothing 162 5.4.2.8 PhaseReconstructionofSpectrogramswithLinearUnwrapping 163 5.4.3 Phase-AwareTime–FrequencyMasks 164 5.4.3.1 Phase-InsensitiveMasks 164 5.4.3.2 Phase-SensitiveMask 165 5.4.3.3 ComplexRatioMask 165 5.4.3.4 ComplexMask 166 5.4.4 PhaseImportanceinSignalInteractionModels 166 5.5 Experiments 168 5.5.1 Experiment5.1:PhaseEstimationforProof-of-ConceptSignal Reconstruction 168 (cid:2)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.