ebook img

Speech Enhancement Techniques for Digital Hearing Aids PDF

162 Pages·2019·11.981 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Speech Enhancement Techniques for Digital Hearing Aids

Komal R. Borisagar · Rohit M. Thanki  Bhavin S. Sedani Speech Enhancement Techniques for Digital Hearing Aids Speech Enhancement Techniques for Digital Hearing Aids (cid:129) Komal R. Borisagar Rohit M. Thanki Bhavin S. Sedani Speech Enhancement Techniques for Digital Hearing Aids KomalR.Borisagar RohitM.Thanki E.C.Department C.U.ShahUniversity AtmiyaInstituteofTechnology WadhwanCity,Gujarat,India andScience Rajkot,Gujarat,India BhavinS.Sedani E.C.Department L.D.EngineeringCollege Ahmedabad,India ISBN978-3-319-96820-9 ISBN978-3-319-96821-6 (eBook) https://doi.org/10.1007/978-3-319-96821-6 LibraryofCongressControlNumber:2018952353 ©SpringerNatureSwitzerlandAG2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthe materialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors, and the editorsare safeto assume that the adviceand informationin this bookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Theinterferenceofbackgroundnoiseisthegreatestproblemreportedbyhearingaid wearers. Background noise is a combination of different frequencies; these unwantedfrequenciesreducetheprecisionofspeech.Ahigherlevelofbackground noise degrades intelligibility. The speech signal is a quasi-periodic signal. If the speech signal is masked by noise, then the intelligence of that signal is reduced significantly and because of that, it is difficult to understand what is being said. Moreover,thespeechsignalcarriesmanyredundantsampleswhichmasksaportion of the noise. Noisy signal can’t be easily detected by a person with a hearing disorder.It’sanirritatingprocessbetweendetectingintelligencefromanoisyspeech bythosewithhearingdisorderswhereasnormalpeoplecandothisjobeasily. Apersonwithahearingimpairment,mayfinditcumbersometoidentifyspecific frequenciesinthepresenceofnoisethatcontainsbasicfrequencies.Inthatcase,the speechsignalisnotaudible;thus,itisnotpossibletointerpretspeech,andthismay cause problems in everyday life. Any product associated with hearing aids needs more work on design in such a way that the effect of the noise would be reduced before any further modification. It is vital to observe the sound quality with the backgroundnoise. Deafnessisaconicaldisabilityduetoeitherasensoryneuraldefectinwhichcells are dead because of age, or a major disease. Deafness can also be caused by problems with bone and air conduction. One of the methods to solving hearing disordersisbycochlearimplants,butthisrequiressurgery.Asanalternative,most people wear hearing aids instead. Enhancement of noisy speech is possible in the caseofhearingaids.Generally,differentnoiseshaveaffectedvariousfrequenciesof thespeechsignal.Fixedfilterscanhelpagreatdealwhenitcomestoremovingan unwanted noise frequency. However, there can be many variations of noise fre- quency and over time it may degrade the fixed filter. It can be seen that with an unwantedsignal,thespeechcomponentisalsoaffected.Withthatconstraint,theuse of filtering techniques that are only applicable to the incoming noise signal is required. In addition to this, as per the noise characteristics, the filtering process shouldbeadaptive. v vi Preface Even after the removal of noise from speech using advance adaptive filtering methods, most partially deaf patients are not able to recognize all the frequencies equally.Thefrequencyresponseofthepatient’seargivesproperinformationabout lossesataparticularfrequency.Basically,speechwiththenoiseremovedshouldbe enhancedaspertheaudiogramstructureoftheindividual.Cleanspeechisthename given to the enhancement in the individual frequency band in the wavelet domain. The audiogram requires individual frequency components that are sometimes less sensitive;thusonlybandswhichareamplifiedbyvolumecanprocessspeechusing the multi-resolution approach of a discrete wavelet transform. This book presents novelapproachessuchas:adaptivefilteringandfrequencybandenhancementinthe waveletdomainthatassistspeechenhancementaswellasnoisereductioninspeech signals. Thebookexplainshownoisecanbeextractedfromthesilentpartofthespeech signal. The various types of adaptive filter such as the least mean square (LMS), normalized LMS (NLMS), and the recursive least square (RLS) are described for noise reduction from the noisy speech signal along with voice activity detection (VAD).Thedifferenttypesofpracticalpossiblenoisesgeneratedforthepreparation ofnoisyspeechsignalandspeechcanbecleanedeffectively.Predictionofthenoise signal is the main task in the present work. The methods mentioned are compared basedonvariousparameterssuchastheamountofnoisereduction,theestimationof filterweightsandtheconvergencerateofthefilters,meansquareerror(MSE),and peaksignaltonoiseratio(PSNR).Thepresentedmethodsprovidesignificantresults in noise reduction and can be observed within a time domain. Also, reconstructed speechcanbeobservedwithnoiseremovalandproperintelligence. ThisbookisaPh.D.researchworkandextensionworkofDr.KomalBorisagar, submitted to the Department of Electronics and Communication Engineering, Shri Jagdish Prasad Jhabarmal Tibrewal University (JJTU), Jhunjhunu, Rajasthan in 2012. The authors are indebted to numerous colleagues for valuable suggestions duringtheentireperiodofthemanuscript’spreparation.Wewouldalsoliketothank thepublishersatSpringer,inparticularMaryE.James,seniorpublishingeditor/CS Springer, for their helpful guidance and encouragement during the creation of thisbook. Rajkot,Gujarat,India KomalR.Borisagar WadhwanCity,Gujarat,India RohitM.Thanki Ahmedabad,India BhavinS.Sedani Contents 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Sound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 EarStructureandItsWorkings. . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.1 ExternalEar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.2 MiddleEar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.3 InnerEar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.4 Cochlea. .. . . .. . . .. . . .. . . .. . .. . . .. . . .. . . .. . . .. 3 1.3 HearingImpaired. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Audiogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 DigitalHearingAids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.6 IssuesinDigitalHearingAids. . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.7 MotivationforThisBook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.7.1 ImportantAreasofSpeechSignalCovered inThisBook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.8 BookOrganization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 GenerationofSpeechSignalandItsCharacteristics. . . . . . . . . . . . . 13 2.1 SpeechSignal. .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . . . . . .. . 13 2.1.1 ArticulatoryPhoneticsandSpeechGeneration. . . . . . . . . . 13 2.1.2 AnatomyandPhysiologyofSpeechGeneration. . . . . . . . . 14 2.1.3 VocalTract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.4 LarynxandVocalFoldsorCords. . . . . . . . . . . . . . . . . . . 16 2.2 MajorFeaturesofSpeechArticulation. . . . . . . . . . . . . . . . . . . . . 18 2.3 PropertiesandCharacteristicsofSpeechSignal. . . . . . . . . . . . . . 20 2.3.1 TimeandFrequencyDomainCharacteristics ofSpeech. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.2 Waveforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.3 FundamentalFrequency. . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.4 OverallPower. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 vii viii Contents 2.3.5 OverallFrequencySpectrum. . . . . . . . . . . . . . . . . . . . . . 22 2.3.6 Short-TimeEnergy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.7 Spectrogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.8 Short-TimeAverageZeroCrossingRate. . . . . . . . . . . . . . 24 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3 IntroductionofAdaptiveFiltersandNoisesforSpeech. . . . . . . . . . . 29 3.1 AdaptiveFilter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 LMSAdaptiveFilter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 LeastMeanSquareAdaptationAlgorithm. . . . . . . . . . . . . 33 3.2.2 StatisticalLMSTheory. . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.3 DirectAveragingMethod. . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.4 SmallStepSizeStatisticalTheory. . . . . . . . . . . . . . . . . . . 37 3.2.5 NaturalModesoftheLMSFilter. . . . . . . . . . . . . . . . . . . 38 3.2.6 LearningCurvesforAdaptiveAlgorithms. . . . . . . . . . . . . 39 3.2.7 ComparisonoftheLMSAlgorithmwiththeSteepest DescentAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 NormalizedLeastMeanSquare(NLMS)AdaptiveFilter. . . . . . . . 41 3.3.1 StructureandOperationofNLMS. . . . . . . . . . . . . . . . . . 42 3.3.2 StabilityoftheNormalizedLMSFilter. . . . . . . . . . . . . . . 45 3.3.3 SpecialEnvironmentofRealValuedData. . . . . . . . . . . . . 46 3.4 RecursiveLeastSquares(RLS)AdaptiveFilter. . . . . . . . . . . . . . . 48 3.4.1 Regularization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4.2 ReformulationoftheNormalEquations. . . . . . . . . . . . . . 50 3.4.3 RecursiveComputationsofФ(n)andz(n). . . . . . . . . . . . . 50 3.4.4 TheMatrixInversionLemma. . . . . . . . . . . . . . . . . . . . . . 51 3.4.5 SelectionoftheRegularizationParameter. . . . . . . . . . . . . 53 3.4.6 ConvergenceAnalysisofRLSAlgorithm. . . . . . . . . . . . . 54 3.4.7 ConvergenceoftheRLSAlgorithminthe MeanValue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4.8 MeanSquareDeviationoftheRLSAlgorithm. . . . . . . . . . 56 3.4.9 EnsembleAverageLearningCurveofthe RLSAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.5 Noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.5.1 SourcesofNoise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4 FourierTransform,Short-TimeFourierTransform, andWaveletTransform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.1 FourierTransform(FT). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2 Short-TimeFT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 WaveletTransform(WT). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4 ComparisonoftheWaveletTransform(WT) withFTandSTFT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5 MultiresolutionAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Contents ix 5 SpeechSignalEnhancementUsingAdaptiveFilters. . . . . . . . . . . . . 75 5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2 StepsforSpeechEnhancementProcess. . . . . . . . . . . . . . . . . . . . 76 5.3 ImplementationFlowofVADAlgorithm. . . . . . . . . . . . . . . . . . . 76 5.4 SpeechEnhancementProcessbasedonLMSAlgorithm. . . . . . . . 77 5.4.1 ResultsforWhiteNoiseSignal. . . . . . . . . . . . . . . . . . . . . 81 5.4.2 ResultsforBabbleNoiseSignal. . . . . . . . . . . . . . . . . . . . 86 5.4.3 ResultsforTrafficJamNoiseSignal. . . . . . . . . . . . . . . . . 91 5.5 SpeechEnhancementProcessBasedontheNLMSAlgorithm. . . . 95 5.5.1 ResultsforWhiteNoiseSignal. . . . . . . . . . . . . . . . . . . . . 98 5.5.2 ResultsforBabbleNoiseSignal. . . . . . . . . . . . . . . . . . . . 102 5.5.3 ResultsforTrafficJamNoiseSignal. . . . . . . . . . . . . . . . . 106 5.6 SpeechEnhancementProcessBasedontheRLSAlgorithm. . . . . . 109 5.6.1 ResultsforWhiteNoiseSignal. . . . . . . . . . . . . . . . . . . . . 110 5.6.2 ResultsforBabbleNoiseSignal. . . . . . . . . . . . . . . . . . . . 115 5.6.3 ResultsforTrafficJamNoiseSignal. . . . . . . . . . . . . . . . . 119 5.7 ComparativeAnalysisofSimulationResults.. . . . .. . . . .. . . . .. 123 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 6 SpeechSignalEnhancementBasedonWaveletTransform. . . . . . . . 125 6.1 ProcedureforSpeechSignalEnhancementUsing WaveletTransform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 6.2 ImplementationandResultsofSpeechSignal EnhancementUsingWaveletTransform. . . . . . . . . . . . . . . . . . . . 130 6.2.1 FirstBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . . 131 6.2.2 SecondBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . 133 6.2.3 ThirdBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . . 134 6.2.4 FourthBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . 137 6.2.5 FifthBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . . 138 6.2.6 SixthBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . . 140 6.2.7 SeventhBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . 142 6.2.8 EighthBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . 143 6.2.9 NinthBandEnhancement. . . . . . . . . . . . . . . . . . . . . . . . . 145 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 7 SummaryofThisBookandFutureResearchDirections. . . . . . . . . . 149 7.1 ImportantPointsCoveredintheBook. . . . . . . . . . . . . . . . . . . . . 149 7.2 FutureResearchDirection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Chapter 1 Introduction 1.1 Sound A sound wave is described as being similar to ripples on the surface of the sea. A “wave” in the atmosphere contains much less variation in pressure than normal atmospheric variations. An acoustic or sound wave as a source generates the compressionandrarefactionofparticles.Usually,soundisgeneratedbyavibrating entity such asaviolin string, aloudspeaker diaphragm, themotorinamachine,or the vocal cords in a human body. Also, sound cannot travel in a vacuum. The eardrum vibrates in direct response to pressure variations in a wave when they reachtheear,andthepressurefluctuationsareheardasasound. 1.2 Ear Structure and Its Workings Human ears arefound in pairs, situated on theleft and right sides ofthe head, and also can be considered as sensory organs comprising the auditory system, which detectssound,andthevestibularsystem,whichisresponsibleformaintainingbody balance and equilibrium [1]. The basic structure of the ear is shown in Fig. 1.1 [1].Theearcanbedividedintothreepartsanatomically,knownastheexternalear, middle ear,and inner ear.The sound collectionand amplifyingmechanisms ofthe eararesuchthat (cid:129) Itcanworkasatransducer,whichconvertssoundvibrationintoactionpotentials. (cid:129) Theactionpotentialcanbedeliveredbythenerves. ©SpringerNatureSwitzerlandAG2019 1 K.R.Borisagaretal.,SpeechEnhancementTechniquesforDigitalHearingAids, https://doi.org/10.1007/978-3-319-96821-6_1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.