ebook img

Nonlinear Analyses and Algorithms for Speech Processing PDF

225 Pages·2016·2.95 MB·English
Save to my drive
Quick download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Nonlinear Analyses and Algorithms for Speech Processing

Thomas Drugman Thierry Dutoit (Eds.) 1 Advances in Nonlinear 1 9 7 I A Speech Processing N L 6th International Conference, NOLISP 2013 Mons, Belgium, June 2013 Proceedings 123 Lecture Notes in Artificial Intelligence 7911 Subseries of Lecture Notes in Computer Science LNAISeriesEditors RandyGoebel UniversityofAlberta,Edmonton,Canada YuzuruTanaka HokkaidoUniversity,Sapporo,Japan WolfgangWahlster DFKIandSaarlandUniversity,Saarbrücken,Germany LNAIFoundingSeriesEditor JoergSiekmann DFKIandSaarlandUniversity,Saarbrücken,Germany Thomas Drugman Thierry Dutoit (Eds.) Advances in Nonlinear Speech Processing 6th International Conference, NOLISP 2013 Mons, Belgium, June 19-21, 2013 Proceedings 1 3 VolumeEditors ThomasDrugman ThierryDutoit UniversityofMons,TCTSLab 31,BouldevardDolez,7000Mons,Belgium E-mail:{thomas.drugman,thierry.dutoit}@umons.ac.be ISSN0302-9743 e-ISSN1611-3349 ISBN978-3-642-38846-0 e-ISBN978-3-642-38847-7 DOI10.1007/978-3-642-38847-7 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2013939661 CRSubjectClassification(1998):I.2.7,I.5,H.5,I.6,G.1 LNCSSublibrary:SL7–ArtificialIntelligence ©Springer-VerlagBerlinHeidelberg2013 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection withreviewsorscholarlyanalysisormaterialsuppliedspecificallyforthepurposeofbeingenteredand executedonacomputersystem,forexclusiveusebythepurchaserofthework.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheCopyrightLawofthePublisher’slocation, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Permissionsforuse maybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violationsareliabletoprosecution undertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpublication, neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforanyerrorsor omissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespecttothe materialcontainedherein. Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface NOLISP, an ISCA tutorial and workshop on non-linear speech processing, is a biannual event whose aim is to present and discuss new ideas, techniques, and results related to alternative approaches in speech processing that may depart from the mainstream. In order to work at the front-end of the subject area, the following domains of interest have been defined for the NOLISP workshops: 1. Non-linear approximation and estimation 2. Non-linear oscillators and predictors 3. Higher-order statistics 4. Independent component analysis 5. Nearest neighbors 6. Neural networks 7. Decision trees 8. Non-parametric models 9. Dynamics for non-linear systems 10. Fractal methods 11. Chaos modeling 12. Non-linear differential equations The initiative of organization of NOLISP 2013 at the University of Mons (UMONS) came from the speech processing research group at TCTS Lab. The factthatthiswasthe sixtheditionofNOLISPgivesevidencethatthe workshop has already become an established international event. TheOrganizingCommitteewouldliketosincerelythankoursponsorsAcapela Group and Nuance, as well as FNRS, University of Mons, and ISCA for their financial support. April 2013 Thomas Drugman Thierry Dutoit Sponsors Table of Contents Speech and Audio Analysis Evaluation of Automatic Glottal Source Analysis .................... 1 John Kane and Christer Gobl NMF-Based Spectral Analysis for Acoustic Event Classification Tasks........................................................... 9 Jimmy Luden˜a-Choez and Ascensi´on Gallardo-Antol´ın Efficient GCI Detection for Efficient Sparse Linear Prediction.......... 17 Vahid Khanagha and Khalid Daoudi Gender Detection in Running Speech from Glottal and Vocal Tract Correlates....................................................... 25 Cristina Mun˜oz-Mulas, Rafael Mart´ınez-Olalla, Pedro G´omez-Vilda, Agust´ın A´lvarez-Marquina, and Luis Miguel Mazaira-Ferna´ndez An Efficient Method for Fundamental Frequency Determination of Noisy Speech .................................................... 33 Mohamed Anouar Ben Messaoud, A¨ıcha Bouzid, and Noureddine Ellouze Glottal Source Model Selection for Stationary Singing-Voice by Low-Band Envelope Matching ..................................... 42 Fernando Villavicencio Contribution to the Multipitch Estimation by Multi-scale Product Analysis ........................................................ 50 Jihen Zeremdini, Mohamed Anouar Ben Messaoud, A¨ıcha Bouzid, and Noureddine Ellouze Speech Signals ParameterizationBased on Auditory Filter Modeling.... 60 Youssef Zouhir and Ka¨ıs Ouni Towards a Better Representation of the Envelope Modulation of Aspiration Noise............................................... 67 Jo˜ao P. Cabral and Julie Carson-Berndsen Speech Synthesis Towards Physically Interpretable Parametric Voice Conversion Functions ....................................................... 75 Daniel Erro, Agust´ın Alonso, Luis Serrano, Eva Navas, and Inma Herna´ez VIII Table of Contents Reduced Search Space Frame Alignment Based on Kullback-Leibler Divergence for Voice Conversion ................................... 83 Abdoreza Sabzi Shahrebabaki, Jamal Amini, Hamid Sheikhzadeh, Mostafa Ghorbandoost, and Neda Faraji Average Voice Modeling Based on Unbiased Decision Trees............ 89 Fahimeh Bahmaninezhad, Soheil Khorram, and Hossein Sameti Non-linear Pitch Modification in Voice Conversion Using Artificial Neural Networks................................................. 97 Bajibabu Bollepalli, Jonas Beskow, and Joakim Gustafson Speech-Based Biomedical Applications AnalysisandQuantificationofAcoustic Artefacts in Tracheoesophageal Speech ......................................................... 104 Thomas Drugman, Myriam Rijckaert, George Lawson, and Marc Remacle Analysis of Speech from People with Parkinson’s Disease through Nonlinear Dynamics.............................................. 112 Juan Rafael Orozco-Arroyave, Julia´n David Arias-London˜o, Jesu´s Francisco Vargas-Bonilla, and Elmar N¨oth Synthesis by Rule of Disordered Voices ............................. 120 Jean Schoentgen and Jorge C. Lucero Towards a Low-Complex Breathing Monitoring System Based on Acoustic Signals ................................................. 128 Pere Mart´ı-Puig, Jordi Sol´e-Casals, Gerard Masferrer, and Esteve Gallego-Jutgl`a Automatic Detection of Laryngeal Pathologies in Running Speech Based on the HMM Transformation of the Nonlinear Dynamics ........ 136 Carlos M. Travieso, Jesu´s B. Alonso, Juan Rafael Orozco-Arroyave, Jordi Sol´e-Casals, and Esteve Gallego-Jutgl`a Feature Extraction Approach Based on Fractal Dimension for Spontaneous Speech Modelling Oriented to Alzheimer Disease Diagnosis ....................................................... 144 Karmele Lo´pez-de-Ipin˜a, Harkaitz Egiraun, Jordi Sole-Casals, Miriam Ecay, Aitzol Ezeiza, Nora Barroso, Pablo Martinez-Lage, and Unai Martinez-de-Lizardui Table of Contents IX Automatic Speech Recognition Robust Hierarchical and Sparse Representation of Natural Sounds in High-Dimensional Space .......................................... 152 Simon Brodeur and Jean Rouat Onthe ImportanceofPre-emphasisandWindowShape inPhase-Based Speech Recognition .............................................. 160 Erfan Loweimi, Seyed Mohammad Ahadi, Thomas Drugman, and Samira Loveymi Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition ............................ 168 Md. Jahangir Alam, Patrick Kenny, and Douglas O’Shaughnessy Fuzzy Phonetic Decoding Method in a Phoneme Recognition Problem ........................................................ 176 Lyudmila V. Savchenko and Andrey V. Savchenko Improved EMD Usable Speech Detection for Co-channel Speaker Identification .................................................... 184 Wajdi Ghezaiel, Amel Ben Slimane, and Ezzedine Ben Braiek Speech Enhancement Speech Enhancement: A Multivariate Empirical Mode Decomposition Approach ....................................................... 192 Jordi Sol´e-Casals, Esteve Gallego-Jutgla`, Pere Mart´ı-Puig, Carlos M. Travieso, and Jesu´s B. Alonso Speech Denoising Based on Empirical Mode Decomposition and Improved Thresholding ........................................... 200 Issaoui Hadhami and A¨ıcha Bouzid A Fast Semi-blind Reverberation Time Estimation Using Non-linear Least Squares Method ............................................ 208 Neda Faraji, Seyed Mohammad Ahadi, and Hamid Sheikhzadeh Author Index.................................................. 217 Evaluation of Automatic Glottal Source Analysis John Kane and Christer Gobl Phonetics and Speech Laboratory, School of Linguistic, Speech and Communication Sciences, Trinity College Dublin,Ireland Abstract. This paper documents a comprehensive evaluation carried out on automatic glottal inverse filtering and glottal source parameteri- sation methods.Theexperimentsconsist ofanalysis ofawide varietyof synthetic vowels and assessment of the ability of derived parameters to differentiate breathy to tense voice. One striking finding is that glottal model-based parameters compared favourably to parameters measured directly from theglottal sourcesignal, in termsof separation ofbreathy totensevoice.Also,certaincombinationsofinversefilteringandparam- eterisation methods were more robust than others. 1 Introduction The productionof voicedspeech canbe consideredas:the sound source created bythevibrationofthevocalfolds(glottalsource)inputtedthroughtheresonance structure of the vocal tract and radiated at the lips. Most acoustic descriptions typically used in speech processing involve characterisation of mainly the vocal tractcontributiontothespeechsignal.However,thereisincreasingevidencethat developmentofindependent feature sets for boththe vocaltractandthe glottal source components can yield a more comprehensive description of the speech signal. Recent developments in speech synthesis [1], voice quality modification [2],voicepathologydetection[3]andanalysisofemotioninspeech[4]haveserved to highlight the potential of features related to the glottal source. However, approaches for analysing the estimated glottal source are at times believedtolackrobustnessincertaincases.Forinstance,higherpitchvoicesare knowntobeproblematicforinversefiltering[5]andparticularlywhencombined with a low first formant frequency. There can be strong source-filter interaction effects [6] which seriously affect the linear model of speech exploited in inverse filtering. Furthermore, precise glottal source analysis is often said to require the use of high-quality equipment to record in anechoic or studio settings [5]. Despite these claims, some studies have found that glottal source parameters derived from speech recorded in less than ideal recording conditions contribute positively to certain analyses [7]. It follows that the purpose of this paper is to investigate the performance of both inversefiltering andparameterisationsteps typically usedin glottalsource analysis.The evaluationofglottalsourceanalysismethods is knownto be prob- lematic as it is not possible to obtain ‘true’ reference values. To deal with this, T.DrugmanandT.Dutoit(Eds.):NOLISP2013,LNAI7911,pp.1–8,2013. (cid:2)c Springer-VerlagBerlinHeidelberg2013

Издательство Springer, 2013, -225 pp.6th International Conference, NOLISP 2013, Mons, Belgium, June 19-21, 2013 Proceedings.NOLISP, an ISCA tutorial and workshop on non-linear speech processing, is a biannual event whose aim is to present and discuss new ideas, techniques, and results re
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.