Table Of ContentBranko Kovačević · Milan Milosavljević
Mladen Veinović · Milan Marković
Robust Digital
Processing of
Speech Signals
Robust Digital Processing of Speech Signals
č ć ć
Branko Kova evi Milan Milosavljevi
(cid:129)
ć ć
Mladen Veinovi Milan Markovi
(cid:129)
Robust Digital Processing
of Speech Signals
123
Branko Kovačević Mladen Veinović
University of Belgrade Department ofInformatics andComputing
Belgrade Singidunum University
Serbia Belgrade
Serbia
Milan Milosavljević
University of Belgrade Milan Marković
Belgrade Department for Informatics
Serbia Banca Intesa
Belgrade
Serbia
ISBN978-3-319-53611-8 ISBN978-3-319-53613-2 (eBook)
DOI 10.1007/978-3-319-53613-2
JointlypublishedwithAcademicMind
ISBN:978-86-7466-677-7AcademicMind
LibraryofCongressControlNumber:2017932436
©AcademicMindandSpringerInternationalPublishingAG2017
Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar
methodologynowknownorhereafterdeveloped.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom
therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Printedonacid-freepaper
ThisSpringerimprintispublishedbySpringerNature
TheregisteredcompanyisSpringerInternationalPublishingAG
Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland
Preface
ThisbookRobustDigitalProcessingofSpeechSignalsrepresentsaresultofyears
of cooperation between the Institute for Applied Mathematics and Electronics and
the Department for Automatics ofthe School of Electrical Engineering, University
ofBelgrade,dedicatedtotheresearchofspeechsignalphenomena.Oneofthemain
conclusions of these extensive investigations has been that the accuracy of the
speech generation model always plays the key role, regardless of whether the
applied procedure for parameter identification and estimation is used for the pur-
poses ofcoding, analytical-synthetical transmission, recognition, or for some other
goal. It islogical that limitationsimposed atthis lowest level ofspeech processing
can be hardly corrected at higher levels in the mentioned complex systems for
digitalprocessingofspeechsignals.Oneofthepossibledirectionstowardmakinga
more complex speech model is its robustification regarding the presumed types of
excitation signals, which is equivalent to the introduction of a class of nonlinear
models and the corresponding criterion functions for parameter estimation.
Comparedtothegeneralclassofnonlinearmodels,suchasvarioustypesofneural
networks,thisclassofmodelspossessesgoodpropertiesofcontrolledcomplexity,a
possibility to work in “online” mode, as well as a low information volume for the
needs of efficient speech encoding and transmission.
The material presented in this book dominantly relies upon the authors’ own
results, previously verified through publishing in eminent international science
journals. In order to arrive at a comprehensive insight into the subject of robust
modeling of speech signal, this monograph has been extended by additional texts
dedicated to general considerations of speech modeling, linear predictive analysis,
androbustparameterestimation.Itisourbeliefthatthisbook'sreadabilityhasbeen
thusimproved,andthatassuch,itmayservebothasaspecializedtextbookandasa
monograph.
The text of this book is divided into seven chapters. The first six chapters are
dedicated to theoretical considerations, synthesis of robust algorithms, and their
experimental evaluation, while the seventh chapter unifies the developed robust
methodsinvariouspracticalproblemsofdigitalspeechprocessing.Thefirstchapter
is dedicated to the general subject of speech modeling as a complex phenomenon
v
vi Preface
with inherent nonlinearity and non-stationarity. The second chapter comprises a
short review of basic procedures of linear speech prediction, from the autocorre-
lationandcovariantmethodtodifferentversionsofpredictivelatticestructures.The
intention of the third chapter is to make the reader acquainted with the basic
postulatesofthegeneraltheoryofrobustparameterestimation,andespeciallywith
the concept of minimax robust estimation. The fourth and fifth chapters, as the
centralpartofthisbook,representanoverviewofthedevelopedrobustmethodsfor
theestimationofspeechsignalmodelparametersinanon-recursive,aswellasina
recursive form. The sixth chapter presents the results of one of the alternative
approachestotheintroductionofanewclassofnonlinearalgorithmsforparameter
estimationofspeechsignalmodels,basedonstatisticalpatternrecognition.Seventh
chapter is dedicated to the most important applications of the developed robust
procedures, such as the segmentation of speech signal, extraction of formant
trajectories, and speech signal coding.
The overall level of the text is suited to the readers with an adequate fore-
knowledge in the probability theory and statistics, as well as in identification and
estimationofsignalmodelparameters.Graduatesfromengineeringfacultieswillbe
abletofollowthetextwithoutsignificantdifficulties,whileanadditionaleffortwill
berequiredfromtheundergraduatesatthefinalyears,asiscustomaryforthiskind
of texts. The methodological approach of this book makes it especially convenient
forgraduatecoursesinthefieldscoveredbyit,suchasmodelingandestimationof
model parameters of stochastic signals and systems, estimation of time-variable
parameters of non-stationary models, digital signal processing, modeling, analysis,
and processing of speech signals. It ensures a single place where one can access a
numberofpracticalproblemstogetaninsightintothewholeprocedureofanalysis
and synthesis of required properties, together with a comprehensive practical
evaluation, which is the basis of the research and development in engineering.
Because of that, this book is also useful for research institutions whose work is
connected with the presented subject.
TheauthorswishtoexpresstheirgratitudetothereviewersProf.Dr.MilanSavić
andProf.Dr.JovanGolićfortheirusefulsuggestionsandadvices,aswellastoall
of those who contributed to the publishing of this monograph.
Let us mention at the end that the contributions to this book of all four authors
are comparable and that we adopted an ordering of authors according to their
academic ranks.
Belgrade, Serbia Branko Kovačević
2014 Milan Milosavljević
Mladen Veinović
Milan Marković
Contents
1 Speech Signal Modeling .. ..... .... .... .... .... .... ..... .... 1
1.1 Nature of Speech Signal... .... .... .... .... .... ..... .... 1
1.2 Linear Model of Speech Signal . .... .... .... .... ..... .... 4
2 Overview of Standard Methods . .... .... .... .... .... ..... .... 9
2.1 Autocorrelation Method ... .... .... .... .... .... ..... .... 11
2.2 Covariant Method... ..... .... .... .... .... .... ..... .... 12
2.3 Forward and Backward Prediction ... .... .... .... ..... .... 15
2.4 Lattice Filter... .... ..... .... .... .... .... .... ..... .... 17
2.5 Method of Minimization of Forward Prediction Error ..... .... 19
2.6 Method of Minimization of Backward Prediction Error .... .... 19
2.7 Method of Geometric Mean .... .... .... .... .... ..... .... 20
2.8 Method of Minimum ..... .... .... .... .... .... ..... .... 21
2.9 General Method .... ..... .... .... .... .... .... ..... .... 21
2.10 Method of Harmonic Mean .... .... .... .... .... ..... .... 21
2.11 Lattice-Covariant LP Method... .... .... .... .... ..... .... 22
2.12 Basic Properties of Partial Correlation Coefficient ... ..... .... 25
2.13 Equivalence of Discrete Model and Linear Prediction Model.... 25
2.14 Speech Synthesis Based on Linear Prediction Model. ..... .... 26
3 Fundamentals of Robust Parameter Estimation .... .... ..... .... 29
3.1 Principles of Robust Parameter Estimation. .... .... ..... .... 29
3.2 Robust Estimation of Signal Amplitude ... .... .... ..... .... 35
3.3 Fundamentals of Minimax Robust Estimation of Signal
Amplitude. .... .... ..... .... .... .... .... .... ..... .... 40
3.4 Recursive Minimax Robust Algorithms for Signal Amplitude
Estimation. .... .... ..... .... .... .... .... .... ..... .... 44
3.5 Statistical Models of Perturbations and Examples of Minimax
Robust Estimator ... ..... .... .... .... .... .... ..... .... 51
3.6 Practical Aspects of Implementation of Robust Estimators.. .... 61
vii
viii Contents
3.7 Robust Estimation of Parameters of Autoregressive Dynamic
Signal Models.. .... ..... .... .... .... .... .... ..... .... 65
3.8 Non-recursive Minimax Robust Estimation Algorithms .... .... 69
3.9 Recursive Minimax Robust Estimation Algorithm ... ..... .... 75
3.10 Fundamentals of Robust Identification of Speech
Signal Model .. .... ..... .... .... .... .... .... ..... .... 80
Appendix 1—Analysis of Asymptotic Properties of Non-recursive
Minimax Robust Estimation of Signal Amplitude. .... .... ..... .... 84
Appendix 2—Analysis of Asymptotic Properties of Recursive
Minimax Robust Estimation of Signal Amplitude. .... .... ..... .... 88
4 Robust Non-recursive AR Analysis of Speech Signal .... ..... .... 95
4.1 Robust Estimations of Parameters of Linear
Regression Model... ..... .... .... .... .... .... ..... .... 96
4.2 Non-recursive Robust Estimation Procedure: RBLP Method .... 99
4.2.1 Newton Algorithm . .... .... .... .... .... ..... .... 100
4.2.2 Dutter Algorithm... .... .... .... .... .... ..... .... 101
4.2.3 Weighted Least Squares Algorithm. .... .... ..... .... 104
4.3 Comparison of Robust and Non-robust Estimation
Algorithms .... .... ..... .... .... .... .... .... ..... .... 105
4.3.1 Analysis of the Estimation Error Variance ... ..... .... 106
4.3.2 Analysis of Estimation Shift.. .... .... .... ..... .... 110
4.4 Characteristics of M-Robust Estimation Procedure... ..... .... 111
4.4.1 Model Validity .... .... .... .... .... .... ..... .... 112
4.4.2 Stability. .... ..... .... .... .... .... .... ..... .... 112
4.4.3 Computational Complexity ... .... .... .... ..... .... 112
4.5 Experimental Analysis .... .... .... .... .... .... ..... .... 113
4.5.1 Test Signals Obtained by Filtering Train
of Dirac Pulses .... .... .... .... .... .... ..... .... 113
4.5.2 Test Signals Obtained by Filtering
of Glottal Excitation .... .... .... .... .... ..... .... 116
4.5.3 Natural Speech Signal... .... .... .... .... ..... .... 119
4.6 Discussion and Conclusion. .... .... .... .... .... ..... .... 123
5 Robust Recursive AR Analysis of Speech Signal.... .... ..... .... 125
5.1 Linear Regression Model for Recursive Parameter Estimation ... 126
5.2 Application of M-Estimation Robust Procedure:
RRLS Method . .... ..... .... .... .... .... .... ..... .... 127
5.3 Robust Recursive Least-Squares Algorithm .... .... ..... .... 129
5.4 Adaptive Robust Recursive Estimation Algorithm ... ..... .... 132
5.5 Determination of Variable Forgetting Factor.... .... ..... .... 133
5.5.1 Approach Based on Discrimination Function . ..... .... 133
5.5.2 Approach Based on Generalized Prediction Error... .... 135
Contents ix
5.6 Experimental Analysis on Test Sinusoids.. .... .... ..... .... 136
5.6.1 Testing with Fixed Forgetting Factor ... .... ..... .... 137
5.6.2 Testing with Variable Forgetting Factor . .... ..... .... 137
5.6.3 Testing with Contaminated Additive Gaussian Noise .... 143
5.7 Experimental Analysis of Speech Signals.. .... .... ..... .... 145
5.7.1 Test Signals Obtained by Filtering a Train of Dirac
Pulses .. .... ..... .... .... .... .... .... ..... .... 146
5.7.2 Test Signals Obtained by Filtering Glottal Excitation.... 147
5.7.3 Natural Speech Signal... .... .... .... .... ..... .... 149
5.8 Discussion and Conclusion. .... .... .... .... .... ..... .... 153
6 Robust Estimation Based on Pattern Recognition... .... ..... .... 155
6.1 Unsupervised Learning.... .... .... .... .... .... ..... .... 156
6.1.1 General Clustering Algorithms .... .... .... ..... .... 157
6.1.2 Frame-Based Methods... .... .... .... .... ..... .... 158
6.1.3 Quadratic Classifier with Sliding Training Set. ..... .... 161
6.2 Recursive Procedure Based on Pattern Recognition .. ..... .... 163
6.3 Application of Bhattacharyya Distance.... .... .... ..... .... 170
6.3.1 Bhattacharyya Distance.. .... .... .... .... ..... .... 172
6.4 Experimental Analysis .... .... .... .... .... .... ..... .... 174
6.4.1 Direct Evaluation .. .... .... .... .... .... ..... .... 174
6.4.2 Indirect Evaluation . .... .... .... .... .... ..... .... 177
6.5 Conclusion .... .... ..... .... .... .... .... .... ..... .... 183
7 Applications of Robust Estimators in Speech Signal Processing ........ 185
7.1 Segmentation of Speech Signal.. .... .... .... .... ..... .... 186
7.1.1 Basics of Modified Generalized Maximum Likelihood
Algorithm ... ..... .... .... .... .... .... ..... .... 187
7.1.2 Robust Discriminant Function. .... .... .... ..... .... 190
7.1.3 Tests with Real Speech Signal .... .... .... ..... .... 191
7.1.4 Appendix 4: Robust MGLR Algorithm (RMGLR) .. .... 191
7.2 Separation of Formant Trajectories... .... .... .... ..... .... 195
7.2.1 Experimental Analysis... .... .... .... .... ..... .... 197
7.3 CELP Coder of Speech Signal.. .... .... .... .... ..... .... 200
7.3.1 LSP Parameters.... .... .... .... .... .... ..... .... 201
7.3.2 Distance Measure .. .... .... .... .... .... ..... .... 203
7.3.3 Linear Prediction Methods with Sample Selection .. .... 206
7.3.4 Experimental Analysis... .... .... .... .... ..... .... 207
References.... .... .... .... ..... .... .... .... .... .... ..... .... 213
Index .... .... .... .... .... ..... .... .... .... .... .... ..... .... 221
Abbrevations
AEF Asymptotic efficiency
AR Autoregressive model
ARX Autoregressive with exogenous input model
BHATT Bhattacharyya distance
CELP Code-excited linear prediction
CEUC c-mean classification algorithm
CG Closed glottis
CIQC Iterative quadratic classification algorithm
CLP Covariant-based linear prediction
CPDF Conditional PDF
CR Cramer–Rao bound
D Discrimination function
EPR Extended prediction error
FF Forgetting factor
FFF Fixed forgetting factor
FFT Fast Fourier transform
k-NN k-nearest neighbors procedure
LP Linear prediction
LPAS Linear prediction with analysis-by-synthesis
LS (LSQ) Least squares method
LSP Line spectral pairs
M Approximate maximum likelihood estimator
MAR Mean absolute value criterion
MGLR Modified general likelihood ratio algorithm
ML Maximum likelihood estimation method
OG Open glottis
PDF Probability density function
Q Quantized values of line spectral pairs
QCSTS Quadratic classifier with sliding training set
RBLP Robust batch processing linear prediction
xi