ASSESSMENT AND PREDICTION OF SPEECH QUALITY IN TELECOMMUNICATIONS Assessment and Prediction of Speech Quality in Telecommunications by Sebastian Möller Institut für Kommunikationsakustik (IKA) SPRINGER-SCIENCE+BUSINESS MEDIA, B.V. A c.I.P. Catalogue record for this book is available from the Library ofCongress. ISBN 978-1-4419-4989-9 ISBN 978-1-4757-3117-0 (eBook) DOI 10.1007/978-1-4757-3117-0 Printed on acid-free paper All Rights Reserved © 2000 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 2000 Softcover reprint ofthe hardcover 1st edition 2000 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner. DES MENSCHEN AUGE HAT NIE GEHÖRT, DES MENSCHEN OHR NIE GESEHEN; DES MENSCHEN HAND KANN NICHT SCHMECKEN; SEINE ZUNGE NICHT BEGREJFEN NOCH SEIN HERZ BERICHTEN, WAS MEIN TRAUM WAR. Ein Sommemach,.'raum. IV. I Contents Definitions and Abbreviations Xl Acknowledgements XVll Summary XIX 1. INTRODUCTION 2. CONSIDERATIONS ON QUALITY 7 1. Speech Quality 7 2. Quality in the Context of Telecommunications 11 3. FACTORS INFLUENCING THE QUALITY OF SERVICE 17 1. Perceptive Factors 18 1.1 Loudness 19 1.2 Articulation 26 1.3 Effects of Bandwidth Restr. and Frequency Distortion 27 1.4 Perception of Sidetone 28 1.5 Perception of Echo 30 1.6 Perception of Circuit Noise 32 1.7 Perception of Ambient Noise 33 1.8 Effects of Transmission Delay 34 2. Configuration of a Telephone Connection 35 3. Simulation of a Telephone Connection for Quality Assessment Purposes 38 4. Classification of Transmission and Service Parameters 43 5. Summary 45 4. QUALITY ASSESSMENT IN TELECOMMUNICATIONS 47 1. Choice of Test Subjects 50 2. Articulation and Intelligibility Tests 51 3. Listening-Only and Conversation Opinion Tests 51 3.1 Listening-Only Tests Using Absolute Category Rating 52 3.2 Listening-Only Tests Using Paired Comparison Techn. 54 3.3 Multidimensional Analysis 56 3.4 Talking and Listening Tests 58 vii viii ASSESSMENT AND PREDICTION OF TELEPHONE-SPEECH QUALITY 3.5 Conversation Tests 59 4. Performance Tests 60 5. User Surveys 61 6. Usability Evaluation 63 7. Assessment of Cost-Related Factors 65 8. Scaling 66 8.1 Ratio Scaling 68 8.2 Absolute Category Scaling 68 8.3 Category-Ratio Scaling 72 9. Development of Conversation Test Scenarios 75 9.1 Requirements for Conversation Test Scenarios 75 9.2 Experiences with two Different Types of Scenario 78 9.3 Scenarios for Special Applications 81 10. Classification of Assessment Methods 85 11. Summary 85 5. MODELS FOR PREDICTING SPEECH COMM. QUALITY AND SERVICE-RELATED MODELS 89 1. Mouth-to-Ear Models U sing Parameters in the Frequ. Domain 91 2. Mouth-to-Ear Models Using Scalar Parameters 94 2.1 The Additivity Property of the E-Model 96 2.2 Description of the E-Model 99 3. Instrumental Models for Single Transmission Aspects 102 4. Call Set-Up and Call Completion Models 104 5. Customer Behavior Models 108 6. Application and Classification of Prediction Models 109 7. Summary 111 6. RELATIONS BETWEEN FACTORS GOVERNING THE QUALITY OF SERVICE 115 1. 'Relative' Quality versus 'Absolute' Quality 116 2. References and Normalization 121 3. Assessment in Listening-Only and Conversation Tests 129 4. Expectation 133 5. Influence of the Cost Factor 141 6. Multidimensional Assessment of Voice Transmission Quality 145 7. Existence of an Integral "Psychological Quality Scale" 147 8. Summary 155 7. QUALITY OF PREDICTION MODELS 159 1. Prediction for Single Perceptive Types of Impairment 160 2. Prediction for Combinations of Different Types of Impairment 167 3. Impairment Factor Principle for Low-Bitrate Codecs 172 4. Prediction of Frequency Characteristics 176 5. Measurement of the Input Parameters 184 Contents ix 6. Accuracy of Quality Predictions, Limitations 185 7. Summary 186 8. FINAL DISCUSSION AND CONCLUSIONS 189 Appendices 197 A- Glossary 197 B- Perceptive Characteristics 199 Resulting from New Technologies 1. Impact of New Technologies and Equipment 199 2. Classification for Modeling Purposes 201 C- Discussion of Articulation and Intelligibility Test 205 Methodologies D- Graeco-Latin-Square Test Design 207 E- SCT Scenarios 209 I. Examples of SCT Scenarios 209 2. Explanations of the SCT Dialog Structure Given to the Test Subjects 213 F- Closing Questionnaire Given after Laboratory Tests 215 G- E-Model Algorithm 217 H- Test Conditions and Results 221 1. Test Conditions of the Isopreference Test 221 2. Relation between E-Model Predictions and Test Results 222 3. Comparison of SUBMOD Model Predictions and Test Results 225 Bibliography 227 Index 241 Definitions and Abbreviations Definitions expectation factor factor for calculating LOI, depending on the loudness of re ceived speech Aw warping amplitude a,b constants of the 'power law' B' frequency weighting function related to loudness B~ frequency weighting function related to articulation B~E frequency weighting function related to listening-effort bo description of an auditory event ßo pure tone threshold ofhearing in quiet [dB rel. 20 J.LPaoHz-1/2] ß's spectrum density of speech at MRP [dB Tel. 20 J.LPaoHz-1/2] D frequency-weighted version of DELSM factor for ca1culating LOI, depending on the level of circuit DLOI noise d magnitude of a stimulus dM magnitude of a stimulus at the midopinion value of the scale DELSM frequency--dependent difference in sensitivity between the di rected and diffuse sound [dB] %Diff percentage of users experiencing difficulty in talking or listen ing over a connection J, fk frequency [Hz] F sampling frequency [Hz] s Fw warping frequency [Hz] (f).f)c bandwidth of critical band [Hz] G signal-to-equivalent-continuous-circuit-noise ratio [dB] 9 exponent of the logistic psychometric function GL frequency weighting function for the ca1culation of loudness ratings %GoB percentage of users rating a connection good or better ho auditory event I,Itot impairment factor Id impairment factor for delayed impairments related to the speech signal Ie equipment impairment factor Iq impairment factor for quantizing distortion 18 impairment factor for impairments occurring simultaneously with the speech signal Xl XlI ASSESSMENT AND PREDICTION OF TELEPHONE-SPEECH QUALITY IQoS GSM quality of service index K factor for calculating Yc from YLE k exponent of the relation between subjective rating and apparent magnitude Kc allowance to the threshold of hearing of complex tones in quiet [dB] frequency weighting function for alternatively calculating loudness ratings point of maximum excitation on the basilar membrane measured from the helicotrema due to atone at fk [mm] totallength of the basilar membrane [mm] lmax Le frequency-dependent loss of the talker echo path [dB] Lst frequency-dependent loss of the sidetone path [dB] LME, LRME, air-to-air transmission loss from mouth reference point to ear reference point [dB] LUME L RME, LUME weighted average mouth-to-ear loss [dB] A, AR, AU impression of loudness LOI listening opinion index LSTR listener sidetone rating [dB] Lo constant factor for calculating loudness ratings (ßl)c criticallength [mm] M shift in hearing threshold attributable to the presence of noise [dB] m exponent of the growth function Q(Z) mse mean squared error n(k) sampled noise signal N total number of frequency bands for calculating loudness ratings Ne circuit noise level [dBmOp] Nfor noise floOf level [dBmp] Np perceptual magnitude No total equivalent circuit noise level [dBmOp] OLR overallloudness rating between MRP and ERP [dB] P,Pr, Ps mean normalized opinion for an impairment P(Z) growth function of Z related to listening-effort Pa(Z) growth function of Z related to articulation Pr A-weighted sound pressure level of room noise at receive side [dB(A)] Ps A-weighted sound press ure level of room noise at send side [dB(A)] %PoW percentage of users rating a connection poor or worse Q signal-to-quantizing-noise ratio [dB] Definitions and Abbreviations X111 Q(Z) growth function of Z related to loudness qdu quantizing distortion unit r Pearson correlation coefficient R transmission rating Ree transmission rating taking into account call-completion im pairments Res transmission rating taking into account call-setup impair ments transmission rating taking into account loss, noise and talker echo R' weighted sound reduction index [dB] w Ro basic signal-to-noise transmission rating factor RLR receive loudness rating between the 0 dBr point in the network and the ERP [dB] RLRset receive loudness rating of the telephone handset [dB] So sound event electro-acoustic receiving sensitivity from junction to ERP SJE [dB] acousto-electric sending sensitivity from MRP to junction [dB] air-to-air sensitivity ofthe electric sidetone path for directed SmeST (speech) sound, as defined in ITU-T Rec. P.64 (1997) [dB] air-to-air sensitivity of the electric sidetone path for diffuse SRNST (room noise) sound, as defined in ITU-T Rec. P.64 (1997) [dB] SLR send loudness rating between the MRP and the 0 dBr point in the network [dB] SLRset send loudness rating of the telephone handset [dB] STMR sidetone masking rating [dB] t level of sensation on a continuous finite rating scale T mean one-way talker echo path delay [ms] Ta overall delay between MRP of the talker and ERP of the Iis tener [ms] Tr round-trip delay for listener echo [ms] TELR talker echo loudness rating [dB] %TME percentage of users terminating a call early Ve, VL active speech level in conversation or Iistening-only situa tions [dBV] Wi frequency weighting function for calculating loudness ratings WEPL weighted echo path 10ss for Iistener echo [dB] x(k) sampled input signal