ebook img

Techniques in Speech Acoustics PDF

327 Pages·1999·12.33 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Techniques in Speech Acoustics

Techniques in Speech Acoustics Text, Speech and Language Technology VOLUME 8 Series Editors Nancy Ide, Vassar College, New York Jean Veronis, Universite de Provence and CNRS, France Editorial Board Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands Kenneth W. Church, AT & T Bell Labs, New Jersey, USA Judith Klavans, Columbia University, New York, USA David T. Barnard, University of Regina, Canada Dan Tufis, Romanian Academy of Sciences, Romania Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain Stig Johansson, University of Oslo, Norway Joseph Mariani, LIMSI-CNRS, France The titles published in this series are listed at the end of this volume. Techniques in Speech Acoustics by Jonathan Harrington Speech Hearing and Language Research Centre, Macquarie University, Sydney, Australia and Steve Cassidy Speech Hearing and Language Research Centre, Macquarie University, Sydney, Australia ..... " SPRINGER-SCIENCE+BUSINESS MEDIA, B.V. A c.I.P. Catalogue record for this book is available from the Library of Congress. Additional material to this book can be downloaded [rom http://extras.springer.com. ISBN 978-0-7923-5822-0 ISBN 978-94-011-4657-9 (eBook) DOI 10.1007/978-94-011-4657-9 Published by Kluwer Academic Publishers, P.O. Box 17,3300 AA Dordrecht, The Netherlands. Sold and distributed in North, Central and South America by Kluwer Academic Publishers, 101 Philip Drive, Norwell, MA 02061, U.S.A. In ali other countries, sold and distributed by Kluwer Academic Publishers, P.O. Box 322, 3300 AH Dordrecht, The Netherlands. Printed on acid-free paper AII Rights Reserved © 1999 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 1999 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner. For Alexander and Catriona Owen and Nichola CONTENTS Preface ix Vowel and Consonant Transcriptions xi Contents of the CD-ROM xiii 1 The Scope of Speech Acoustics 1 1.1 What is speech acoustics? ........ . 1 1.2 The variable nature of speech . . . . . . . 2 1.3 Experimental designs in speech acoustics . 4 1.4 Some key areas of research in speech acoustics . 5 2 The Physics of Speech 9 2.1 Speech waveforms 9 2.2 Frequency analysis . 12 3 The Acoustic Theory of Speech Production 29 3.1 The source-filter decomposition of speech 30 3.2 The acoustic source in speech production 34 3.3 The acoustic filter in speech production 38 3.4 Vocal tract losses ..... 52 3.5 Radiated sound pressure. 53 3.6 The composite model. . . 55 4 Segmental and Prosodic Cues 57 4.1 Vowels ..... . 5i 4.2 Oral stops ... . i8 4.3 Nasal consonants 95 4.4 Fricatives . . . . 99 4.5 Approximants .. 105 4.6 Prosody and juncture 111 5 Time-Domain Analysis 131 5.1 Sampling and quantisation . 132 5.2 Definition of a digital signal 136 5.3 Simple operations on signals . 138 5.4 Windowing signals ..... . 140 5.5 Some common time-domain parameters 142 5.6 Convolution and time-domain filtering . 148 6 Frequency-Domain Analysis 157 6.1 Digital sinusoids ...... 15i 6.2 The discrete Fourier transform 162 6.3 Spectra derived from the DFT 164 6.4 Some points of procedure in applying a DFT 164 6.5 Spectral parameterisations . 1iO 6.6 Frequency-domain filtering. li8 7 Digital Formant Synthesis 195 i.1 Core structure of a formant synthesiser . 19i i.2 Digital considerations 199 i.3 Periodic excitation .......... 199 7.4 Formant filter . . . . . . . . . . . . . 202 7.5 Combining the source with the filter 206 7.6 Parallel structure . . . . . . ..... 208 8 Linear Prediction of Speech 211 8.1 LPC and its relationship to digital speech 212 8.2 Techniques for calculating the LPC coefficients 215 8.3 Analysis of the error signal ....... 217 8.4 LPC-smoothed spectra and formants . . 219 8.5 Area functions and reflection coefficients 226 8.6 Speech synthesis from LPC-parameters . 233 9 Classification of Speech Data 239 9.1 Speech spaces and distance measures 240 9.2 Distributions of speech sounds 244 9.3 Discriminant functions and classification 250 9.4 Classification experiments 254 9.5 Classifying signals in time 261 9.6 Data reduction ...... 262 References 279 viii PREFACE This book is the development of a series of lectures to undergraduate and post graduate students at Macquarie University on basic principles in acoustic pho netics and speech signal processing. The first part of the book (Chapters 1 to 4) is intended to provide students with the ability to interpret acoustic records of speech signals in their various forms. These chapters include a review of elementary wave motion and frequency analysis as applied to speech (Chapter 2), a summary of the relationship between speech production and its acoustic consequences (Chapter 3), and finally a review of the principal cues to speech sounds and prosodic units (Chapter 4). The material from these first four chap ters (and the related exercises on the accompanying CD-ROM) has formed the basis of a one-semester undergraduate course in acoustic phonetics to students primarily of linguistics, but also of other disciplines including computer science and psychology. The second part of the book provides an introduction to speech signal pro cessing, which is intended for similar groups of students. It is therefore different from more detailed introductory texts in this area (e.g., Rabiner & Schafer, 1978; O'Shaughnessy, 1987) which assume both a background in engineering/signal processing and a more sophisticated mathematical knowledge (e.g., Parsons, 1987). Part of the motivation for writing this section of the book is to make many of the techniques and algorithms that are discussed in the engineering lit erature on speech analysis (for example, in the IEEE Acoustics Speech and Signal Processing journal) more accessible to both students and researchers of phonet ics and speech science whose training is not usually in a scientific discipline. We have therefore tried to keep equations to a minimum and to assume, as far as possible. no more than a very basic understanding of algebra and trigonometry. In this part of the book, we cover fundamental aspects of time and frequency domain processing of speech signals (Chapters 5 and 6). Chapters 7 and 8 are concerned with digital techniques for combining (in digital formant synthesis) and separating (in linear predictive coding) the contributions of the source and filter to the acoustic speech signal. The final chapter and related exercises deal with techniques for the probabilistic classification of acoustic speech data that forms the basis of more advanced work ill automatic speech recognition. We have always felt that students of speech benefit from being able to carry out their own experiments to test some of the theories that they learn about in books such as these. However, preparing experiments that are also appropriate to testing a particular aspect of speech acoustics can be very time-consuming to construct and may not be feasible either because of the difficulties of collecting (and digitising) large quantities of speech data or because of the problem of student access to laboratory machines that are usually dedicated to research projects. In the last three to four years, we have adapted a research tool that has been developed in our laboratory (the mu+ system Harrington, Cassidy, Fletcher, & McVeigh, 1993) for building and analysing speech corpora to the needs of teaching on our undergraduate and postgraduate courses. In extending this system, which is included together with a number of speech corpora on a CD-ROM in this book, we have sought to develop an interface that would obscure minimally the aim of the exercises, which is to solve problems in speech acoustics rather than in computer programming. The revised software (known as the EMU speech analysis system) runs on both UNIX and Windows 95 plat forms. Since the EMU system can be used independently of the exercises in this book for segmenting and labelling utterances, as well as analysing them in the time and frequency domain (including spectrographic analysis), the exercises can be easily modified to include other sets of speech corpora beyond those pro vided on this CD-ROM. The EMU system also includes a set of extensions to the Xlisp-Stat statistical system to facilitate the analysis of speech data; the ex ercises on the CD-ROM use Xlisp-Stat extensively to allow students to analyse real data from the various speech corpora included on the disc. The Xlisp-Stat system (but not currently Emu) also runs on Macintosh systems. We usually teach courses on speech acoustics after students have gained some grounding in phonetics and phonology. Accordingly, we assume some background knowledge of phonetic principles of sound classification and ele mentary phonological theory, which are given excellent coverage in many other books (see, for example, recent general phonetics texts at the end of Chapter 1) and which are not dealt with in detail in this book. x Vowel and Consonant Transcriptions The columns for the vowels apply to many talkers of American, Australian and Southern British English (Adapted from Ladefoged, 1993). Vowels American Australian Southern British Key English English English word heed hid II Ia Ia here u u u who'd U U U hood eI eI eI hay £ £ r head eJ ra ra there a a a the 3' 3 3 heard ou aU aU hoed a '0 '0 hod ;) ;) ;) hawed ;)1 ;)1 ;)1 boy A A A bud re re re had aI aI aI hide ill a a hard aU aU aU how Consonants Symbol Key Symbol Key Symbol Key word word word e p pie think <t1 judge t tie 0 these m my k cat s so n no b bat z zoo sing D d do J shoe w we 9 go 3 measure J run f fan h he j you y van tf chew 1 leaf Xl

Description:
Techniques in Speech Acoustics provides an introduction to the acoustic analysis and characteristics of speech sounds. The first part of the book covers aspects of the source-filter decomposition of speech, spectrographic analysis, the acoustic theory of speech production and acoustic phonetic cues.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.