ebook img

The Effect of Selective Narrow-Band Filtering on the Perception on Certain English Vowels PDF

212 Pages·1964·15.202 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Effect of Selective Narrow-Band Filtering on the Perception on Certain English Vowels

THE EFFECT OF SELECTIVE NARROW-BAND FILTERING ON THE PERCEPTION OF CERTAIN ENGLISH VOWELS JANUA LINGUARUM STUDIA MEMORIAE NICOLAI VAN WIJK DEDICATA edenda curai CORNELIS H. VAN SCHOONEVELD STANFORD UNIVERSITY SERIES PRACTICA XIII 1964 MOUTON & CO LONDON • THE HAGUE • PARIS THE EFFECT OF SELECTIVE NARROW-BAND FILTERING ON THE PERCEPTION OF CERTAIN ENGLISH VOWELS by WILLIAM E. CASTLE 1964 MOUTON & CO LONDON • THE HAGUE • PAWS © Copyright Mouton & Co., Publishers, The Hague, The Netherlands. No part of this book may be translated or reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publishers. Printed in the Netherlands by Mouton & Co., Printers, The Hague. ACKNOWLEDGEMENTS The author wishes to extend his sincerest appreciation to Professor Dorothy A. Huntington of Stanford University for her counsel and guidance with respect to his study. More importantly, he is grateful for the interest which she has instilled in him in the speech sciences by way of her own heart-felt interest. Appreciation is also due Professors Richard F. Dixon and Alphonse G. Juilland, also from Stanford University, for their helpful criticism and suggestions. In addition, the author would like to acknowledge the love offered him by his wife, Diane, throughout the writing struggle. TABLE OF CONTENTS ACKNOWLEDGEMENTS 5 I. INTRODUCTION AND STATEMENT OF HYPOTHESIS 9 II. BACKGROUND OF THE PROBLEM 12 The Importance of Steady State Formants for Perceptual Distinctions among Vowels 12 The Importance of Factors other than Steady State Formants for Perceptual Distinctions among Vowels 21 Summary 30 III. PROCEDURES 31 Procedure I for Preparation and Presentation of Stimuli 31 Procedure II for Preparation and Presentation of Stimuli 36 Additional Procedures 44 IV. CHARACTERISTICS OF INSTRUMENTATION AND SPEAKERS 47 Comparison of the Filter Systems 47 Formant Frequencies for Speakers 67 V. PRESENTATION AND DISCUSSION OF RESULTS 74 Identification of Unfiltered Stimuli 74 Identification of Filtered Stimuli 76 Special Points of Discussion 138 VI. SUMMARY 155 Procedures 155 Results and Conclusions 157 APPENDIXES 161 BIBLIOGRAPHY 207 I INTRODUCTION AND STATEMENT OF HYPOTHESIS Researchers have been interested for some time in determining what it is in the acous- tic nature of spoken vowels which allows them to be categorized phonemically regard- less of speaker, speech rate, phonetic environment, position in the word, and presence or absence of distortion. As Curtis (12)1 pointed out in 1954, the problem is not simply that of describing the acoustic characteristics of spoken vowel sounds. It is just as important to determine which of these characteristics are significant cues to vowel perception and which are incidental. It is a common hypothesis that the significant acoustic correlates of the perception of a given vowel phoneme can be specified with a description of those regions in the frequency spectrum in which energy is most predominant. Such regions of energy predominance are assumed to be directly related to the resonances which occur in the vocal tract of the speaker as he produces samples of the vowel phoneme. These energy regions and their center frequencies are often referred to as "formants" and "formant frequencies", respectively. In the present paper this terminology will be employed. It has often been assumed that a complete description of the necessary and sufficient acoustic cues for the perception of a vowel, whether isolated or in context, is obtained by determining the formant frequencies that occur "over a single fundamental period, or over a short interval including at most a few cycles of the fundamental frequency" (87, p. 1). It has often further been assumed that such measurements need only be made for the so-called "steady state" portion of the vowel. For sustained, isolated production of a given vowel, the energy vs. frequency distribution is considered to be essentially steady state throughout the production, since the formant positions remain nearly the same during all fundamental periods except those at the onset and the ter- mination. For context productions of the same vowel by the same speaker, those por- tions "where the rates of formant change are slowest" (6, p. 161) often show energy vs. frequency distributions which approximate that of the isolated production. Such portions of the context productions of the vowels are, therefore, sometimes defined as the steady state portions; for the purposes of this investigation, this definition has been adopted. While it has often been assumed that some form of invariance inheres in the steady 1 Numbers between parentheses refer to the Bibliography, pp. 207-209. 10 INTRODUCTION AND STATEMENT OF HYPOTHESIS state formant structure of all vowel samples from the same phoneme category, there is not general agreement about how many or which of the formants define the assumed invariance. Most of the available evidence suggests that for the perception of any vowel the first two formants are the most important acoustic cues. Some evidence suggests, however, that for back vowels only a single formant may be important. There is evidence, too, which suggests that three or more formants are important to the perception of front vowels at least and perhaps to the perception of all vowels. Along with the assumption that there may be inherent invariants in the steady state formant structure of a vowel, it is sometimes suggested that these invariants are the absolute values of the formant frequencies present. There are a number of facts which do not support this suggestion, however. For instance, vowels produced by members of a relatively homogeneous group of speakers (e.g., adult males) are sometimes per- ceived as different vowels even though the same set of steady state formant frequencies was used by the speakers involved (20, 30, 75). Also, productions of the same vowel phoneme by adult males, adult females, and children are usually perceived as the same phoneme though the respective average formant frequencies for the three groups of speakers are quite different (8,24,31,46, 69, 72). Moreover, several productions of the same vowel phoneme by a single speaker are perceived correctly though the steady state formant frequencies for the various productions are sometimes very different (20,30, 75). It is also sometimes suggested that the invariance assumed to inhere in the steady state formant structure of a given vowel phoneme is based upon a ratio which exists between the first and second formant frequencies for all productions of that phoneme, regardless of speaker (1, 61). This suggestion is not supported, however, by the fact that the same ratio may occur for several vowel productions which are perceived as different vowels (20). Chiba and Kajiyama (8) have described the assumed invariance for the steady state formant structure of a vowel in terms of ranges of variation. According to their de- scription, whenever the formant frequencies present in a vowel are "situated within certain frequency regions fixed for a given vowel (i.e., characteristic frequency regions)" (8, p. 194), it is that vowel which will be perceived, regardless of speaker. Certain supporters of the distinctive feature theory of phoneme perception (25, 31, 43, 44) believe that it is the relationship which the steady state formant frequencies bear to one another that is of primary importance to vowel perception. They do not support a simple ratio hypothesis, however. Instead, they propose that the assumed invariants consist of a set of binary decisions which the listener makes with regard to the steady state energy vs. frequency distribution for each production of a given vowel phoneme. For English vowels, for instance, the listener is presumed to decide whether the first formant is "maximally low" or not (31, p. 126), whether the first for- mant is "maximally high" or not (31, p. 127), whether there is a "lowering of all for- mants" or not (31, p. 127), and whether the second formant is closer to the first for- mant than to the third formant or vice-versa (43, p. 30). Theoretically, whenever a

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.