ebook img

Prosody: Theory and Experiment: Studies Presented to Gösta Bruce PDF

358 Pages·2000·10.065 MB·Text, Speech and Language Technology 14
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Prosody: Theory and Experiment: Studies Presented to Gösta Bruce

Prosody: Theory and Experiment Speech and Language Technology Text~ VOLUME 14 Series Editors Nancy Ide, Vassar College, New York Jean Veronis, Universite de Provence and CNRS, France Editorial Board Harald Baayen, Max Planck Institute for Psycho linguistics, The Netherlands Kenneth W. Church, AT & T Bell Labs, New Jersey, USA Judith Klavans, Columbia University, New York, USA David T. Barnard, University of Regina, Canada Dan Tufis, Romanian Academy of Sciences, Romania Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain Stig Johansson, University of Oslo, Norway Joseph Mariani, LIMSI-CNRS, France The titles published in this series are listed at the end of this volume. Prosody: Theory and Experiment Studies Presented to Gosta Bruce Edited by Merle Horne University of Lund, Sweden , • SPRlNGER-SCIENCE+BUSINESS MEDIA, B.Y. Library ofCongress Cataloging-in-Publication Data Prosody, theory and experiment: studies presented to Găsta Bruce I edited by Merle Home. p. cm. -- (Text, speech, and information technology ; v. 14) Includes index. ISBN 978-90-481-5562-0 ISBN 978-94-015-9413-4 (eBook) DOI 10.1007/978-94-015-9413-4 1. Prosodic analysis (Linguistics) 1. Bruce, Găsta, 1947-II. Home, Merle. ID. Series. P224 .P77 2000 414'.6--dc21 00-060904 Printed on acid-free paper AII Rights Reserved © 2000 by Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 2000 Softcover reprint of the hardcover 1s t edition 2000 No part of the material protected by this copyright notice may be reproduced or utilized in any form Of by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner. CONTENTS INTRODUCTION Merle Horne ........................................................................................................ 1 1. TONAL ELEMENTS AND THEIR ALIGNMENT Janet Pierrehumbert ......................................................................................... 11 2. BRUCE, PIERREHUMBERT, AND THE ELEMENTS OF INTONATIONAL PHONOLOGY D. Robert Ladd. ................................................................................................. 37 3. LEVELS OF REPRESENTATION AND LEVELS OF ANALYSIS FOR THE DESCRIPTION OF INTONATION SYSTEMS Daniel Hirst, Albert Di Cristo and Robert Espesser ........................................ 51 4. THE PERCEPTION OF PROSODIC PROMINENCE Jacques Terken and Dik Hermes ...................................................................... 89 5. THE LEXICAL TONE CONTRAST OF ROERMOND DUTCH IN OPTIMALITY THEORY Carlos Gussenhoven ....................................................................................... 129 6. MODELING THE ARTICULATORY DYNAMICS OF TWO LEVELS OF STRESS CONTRAST Mary E. Beckman and K. Bretonnel Cohen .................................................... 169 7. PHRASE-LEVEL PHONOLOGY IN SPEECH PRODUCTION PLANNING: EVIDENCE FOR THE ROLE OF PROSODIC STRUCTURE Stefanie Shattuck-Hufoagel ............................................................................ 201 8. THE INTERACTION OF CONSTRAINTS ON PROSODIC PHRASING Elisabeth Selkirk ............................................................................................. 231 9. PROSODIC BOUNDARY DETECTION Mari Ostendorf .................................... , .......................................................... 263 10. TIMING IN SPEECH: A MULTI-LEVEL PROCESS Nick Campbell ................................................................................................. 281 11. A CORPUS-BASED APPROACH TO THE STUDY OF SPEAKING STYLE Julia Hirschberg ............................................................................................. 335 INDEX ............................................................................................................. 351 INTRODUCTION The study of prosody is perhaps the area of speech research which has undergone the most noticeable development during the past ten to fifteen years. As an indication of this, one can note, for example, that at the latest International Conference on Spoken Language Processing in Philadelphia (October 1996), there were more sessions devoted to prosody than to any other area. Not only that, but within other sessions, in particular those dealing with dialogue, several of the presentations dealt specifically with prosodic aspects of dialogue research. Even at the latest Eurospeech meeting in Rhodes (September 1997), prosody, together with speech recognition (where several contributions dealt with how prosodic cues can be exploited to improve recognition processes) were the most frequent session topics, despite the fact that th'ere was a separate ESCA satellite workshop on intonation in conjunction with the main Eurospeech meeting which included over 80 contributions. This focus on prosodic research is partly due to the fact that developments in speech technology have made it possible to examine the acoustic parameters associated with prosodic phenomena (in particular fundamental frequency and duration) to an extent which has not been possible in other domains of speech research. It is also due to the fact that significant theoretical advances in linguistics and phonetics have been made during this time which have made it possible to obtain a better understanding of how prosodic parameters function in expressing different kinds of meaning in the languages of the world. One of the researchers who has made a significant contribution in shaping the methodology and goals of prosody research is Gosta Bruce. His studies of Swedish intonation in sentence perspective have involved both theoretical modelling and experimental phonetic studies testing the theoretical claims. Perhaps the most influential of his contributions was the compositional analysis of Swedish intonational contours into word accents, sentence accent (associated with focus) and terminal juncture (boundary tones) which realize different combinations of two phonological level tones Hand L. This insight into higher-level patterning of prosodic patterns led other researchers to apply the same methods to other languages. Janet Pierrehumbert shows in her contribution to this volume "Tonal Elements and their Alignment", how her model of English intonation was influenced substantially by Bruce's model of Swedish accent and intonation. In it, English intonation is also modelled using two underlying tones. The different pitch accents, phrase accents and boundary tones, defined in terms of Hand L tones, constitute basic elements in a grammar for generating intonation contours. Another contribution made by Bruce, as Pierrehumbert shows in her chapter consisted in showing how different tonal properties can be assigned to M. Horne (ed.), Prosody: Theory and Experiment, 1-10. @ 2000 Kluwer Academic Publishers. 2 MERLE HORNE different levels of prosodic structure, i.e. how the different tonal components could be assigned to word accents, phrasal (focal) accents and boundary tones. A further hallmark of Bruce's work, as Pierrehumbert points out, was that he contributed to our understanding of how tonal representations are associated with the segmental string, for example, the difference between the two word accents in Swedish is explained in terms of a timing difference in the association of the starred tonal-component of the word accent representations HL * (Accent I) and H*L (Accent II) with the stressed vowel. All these basic ideas were taken up by Pierrehumbert in her analysis of English, a language which differs prosodically in many respects from Swedish. For example, unlike Swedish, English pitch accents are not properties of words but rather post lexical reflexes of pragmatic aspects of an utterance. English is also characterized by a larger inventory of pitch accents. In her contribution, Pierrehumbert discusses the main features of her model as well as modifications (e.g. reduction in the number of pitch accents, the treatment of pitch range choice and down step) which it has experienced since its debut in 1980. These have resulted mainly from experimental work in collaboration with Mark Liberman and Mary Beckman. As Pierrehumbert shows, the empirical research has also raised questions related to the issues of tonal alignment as well as locality and lookahead in tonal implementation. A better understanding of the pragmatic meanings associated with the pitch accents has also been obtained in recent years due to work with Julia Hirschberg and practical studies of intonational variation in large data-bases have been made possible due to the development of the ToBI (Tones and Break Indices) transcription system based on Pierrehumbert's model. In his chapter, "Bruce, Pierrehumbert, and the Elements of Intonational Phonology", D.R. Ladd takes up a number of theoretical problems surrounding the nature of tones and tonal association that arise from a comparison of Bruce's ideas and related auto segmental treatments. For example, whereas Bruce's Hand L tones correspond to concrete "local maxima and minima in the contour" - turning points - Pierrehumbert's tones do not necessarily correspond to turning points, nor do turning points always reflect the phonetic realization of a tone. That is to say, the tones in her analysis of English are in some cases more abstract than Bruce's and resemble more the underlying tones in tone languages. Ladd raises the question as to how abstract should analyses of tone be in languages like English, which is assumed to have only post-lexical or intonational uses of tones. As opposed to languages with lexical tone, where one can observe alternations in the shape of morphemes and words in different contexts, languages with only post-lexical (intonational) uses of tone often do not allow stringent control over the identification of tones. Ladd also takes up questions regarding the status of 'starred' tones. These have been used by Pierrehumbert in representing English bitonal accents and also retrospectively by Bruce in the representation of INTRODUCTION 3 Swedish word accents. As Ladd shows, it is not clear, however, what the defining characteristic of the star is. Furthermore, he provides evidence that shows that it is not always straightforward how the alignment between a starred tone and the segmental string should take place. Ladd further takes up the distinction between association and alignment, where association is seen as an abstract phonological "belonging together" that makes no precise predictions about temporal coordination; alignment, on the other hand, "is specified independently of the identity of the tonal string". Association is "digitial", whereas alignment is "analogue". In their contribution "Levels of Representation and Levels of Analysis for the Description of Intonational Systems", Daniel Hirst, Albert Di Cristo and Robert Espesser discuss a number of issues related to the phonetics/phonology interface as well as the relationship of the phonological component with the higher-level syntactic and semantic components. In particular those issues that are relevant for deciding what kinds of information should be incorporated in the phonetic ('surface phonological') transcription of prosodic parameters in cross-linguistic studies are addressed. They distinguish between 'functional representations' that provide information required for the syntactic and seman~ic interpretation of an utterance's prosody and 'formal representations' which provide information related to the pronunciation of an utternace. Since there is not a one-to-one correspondance between these two types of information in languages of the world, the possibility of making cross linguistic studies relies on the ability to separate these two forms of representation in a universal theory of language structure. Language-specific parameters then should specify the mapping between form and functional categories. One specific problem discussed is how to derive an optimal symbolic phonetic representation for a FO curve on the basis of what we know about related physiological and perceptual factors. The design and evaluation of the MOMEL algorithm developed at the University of Aix-en-Provence for automatic modelling of and synthesis of FO curves in a number of languages is discussed in detail. The phonetic transcription system (INTSINT) designed to allow collection and classification of intonational data in languages at preliminary levels of analyses where pitch accent inventories are not known, is also taken up and discussed. Also discussed are possible ways in which the system could be modified to integrate into the transcription discourse information on variation in overall range and register within intonational units. Finally, a sketch of the way in which cross-linguistic differences between intonation systems can be formalized in terms of phonological representations relating prosodic constituents and tones is presented using data from English and French. "Perception of Prosodic Prominence" in particular perceptual judgements associated with variation in accentual patternings is the topic of Jacques Terken and Dik Hermes' chapter. They discuss different typologies of 4 MERLE HORNE prominence that have been proposed since the time of Trager and Smith and conclude that there is phonetic evidence for at least four types of prominence categories: reduced, full, stressed and accented. Further they take up the issue of frequency scales used in the study of speech intonation and address the question as to which scale best expresses equivalences in the pitch ranges of men and women and which scale best expresses excursion sizes of pitch movements related to the perception' of prominence (as opposed to 'liveliness'). They conclude that experimental evidence points to the optimality of the ERB-rate scale for representing excursion equivalences associated with accentual prominences and discuss the possible perceptual processes which may account for this. Variations in the strength of the prominence category 'accent' are known to be related both to the local phonetic characteristics of the pitch movement (e.g. size, slope, timing) as well as to more global characteristics such as declination. Terken and Hermes discuss models for accounting for the variation in accent strength of both single and mUltiple accents in an utterance. Two models, a High-Level (HL) model and a Pitch Level Difference (PLD) model are used to explain subjects' judgements of strength relations between different pitch movements. One observed stable effect of the perception experiments involving single accents is that falls induce greater strength than the rise and the rise-fall for the same excursion sizes. In the modelling of the relative strength relations between two or more accents within the same utterance, the situation is complicated due to effects from both declination and speaker-specific pitch range, as welI as the mutual effect of the prominence of accents on each other's perceived prominence. A survey is made of experimental results made by a number of researchers on this intricate problem. The need for a reliable way of transcribing prominence relations between accents both within and across different pitch registers is also discussed and the authors conclude by pointing to the need for more work in the development of theories of prominence and in relating them to issues in language processing. Carlos Gussenhoven discusses an interesting case of interaction between tone and intonation at the ends of phrases in his chapter entitled "The Lexical Tone Contrast of Roermond Dutch in Optimality Theory". He shoes how constraint-based Optimality Theory alIows one to better understand a number of generalizations related to the phonological behaviour of the two tonal word accents and the intonation (focus and boundary) tones in the Roermond dialect. These include: the neutralization of the lexical tone contrast in nonfocussed nonfinal position, the spreading of the final boundary tones into the phrase, the assimilation ofthe lexical H-tone to L after a focal L-tone in the same sYlIable and, most interestingly, infixation of the phrasal boundary tones before the lexical tone. In particular, it is shown that an explanation for the final generalization concerning the 'tonal reversal' phenomenon in Roermond cannot be naturally provided by a derivational rule-based description since it INTRODUCTION 5 leads to "an insoluable ordering paradox". By using the concept of alignment constraints within Optimality theory, however, the hierarchical ranking of these constraints can be used to determine which tone will be given preference at the end of a phrase. Gussenhoven also takes up the distinction between the concepts 'association' and 'alignment' and uses it to account for the two different patterns of leftward tone spreading in Roermond: whereas "alignment locates tones with reference to the edges of (phonological and morphological) constitutents, association creates a structural connection with a tone-bearing unit". All tones are aligned according to this view, but only some are associated. Association does not occur when there is no free tone bearing unit (TBU) available which is seen to be the case in Roermond when the more freely-timed case of tone spreading is compared with the more accurately timed case where a TBU is available. Timing of tonal events relative to critical points in the segmental representation of a text is now understood to be an important parameter in relating meaning contrasts with sound structure. There is, however, another aspect of timing that is relevant to the issue of tune-text association and that is the timing of articulatory gestures in the production of the segments to which intonational events such as pitch accents and boundary tones are anchored. This is the topic of Mary Beckman and Bretonnel Cohen's chapter "Modelling the Articulatory Dynamics of Two Levels of Stress Contrast". In this contribution, differences in segmental timing control associated with "word stress" and "sentence stress" are studied in an experiment involving articulatory kinematics and the results are discussed in light of three different models of articulatory timing: a 'truncation' model, a 'rescaling' model and a 'hybrid' model. Jaw movement data involving durations, amplitudes and peak velocities for jaw opening and closing movements for accented, unaccented heavy and reduced CVC ([pap ]/[PdP]) sequences uttered at 3 different speech tempi are seen to indicate the same relational effects of stress at both levels of prosodic structure. Movement durations were longest for the vowels in accented position and shortest for the reduced vowel, with accent producing a much smaller effect than syllable weight. Movement amplitudes and peak velocities show the same pattern as duration: accented syllables have the largest and fastest movements, while the reduced vowel showed the smallest and slowest movements. Here also, accent had a considerably smaller effect than syllable weight. However, the relatively more prominent sequences in each stress contrast pair had smaller velocities than expected for their larger amplitudes. These observations are explained by a hybrid model which changes both between- and within-gesture parameters by combining the between-gesture shortening mechanism of the truncation model as well as the gestural settling time and displacement changes of the rescaling model. It is suggested that the lengthening observed in accented sequences as opposed to unaccented sequences is a considerably more subtle version of the same effect

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.