ebook img

Discourse-Driven Pitch Accent Prediction PDF

100 Pages·2009·1.71 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Discourse-Driven Pitch Accent Prediction

Discourse-Driven Pitch Accent Prediction Jennifer Moore Department of Computational Linguistics Universität des Saarlandes Thesis submitted for the degree of Master of Science 2009 Advisors Magdalena Wolska Bistra Andreeva 2 Acknowledgements First and foremost, I would like to thank Magdalena Wolksa, without whose encouragement, guidance, and support, from the first nebulous ideas, to the dried ink on the very last page, this thesis would not have been possible. Everyconversationwithherwasajumpstarttomyflaggingspirits, inwhich ideas would spiral one from another, while her feedback always kept me on the straight and narrow. She also has my gratitude for introducing me to Bistra Andreeva, whose expertise in phonological systems in both English and German was invaluable for such multi-disciplinary investigations. She directed me to many different areas of research in the field, allowing me to narrow in on a clear set of problems to address, and really strengthen the key points of my research. Secondly, there are no words to express my indebtedness to John Hörnqvist, who was always there for me through the best and worst times, who read everydraft, wholistenedpatientlytoeveryideaandeveryproblem. Iwould notbewhereIamtodaywithouthisloveandsupportthesepastmanyyears. Finally, I would like to thank the many friends, faculty, and researchers that have contributed to this thesis in various ways: Dr. Kim Silverman, who firstinspiredmewiththeideaoveracupofcoffeeattheApplecampus;Prof. Dr. Manfred Pinkal, who was my mentor throughout the program; Michael Roth, who provided the German translation of this abstract and gave me feedback on some of my early drafts; and Roland Albrecht, who proof-read the more mathematical sections of this thesis, among many other things, for which I am eternally grateful. Abstract Pitch accent is a component of prosody that is often used to convey in- formation beyond the intrinsic linguistic meaning of a spoken utterance, such as highlighting words that correspond to important information, or in signaling a contrast with information that was previously conveyed. This information-bearing aspect of pitch accent is therefore important for effec- tive communication in spoken applications. Recent work has looked into statistical modeling techniques for automatic pitch accent prediction as a component of speech technologies like Text-to- Speech (TTS). Many of these systems, however, have largely overlooked the dimension of discourse context in driving pitch accent placement; others simply introduced more complex models of discourse-level phenomena into the accent prediction component. We investigate a model for discourse-driven statistical pitch accent predic- tion that makes use of a dynamically-updated semantic space as a means of introducing context-sensitive features into the prediction model. This approach has the advantage of being trainable on a large corpus of unan- notated data, making it less prone to corpus domain bias (i.e. distributions estimated from a given corpus that reflect the genre of that corpus only) inherent in purely probabilistic variables for accent prediction. Moreover, this approach does not require additional modeling of complex discourse processes, but relies solely on shallow analysis of the input text. Abstract Betonungsakzente bilden eine Prosodie-Komponente, die in gesprochener Sprache häufig verwendet wird, um Informationen zu übermitteln, die über intrinsische linguistische Bedeutung hinausgehen. Beispiele hierfür sind das Betonen von Worten, die eine besondere Wichtigheit haben, oder das Sig- nalisieren eines Kontrasts zu vorherigen Informationen. Dieser information- stragende Aspekt von Betonung ist daher wichtig für das effektive Kommu- nizieren in sprachlischen Anwendungen. Neuste Forschungen haben versucht, statistische Modelle zu entwickeln, um automatische Betonungsakzente in sprachtechnologische Anwendungen wie Text-to-Speech-Systeme zu integrieren. Viele dieser Systeme berücksichti- gen allerdings gar keinen Diskurskontext, um Betonungsakzente zu setzen; andere Systeme führen komplexe Diskurs-Modelle ein, um Phänomene auf der Ebene in einer Betonungsvorhersage nachzubilden. IndieserArbeituntersuchenwireinModelzurstatistischen,Diskurs-getriebenen Betonungsvorhersage,indemeindynamischaktualisiertersemantischerRaum als kontext-sensitives Attribut verwendet wird. Diese Vorgehensweise hat den Vorteil, dass das Modell mit einem großen Korpus unannotierter Daten trainiertwerdenkannunddadurchwenigeranfälligfürDomänen-spezifische Tendenzenwird(d.h. dieauseinemKorpusberechnetenVerteilungenspiegeln stets nur das Genre des Korpus wider). Darüber hinaus ist dieser Ansatz nichtaufeinzusätzlichesModellierenvonkomplexenDiskursprozessenangewiesen, sondern bedient sich lediglich einer flachen Analyse des Eingabetexts. Contents 1 Introduction 2 1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Theoretical Foundations 9 2.1 Accent and Prominence in Spoken Language . . . . . . . . . . . . . . . . 10 2.2 Information and Accent . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Computational Models of Accent Variation. . . . . . . . . . . . . . . . . 34 3 Model Design 40 3.1 The Stochastic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 Feature Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.1 Semantic Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2.2 Local and Global Discourse . . . . . . . . . . . . . . . . . . . . . 52 3.2.3 Discourse Structure. . . . . . . . . . . . . . . . . . . . . . . . . . 55 4 Data Acquisition 57 4.1 Corpora for CRF Training . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.1 The MULI Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.2 The IViE Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2 Corpora for LSA Training . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2.1 The Wikipedia XML Corpus . . . . . . . . . . . . . . . . . . . . 68 iv CONTENTS 5 Experimental Setup 69 5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6 Conclusions 80 References 92 v List of Figures 2.1 Spectral analysis and fundamental frequency of the word May . . . . . . 11 2.2 Prince’s taxonomy of given/new information . . . . . . . . . . . . . . . . 31 3.1 A linear-chain CRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2 Word and document factors as coordinates in a two-dimensional space . 51 4.1 The complete MULI transcription of ‘Exporte in den Libanon sichert Bonn derzeit nur kurzfristig ab’ . . . . . . . . . . . . . . . . . . . . . . . 60 4.2 The complete IViE transcription of ‘We arrived in a limo’ . . . . . . . . 66 vi The English teacher W. Jones, on attending a concert in Japan, heard his Japanese host announce, “Tonight, we shall be singing a programme of French SONGS and German SONGS; I am sorry that we shall not be singing any English SONGS.” To which address, the baffled Jones mused, “His grammar was faultless, his pronunciation unambiguous, but since I knew when I bought my ticket that the business of choirs is to sing songs, I wondered, if only for a moment, why he had stressed the word ‘songs’: what was this meant to tell us?” – Studies in Culture Chapter 1 Introduction Pitchaccent,acomponentofprosodyinspokennaturallanguage,canbeusedtoconvey information beyond the intrinsic linguistic meaning of a spoken utterance. This form of accentuation in speech is often employed as a means of alerting the listener to new or important information within an utterance, or in signaling a contrast to information which was previously conveyed. In such cases, accent is clearly situated within the context of the immediate discourse, and in the absence of discourse, can even evoke a specific context. For example, the accent in the utterance ‘JOHN went to the store’ presumesadiscoursecontextinwhichthequestion‘WHOwenttothestore?’ wasposed. In this way, accents can be simultaneously information-bearing and discourse-bound. Itisthisinformation-bearingnatureofaccentthatmakesitakeyelementofeffective spoken communication, in itself an important goal of human-computer information systems. In many applications of speech technology, proper accenting strategies are not only integral to the correct interpretation of a message, but misplaced accents can cause miscomprehension and confusion on the part of the listener. The classic example is that of an in-car spoken navigation system. In a situation in which the driver comes upontworoadsofthesamename, butsuffixeddifferently, aqualifyingaccenton‘Ocean AVENUE’ would signal to the driver the importance of the street kind, whereas an accent in ‘OCEAN avenue’ might cause the driver to derail to the nearest road of that name. Unfortunately, while much theoretical work has been done to investigate the dimen- sionofdiscourseinpitchaccentplacement, accentmodelingforpracticalapplicationsof speech technology have largely overlooked it. In particular, incorporating informational 2

Description:
pitch accent prediction as a component of speech technologies like Text-to- Although these systems have become the de-facto standard for prosodic annotation, .. In this case, 'books' is a referent to 'Slaughterhouse-Five', not through
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.