Table Of Content

Progress in Speech Synthesis Jan P.H. van Santen Richard W. Sproat Joseph P. Olive Julia Hirschberg Editors Progress in Speech Synthesis With 158 Illustrations Springer Jan P.H. van Santen Richard W. Sproat Bell Laboratories Bell Laboratories Room2D-452 Room2D-451 600 Mountain Avenue 600 Mountain Avenue Murray Hill, NJ 07974-0636 USA Murray Hill, NJ 07974-0636 USA Joseph P. Olive Julia Hirschberg Bell Laboratories AT&T Research Room 2D-447 Room 2C-409 600 Mountain Avenue 600 Mountain Avenue Murray Hill, NJ 07974-0636 USA Murray Hill, NJ 07974-0636 USA Library of Congress Cataloging-in-Publication Data Progress in speech synthesis/Jan P.H. van Santen ... let al.l, editors. p. cm. Includes bibliographical references and index. ISBN 978-1-4612-7328-8 ISBN 978-1-4612-1894-4 (eBook) DOl 10.1007/978-1-4612-1894-4 1. Speech synthesis. 1. Santen, Jan P.H. van. TK7882.S65S6785 1996 006.5' 4-dc20 96-10596 Printed on acid-free paper. © 1997 Springer Science+Business Media New York Originally published by Springer-Verlag New York, Inc. in 1997 Softcover reprint of the hardcover 1 st edition 1997 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher, Springer Science+Business Media, LLC except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereaf ter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Production managed by Natalie Johnson; manufacturing supervised by Jacqui Ashri. Camera-ready copy prepared using LaTeX. Printed and bound by Edwards Brothers, Inc., Ann Arbor, MI. Additional material to this book can be downloaded from http://extras.springer.com. 9 8 7 6 543 2 ISBN 978-1-4612-7328-8 Preface Text-to-speech synthesis involves the computation ofaspeech signal from input text. Accomplishing this requires a system that consists ofan astonishing range ofcomponents, from abstract linguistic analysis ofdiscourse structure to speech coding. Several implications flow from this fact. First, text-to-speech synthesis is in herently multidisciplinary, as is reflected by the authors of this book, whose backgroundsincludeengineering,multipleareasoflinguistics,computerscience, mathematicalpsychology,andacoustics. Second,progressintheseresearchareas is uneven because the problems faced vary so widely in difficulty. Forexample, producingflawless pitchaccentassignmentfor complex, mUlti-paragraph textual inputisextremelydifficult, whereas producingdecentsegmental durations given thatallelsehasbeencomputedcorrectlyisnotverydifficult.Third,theonlywayto summarizeresearchinallareasrelevantforTISisintheformofamulti-authored book-no singleperson, oreven small groupofpersons, has asufficientlybroad scope. The most importantgoal ofthis bookis, ofcourse, to provide an overview of these research areas by having invited key players in each area to contribute to this book.This willgivethereaderacompletepictureofwhatthechallengesand solutionsareandinwhatdirectionsresearchersaremoving. Butan importantsecondgoalis to allow thereadertojudgewhatall this work adds up to-the scientific sophistication may be impressive, but does it really producegoodsyntheticspeech?Thebookattempts toanswerthis questionintwo ways.First,wehaveaskedauthorstoincluderesultsfromsubjectiveevaluationsin theirchapterwheneverpossible.Thereisalsoaspecial sectiononperceptionand evaluationin the book. Second, the bookcontains aCD-ROMdisk withsamples ofseveralofthesynthesizersdiscussed. Boththesuccessesandthefailures areof interest-the latter in particularbecause it is unlikely that samples are included demonstrating majorflaws inasystem.WethankChristianBenoitfor suggesting this idea. vi Preface Abriefnoteonthehistoryofthisvolume:In1990,theFirstESCAWorkshopon Text-to-SpeechSynthesiswasorganizedinAutrans,France.Theorganizersofthis workshop,GerardBaillyandChristianBenoit,feltthattherewasaneedforabook containinglonger verionsofpapers from the workshop proceedings, resulting in TalkingMachines.In1994,theeditorsofthecurrentvolumeorganizedtheSecond ESCAlIEEE/AAAIWorkshoponText-to-SpeechSynthesis andlikewisedecided thatabookwasnecessarythatwouldpresenttheworkreportedintheproceedings in amore complete, updated, and polished form. To ensure the highest possible qualityofthechaptersforthecurrentvolume,weaskedmembersofthescientific committeeoftheworkshoptoselectworkshoppapersforpossibleinclusion.The editors added their selections, too. We then invited those authors whose work receivedanunambiguousendorsementfromthisprocesstocontributetothebook. Finally, wewanttothankthemanypeoplewhohavecontributed: thescientific committeemembers whohelpedwiththeselectionprocess;fourteen anonymous reviewers;BerndMoebiusforworkonstubbornfigures;AliceGreenwoodforedi torialassistance;MikeTanenblattandJuergenSchroeterforprocessingthespeech andvideofiles;DavidYarowskyforprovidingtheindex;ThomasvonFoersterand KennethDreyhauptatSpringer-Verlagforexpeditingtheprocess;CathyHopkins for administrativeassistance; andBellLaboratoriesforitsencouragementofthis work. JanP. H. vanSanten RichardW. Sproat JosephP. Olive JuliaHirschberg MurrayHill,NewJersey October 1995 Contents Preface v Contributors xvii I Signal Processing and SourceModeling 1 1 SectionIntroduction.RecentApproachestoModelingtheGlottal SourceforTTS 3 DanKahn, MarianJ. Macchi 1.1 Modelingthe GlottalSource: Introduction 3 1.2 AlternativestoMonopulseExcitation . 4 1.3 AGuidetotheChapters 5 1.4 Summary ............. 6 2 SynthesizingAllophonicGlottalization 9 JanetB. Pierrehumbert,StefanFrisch 2.1 Introduction. . . . . . 9 2.2 ExperimentalData ...... 10 2.3 SynthesisExperiments .... 12 2.4 ContributionofIndividualSourceParameters 20 2.5 Discussion 20 2.6 Summary ................... 24 3 Text-to-Speech Synthesis with Dynamic Control ofSource Parameters 27 LuisC. Oliveira 3.1 Introduction . 27 3.2 SourceModel . 27 viii Contents 3.3 Analysis Procedure 30 3.4 Analysis Results 33 3.5 Conclusions .... 36 4 Modificationofthe AperiodicComponentofSpeech Signalsfor Synthesis 41 GaelRichard, ChristopheR. d'Alessandro 4.1 Introduction.................. 41 4.2 SpeechSignalDecomposition 43 4.3 AperiodicComponentAnalysis andSynthesis 47 4.4 Evaluation........ 50 4.5 SpeechModifications. . . 51 4.6 DiscussionandConclusion 54 5 On the Use ofa Sinusoidal Model for Speech Synthesis in Text-to-Speech 57 Miguel Angel Rodriguez Crespo, Pilar Sanz Velasco, LuisMonzonSerrano,JoseGregorioEscaladaSardina 5.1 Introduction............ 57 5.2 OverviewoftheSinusoidalModel 59 5.3 SinusoidalAnalysis. . . . . . . . 60 5.4 SinusoidalSynthesis . . . . . . . 60 5.5 SimplificationoftheGeneralModel 61 5.6 ParametersoftheSimplifiedSinusoidalModel . 64 5.7 FundamentalFrequencyandDurationModifications 65 5.8 AnalysisandResynthesisExperiments 66 5.9 Conclusions..................... 69 II LinguisticAnalysis 71 6 Section Introduction. The Analysis ofTextin Text-to-Speech Synthesis 73 RichardW. Sproat 7 Language-IndependentData-Oriented Grapheme-to-Phoneme Conversion 77 WalterM. P.Daelemans,AntalP.1. vandenBosch 7.1 Introduction..... 77 7.2 DesignoftheSystem 79 7.3 RelatedApproaches . 85 7.4 Evaluation 86 7.5 Conclusion..... 88 Contents ix 8 All-ProsodicSpeechSynthesis 91 ArthurDirksen,JohnS. Coleman 8.1 Introduction.... 91 8.2 Architecture.... 93 8.3 PolysyllabicWords 100 8.4 ConnectedSpeech 104 8.5 Summary..... 106 9 AModelofTimingforNonsegmentalPhonologicalStructure 109 JohnLocal, RichardOgden 9.1 Introduction........................ 109 9.2 SyllableLinkageandItsPhoneticInterpretationin YorkTalk. 110 9.3 TheDescriptionandModelingofRhythm . . . . . . . . .. 112 9.4 Comparison of the Output of YorkTalk with NaturalSpeechandSynthesis. 117 9.5 Conclusion................. 119 10 A Complete Linguistic Analysis for an Italian Text-to-SpeechSystem 123 GiulianoFerri,PieroPierucci,DonatellaSanzone 10.1 Introduction........ 123 10.2 TheMorphologicAnalysis . . . 125 10.3 ThePhoneticTranscription . . . 130 10.4 TheMorpho-SyntacticAnalysis 132 10.5 PerformanceAssessment 137 10.6 Conclusion . . . . . . . . . . . 137 11 DiscourseStructuralConstraintsonAccentinNarrative 139 ChristineH. Nakatani ILl Introduction...................... 139 11.2 TheNarrativeStudy 140 11.3 ADiscourse-BasedInterpretationofAccentFunction 142 11.4 DiscourseFunctionsofAccent 146 11.5 Discussion 150 11.6 Conclusion . . . . . . . . . . 153 12 HomographDisambiguationinText-to-SpeechSynthesis 157 DavidYarowsky 12.1 ProblemDescription 157 12.2 PreviousApproaches 158 12.3 Algorithm...... 159 12.4 DecisionListsforAmbiguityClasses. 165 12.5 Evaluation 168 12.6 DiscussionandConclusions ..... 169 x Contents III Articulatory Synthesis and Visual Speech 173 13 SectionIntroduction.TalkingHeadsinSpeechSynthesis 175 DominicW. Massaro,MichaelM.Cohen 14 SectionIntroduction.ArticulatorySynthesisandVisualSpeech 179 JuergenSchroeter 14.1 Bridging the Gap Between Speech Science and Speech Applications 179 15 SpeechModelsandSpeechSynthesis 185 MaryE. Beckman 15.1 ThemeandSomeExamples. . . . . . . . . 185 15.2 ADecadeandaHalfofIntonationSynthesis 187 15.3 ModelsofTime . 197 15.4 Valediction . . . . . . . . . . . . . . . . . 202 16 A Framework for Synthesis of Segments Based on PseudoarticulatoryParameters 211 CorineA. Bickley,KennethN. Stevens,DavidR. Williams 16.1 BackgroundandIntroduction. . . . . . . . 211 16.2 ControlParametersandMappingRelations. 212 16.3 ExamplesofSynthesisfromHLParameters 215 16.4 TowardRulesforSynthesis . . . . . . . . . 217 17 BiomechanicalandPhysiologicallyBasedSpeechModeling 221 ReinerF.Wilhelms-Tricarico,JosephS. Perkell 17.1 Introduction........... 221 17.2 ArticulatorySynthesizers . . . . 222 17.3 AFiniteElementTongueModel 222 17.4 TheController 227 17.5 Conclusions........... 232 18 Analysis-SynthesisandIntelligibilityofaTalkingFace 235 BertrandLeGoff,ThierryGuiard-Marigny,ChristianBenoit 18.1 Introduction...... 235 18.2 TheParametricModels . . . . 236 18.3 VideoAnalysis . . . . . . . . 236 18.4 Real-TimeAnalysis-Synthesis 237 18.5 IntelligibilityoftheModels . 238 18.6 Conclusion . . . . . . . . . . 244 19 3DModelsoftheLipsandJawforVisualSpeechSynthesis 247 ThierryGuiard-Marigny, AliAdjoudani,ChristianBenoit 19.1 Introduction........ 247 19.2 The2DModeloftheLips 248

Progress in Speech Synthesis PDF

590 Pages·1997·13.488 MB·English

by Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.)

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Progress in Speech Synthesis PDF Free - Full Version

by Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.)| 1997| 590 pages| 13.488| English

Download Progress in Speech Synthesis by Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.) in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Progress in Speech Synthesis

No description available for this book.

Detailed Information

Author:	Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.)
Publication Year:	1997
Pages:	590
Language:	English
File Size:	13.488
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Progress in Speech Synthesis Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Progress in Speech Synthesis PDF?

Yes, on https://PDFdrive.to you can download Progress in Speech Synthesis by Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.) completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Progress in Speech Synthesis on my mobile device?

After downloading Progress in Speech Synthesis PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Progress in Speech Synthesis?

Yes, this is the complete PDF version of Progress in Speech Synthesis by Dan Kahn, Marian J. Macchi (auth.), Jan P. H. van Santen, Joseph P. Olive, Richard W. Sproat, Julia Hirschberg (eds.). You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Progress in Speech Synthesis PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.