Table Of Content

W689-Delpech.qxp_Layout 1 12/06/2014 08:30 Page 1 COGNITIVE SCIENCE AND KNOWLEDGE MANAGEMENT SERIES Computer-assisted translation (CAT) has always used translation E s memories, which require the translator to have a corpus of previous te translations that the CAT software can use to generate bilingual lle M lexicons. This can be problematic when the translator does not have a such a corpus, for instance, when the text belongs to an emerging field. ry To solve this issue, CAT research has looked into the leveraging of lin Comparable Corpora e comparable corpora, i.e. a set of texts, in two or more languages, which D e deal with the same topic but are not translations of one another. lp and Computer-assisted e c This work had two primary objectives. The first is to assess the input of h lexicons extracted from comparable corpora in the context of a specialized human translation task. The second objective is to identify Translation bilingual-lexicon-extraction methods which best match the translators’ C o needs, determining the current limits of these techniques and m suggesting improvements. The author focuses, in particular, on the p identification of fertile translations, the management of multiple a r morphological structures, and the ranking of candidate translations. a b The experiments are carried out on two language pairs (English–French le and English–German) and on specialized texts dealing with breast C cancer. This research puts significant emphasis on applicability – o Estelle Maryline Delpech r methodological choices are guided by the needs of the final users. This p o book is organized in two parts: the first part presents the applicative and r a scientific context of the research, and the second part is given over to a efforts to improve compositional translation. n d The research work presented in this book received the PhD Thesis C award 2014 from the French association for natural language o m processing (ATALA). p u t e r - a s s i s Estelle Maryline Delpech holds a PhD in Computer Science from the t e University of Nantes in France, where she specialized in natural d language processing and computer-aided translation. She is currently T r Chief Scientist at Nomao, a web and mobile app search engine a n company. Her research interests include multilingualism, computational s l linguistics, information extraction and data integration. a t i o n Z(7ib8e8-CBGIJB( www.iste.co.uk Comparable Corpora and Computer-assisted Translation To Elia Series Editor Narendra Jussien Comparable Corpora and Computer-assisted Translation Estelle Maryline Delpech Firstpublished2014inGreatBritainandtheUnitedStatesbyISTELtdandJohnWiley&Sons,Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permittedundertheCopyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced, storedortransmitted,inanyformorbyanymeans,withthepriorpermissioninwritingofthepublishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentionedaddress: ISTELtd JohnWiley&Sons,Inc. 27-37StGeorge’sRoad 111RiverStreet LondonSW194EU Hoboken,NJ07030 UK USA www.iste.co.uk www.wiley.com ©ISTELtd2014 TherightsofEstelleMarylineDelpechtobeidentifiedastheauthorofthisworkhavebeenassertedby herinaccordancewiththeCopyright,DesignsandPatentsAct1988. LibraryofCongressControlNumber:2014936484 BritishLibraryCataloguing-in-PublicationData ACIPrecordforthisbookisavailablefromtheBritishLibrary ISBN978-1-84821-689-1 PrintedandboundinGreatBritainbyCPIGroup(UK)Ltd.,Croydon,SurreyCR04YY Table of Contents ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi PART1.APPLICATIVEAND SCIENTIFICCONTEXT . . . . . . . . . . . . . 1 CHAPTER 1. LEVERAGING COMPARABLE CORPORA FOR COMPUTER- ASSISTEDTRANSLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. From the beginnings of machine translation to comparable corpora processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1. Thedawnofmachinetranslation . . . . . . . . . . . . . . . . . . . 3 1.2.2. Thedevelopmentofcomputer-assistedtranslation . . . . . . . . . . 5 1.2.3. Drawbacksofparallelcorporaandadvantagesofcomparablecorpora 7 1.2.4. Difficultiesoftechnicaltranslation . . . . . . . . . . . . . . . . . . 9 1.2.5. Industrialcontext . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3. Termalignmentfromcomparablecorpora: astate-of-the-art . . . . . . 15 1.3.1. Distributionalapproachprinciple . . . . . . . . . . . . . . . . . . . 15 1.3.2. Termalignmentevaluation . . . . . . . . . . . . . . . . . . . . . . . 18 1.3.3. Improvementandvariantsofthedistributionalapproach . . . . . . 20 1.3.4. Theinfluencedataandparametersonalignmentquality . . . . . . 28 1.3.5. Limitsofthedistributionalapproach . . . . . . . . . . . . . . . . . 30 1.4. CATsoftwareprototypeforcomparablecorporaprocessing . . . . . . 32 1.4.1. Implementationofatermalignmentmethod . . . . . . . . . . . . . 32 1.4.2. Terminologicalrecordsextraction . . . . . . . . . . . . . . . . . . . 36 vi ComparableCorporaandComputer-assistedTranslation 1.4.3. Lexiconconsultationinterface . . . . . . . . . . . . . . . . . . . . . 38 1.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 CHAPTER2.USER-CENTERED EVALUATIONOF LEXICONS EXTRACTEDFROM COMPARABLE CORPORA . . . . . . . . . . . . . . . . . 41 2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.2. Translationqualityevaluationmethodologies. . . . . . . . . . . . . . . 42 2.2.1. Machinetranslationevaluation. . . . . . . . . . . . . . . . . . . . . 42 2.2.2. Humantranslationevaluation . . . . . . . . . . . . . . . . . . . . . 46 2.2.3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.3. Designandexperimentationofauser-centeredevaluation . . . . . . . 50 2.3.1. Methodologicalaspects . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.3.2. Experimentationprotocol. . . . . . . . . . . . . . . . . . . . . . . . 54 2.3.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 CHAPTER3.AUTOMATICGENERATIONOF TERM TRANSLATIONS . . . 67 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2. Compositionalapproaches . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2.1. Compositionaltranslationprinciple . . . . . . . . . . . . . . . . . . 68 3.2.2. Polylexicalunitscompositionaltranslation . . . . . . . . . . . . . . 70 3.2.3. Monolexicalunitscompositionaltranslation . . . . . . . . . . . . . 75 3.2.4. Candidatetranslationfiltering . . . . . . . . . . . . . . . . . . . . . 81 3.3. Data-drivenapproaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.3.1. Analogy-basedtranslation . . . . . . . . . . . . . . . . . . . . . . . 85 3.3.2. Rewritingruleslearning . . . . . . . . . . . . . . . . . . . . . . . . 87 3.3.3. Dealingwithmorphologicalvariation . . . . . . . . . . . . . . . . . 88 3.4. Evaluationoftermtranslatorgenerationmethods . . . . . . . . . . . . 91 3.5. Researchperspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 PART2.CONTRIBUTIONSTOCOMPOSITIONALTRANSLATION . . . . . 99 CHAPTER4.MORPH-COMPOSITIONALTRANSLATION: METHODOLOGICALFRAMEWORK . . . . . . . . . . . . . . . . . . . . . . . 101 4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.2. Morpho-compositionaltranslationmethod . . . . . . . . . . . . . . . . 101 4.2.1. Scientificpositioning . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.2.2. Definitionsandterminology . . . . . . . . . . . . . . . . . . . . . . 105 4.2.3. Underlyingassumptions . . . . . . . . . . . . . . . . . . . . . . . . 108 4.2.4. Advantagesoftheproposedapproachforprocessingcomparable corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.3. Addressedissuesandcontributions . . . . . . . . . . . . . . . . . . . . 110 4.3.1. Generatingfertiletranslations . . . . . . . . . . . . . . . . . . . . . 110 Contents vii 4.3.2. Dealingwithdiversemorphologicalstructures . . . . . . . . . . . . 113 4.3.3. Candidatetranslationsranking . . . . . . . . . . . . . . . . . . . . . 116 4.4. Evaluationmethodology . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.4.1. Apriorireference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.4.2. Aposteriorireference . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 CHAPTER5.EXPERIMENTALDATA . . . . . . . . . . . . . . . . . . . . . . . 123 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.2. Comparablecorpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.3. Sourceterms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.4. Referencedatafortranslationgenerationevaluation . . . . . . . . . . . 126 5.4.1. Apriorireference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.4.2. Aposteriorireference . . . . . . . . . . . . . . . . . . . . . . . . . . 129 5.5. Translationrankingtrainingandevaluationdata . . . . . . . . . . . . . 131 5.6. Linguisticresources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 5.6.1. Generallanguagebilingualdictionary. . . . . . . . . . . . . . . . . 131 5.6.2. Thesaurus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 5.6.3. Boundmorphemestranslationtable . . . . . . . . . . . . . . . . . . 132 5.6.4. Lexiconforworddecomposition . . . . . . . . . . . . . . . . . . . 133 5.6.5. Morphologicalfamilies . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.6.6. Dictionaryofcognates . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 CHAPTER 6. FORMALIZATION AND EVALUATION OF CANDIDATE TRANSLATION GENERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 6.2. Translationgenerationalgorithm . . . . . . . . . . . . . . . . . . . . . . 139 6.2.1. Decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.2.2. Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 6.2.3. Recomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 6.2.4. Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.3. Morphologicalsplittingevaluation. . . . . . . . . . . . . . . . . . . . . 147 6.4. Translationgenerationevaluation . . . . . . . . . . . . . . . . . . . . . 148 6.4.1. Referencedataandevaluationmeasures . . . . . . . . . . . . . . . 148 6.4.2. Modelgenericityinfluence . . . . . . . . . . . . . . . . . . . . . . . 152 6.4.3. Linguisticresourcesinfluence . . . . . . . . . . . . . . . . . . . . . 156 6.4.4. Fallbackstrategyinfluence . . . . . . . . . . . . . . . . . . . . . . . 159 6.4.5. Fertiletranslationsinfluence . . . . . . . . . . . . . . . . . . . . . . 160 6.4.6. Popularsciencecorpusinfluence . . . . . . . . . . . . . . . . . . . 165 6.4.7. Qualitativeanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 6.5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 6.5.1. Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 6.5.2. Researchperspectives . . . . . . . . . . . . . . . . . . . . . . . . . . 176 viii ComparableCorporaandComputer-assistedTranslation CHAPTER 7. FORMALIZATION AND EVALUATION OF CANDIDATE TRANSLATION RANKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.2. Rankingcriteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.2.1. Contextsimilarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 7.2.2. Candidatetranslationfrequency . . . . . . . . . . . . . . . . . . . . 180 7.2.3. Parts-of-speechtranslationprobability . . . . . . . . . . . . . . . . 180 7.2.4. Componentstranslationmode . . . . . . . . . . . . . . . . . . . . . 181 7.3. Criteriacombination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 7.3.1. Valuestandardization . . . . . . . . . . . . . . . . . . . . . . . . . . 184 7.3.2. Linearcombination . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 7.3.3. Learning-to-rankmodel. . . . . . . . . . . . . . . . . . . . . . . . . 186 7.4. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 7.4.1. Referencedataandevaluationmeasures . . . . . . . . . . . . . . . 187 7.4.2. Basesofcomparison . . . . . . . . . . . . . . . . . . . . . . . . . . 188 7.4.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 7.5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 7.5.1. Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 7.5.2. Researchperspectives . . . . . . . . . . . . . . . . . . . . . . . . . . 196 CONCLUSIONANDPERSPECTIVES . . . . . . . . . . . . . . . . . . . . . . . 199 PART3.APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 APPENDIX1.MEASURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 APPENDIX2.DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 APPENDIX3.COMPARABLE CORPORALEXICONS CONSULTATION INTERFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 LISTOFTABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 LISTOFFIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 LISTOFALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 LISTOFEXTRACTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Comparable corpora and computer-assisted translation PDF

305 Pages·2014·2.182 MB·English

by Delpech, Estelle Maryline

Checking for file health...

Save to my drive

Quick download

Download

Download Comparable corpora and computer-assisted translation PDF Free - Full Version

by Delpech, Estelle Maryline| 2014| 305 pages| 2.182| English

Download Comparable corpora and computer-assisted translation by Delpech, Estelle Maryline in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Comparable corpora and computer-assisted translation

No description available for this book.

Detailed Information

Author:	Delpech, Estelle Maryline
Publication Year:	2014
ISBN:	1848216890
Pages:	305
Language:	English
File Size:	2.182
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Comparable corpora and computer-assisted translation Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Comparable corpora and computer-assisted translation PDF?

Yes, on https://PDFdrive.to you can download Comparable corpora and computer-assisted translation by Delpech, Estelle Maryline completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Comparable corpora and computer-assisted translation on my mobile device?

After downloading Comparable corpora and computer-assisted translation PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Comparable corpora and computer-assisted translation?

Yes, this is the complete PDF version of Comparable corpora and computer-assisted translation by Delpech, Estelle Maryline. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Comparable corpora and computer-assisted translation PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.