ACL 2007 PRAGUE A C L 2 0 0 7 Proceedings of the Second Workshop on Statistical Machine Translation June 23, 2007 Prague, Czech Republic The Association for Computational Linguistics ProductionandManufacturingby Omnipress 2600AndersonStreet Madison,WI53704 USA (cid:13)c 2007AssociationforComputationalLinguistics OrdercopiesofthisandotherACLproceedingsfrom: AssociationforComputationalLinguistics(ACL) 209N.EighthStreet Stroudsburg,PA18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ii Preface The ACL 2007 Workshop on Statistical Machine Translation (WMT-07) took place on Saturday, June 23 in Prague, Czech Republic, immediately preceding the annual meeting of the Association for Computational Linguistics, which was hosted by Charles University. This was the second time this workshop had been held, following the first workshop at the 2006 HLT-NAACL conference. But its ancestry can be traced back farther to the ACL 2005 Workshop on Building and Using Parallel Texts (whenwestartedourevaluationcampaignonEuropeanlanguages),andeventheACL2001Workshop on Data-Driven Machine Translation (which was the first ACL workshop mostly directed at statistical machinetranslation). Over the last years, interest in statistical machine translation has been risen dramatically. We received an overwhelming number of full paper submission for a one-day workshop, 38 in total. Given our limitedcapacity,wewereonlyabletoaccept12fullpapersfororalpresentationand9papersforposter presentation, an acceptance rate of 55%. In a second poster session, 16 additional shared task papers werepresented. TheworkshopalsofeaturedaninvitedtalkbyJeanSenellartofSYSTRANLanguage TranslationTechnology,Paris. Prior to the workshop, in addition to soliciting relevant papers for review and possible presentation we conducted a shared task that brought together machine translation systems for an evaluation on previously unseen data. This year’s task resembled the shared tasks of previous years in many ways. Its focus was again the translation of European languages, using a relatively large training corpus. This year, we included a variety of manual evaluations of the MT systems’ outputs, and a variety of automated evaluation metrics. Also, as a special challenge this year, we posed the problem of domain adaptation. Theresultsofthesharedtaskwereannouncedattheworkshop, andtheseproceedingsalsoincludean overview paper for the shared task that summarizes the results, as well as provides information about the data used and any procedures that were followed in conducting or scoring the task. In addition, thereareshortpapersfromeachparticipatingteamthatdescribetheirunderlyingsysteminsomedetail. We would like to thank the members of the Program Committee for their timely reviews. We also wouldliketothanktheparticipantsofthesharedtask,theparticipantsoftheMTMarathon,whichwas organized by the University of Edinburgh in March this year, and all the other volunteers who helped with the manual evaluations. We also acknowledge financial support for the manual evaluation by the EuroMatrixproject(fundedbytheEuropeanCommissionundertheFrameworkProgramme6). ChrisCallison-Burch,PhilippKoehn,ChristofMonz,andCameronShawFordyce Co-Organizers iii Organizers Chairs: ChrisCallison-Burch(JohnsHopkinsUniversity) PhilippKoehn(UniversityofEdinburgh) ChristofMonz(QueenMary,UniversityofLondon) CameronShawFordyce(CenterfortheEvaluationofLanguageandCommunicationTechnologies) InvitedSpeaker: JeanSenellart(SYSTRANLanguageTranslationTechnology,Paris) ProgramCommittee: LarsAhrenberg(Linko¨pingUniversity) FranciscoCasacuberta(UniversityofValencia) ColinCherry(UniversityofAlberta) StephenClark(OxfordUniversity) BrookeCowan(MassachusettsInstituteofTechnology) MonaDiab(ColumbiaUniversity) ChrisDyer(UniversityofMaryland) AndreasEisele(UniversitySaarbru¨cken) MarcelloFederico(ITC-IRST) GeorgeFoster(CanadaNationalResearchCouncil) AlexFraser(ISI/UniversityofSouthernCalifornia) UlrichGermann(UniversityofToronto) RebeccaHwa(UniversityofPittsburgh) KevinKnight(ISI/UniversityofSouthernCalifornia) PhilippeLanglais(UniversityofMontreal) AlonLavie(CarnegieMelonUniversity) LoriLevin(CarnegieMellonUniversity) DanielMarcu(ISI/UniversityofSouthernCalifornia) BobMoore(MicrosoftResearch) MilesOsborne(UniversityofEdinburgh) MichelSimard(CanadaNationalResearchCouncil) EiichiroSumita(NICT/ATR) Jo¨rgTiedemann(UniversityofGroningen) ChristophTillmann(IBMResearch) DanTufis¸ (RomanianAcademy) TaroWatanabe(NTT) DekaiWu(HKUST) RichardZens(RWTHAachen) AdditionalReviewers: JoshuaAlbrecht,MarineCarpuat,HirofumiYamamoto,andKeijiYasuda. v Table of Contents UsingDependencyOrderTemplatestoImproveGeneralityinTranslation ArulMenezesandChrisQuirk.............................................................1 CCGSupertagsinFactoredStatisticalMachineTranslation AlexandraBirch,MilesOsborneandPhilippKoehn..........................................9 IntegrationofanArabicTransliterationModuleintoaStatisticalMachineTranslationSystem MehdiM.Kashani,EricJoanis,RolandKuhn,GeorgeFosterandFredPopowich..............17 ExploringDifferentRepresentationalUnitsinEnglish-to-TurkishStatisticalMachineTranslation KemalOflazerandIlknurDurgarEl-Kahlout...............................................25 CanWeTranslateLetters? DavidVilar,Jan-ThorstenPeterandHermannNey..........................................33 ADependencyTreeletStringCorrespondenceModelforStatisticalMachineTranslation DeyiXiong,QunLiuandShouxunLin....................................................40 WordErrorRates: DecompositionoverPOSclassesandApplicationsforErrorAnalysis MajaPopovicandHermannNey..........................................................48 Speech-InputMulti-TargetMachineTranslation AliciaPe´rez,M.TeresaGonza´lez,M.Ine´sTorresandFranciscoCasacuberta..................56 Meta-StructureTransformationModelforStatisticalMachineTranslation JiadongSun,TiejunZhaoandHuashenLiang..............................................64 TrainingNon-ParametricFeaturesforStatisticalMachineTranslation PatrickNguyen,MilindMahajanandXiaodongHe.........................................72 UsingWord-DependentTransitionModelsinHMM-BasedWordAlignmentforStatisticalMachineTrans- lation XiaodongHe ........................................................................... 80 EfficientHandlingofN-gramLanguageModelsforStatisticalMachineTranslation MarcelloFedericoandMauroCettolo.....................................................88 HumanEvaluationofMachineTranslationThroughBinarySystemComparisons DavidVilar,GregorLeusch,HermannNeyandRafaelE.Banchs ............................ 96 LabelledDependenciesinMachineTranslationEvaluation KarolinaOwczarzak,JosefvanGenabithandAndyWay...................................104 AnIteratively-TrainedSegmentation-FreePhraseTranslationModelforStatisticalMachineTranslation RobertMooreandChrisQuirk .......................................................... 112 vii UsingParaphrasesforParameterTuninginStatisticalMachineTranslation NitinMadnani,NecipFazilAyan,PhilipResnikandBonnieDorr...........................120 Mixture-ModelAdaptationforSMT GeorgeFosterandRolandKuhn.........................................................128 (Meta-)EvaluationofMachineTranslation ChrisCallison-Burch,CameronFordyce,PhilippKoehn,ChristofMonzandJoshSchroeder...136 Context-awareDiscriminativePhraseSelectionforStatisticalMachineTranslation Jesu´sGime´nezandLlu´ısMa`rquez.......................................................159 Ngram-BasedStatisticalMachineTranslationEnhancedwithMultipleWeightedReorderingHypotheses MartaR.Costa-jussia`,JosepM.Crego,PatrikLambert,MaximKhalilov,Jose´ A.R.Fonollosa,Jose´ B.MarioandRafaelE.Banchs...............................................................167 Analysis of Statistical and Morphological Classes to Generate Weigthed Reordering Hypotheses on a StatisticalMachineTranslationSystem MartaR.Costa-jussia` andJose´ A.R.Fonollosa ........................................... 171 DomainAdaptationinStatisticalMachineTranslationwithMixtureModelling JorgeCiveraandAlfonsJuan............................................................177 GettingtoKnowMoses: InitialExperimentsonGerman-EnglishFactoredTranslation MariaHolmqvist,SaraStymneandLarsAhrenberg ....................................... 181 NRC’sPORTAGESystemforWMT2007 NicolaUeffing,MichelSimard,SamuelLarkinandHowardJohnson........................185 BuildingaStatisticalMachineTranslationSystemforFrenchUsingtheEuroparlCorpus HolgerSchwenk.......................................................................189 Multi-EngineMachineTranslationwithanOpen-SourceSMTDecoder YuChen,AndreasEisele,ChristianFedermann,EvaHasler,MichaelJellinghausand SilkeTheison..........................................................................193 TheISLPhrase-BasedMTSystemforthe2007ACLWorkshoponStatisticalMachineTranslation MatthiasPaulik,KayRottmann,JanNiehues,SiljaHildebrandandStephanVogel............197 Rule-BasedTranslationwithStatisticalPhrase-BasedPost-Editing MichelSimard,NicolaUeffing,PierreIsabelleandRolandKuhn ........................... 203 The”NoisierChannel”: TranslationfromMorphologicallyComplexLanguages ChristopherJ.Dyer.....................................................................207 UCBSystemDescriptionfortheWMT2007SharedTask PreslavNakovandMartiHearst.........................................................212 viii TheSyntaxAugmentedMT(SAMT)SystemattheSharedTaskforthe2007ACLWorkshoponStatistical MachineTranslation AndreasZollmann,AshishVenugopal,MatthiasPaulikandStephanVogel...................216 StatisticalPost-EditingonSYSTRAN’sRule-BasedTranslationSystem Lo¨ıcDugast,JeanSenellartandPhilippKoehn............................................220 ExperimentsinDomainAdaptationforStatisticalMachineTranslation PhilippKoehnandJoshSchroeder.......................................................224 METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judg- ments AlonLavieandAbhayaAgarwal ........................................................ 228 English-to-CzechFactoredMachineTranslation OndˇrejBojar...........................................................................232 SentenceLevelMachineTranslationEvaluationasaRanking YangYe,MingZhouandChin-YewLin..................................................240 LocalizationofDifficult-to-TranslatePhrases BehrangMohitandRebeccaHwa........................................................248 LinguisticFeaturesforAutomaticEvaluationofHeterogenousMTSystems Jesu´sGime´nezandLlu´ısMa`rquez.......................................................256 ix
Description: