ebook img

Computational Linguistics and Intelligent Text Processing: 14th International Conference, CICLing 2013, Samos, Greece, March 24-30, 2013, Proceedings, Part II PDF

598 Pages·2013·9.673 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computational Linguistics and Intelligent Text Processing: 14th International Conference, CICLing 2013, Samos, Greece, March 24-30, 2013, Proceedings, Part II

Alexander Gelbukh (Ed.) Computational Linguistics 7 1 and Intelligent 8 7 S C Text Processing N L 14th International Conference, CICLing 2013 Samos, Greece, March 2013 Proceedings, Part II 123 Lecture Notes in Computer Science 7817 CommencedPublicationin1973 FoundingandFormerSeriesEditors: GerhardGoos,JurisHartmanis,andJanvanLeeuwen EditorialBoard DavidHutchison LancasterUniversity,UK TakeoKanade CarnegieMellonUniversity,Pittsburgh,PA,USA JosefKittler UniversityofSurrey,Guildford,UK JonM.Kleinberg CornellUniversity,Ithaca,NY,USA AlfredKobsa UniversityofCalifornia,Irvine,CA,USA FriedemannMattern ETHZurich,Switzerland JohnC.Mitchell StanfordUniversity,CA,USA MoniNaor WeizmannInstituteofScience,Rehovot,Israel OscarNierstrasz UniversityofBern,Switzerland C.PanduRangan IndianInstituteofTechnology,Madras,India BernhardSteffen TUDortmundUniversity,Germany MadhuSudan MicrosoftResearch,Cambridge,MA,USA DemetriTerzopoulos UniversityofCalifornia,LosAngeles,CA,USA DougTygar UniversityofCalifornia,Berkeley,CA,USA GerhardWeikum MaxPlanckInstituteforInformatics,Saarbruecken,Germany Alexander Gelbukh (Ed.) Computational Linguistics and Intelligent Text Processing 14th International Conference, CICLing 2013 Samos, Greece, March 24-30, 2013 Proceedings, Part II 1 3 VolumeEditor AlexanderGelbukh NationalPolytechnicInstitute CenterforComputingResearch Av.JuanDiosBátiz,Col.NuevaIndustrialVallejo 07738MexicoD.F.,Mexico ISSN0302-9743 e-ISSN1611-3349 ISBN978-3-642-37255-1 e-ISBN978-3-642-37256-8 DOI10.1007/978-3-642-37256-8 SpringerHeidelbergDordrechtLondonNewYork LibraryofCongressControlNumber:2013933372 CRSubjectClassification(1998):H.3,H.4,F.1,I.2,H.5,H.2.8,I.5 LNCSSublibrary:SL1–TheoreticalComputerScienceandGeneralIssues ©Springer-VerlagBerlinHeidelberg2013 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, inistcurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable toprosecutionundertheGermanCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelaws andregulationsandthereforefreeforgeneraluse. Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface CICLing 2013 was the 14th Annual Conference on Intelligent Text Processing and Computational Linguistics. The CICLing conferences provide a wide-scope forumfordiscussionofthe artandcraftofnaturallanguageprocessingresearch as well as the best practices in its applications. This set of two books contains four invited papers and a selection of regular papers accepted for presentation at the conference. Since 2001, the proceedings of the CICLing conferences have been published in Springer’s Lecture Notes in Computer Science series as volume numbers 2004,2276,2588,2945,3406,3878, 4394, 4919,5449, 6008, 6608, 6609, 7181,and 7182. The set has been structured into 12 sections: – General Techniques – Lexical Resources – Morphology and Tokenization – Syntax and Named Entity Recognition – Word Sense Disambiguation and Coreference Resolution – Semantics and Discourse – Sentiment, Polarity, Emotion, Subjectivity, and Opinion – Machine Translation and Multilingualism – Text Mining, Information Extraction, and Information Retrieval – Text Summarization – Stylometry and Text Simplification – Applications The 2013eventreceivedarecordhighnumberofsubmissionsinthe 14-yearhis- toryoftheCICLingseries.Atotalof354papersby788authorsfrom55countries weresubmittedforevaluationbytheInternationalProgramCommittee;seeFig- ure 1 and Tables 1 and 2. This two-volume set contains revised versions of 87 regularpapersselectedforpresentation;thustheacceptancerateforthissetwas 24.6%. The book features invited papers by – Sophia Ananiadou, University of Manchester, UK – Walter Daelemans, University of Antwerp, Belgium – Roberto Navigli, Sapienza University of Rome, Italy – Michael Thelwall, University of Wolverhampton, UK who presented excellent keynote lectures at the conference. Publication of full- text invited papers in the proceedings is a distinctive feature of the CICLing conferences. Furthermore, in addition to presentation of their invited papers, the keynote speakers organized separate vivid informal events; this is also a distinctive feature of this conference series. VI Preface Table 1. Numberof submissions and accepted papers by topic1 Accepted Submitted% accepted Topic 18 75 24 Text mining 18 64 28 Semantics, pragmatics, discourse 17 80 21 Information extraction 17 67 25 Lexical resources 14 44 32 Other 14 35 40 Emotions, sentiment analysis, opinion mining 13 40 33 Practical applications 11 52 21 Information retrieval 11 51 22 Machine translation and multilingualism 8 30 27 Syntaxand chunking 7 40 17 Underresourced languages 7 39 18 Clustering and categorization 6 23 26 Summarization 5 32 16 Morphology 5 24 21 Word sense disambiguation 5 19 26 Named entity recognition 4 20 20 Noisy text processing and cleaning 4 17 24 Social networksand microblogging 4 13 31 Natural language generation 3 11 27 Coreference resolution 3 9 33 Natural language interfaces 3 8 38 Question answering 2 23 9 Formalisms and knowledge representation 2 18 11 POS tagging 2 2 100 Computational humor 1 11 9 Speech processing 1 11 9 Computational terminology 1 8 12 Spelling and grammar checking 1 3 33 Textualentailment 1 Asindicatedbytheauthors.Apapermaybelongtoseveraltopics. With this event we continued with our policy of giving preference to papers with verifiable and reproducible results: in addition to the verbal description of their findings given in the paper, we encouraged the authors to provide a proof of their claims in electronic form. If the paper claimed experimental re- sults, we asked the authors to make available to the community all the input data necessary to verify and reproduce these results; if it claimed to introduce an algorithm, we encouragedthe authors to make the algorithm itself, in a pro- gramming language, available to the public. This additional electronic material willbe permanentlystoredonthe CICLing’sserver,www.CICLing.org,andwill be available to the readers of the corresponding paper for download under a license that permits its free use for researchpurposes. Inthelongrunweexpectthatcomputationallinguisticswillhaveverifiability and clarity standards similar to those of mathematics: in mathematics, each Preface VII Table 2. Numberof submitted and accepted papers by countryor region Country Authors Papers2 Country Authors Papers2 or region Subm. Subm. Accp. or region Subm. Subm. Accp. Algeria 4 4 – Malaysia 7 1.67 1 Argentina 3 1 – Malta 1 1 – Australia 3 1 – Mexico 14 6.25 3.25 Austria 1 1 – Moldova 3 1 – Belgium 3 1 1 Morocco 7 4 1 Brazil 13 6.83 2 Netherlands 8 4.50 1 Canada 11 4.53 1.2 New Zealand 5 1.67 – China 57 21.72 3.55 Norway 6 2.92 0.92 Colombia 2 1 1 Pakistan 5 2 – Croatia 5 2 2 Poland 8 3.75 0.75 Czech Rep. 10 5 2 Portugal 9 3 – Egypt 22 11.67 1 Qatar 2 0.67 – Finland 2 0.67 – Romania 14 9.67 2 France 64 25.9 5.65 Russia 15 4.75 1 Georgia 1 1 0.5 Singapore 5 2.25 0.25 Germany 32 13.92 6.08 Slovakia 2 1 – Greece 21 6.12 2.12 Spain 39 15.50 8.75 HongKong 9 2.53 0.2 Sweden 2 2 – Hungary 12 6 – Switzerland 8 3.83 1.33 India 98 49.2 5.6 Taiwan 1 1 – Iran 14 11.33 – Tunisia 24 11 2 Ireland 6 4.5 1.5 Turkey 11 6.25 3.25 Italy 22 11.37 4.5 Ukraine 2 1.25 0.50 Japan 48 20.5 5 UAE 1 0.33 – Kazakhstan 10 3.75 – UK 35 15.73 5.20 Korea, South 7 3 – USA 54 18.98 8.90 Latvia 6 2 1 Viet Nam 8 3.50 – Macao 6 2 – Total: 788 354 87 2 By the numberof authors: e.g., a paper bytwo authors from theUSA and one from UK is counted as 0.67 for the USAand 0.33 for UK. claim is accompanied by a complete and verifiable proof (usually much longer thantheclaimitself);eachtheorem’scompleteandpreciseproof—andnotjusta vaguedescriptionofits generalidea—ismadeavailableto thereader.Electronic mediaallowcomputationallinguists toprovidematerialanalogousto the proofs andformulasinmathematics infull length—whichcanamountto megabytesor gigabytesofdata—separatelyfroma 12-pagedescriptionpublished inthe book. More information can be found on www.CICLing.org/why verify.htm. Toencourageprovidingalgorithmsanddataalongwiththepublishedpapers, we selected a winner of our Verifiability, Reproducibility, and Working Descrip- tionAward.Themainfactorsinchoosingtheawardedsubmissionweretechnical correctness and completeness, readability of the code and documentation, sim- plicity of installation and use, and exact correspondence to the claims of the VIII Preface Fig.1. Submissions by country or region. The area of a circle represents the number of submitted papers. paper.Unnecessarysophisticationofthe user interfacewasdiscouraged;novelty and usefulness of the results were not evaluated—instead, they were evaluated forthepaperitselfandnotforthedata.Thisyear’swinningpaperwaspublished in a separate proceedings volume and is not included in this set. ThefollowingpapersreceivedtheBestPaperAwards,theBestStudentPaper Award, as well as the Verifiability, Reproducibility, and Working Description Award, correspondingly (the best student paper was selected among papers of whichthefirstauthorwasafull-timestudent,excludingthepapersthatreceived a Best Paper Award): 1st Place: Automatic Detection of Idiomatic Clauses, byAnna Feldmanand Jing Peng, USA; 2nd Place: Topic-Oriented Words as Features for Named Entity Recognition, by Ziqi Zhang, Trevor Cohn, and Fabio Ciravegna,UK; 3rd Place: Five Languages are Better than One: An Attempt to Bypass the Data Acquisition Bottleneck for WSD, by Els Lefever, Veronique Hoste, and Martine De Cock, Belgium; Student: Domain Adaptation in Statistical Machine Translation Using Comparable Corpora: Case Study for English-Latvian IT Local- isation, by Ma¯rcisPinnis,Inguna Skadin,a,andAndrejs Vasil,jevs, Latvia; Verifiability: Linguistically-Driven Selection of Correct Arcs for Dependency Parsing, by Felice Dell’Orletta, Giulia Venturi, and Simonetta Montemagni, Italy. The authors of the awarded papers (except for the Verifiability Award) were given extended time for their presentations. In addition, the Best Presentation Award and the Best Poster Award winners were selected by a ballot among the attendees of the conference. Besides its high scientific level, one of the success factors of CICLing confer- ences is their excellent cultural program.The attendees of the conference had a chancetovisituniquehistoricalplaces:theGreekislandofSamos,thebirthplace Preface IX of Pythagoras (Pythagorean theorem!), Aristarchus (who first realized that the Earth rotates around the Sun and not vice versa), and Epicurus (one of the founders of the scientific method); the Greek island of Patmos, where John the Apostle received his visions of the Apocalypse; and the huge and magnificent archeologicalsite ofEphesusinTurkey,wherestoodtheTemple ofArtemis,one of the Seven Wonders of the World (destroyed by Herostratus), and where the Virgin Mary is believed to have spent the last years of her life. Iwouldliketothankallthoseinvolvedintheorganizationofthisconference. In the first place these are the authors of the papers that constitute this book: it is the excellence of their researchworkthat gives value to the book andsense to the work of all other people. I thank all those who served on the Program Committee, Software Reviewing Committee, Award Selection Committee, as well as additional reviewers, for their hard and very professional work. Special thanksgotoTedPedersen,AdamKilgarriff,ViktorPekar,KenChurch,Horacio Rodriguez, Grigori Sidorov, and Thamar Solorio for their invaluable support in the reviewing process. Iwouldliketothanktheconferencestaff,volunteers,andthemembersofthe localorganizationcommitteeheadedbyDr.EfstathiosStamatatos.Inparticular, we are grateful to Dr. Ergina Kavallieratou for her great effort in planning the cultural program and Mrs. Manto Katsiani for her invaluable secretarial and logistics support. We are deeply grateful to the Department of Information and Communication Systems Engineering of the University of the Aegean for its generoussupportandsponsorship.SpecialthanksgototheUnionofVinicultural Cooperatives of Samos (EOSS), A. Giannoulis Ltd., and the Municipality of Samos for their kind sponsorship. We also acknowledge the support received fromtheprojectWIQ-EI(FP7-PEOPLE-2010-IRSES:WebInformationQuality Evaluation Initiative). The entire submission and reviewing process was supported for free by the EasyChairsystem(www.EasyChair.org).Lastbutnotleast,I deeplyappreciate the Springerstaff’spatience andhelpineditingthese volumesandgettingthem printedinrecordshorttime—itisalwaysagreatpleasuretoworkwithSpringer. February 2013 Alexander Gelbukh Organization CICLing 2013 is hosted by the University of the Aegean and is organized by the CICLing 2013 Organizing Committee in conjunction with the Natural Lan- guage and Text Processing Laboratory of the CIC (Centro de Investigacio´n en Computacio´n) of the IPN (Instituto Polit´ecnico Nacional), Mexico. Organizing Chair Efstathios Stamatatos Organizing Committee Efstathios Stamatatos (chair) Ergina Kavallieratou Manolis Maragoudakis Program Chair Alexander Gelbukh Program Committee Ajith Abraham Dan Cristea Marianna Apidianaki Walter Daelemans Bogdan Babych Anna Feldman Ricardo Baeza-Yates Alexander Gelbukh (chair) Kalika Bali Gregory Grefenstette Sivaji Bandyopadhyay Eva Hajicova Srinivas Bangalore Yasunari Harada Leslie Barrett Koiti Hasida Roberto Basili Iris Hendrickx Anja Belz Ales Horak Pushpak Bhattacharyya Veronique Hoste Igor Boguslavsky Nancy Ide Anto´nio Branco Diana Inkpen Nicoletta Calzolari Hitoshi Isahara Nick Campbell Sylvain Kahane Michael Carl Alma Kharrat Ken Church Adam Kilgarriff

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.