ebook img

Combinatorial Pattern Matching: 11th Annual Symposium, CPM 2000 Montreal, Canada, June 21–23, 2000 Proceedings PDF

434 Pages·2000·6.545 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Combinatorial Pattern Matching: 11th Annual Symposium, CPM 2000 Montreal, Canada, June 21–23, 2000 Proceedings

Lecture Notes in Computer Science 1848 Editedby G.Goos,J. Hartmanisand J.van Leeuwen 3 Berlin Heidelberg NewYork Barcelona HongKong London Milan Paris Singapore Tokyo Raffaele Giancarlo David Sankoff (Eds.) Combinatorial Pattern Matching 11th Annual Symposium, CPM 2000 Montreal, Canada, June 21-23, 2000 Proceedings 1 3 SeriesEditors GerhardGoos,KarlsruheUniversity,Germany JurisHartmanis,CornellUniversity,NY,USA JanvanLeeuwen,UtrechtUniversity,TheNetherlands VolumeEditors RaffaeleGiancarlo Universita`diePalermo DipartimentodiMatematicaedApplicazioni ViaArchirafi34,90123Palermo,Italy E-mail:[email protected] DavidSankoff Universite´deMontre´al Centrederecherchesmathe´matiques CP6128succursaleCentre-Ville Montre´al,Que´bec,CanadaH3C3J7 E-mail:[email protected] Cataloging-in-Publicationdataappliedfor DieDeutscheBibliothek-CIP-Einheitsaufnahme Combinatorialpatternmatching:11thannualsymposium;proceedings/ CPM2000,Montre´al,Canada,June21-23,2000.RaffaeleGiancarlo; DavidSankoff(ed.).-Berlin;Heidelberg;NewYork;Barcelona; HongKong;London;Milan;Paris;Singapore;Tokyo:Springer,2000 (Lecturenotesincomputerscience;Vol.1848) ISBN3-540-67633-3 CRSubjectClassification(1998):F.2.2,I.5.4,I.5.0,I.7.3,H.3.3,E.4,G.2.1 ISSN0302-9743 ISBN3-540-67633-3Springer-VerlagBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer-Verlag.Violationsare liableforprosecutionundertheGermanCopyrightLaw. Springer-VerlagisacompanyintheBertelsmannSpringerpublishinggroup. (cid:1)c Springer-VerlagBerlinHeidelberg2000 PrintedinGermany Typesetting:Camera-readybyauthor,dataconversionbyChristianGrosche,Hamburg Printedonacid-freepaper SPIN:10722094 06/3142 543210 Foreword Thepapers contained inthisvolumewere presented atthe 11thAnnualSympo- siumon CombinatorialPattern Matching,held June 21-23,2000 at the Univer- sit(cid:19)e de Montr(cid:19)eal. They were selected from 44 abstracts submitted in response to the call for papers. In addition,there were invited lectures by Andrei Broder (AltaVista),Fernando Pereira (AT&T Research Labs), and Ian H. Witten (Uni- versity of Waikato). The symposium was preceded by a two-day summer school set up to at- tract and trainyoungresearchers. The lecturers at the school were Greg Butler, ClementLam,andGusGrahne:BLAST!Howdoyousearchsequencedatabases?, DavidBryant:Phylogeny,Ra(cid:11)aeleGiancarlo:Algorithmicaspectsofspeechrecog- nition, Nadia El-Mabrouk: Genome rearrangement, Laxmi Parida: Flexible- pattern discovery, and Ian H. Witten: Adaptive text mining: inferring structure from sequences. Combinatorial Pattern Matching (CPM) addresses issues of searching and matching strings and more complicated patterns such as trees, regular expres- sions graphs, point sets, and arrays. The goal is to derive non-trivial combina- torial properties of such structures and to exploit these properties in order to achieve superior performance for the corresponding computationalproblems. Over recent years a steady flow of high-qualityresearch on this subject has changed a sparse set of isolated results into a fully-fledged area of algorithmics. This area is continuing to grow even further due to the increasing demand for speed and e(cid:14)ciency that comes from important and rapidly expanding appli- cations such as the World Wide Web, computational biology, and multimedia systems,involvingrequirements forinformationretrieval,datacompression,and patternrecognition.TheobjectiveoftheannualCPMgatheringsistoprovidean international forum for research in combinatorial pattern matching and related applications. The (cid:12)rst ten meetings were held in Paris (1990), London (1991), Tucson (1992),Padova (1993),Asilomar (1994), Helsinki (1995), Laguna Beach (1996), Aahrus(1997),Piscataway(1998),andWarwick(1999).Afterthe(cid:12)rstmeeting,a selectionofpapersappearedasaspecialissueofTheoreticalComputerSciencein Volume92.The proceedings ofthe thirdtotenth meetings appeared as volumes 644,684, 807,937, 1075,1264, 1448,and 1645 of the Springer LNCS series. The general organizationand orientationofCPM conferences is coordinated by a steering committee composed of: Alberto Apostolico, Zvi Galil, University of Padova Columbia University & Purdue University Udi Manber, MaximeCrochemore, Yahoo! Inc. Universit(cid:19)e de Marne-la-Vall(cid:19)ee VI Foreword The program committee of CPM 2000 consisted of: Gad Landau, Amihood Amir, University of Haifa Bar Ilan University & Polytechnic University Bonnie Berger, Wojciech Rytter, MIT University of Warsaw Byron Dom, & University of Liverpool IBM Almaden Marie-France Sagot, Ra(cid:11)aele Giancarlo,Co-chair, Institut Pasteur University of Palermo Cenk Sahinalp, Dan Gus(cid:12)eld, Case Western Reserve University University of California, Davis DavidSanko(cid:11), Co-chair, Monika Henzinger, Universit(cid:19)e de Montr(cid:19)eal Google, Inc. Jim Storer, John Kececioglu, Brandeis University University of Georgia Esko Ukkonen, University of Helsinki The local organizing committee, all from the Universit(cid:19)e de Montr(cid:19)eal, consisted of: Nadia El-Mabrouk David Sanko(cid:11) Louis Pelletier SylvainViart The conference was supported by the Centre de recherches math(cid:19)ema- tiques of the Universit(cid:19)e de Montr(cid:19)eal, in the context of a thematic year on Mathematical Methods in Biologyand Medecine (2000-2001). April 2000 Ra(cid:11)aele Giancarlo David Sanko(cid:11) Foreword VII List of Reviewers O. Arbel P. Ferragina R. Sprugnoli D. Brown R. Grossi M. Sciortino D. Bryant R. Kumar D. Shapira C. Constantinescu A. Malinowski J. Sharp C. Cormode D. Modha L. Stockmeyer K. Diks M. Nykanen F. Ergun W. Plandowski R.Fagin A. Piccolboni Table of Contents Invited Lectures Identifyingand Filtering Near-Duplicate Documents:::::::::::::::::::: 1 Andrei Z. Broder Machine Learning for E(cid:14)cient Natural-Language Processing::::::::::::: 11 Fernando Pereira Browsing around a DigitalLibrary: Todayand Tomorrow ::::::::::::::: 12 Ian H. Witten Summer School Lectures AlgorithmicAspects of Speech Recognition:A Synopsis::::::::::::::::: 27 Adam L. Buchsbaum and Ra(cid:11)aele Giancarlo Some Results on Flexible-Pattern Discovery ::::::::::::::::::::::::::: 33 Laxmi Parida Contributed Papers Explainingand ControllingAmbiguityin Dynamic Programming :::::::: 46 Robert Giegerich A Dynamic Edit Distance Table ::::::::::::::::::::::::::::::::::::: 60 Sung-Ryul Kim and Kunsoo Park Parametric Multiple Sequence Alignment and Phylogeny Construction:::: 69 David Ferna(cid:19)ndez-Baca, Timo Seppa¨la¨inen, and Giora Slutzki TsukubaBB:ABranchandBoundAlgorithmforLocalMultipleSequence Alignment :::::::::::::::::::::::::::::::::::::::::::::::::::::::: 84 Paul Horton A Polynomial Time Approximation Scheme for the Closest Substring Problem :::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 99 Bin Ma ApproximationAlgorithms for Hamming Clustering Problems :::::::::::108 Leszek Ga(cid:24)sieniec, Jesper Jansson, and Andrzej Lingas Approximatingthe MaximumIsomorphic Agreement Subtree Is Hard ::::119 Paola Bonizzoni, Gianluca Della Vedova, and Giancarlo Mauri X Table of Contents A Faster and Unifying Algorithm for Comparing Trees :::::::::::::::::129 Ming-Yang Kao, Tak-Wah Lam, Wing-Kin Sung, and Hing-Fung Ting Incomplete Directed Perfect Phylogeny :::::::::::::::::::::::::::::::143 Itsik Pe’er, Ron Shamir, and Roded Sharan The Longest Common Subsequence Problem for Arc-Annotated Sequences 154 Tao Jiang, Guo-Hui Lin, Bin Ma, and Kaizhong Zhang Boyer-Moore String Matching over Ziv-Lempel Compressed Text:::::::::166 Gonzalo Navarro and Jorma Tarhio A Boyer-Moore Type Algorithm for Compressed Pattern Matching :::::::181 Yusuke Shibata, Tetsuya Matsumoto, Masayuki Takeda, Ayumi Shinohara, and Setsuo Arikawa Approximate String Matching over Ziv-Lempel Compressed Text:::::::::195 Juha Ka¨rkka¨inen, Gonzalo Navarro, and Esko Ukkonen ImprovingStatic Compression Schemes by Alphabet Extension ::::::::::210 Shmuel T. Klein Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments:::::::::::::::::::::::::::::::::::::::::::::::222 Nadia El-Mabrouk A Lower Bound for the Breakpoint Phylogeny Problem :::::::::::::::::235 David Bryant Structural Properties and Tractability Results for Linear Synteny ::::::::248 David Liben-Nowell and Jon Kleinberg Shift Error Detection in Standardized Exams::::::::::::::::::::::::::264 Steven Skiena and Pavel Sumazin An Upper Bound for Number of Contacts in the HP-Model on the Face-Centered-Cubic Lattice (FCC) ::::::::::::::::::::::::::::::::::277 Rolf Backofen The CombinatorialPartitioning Method ::::::::::::::::::::::::::::::293 Matthew R. Nelson, Sharon L. Kardia, and Charles F. Sing Compact Su(cid:14)x Array ::::::::::::::::::::::::::::::::::::::::::::::305 Veli Ma¨kinen Linear Bidirectional On-Line Construction of A(cid:14)x Trees::::::::::::::::320 Moritz G. Maa(cid:25) Using Su(cid:14)x Trees for Gapped Motif Discovery:::::::::::::::::::::::::335 Emily Rocke

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.