ebook img

Foundations of Data Exchange PDF

346 Pages·2014·1.986 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Foundations of Data Exchange

FOUNDATIONS OF DATA EXCHANGE The problem of exchanging data between different databases with different schemasisanareaofimmenseimportance.Consequentlydataexchangehasbeen one of the most activeresearchtopics in databasesoverthe pastdecade.Founda- tional questions related to data exchange largely revolve around three key prob- lems: how to build target solutions; how to answer queries over target solutions; and how to manipulate schema mappings themselves? The last question is also known underthe name “metadata management”,since mappingsrepresent meta- data,ratherthandatainthedatabase. Inthisbooktheauthorssummarizethekeydevelopmentsofadecadeofresearch. Part I introduces the problem of data exchangevia examples, both relational and XML;PartIIdealswithexchangingrelationaldata;PartIIIfocusesonexchanging XMLdata;andPartIVcoversmetadatamanagement. Marcelo Arenas is an Associate Professor at the Department of Computer ScienceatthePontificiaUniversidadCatolicadeChile.HereceivedhisPh.D.from theUniversityofTorontoin2005.Hisresearchinterestsareindifferentaspectsof databasetheory,suchasexpressivepowerofquerylanguages,databasesemantics, inconsistency handling, database design, XML databases, data exchange, meta- data management and database aspects of the Semantic Web. He has received an IBM Ph.D. Fellowship (2004), seven best paper awards (PODS 2003, PODS 2005, ISWC 2006, ICDT 2010, ESWC 2011, PODS 2011 and WWW 2012) and an ACM-SIGMOD Dissertation Award HonorableMention in 2006for his Ph.D. dissertation “Design Principles for XML Data”. He has served on multiple pro- gramcommittees,andsince2009hehasbeenparticipatingasaninvitedexpertin theWorldWideWebConsortium. Pablo BarcelóisanAssistantProfessorintheDepartmentofComputerSci- ence at the University of Chile. He received his Ph.D. from the University of Toronto in 2006. His main research interest is in the area of Foundationsof Data Management,inparticular,querylanguages,dataexchange,incompletedatabases, and, recently,graph databases.He has servedon programcommittees of some of the majorconferencesin databasetheory andthe theoreticalaspectsofComputer Science(PODS,ICDT,CIKM,STACS,SIGMOD). Leonid LibkinisProfessorofFoundationsofDataManagementintheSchool of Informatics at the University of Edinburgh. He was previously a Professor at the University of Toronto and a member of research staff at Bell Laboratories in Murray Hill. He received his PhD from the University of Pennsylvania in 1994. His main research interests are in the areas of data managementand applications of logic in computer science. He has written four books and over 150 technical papers.He was the recipientofa Marie Curie ChairAward from the EU in 2006, andwonfourbestpaperawards.Hehaschairedprogrammecommitteesofmajor database conferences (ACM PODS, ICDT) and was the conference chair of the 2010 Federated Logic Conference. He has given many invited conference talks and has served on multiple program committees and editorial boards. He is an ACMfellowandafellowoftheRoyalSocietyofEdinburgh. Filip Murlak is assistant professor at the Faculty of Mathematics, Informat- ics,andMechanicsattheUniversityofWarsaw,Poland.Previouslyhewasresearch fellowattheUniversityofEdinburgh.HereceivedhisPhDfromtheUniversityof Warsawin2008.Hismainresearchareasareautomatatheoryandsemi-structured data. He was the recipient of the best paper award at ICALP 2006, the Witold LipskiPrizeforyoungresearchersin2008,andtheHomingPlusscholarshipfrom the Foundation for Polish Science in 2010. He was co-chair of MFCS 2011 and served on program committees of several database and theoretical computer sci- enceconferences. FOUNDATIONS OF DATA EXCHANGE MARCELO ARENAS Pontificia Universidad Católica de Chile PABLO BARCELO´ Universidad deChile LEONID LIBKIN University of Edinburgh FILIP MURLAK Uniwersytet Warszawski, Poland UniversityPrintingHouse,CambridgeCB28BS,UnitedKingdom PublishedintheUnitedStatesofAmericabyCambridgeUniversityPress,NewYork CambridgeUniversityPressispartoftheUniversityofCambridge. ItfurtherstheUniversity’smissionbydisseminatingknowledgeinthepursuitof education,learningandresearchatthehighestinternationallevelsofexcellence. www.cambridge.org Informationonthistitle:www.cambridge.org/9781107016163 ©MarceloArenas,PabloBarceló,LeonidLibkinandFilipMurlak2014 Thispublicationisincopyright.Subjecttostatutoryexception andtotheprovisionsofrelevantcollectivelicensingagreements, noreproductionofanypartmaytakeplacewithoutthewritten permissionofCambridgeUniversityPress. Firstpublished2014 PrintedintheUnitedKingdombyCPIGroupLtd,CroydonCR04YY AcataloguerecordforthispublicationisavailablefromtheBritishLibrary LibraryofCongressCataloguinginPublicationdata ISBN978-1-107-01616-3Hardback CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyof URLsforexternalorthird-partyinternetwebsitesreferredtointhispublication, anddoesnotguaranteethatanycontentonsuchwebsitesis,orwillremain, accurateorappropriate. v ToMagdalena,MarcelitoandVanny,fortheirloveandsupport. M.A. ToSalvador,Gabiandmyparents. P.B. Tomyparents. L.L. Tomygrandparents,whochosetosendtheirkidstouniversity. F.M. Contents Preface pagexi PARTONE GETTINGSTARTED 1 1 Dataexchangebyexample 3 1.1 Adataexchangeexample 3 1.2 Overviewofthemaintasksindataexchange 9 1.3 Dataexchangevsdataintegration 11 2 Theoreticalbackground 12 2.1 Relationaldatabasemodel 12 2.2 Querylanguages 14 2.3 Incompletedata 18 2.4 Complexityclasses 21 2.5 Basicsofautomatatheory 27 3 Dataexchange:keydefinitions 29 3.1 Schemamappings 29 3.2 Solutions 30 3.3 Queryansweringandrewriting 31 3.4 Bibliographiccomments 32 PARTTWO RELATIONALDATAEXCHANGE 33 4 Theproblemofrelationaldataexchange 35 4.1 Keydefinitions 35 4.2 Keyproblems 39 5 Existenceofsolutions 43 5.1 Theproblemandeasycases 43 5.2 Undecidabilityforst-tgdsandtargetconstraints 44 5.3 Thechase 46 viii Contents 5.4 Weakacyclicityoftargetconstraints 49 5.5 Complexityoftheproblem 53 6 Goodsolutions 56 6.1 Universalsolutions 56 6.2 Existenceofuniversalsolutions 59 6.3 Canonicaluniversalsolutionandchase 65 6.4 Thecore 68 7 Queryansweringandrewriting 75 7.1 Answeringrelationalcalculusqueries 75 7.2 Answeringconjunctivequeries 76 7.3 Conjunctivequerieswithinequalities 78 7.4 Tractablequeryansweringwithnegation 81 7.5 Rewritabilityoverspecialsolutions 88 7.6 Non-rewritabilitytool:locality 91 8 Alternativesemantics 97 8.1 Universalsolutionssemantics 98 8.2 Closed-worldsemantics 102 8.3 Closed-worldsemanticsandtargetconstraints 112 8.4 Clopen-worldsemantics 121 9 EndnotestoPartTwo 124 9.1 Summary 124 9.2 Bibliographiccomments 125 9.3 Exercises 126 PARTTHREE XMLDATAEXCHANGE 133 10 TheproblemofXMLdataexchange 135 10.1 XMLdocumentsandschemas 135 10.2 KeyproblemsofXMLdataexchange 141 11 Patternsandmappings 143 11.1 Treepatterns:classificationandcomplexity 143 11.2 XMLschemamappingsandtheircomplexity 153 12 Buildingsolutions 158 12.1 Buildingsolutionsrevisited 158 12.2 Asimpleexhaustivesearchalgorithm 159 12.3 Nested-relationalDTDs 162 12.4 Thealgorithmforregularschemas 168 12.5 Thegeneralalgorithm 172 12.6 Combinedcomplexityofsolutionbuilding 178

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.