ebook img

Andrea Cal`ı Maria-Esther Vidal (Eds.) AMW2015 Alberto Mendelzon International Workshop on ... PDF

231 Pages·2015·4.71 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Andrea Cal`ı Maria-Esther Vidal (Eds.) AMW2015 Alberto Mendelzon International Workshop on ...

Andrea Cal`ı Maria-Esther Vidal (Eds.) AMW2015 Alberto Mendelzon International Workshop on Foundations of Data Management Lima, Peru´, May 6th-8th, 2015 Proceedings Copyright c 2015 for the individual papers by the papers’ authors. Copying permitted (cid:13) onlyforprivateandacademicpurposes. Thisvolumeispublishedandcopyrightedbyits editors. Re-publicationofmaterialfromthisvolumerequirespermissionbythecopyright owners. Editors’addresses: BirkbeckCollege,UniversityofLondon,UK [email protected]; UniversidadSimo´nBol´ıvar DepartmentofComputerScience ValledeSartenejas Caracas1086,Venezuela [email protected] Preface TheAlbertoMendelzonWorkshop(AMW)isaLatinAmericaninitiativestartedin2006 to honor the memory of Alberto Mendelzon, who gave a significant contribution to the field of data management. The AMW is a research venue for top-quality research on foundationsofdatamanagement;whilefocusedespeciallyonLatinAmericanstudentsand scholars, the workshop is open to submissions from anywhere and it has so far gathered someoftheworld’sbestresearchersinthefield. Thisvolumecontainspapersacceptedatthe9theditionoftheAMW,heldinLima,Peru´, fromthe6thtothe8thofMay,2015.Thecallforpapersofthiseditionrequestedtwotypes ofsubmissions:regularandshortpapers,wherethelatterwereintendedtopresentongoing research,resultspreviouslypublished,ordatamanagementapplications. Witharound40 submissions, we could put up an exciting program, which gave raise to discussions and new ideas. Despite the beauties of Lima, including its sophisticated cuisine and drinks, the attendees and the organizers worked hard and made the AMW 2015 a great success, both from the scientific and organizational point of view. The school, which took place beforetheworkshop,providedlocalstudentswithtutorialswhosequalitywasonparwith those at the best conferences in the area. The invited talks at the workshop were also certainlyworld-classandgivenbyleadingresearchers: GuyvandenBroeck(KULeuven, Belgium),WagnerMeiraJr(UniversidadeFederaldeMinasGerais,Brazil),DanOlteanu (UniversityofOxford,UK),andVictorVianu(UCSanDiego,USA). For what we are proud to call a very exciting scientific event, we would like to warmly thank, in no particular order: the authors of the papers, the Program Committee and the externalreviewers,thelocalorganizers,theAMWSchoolcommittee,theSteeringCom- mittee, the General Chair, and the local supporting institutions. Without the effort of all theabove,theAMW2015couldnothavebeensuccessful. We are also looking forward to the next editions of the Alberto Mendelzon Workshop, whichhasbecomeanestablishedandhigh-qualityvenueintheareaofDataManagement. Lima,May2015 AndreaCal`ı1 Maria-EstherVidal 1AndreaCal`ıacknowledgessupportfromtheEPSRCgrant“Logic-basedIntegrationandQueryingofUnin- dexedData”(EP/E010865/1) GeneralChair PabloBarcelo´ (UniversidaddeChile,Chile) PCChairs AndreaCali,BirkbeckCollege,UniversityofLondon,UK) Maria-EstherVidal,UniversidadSimo´nBol´ıvar,Venezuela ProgramCommittee CristinaDutraDeAguiarCiferri(UniversidadedeSaoPaulo,Brazil) OscarCorcho(UniversidadPolitecnicadeMadrid,Spain) IsabelCruz(UniversityofIlinnoisatChicago,USA) AmelieGheerbrant(UniversiteParisVIIDenisDiderot,France) ParkeGodfrey(YorkUniversity,Canada) SergioGreco(UniversitadellaCalabria,Italy) ClaudioGutierrez(UniversidaddeChile) MauricioA.Hernandez-Sherrington(IBMResearchAlmaden,USA) AidanHogan(UniversidaddeChile) ElizabethLeon(UniversidadNacionaldeColombia) JorgeLobo(ICREAandUniversidadPompeuFabra,Spain) MariaVaninaMartinez(UniversidadNacionaldelSur,Argentina) FilipMurlak(UniversityofWarsaw,Poland) MauricioOsorio(UniversidaddelasAmericasPuebla,Mexico) ReinhardPichler(ViennaUniversityofTechnology,Austria) AndreasPieris(ViennaUniversityofTechnology,Austria) AlessandroProvetti(UniversityofMessina,Italy) JarekSzlichta(UniversityofOntario,Canada) ReginaPaolaTiconaHerrera(UniversitedePauetdesPaysdel’Adour,France) RiccardoTorlone(UniversitaRomaTre) PeterWood(Birkbeck,UniversityofLondon,UK) SteeringCommittee RicardoBaeza-Yates(YahooResearch,Spain) PabloBarcelo(UniversidaddeChile,Chile) LeopoldoBertossi(CarletonUniversity,Canada) MarianoConsens(UniversityofToronto,Canada) AlbertoH.F.Laender(UniversidadeFederaldeMinasGerais,Brazil) JorgePerez(UniversidaddeChile,Chile) LocalSupportingInstitutions UniversidadMayordeSanMarcos FacultaddeIngenier´ıadeSistemaseInforma´tica • FacultaddeCienciasMatema´ticas • UniversidadRicardoPalma FacultaddeIngenier´ıaInforma´tica • UniversidadCato´licadeSanPablo EscuelaProfesionaldeCienciadelaComputacio´n • SociedadPeruanadeComputacio´n IEEE-Seccio´nPeu´ ColegiodeMatema´ticosdePeu´ Contents ADatabaseFrameworkforClassifierEngineering BennyKimelfeldandChristopherRe 1 ExtendingDatalogwithAnalyticsinLogicBlox MolhamAref,BennyKimelfeld,EmirPasalic1andNikolaosVasiloglou 6 TowardsReconcilingSPARQLandCertainAnswers(ExtendedAbstract) ShqiponjaAhmetaj,WolfgangFischl,ReinhardPichler,MantasSimkusandSebas- tianSkritek 12 EfficientEvaluationofWell-designedPatternTrees(ExtendedAbstract) PabloBarcelo,ReinhardPichlerandSebastianSkritek 18 ApproximationAlgorithmsforSchema-MappingDiscoveryfromDataExamples BaldertenCate,PhokionG.Kolaitis,KunQianandWang-ChiewTan 24 IMGpedia: AProposaltoEnrichDBpediawithImageMeta-Data BenjaminBustosandAidanHogan 35 FromClassicaltoConsistentQueryAnsweringunderExistentialRules ThomasLukasiewicz,MariaVaninaMartinez,AndreasPierisandGerardoI.Simari 40 FindingSimilarProductsinE-commerceSitesBasedonAttributes UriqueHoffmann,AltigrandaSilva,andMoisesCarvalho 46 EntityMatching: ACaseStudyintheMedicalDomain LuizF.M.Carvalho,AlbertoH.F.LaenderandWagnerMeiraJr. 57 UsingStatisticsforComputingJoinswithMapReduce TheresaCsar,ReinhardPichler,EmanuelSallingerandVadimSavenkov 69 TriAL-QL:DistributedProcessingofNavigationalQueries MartinPrzyjaciel-Zablocki,AlexanderSchatzleandAdrianLange 75 OnAxiomatizationandInferenceComplexityoveraHierarchyofFunctionalDe- pendencies JaroslawSzlichta,LukaszGolabandDiveshSrivastava 79 PPDL:ProbabilisticProgrammingwithDatalog BaldertenCate,BennyKimelfeldandDanOlteanu 91 CONTENTS Implementing Graph Query Languages over Compressed Data Structures: A ProgressReport Nicola´sLehmannandJorgePe´rez 96 Tractable Query Answering and Optimization for Extensions of Weakly-Sticky Datalog ± MostafaMilaniandLeopoldoBertossi 101 Saturation,Definability,andSeparationforXPathonDataTrees SergioAbriol,MaraEmiliaDescotte,andSantiagoFigueira 106 Random-WalkClosenessCentralitySatisfiesBoldi-VignaAxioms RicardoMoraandClaudioGutierrez 110 ExploitingSemanticstoPredictPotentialNovelLinksfromDenseSubgraphs AlejandroFlores,Maria-EstherVidalandGuillermoPalma 121 OntheCALMPrincipleforBSPComputation MatteoInterlandiandLetiziaTanca 131 ChaseTerminationforGuardedExistentialRules MarcoCalautti,GeorgGottlobandAndreasPieris 142 Dataintegrationwithmanyheterogeneoussourcesanddynamictargetschemas (extendedabstract) LuigiBellomarini,PaoloAtzeniandLucaCabibbo 148 Rewriting-basedCheckofChaseTermination MarcoCalautti,SergioGreco,CristianMolinaroandIrinaTrubitsyna 156 NavigationalQueriesBasedonFrontier-GuardedDatalog: PreliminaryResults MeghynBienvenu,MagdalenaOrtizandMantasSimkus 162 LDQL:ALanguageforLinkedDataQueries OlafHartig 172 IntuitionisticDataExchange GostaGrahne,AliMoallemiandAdrianOnet 184 A preliminary investigation into SPARQL query complexity and federation in Bio2RDF CarlosBuil-Aranda,MartinUgarte,MarceloArenasandMichelDumontier 196 CONTENTS KeywordSearchintheDeepWeb AndreaCali,DavideMartinenghiandRiccardoTorlone 205 ImplementingData-CentricDynamicSystemsoveraRelationalDBMS DiegoCalvanese,MarcoMontali,FabioPatriziandAndreyRivkin 209 DisentanglingtheNotionofDatasetinSPARQL DanielHernandezandClaudioGutierrez 213 A Database Framework for Classifier Engineering BennyKimelfeld1andChristopherRe´2 1 LogicBlox,Inc.andTechnion,Israel 2 StanfordUniversity 1 Introduction In the design of machine-learning solutions, a critical and often the most resourceful taskisthatoffeatureengineering[7,4],forwhichrecipesandtoolinghavebeendevel- oped [3,7]. In this vision paper we embark on the establishment of database founda- tionsforfeatureengineering.Weproposeaformalframeworkforclassification,inthe context of a relational database, towards investigating the application of database and knowledgemanagementtoassistwiththetaskoffeatureengineering.Wedemonstrate theusefulnessofthisframeworkbyformallydefiningtwokeyalgorithmicchallenges within:(1)separabilityreferstodeterminingtheexistenceoffeaturequeriesthatagree with the given training examples, and (2) identifiability is the task of testing for the property of independence among features (given as queries). Moreover, we give pre- liminary results on these challenges, in the context of conjunctive queries. We focus here on boolean features that are represented as ordinary database queries, and view thisworkasthebasisofvariousfutureextensionssuchasnumericalfeaturesandmore generalregressiontasks. 2 FormalFramework We first present our formal framework for classification with binary features within a relationaldatabase. 2.1 ClassifiersandLearning Inthiswork,aclassifierisafunctionoftheform γ : 1,1 n 1,1 {− } →{− } wherenisanaturalnumberthatwecallthearityofγ.Aclassifierclassisa(possibly infinite) family Γ of classifiers. We denote by Γ the restriction of Γ to the n-ary n classifiersinΓ.Ann-arytrainingcollectionisamultisetT ofpairs x,y wherex 1,1 n andy 1,1 .WedenotebyT thesetofalln-arytrahiningicollection∈s. n {− } ∈ {− } AcostfunctionforaclassifierclassΓ isafunctionoftheform c : n(Γn Tn) R 0 ∪ × → ≥ (cid:0) (cid:1) 1 ADatabaseFrameworkforClassifierEngineering whereR 0 isthesetofnonnegativenumbers.InthecontextofaclassifierclassΓ and ≥ a cost function c, learning a classifier is the task of finding a classifier γ Γ that n ∈ minimizesc(γ,T),givenatrainingcollectionT T . n ∈ We illustrate the above definitions on the important class of linear classifiers. An n-arylinearclassifierisparameterizedbyavectorw Rn,isdenotedbyΛw,andis definedasfollowsforalla 1,1 n. ∈ ∈{− } 1 ifa w 0; λ (a)=def · ≥ w ( 1 otherwise. − where “” denotes the operation of dot product. By Lin we denote the class of linear · classifiers.Anexampleofacostfunctionistheleastsquarecostlsq thatisgivenby lsq(Λ ,T)=def (x w y)2 w · − hxX,yi∈T fortheargumentsΛ Lin andT T . w n n ∈ ∈ 2.2 RelationalFormalism Our relational terminology is as follows. A schema is a pair ( ,Σ), where is a A A signaturethatconsistsofrelationsymbols,andΣisasetoflogicalintegrityconstraints over .EachrelationsymbolRhasanassociatedarity.WeassumeaninfinitesetConst A of constants. An instance I over a schema S = ( ,Σ) associates with every k-ary relationsymbolR afinitesubsetRI ofConstkA,suchthatalltheconstraintsofΣ ∈ A aresatisfied.TheactivedomainofaninstanceI,denotedadom(I),isthesetofallthe constantsinConstthatarementionedinI. LetSbeschema.Aquery(overS)isafunctionQthatisassociatedwithanarity k,andthatmapseveryrelationinstanceI overSintoafinitesubsetQ(I)ofConstk.A queryQ containsaqueryQifQ(I) Q(I)forallinstancesI overS;ifQ Q and 0 0 0 ⊆ ⊆ Q QthenQandQ aresaidtobeequivalent.AqueryQisadditiveifforeverytwo 0 0 ⊆ instancesI andI ,ifadom(I )andadom(I )aredisjoint,then 1 2 1 2 Q(I I )=Q(I ) Q(I ). 1 2 1 2 ∪ ∪ AqueryclassisamappingthatassociateswitheveryschemaSaclassofqueriesover S.Anexampleofaqueryclassisthatoftheconjunctivequeries.Aconjunctivequery (CQ)isrepresentedbythelogicalformulaq(x)thathastheform y[φ (x,y,d) φ (x,y,d)] 1 m ∃ ∧···∧ wherexandyaredisjointsequencesofvariables,disasequenceofconstants,andeach φ isanatomicqueryoverS(i.e.,aformulathatconsistsofasinglerelationsymboland i nologicaloperators).TheresultofapplyingtheCQQ=q(x)totheinstanceIconsists ofallthetuplesa(ofthesamelengthasx)suchthatq(a)istrueinI;wedenotethis resultisdenotedbyQ(I). 2

Description:
foundations of data management; while focused especially on Latin both from the scientific and organizational point of view. The school, which took place before the workshop, provided local students with tutorials whose quality . Implementing Data-Centric Dynamic Systems over a Relational DBMS.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.