ebook img

Transactions on Large-Scale Data- and Knowledge-Centered Systems XIX: Special Issue on Big Data and Open Data PDF

137 Pages·2015·10.619 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Transactions on Large-Scale Data- and Knowledge-Centered Systems XIX: Special Issue on Big Data and Open Data

Devis Bianchini Valeria De Antonellis · e Roberto De Virgilio n i l b Guest Editors u S l a Transactions on n r u o J Large-Scale 0 9 Data- and Knowledge- 9 8 S C Centered Systems XIX N L Abdelkader Hameurlain • Josef Küng • Roland Wagner Editors-in-Chief Special Issue on Big Data and Open Data 123 Lecture Notes in Computer Science 8990 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen Editorial Board David Hutchison Lancaster University, Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Zürich, Switzerland John C. Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbrücken, Germany More information about this series at http://www.springer.com/series/8637 ü Abdelkader Hameurlain Josef K ng (cid:129) Roland Wagner Devis Bianchini (cid:129) Valeria De Antonellis Roberto De Virgilio (Eds.) (cid:129) Transactions on Large-Scale Data- and Knowledge- Centered Systems XIX Special Issue on Big Data and Open Data 123 Editors-in-Chief Abdelkader Hameurlain Roland Wagner IRIT,Paul Sabatier University FAW,University ofLinz Toulouse Linz France Austria JosefKüng FAW,University ofLinz Linz Austria Guest Editors Devis Bianchini RobertoDe Virgilio University ofBrescia Universityof RomeIII Brescia Rome Italy Italy Valeria DeAntonellis University ofBrescia Brescia Italy ISSN 0302-9743 ISSN 1611-3349 (electronic) Lecture Notesin ComputerScience ISBN 978-3-662-46561-5 ISBN 978-3-662-46562-2 (eBook) DOI 10.1007/978-3-662-46562-2 LibraryofCongressControlNumber:2015932976 SpringerHeidelbergNewYorkDordrechtLondon ©Springer-VerlagBerlinHeidelberg2015 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodologynow knownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookare believedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissionsthatmayhavebeenmade. Printedonacid-freepaper Springer-VerlagGmbHBerlinHeidelbergispartofSpringerScience+BusinessMedia (www.springer.com) LNCS Transactions on Large-Scale Data- and Knowledge-Centered Systems (TLDKS) Special Issue on Big Data and Linked Open Data Linked Data and Big Data have been featured in recent years due to growing interest. Proper use of enabling technologies meant for these two kinds of data is a critical success factor in the evolution of the Web. The Linked Data perspective inspired researcheffortsforbuilding,maintaining,andexploitingtheWebasaglobaldatabase, whereresourcesareidentified(bymeansofURIs),semanticallydescribed(bymeansof RDF), and connected through RDF links. This perspective goes beyond the potential ofWeb2.0,enablingpeopleandapplicationstodiscovernewlinkedinformationinan unexpectedway,accordingtoanexplorativeperspective.BigDataemphasizesthefact thatnewtechniquesandinfrastructuresarerequiredforthesustainableexploitationofa hugeamountofdata.TheLinkedDataparadigmisoftenseenasanapproachtocoping withBigData,asitmovestheattentionfromaWebofdocumentstoaWebofrichdata. Nevertheless,thegreatavailabilityofresourcesraisesdatamanagementissues,that must be faced in a dynamic, highly distributed, and heterogeneous environment, such astheWeb:(i)howtomodellargeamountsof(linked)data,(ii)howtoquerydataand reasonontheminafeasibleway,(iii)howtoexploitBigandLinkedDataapplications in real-worldscenarios. To solve these issues means toexploit thesynergism between the conceptual foundations of data management and logical foundations of Big and Linked Data initiatives. At the same time, emerging Big Data technologies could be useful in addressing data management issues within the Linked Data context. Among them,moderndistributedtechnologiesbasedontheprinciplesofCAPtheorem,suchas NoSQL DBMS and Map/Reduce data processing. ThisSpecialIssuecollectsfourhigh-qualitypapersthataimatinvestigatingLinked Data and Big Data interleaving issues under a data management perspective: (a) two papers propose the application of clustering techniques for performing inference and search over (linked) data sources; (b) a paper leverages graph analysis techniques to enable application-level integration of institutional data; (c) a paper describes an approachforprotectingusers’profiledatafromdisclosure,tampering,andimproperuse. January 2015 Devis Bianchini Valeria De Antonellis Roberto De Virgilio Editorial Board Reza Akbarinia Inria, France Bernd Amann LIP6 – UPMC, France Dagmar Auer FAW, Austria Stéphane Bressan National University of Singapore, Singapore Francesco Buccafurri Università Mediterranea di Reggio Calabria, Italy Qiming Chen HP Lab, USA Tommaso Di Noia Politecnico di Bari, Italy Dirk Draheim University of Innsbruck, Austria Johann Eder Alpen-Adria-Universität Klagenfurt, Austria Stefan Fenz Vienna University of Technology, Austria Georg Gottlob Oxford University, UK Anastasios Gounaris Aristotle University of Thessaloniki, Greece Theo Härder Technical University of Kaiserslautern, Germany Andreas Herzig IRIT, Paul Sabatier University, France Hilda Kosorus FAW, Austria Dieter Kranzlmüller Ludwig-Maximilians-Universität München, Germany Philippe Lamarre INSA Lyon, France Lenka Lhotská Technical University of Prague, Czech Republic Vladimir Marik Technical University of Prague, Czech Republic Mukesh Mohania IBM Research, India Franck Morvan IRIT, Paul Sabatier University, France Kjetil Nørvåg Norwegian University of Science and Technology, Norway Gultekin Ozsoyoglu Case Western Reserve University, USA Themis Palpanas Paris Descartes University, France Torben Bach Pedersen Aalborg University, Denmark Günther Pernul University of Regensburg, Germany Klaus-Dieter Schewe University of Linz, Austria David Taniar Monash University, Australia A Min Tjoa Vienna University of Technology, Austria Chao Wang Oak Ridge National Laboratory, USA Reviewers Ladjel Bellatreche ENSMA, France Soon Ae Chun CSI/City University of New York, USA Mirel Cosulschi University of Craiova, Romania Philippe Cudre-Mauroux University of Fribourg, Switzerland Peter Haase Fluid Operations AG, Germany Prateek Jain IBM TJ Watson Research Center, USA VIII Editorial Board Jose-Norberto Mazón University of Alicante, Spain Marina Mongiello Technical University of Bari, Italy Simon Scerri Digital Enterprise Research Institute, Galway, Ireland Steffen Staab University of Koblenz-Landau, Germany Wolfram Woess Johannes Kepler University Linz, Austria Fouad Zablith American University of Beirut, Lebanon Contents Structure Inference for Linked Data Sources Using Clustering . . . . . . . . . . . 1 Klitos Christodoulou, Norman W. Paton, and Alvaro A.A. Fernandes The Web Within: Leveraging Web Standards and Graph Analysis to Enable Application-Level Integration of Institutional Data . . . . . . . . . . . . 26 Luiz Gomes Jr. and André Santanchè Dimensional Clustering of Linked Data: Techniques and Applications. . . . . . 55 Alfio Ferrara, Lorenzo Genta, Stefano Montanelli, and Silvana Castano ProProtect3: An Approach for Protecting User Profile Data from Disclosure, Tampering, and Improper Use in the Context of WebID. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Stefan Wild, Fabian Wiedemann, Sebastian Heil, Alexey Tschudnowsky, and Martin Gaedke Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Structure Inference for Linked Data Sources Using Clustering B Klitos Christodoulou( ), Norman W. Paton, and Alvaro A.A. Fernandes School of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK {christodoulou,norm,alvaro}@cs.man.ac.uk Abstract. Linked Data (LD) overlays the World Wide Web of docu- mentswithaWebofData.Thisisbecomingsignificantasshowninthe growth of LD repositories available as part of the Linked Open Data (LOD) cloud. At the instance-level, LD sources use a combination of terms from various vocabularies, expressed as RDFS/OWL, to describe data and publish it to the Web. However, LD sources do not organ- ise data to conform to a specific structure analogous to a relational schema; instead data can adhere to multiple vocabularies. Expressing SPARQL queries over LD sources – usually over a SPARQL endpoint thatispresentedtotheuser–requiresknowledgeofthepredicatesused so as to allow queries to express user requirements as graph patterns. Although LD provides low barriers to data publication using a single language(i.e.,RDF),sourcesorganisedatawithdifferentstructuresand terminologies.Thispaperdescribesanapproachtoautomaticallyderive structural summaries over instance-level data expressed as RDF triples. The technique builds on a hierarchical clustering algorithm that organ- ises RDF instance-level data into groups that are then utilised to infer a structural summary over a LD source. The resulting structural sum- mariesareexpressedintheformofclasses,propertiesand,relationships. Ourexperimentalevaluationshowsgoodresultswhenappliedtodifferent types of LD sources. · · · Keywords: Schema Linked Data Clustering Query formulation 1 Introduction In recent years there has been a significant growth in the amount of publicly available structured data on the Web using a graph-based representation model and a set of simple principles, the so-called Linked Data Principles [3]. A moti- vation for the adoption of these principles is the fact that they are based upon established web infrastructures (like URIs and HTTP) and semantic web stan- dards (like RDF and RDFS), thus providing low barriers to data publication. The adoption of these principles is apparent in the number of Linked Data (LD) repositories that form the Linked Open Data (LOD) cloud1. An interest- ingaspectofthiskindofWebisthatdatasetsarenotonlypublishedinisolation 1 http://lod-cloud.net. (cid:2)c Springer-VerlagBerlinHeidelberg2015 A.Hameurlainetal.(Eds.):TLDKSXIX,LNCS8990,pp.1–25,2015. DOI:10.1007/978-3-662-46562-21

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.