ebook img

Data Management and Query Processing in Semantic Web Databases PDF

273 Pages·2011·11.762 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data Management and Query Processing in Semantic Web Databases

Data Management and Query Processing in Semantic Web Databases . Sven Groppe Data Management and Query Processing in Semantic Web Databases SvenGroppe InstituteofInformationSystems UniversityofLu¨beck RatzeburgerAllee160(Building64-2ndlevel) 23562Lu¨beck Germany groppe@ifis.uni-luebeck.de ISBN978-3-642-19356-9 e-ISBN978-3-642-19357-6 DOI10.1007/978-3-642-19357-6 SpringerHeidelbergDordrechtLondonNewYork ACMComputingClassification(1998):H.2,H.3,I.2 LibraryofCongressControlNumber:2011926984 # Springer-VerlagBerlinHeidelberg2011 Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting, reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9, 1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violations areliabletoprosecutionundertheGermanCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotective lawsandregulationsandthereforefreeforgeneraluse. Coverdesign:deblik Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Contents 1 Introduction ............................................................... 1 1.1 MainTargetGroupoftheBook ....................................... 2 1.2 PrerequisitesNeededtoUnderstandtheBook ........................ 3 1.3 Content ................................................................. 3 1.4 LogicalOrganizationoftheBook ..................................... 4 1.5 StructureoftheChaptersandBookWebpage ........................ 4 2 SemanticWeb ............................................................. 7 2.1 Introduction ............................................................ 7 2.2 Overview ............................................................... 8 2.3 RDFData .............................................................. 9 2.3.1 N3Notation .................................................... 11 2.3.2 RDF/XML ..................................................... 13 2.4 OntologyLanguages .................................................. 13 2.5 OpenWorldAssumption ............................................. 16 2.6 NoUniqueNameAssumption ........................................ 17 2.7 SPARQLQueryLanguage ........................................... 17 2.7.1 LanguageConstructsofSPARQL ............................. 18 2.7.2 SPARQLProtocolforRDF ................................... 24 2.7.3 SPARQLQueryResultsXMLFormat ........................ 26 2.7.4 RDFStores ..................................................... 27 2.8 Rules .................................................................. 28 2.9 RelatedWork ......................................................... 31 2.9.1 RIFProcessing ................................................. 31 2.9.2 OptimizationsforRecursiveRules ............................ 33 2.10 SummaryandConclusions .......................................... 34 3 ExternalSortingandB+-Trees ......................................... 35 3.1 Motivation ............................................................ 35 3.2 B+-trees ............................................................... 36 3.2.1 PropertiesofB+-Trees ......................................... 37 v vi Contents 3.2.2 Self-balancingPropertyofB+-Trees .......................... 38 3.2.3 Searching ....................................................... 39 3.2.4 PrefixSearchinCombinationwithSideways InformationPassing ............................................ 39 3.2.5 Inserting ........................................................ 41 3.2.6 Deleting ........................................................ 43 3.2.7 B+-TreeConstructionfromalargeDataset ................... 45 3.3 Heap ................................................................... 45 3.4 (External)MergeSort ................................................ 47 3.5 ReplacementSelection ................................................ 48 3.6 ExternalChunksMergeSort ......................................... 50 3.7 DistributionSort ...................................................... 52 3.8 RDFDistributionSort ................................................ 53 3.9 ExperimentalAnalysis ................................................ 56 3.9.1 SP2BDataset ................................................... 57 3.9.2 YagoDataset ................................................... 58 3.10 SummaryandConclusions .......................................... 63 4 QueryProcessingOverview ............................................. 67 4.1 TheLUPOSDATESystem ........................................... 67 4.2 PhasesofQueryProcessing .......................................... 69 4.3 CoreSPARQL ......................................................... 73 4.3.1 DefiningCoreSPARQL ........................................ 73 4.3.2 TransformingSPARQLQueriesintoCoreSPARQL Queries ......................................................... 74 4.3.3 CoreSPARQLGrammar ....................................... 77 4.4 RelatedWork ......................................................... 78 4.5 SummaryandConclusions ........................................... 78 5 LogicalOptimization .................................................... 79 5.1 LogicalAlgebra ....................................................... 79 5.1.1 SemanticsoftheLogicalAlgebraOperators .................. 81 5.2 LogicalOptimizationRules .......................................... 85 5.2.1 PushingFILTEROperators .................................... 85 5.2.2 SplittingandCommutativityofFILTEROperators .......... 87 5.2.3 ConstantandVariablePropagation ........................... 87 5.2.4 HeuristicQueryOptimizationUsingEquivalencyRules ..... 89 5.2.5 Cost-BasedOptimization ...................................... 90 5.2.6 Histograms ..................................................... 99 5.3 FurtherRelatedWork ............................................... 101 5.4 SummaryandConclusions ......................................... 101 6 PhysicalOptimization .................................................. 103 6.1 Motivation .......................................................... 104 6.2 RelatedWork ....................................................... 106 Contents vii 6.3 Indexing ............................................................. 108 6.3.1 BuildingIn-MemoryIndices ................................. 109 6.3.2 BuildingDisk-BasedIndices ................................. 110 6.4 PipeliningVersusMaterialization .................................. 116 6.4.1 Pipeline-Breaker .............................................. 116 6.4.2 SidewaysInformationPassing ................................ 116 6.5 JoinAlgorithms ..................................................... 117 6.5.1 Nested-LoopJoin ............................................. 117 6.5.2 MergeJoin .................................................... 120 6.5.3 IndexJoin ..................................................... 122 6.5.4 HashJoin ...................................................... 123 6.6 DynamicallyRestrictingTriplePatterns ........................... 126 6.7 SortingNumberingScheme ........................................ 129 6.7.1 JoinsWithoutPresortingNumbers ........................... 129 6.7.2 JoinswithPresortingNumbers ............................... 131 6.7.3 OptimizationofFastSorting ................................. 132 6.7.4 SortingforComplexJoins .................................... 132 6.7.5 AdditionalBenefitsfromSIPStrategies ..................... 135 6.8 Optional ............................................................. 136 6.8.1 MergeOptional ................................................ 136 6.9 DuplicateElimination .............................................. 137 6.9.1 DuplicateEliminationUsingHashing ........................ 137 6.9.2 DuplicateEliminationUsingSorting ......................... 138 6.9.3 DuplicateEliminationUsingPresortingNumbers ........... 138 6.10 CostModel ........................................................ 138 6.11 PerformanceEvaluation ........................................... 139 6.11.1 PerformanceEvaluationforIn-memoryDatabases ....... 139 6.11.2 PerformanceEvaluationforLarge-ScaleDatasets ........ 145 6.12 SummaryandConclusions ........................................ 152 7 Streams .................................................................. 155 7.1 Introduction ......................................................... 155 7.2 eBay ................................................................. 156 7.3 MonitoringeBayAuctions ......................................... 157 7.3.1 MonitoringSystem ........................................... 157 7.3.2 Demonstration ................................................ 158 7.3.3 StreamingSPARQLEngine .................................. 159 7.4 SpecialOperatorsforStreamProcessing ........................... 160 7.4.1 TypesofStreamOperators ................................... 160 7.4.2 TypesofWindowOperators .................................. 161 7.5 RelatedWork ....................................................... 161 7.5.1 DataStreamsinGeneral ...................................... 161 7.5.2 SemanticWebDataStreams ................................. 162 7.6 SummaryandConclusions ......................................... 162 viii Contents 8 ParallelDatabases ...................................................... 163 8.1 Motivation .......................................................... 163 8.2 TypesofParallelisms ............................................... 165 8.3 Amdahl’sLaw ...................................................... 167 8.4 ParallelMonitorsandBoundedBuffers ............................ 168 8.5 ParallelJoinUsingaDistributionThread .......................... 168 8.6 ParallelMergeJoinUsingPartitionedInput ....................... 169 8.7 ParallelComputationofOperands ................................. 172 8.8 PerformanceEvaluation ............................................ 173 8.9 PerformanceGainsandLoss ....................................... 175 8.10 SummaryandConclusions ........................................ 175 9 Inference ................................................................. 177 9.1 Introduction ......................................................... 177 9.2 RDFSchemaInferenceRules ...................................... 178 9.3 MaterializationofInferenceandConsequences forQueryOptimization ............................................. 179 9.4 LogicalOptimizationforInference ................................ 180 9.5 PerformanceAnalysis .............................................. 187 9.6 RelatedWork ....................................................... 189 9.7 SummaryandConclusions ......................................... 189 10 VisualQueryLanguages ............................................... 191 10.1 Motivation ......................................................... 191 10.2 RelatedWork ...................................................... 193 10.3 RDFVisualEditor ................................................. 194 10.4 SPARQLVisualEditor ............................................ 194 10.5 Browser-LikeQueryCreation ..................................... 194 10.6 GeneratingCondensedDataView ................................ 196 10.7 RefiningQueries ................................................... 197 10.8 QueryFormulationDemo ......................................... 198 10.9 ComputationofSuggestedTriplePatternsforQuery Refinement ......................................................... 199 10.10 SummaryandConclusions ....................................... 201 11 EmbeddedLanguages .................................................. 203 11.1 Motivation ......................................................... 203 11.2 RelatedWork ...................................................... 204 11.3 EmbeddingSemanticWebLanguagesIntoJAVA ............... 205 11.3.1 TheTypeSystem .......................................... 208 11.3.2 SubtypeTest ............................................... 210 11.3.3 SatisfiabilityTestofEmbeddedSPARQL andSPARULQueries .................................... 215 11.3.4 DeterminationoftheQueryResultTypes ................ 217 11.4 SummaryandConclusions ........................................ 217 Contents ix 12 ComparisonoftheXMLandSemanticWebWorlds ................ 219 12.1 Introduction ........................................................ 219 12.2 ConceptsandVisions ............................................. 221 12.3 DataModels ....................................................... 221 12.4 SchemaandOntologyLanguages ................................. 222 12.5 QueryLanguages .................................................. 223 12.6 EmbeddingSPARQLintoXQuery/XSLT ........................ 226 12.6.1 EmbeddedSPARQL ....................................... 226 12.6.2 TranslationProcess ........................................ 229 12.6.3 ExperimentalAnalysis .................................... 235 12.7 EmbeddingXPathIntoSPARQL ................................. 240 12.7.1 TranslationofXPathSubqueriesIntoSPARQL Queries ..................................................... 241 12.7.2 PerformanceAnalysis ..................................... 247 12.8 RelatedWork ...................................................... 248 12.9 SummaryandConclusions ........................................ 250 13 Summary,Conclusions,andFutureWork ........................... 251 13.1 PossibilitiesforFutureWork ...................................... 252 References .................................................................... 255 Index .......................................................................... 267

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.