ebook img

Encyclopedia of Big Data Technologies PDF

1820 Pages·2019·46.02 MB·english
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Encyclopedia of Big Data Technologies

Sherif Sakr Albert Y. Zomaya Editors Encyclopedia of Big Data Technologies Encyclopedia of Big Data Technologies Sherif Sakr • Albert Y. Zomaya Editors Encyclopedia of Big Data Technologies With429Figuresand54Tables 123 Editors SherifSakr AlbertY.Zomaya InstituteofComputerScience SchoolofInformationTechnologies UniversityofTartu SydneyUniversity Tartu,Estonia Sydney,Australia ISBN978-3-319-77524-1 ISBN978-3-319-77525-8(eBook) ISBN978-3-319-77526-5(printandelectronicbundle) https://doi.org/10.1007/978-3-319-77525-8 LibraryofCongressControlNumber:2018960889 ©SpringerNatureSwitzerlandAG2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewhole orpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseof illustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway, andtransmissionorinformationstorageandretrieval,electronicadaptation,computersoftware, orbysimilarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publication does not imply, even in the absence of a specific statement, that such names are exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationin thisbookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublisher northeauthorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerial containedhereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremains neutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Inthefieldofcomputerscience,dataisconsideredasthemainrawmaterial which is produced by abstracting the world into categories, measures, and other representational forms (e.g., characters, numbers, relations, sounds, images, electronic waves) that constitute the building blocks from which information and knowledge are created. In practice, data generation and consumption has become a part of people’s daily life especially with the pervasive availability of Internet technology. We are progressively moving toward being a data-driven society where data has become one of the most valuable assets. Big data has commonly been characterized by the defining 3V’s properties which refer to huge Volume, consisting of terabytes or petabytes of data; high in Velocity, being created in or near real time; and diversityinVarietyofform,beingstructuredandunstructuredinnature. Recently,researchcommunities,enterprises,andgovernmentsectorshave allrealizedtheenormouspotentialofbigdataanalytics,andcontinuousad- vancementshavebeenemerginginthisdomain.ThisEncyclopediaanswers the need for solid and comprehensive research source in the domain of Big DataTechnologies.TheaimofthisEncyclopediaistoprovideafullpicture of various related aspects, topics, and technologies of big data including big data enabling technologies, big data integration, big data storage and indexing,datacompression,bigdataprogrammingmodels,bigSQLsystems, big streaming systems, big semantic data processing, graph analytics, big spatial data management, big data analysis, business process analytics, big dataprocessingonmodernhardwaresystems,bigdataapplications,bigdata securityandprivacy,andbenchmarkingofbigdatasystems. Withcontributionsofmanyleadersinthefield,thisEncyclopediaprovides the reader with comprehensive reading materials for a large range of audi- ences.ItisamainaimoftheEncyclopedia toinfluencethereaderstothink furtherandinvestigatetheareasthatarenoveltothem.TheEncyclopediahas been designed to serve as a solid and comprehensive reference not only to expertresearchersandsoftwareengineersinthefieldbutequallytostudents andjuniorresearchersaswell.ThefirsteditionoftheEncyclopediacontains more than 250 entries covering a wide range of topics. The Encyclopedia’s entrieswillbeupdatedregularlytofollowthecontinuousadvancementinthe v vi Preface domain and have up-to-date coverage available for our readers. It will be availablebothinprintandonlineversions.Wehopethatourreaderswillfind this Encyclopedia as a rich and valuable resource and a highly informative referenceforBigDataTechnologies. InstituteofComputerScience SherifSakr UniversityofTartu Tartu,Estonia SchoolofInformationTechnologies AlbertY.Zomaya SydneyUniversity Sydney,Australia Jan2019 List of Topics BigDataIntegration SparkSQL VirtualDistributedFileSystem:Alluxio SectionEditor:MaikThiele Wildfire:HTAPforBigData DataCleaning DataFusion BigSpatialDataManagement DataIntegration DataLake SectionEditor:TimosSellis DataProfiling andAamirCheema DataWrangling ApplicationsofBigSpatialData:Health ETL Architectures HolisticSchemaMatching Indexing Integration-OrientedOntology LinkedGeospatialData Large-ScaleEntityResolution QueryProcessing–kNN Large-ScaleSchemaMatching QueryProcessing:ComputationalGeometry Privacy-PreservingRecordLinkage QueryProcessing:Joins ProbabilisticDataIntegration SpatialDataIntegration RecordLinkage SpatialDataMining SchemaMapping SpatialGraphBigData TruthDiscovery Spatio-socialData UncertainSchemaMatching Spatio-textualData SpatiotemporalData:Trajectories BigSQL StreamingBigSpatialData SectionEditor:YuanyuanTian UsingBigSpatialDataforPlanning andFatmaOzkan UserMobility Visualization BigDataIndexing CachingforSQL-on-Hadoop Cloud-BasedSQLSolutionsforBigData BigSemanticDataProcessing ColumnarStorageFormats SectionEditor:PhilippeCudré-Mauroux Hive andOlafHartig HybridSystemsBasedonTraditional DatabaseExtensions AutomatedReasoning Impala BigSemanticDataProcessingintheLife QueryOptimizationChallenges SciencesDomain forSQL-on-Hadoop BigSemanticDataProcessingintheMaterials SnappyData DesignDomain vii viii ListofTopics DataQualityandDataCleansing Python ofSemanticData Scala DistantSupervisionfromKnowledgeGraphs SciDB FederatedRDFQueryProcessing RLanguage:APowerfulToolforTaming Framework-BasedScale-OutRDFSystems BigData KnowledgeGraphEmbeddings KnowledgeGraphsintheLibrariesandDigital HumanitiesDomain BigDataonModernHardwareSystems NativeDistributedRDFSystems SectionEditor:BingshengHe OntologiesforBigData andBehroozParhami RDFDatasetProfiling RDFSerializationandArchival BigDataandExascaleComputing ReasoningatScale ComputerArchitectureforBigData SecurityandPrivacyAspectsofSemanticData DataLongevityandCompatibility SemanticInterlinking DataReplicationandEncoding SemanticSearch EmergingHardwareTechnologies SemanticStreamProcessing EnergyImplicationsofBigData VisualizingSemanticData GPU-BasedHardwarePlatforms HardwareReliabilityRequirements Hardware-AssistedCompression BigDataAnalysis ParallelProcessingwithBigData SearchandQueryAccelerators SectionEditor:DomenicoTaliaand StorageHierarchiesforBigData PaoloTrunfio StorageTechnologiesforBigData ApacheMahout StructuresforLargeDataSets ApacheSystemML TabularComputation BigDataAnalysisTechniques BigDataAnalysisandIoT BigDataAnalysisforSmartCity BigDataApplications Applications SectionEditor:KamranMunir BigDataAnalysisforSocialGood andAntonioPescape BigDataAnalysisinBioinformatics CloudComputingforBigDataAnalysis BigDataandRecommendation DeepLearningonBigData BigDataApplicationinManufacturing EnergyEfficiencyinBigDataAnalysis Industry LanguagesforBigDataanalysis BigDataEnablesLaborMarket PerformanceEvaluationofBigDataAnalysis Intelligence ScalableArchitecturesforBigDataAnalysis BigDataTechnologiesforDNA ToolsandLibrariesforBigDataAnalysis Sequencing WorkflowSystemsforBigDataAnalysis BigDataWarehousesforSmartIndustries BigDataforCybersecurity BigDataforHealth BigDataProgrammingModels BigDatainAutomotiveIndustry BigDatainComputerNetworkMonitoring SectionEditor:SherifSakr BigDatainCulturalHeritage BSPProgrammingModel BigDatainMobileNetworks Clojure BigDatainNetworkAnomalyDetection Julia BigDatainSmartCities ListofTopics ix BigDatainSocialNetworks DistributedSystemsforBigData FloodDetectionUsingSocialMedia SectionEditor:AsteriosKatsifodimos BigDataStreams andPramodBhatotia AchievingLowLatencyTransactionsfor Geo-replicatedStoragewithBlotter EnablingBigDataTechnologies AdvancementsinYARNResourceManager SectionEditor:RodrigoNevesCalheiros ApproximateComputingforStreamAnalytics andMarcosDiasdeAssuncao CheapDataAnalyticsonColdStorage DistributedIncrementalViewMaintenance ApacheSpark HopsFS:ScalingHierarchicalFileSystem BigDataArchitectures MetadataUsingNewSQLDatabases BigDataDeepLearningTools IncrementalApproximateComputing BigDataVisualizationTools IncrementalSlidingWindowAnalytics BigDataandFogComputing OptimizingGeo-distributedStreaming BigDataintheCloud Analytics DatabasesasaService ParallelJoinAlgorithmsinMapReduce DistributedFileSystems Privacy-PreservingDataAnalytics GraphProcessingFrameworks RobustDataPartitioning Hadoop Sliding-WindowAggregationAlgorithms MobileBigData:Foundations,StateoftheArt, StreamWindowAggregationSemantics andFutureDirections andOptimization Network-LevelSupportforBigData StreamMine3G:ElasticandFaultTolerant Computing LargeScaleStreamProcessing NoSQLDatabaseSystems TARDiS:ABranch-and-MergeApproach OrchestrationToolsforBigData toWeakConsistency VisualizationTechniques BigDataSecurityandPrivacy BigDataTransactionProcessing SectionEditor:JunjunChen SectionEditor:MohammadSadoghi andDeepakPuthal ActiveStorage BigDataStreamSecurityClassification BlockchainTransactionProcessing forIoTApplications Conflict-FreeReplicatedDataTypesCRDTs BigDataandPrivacyIssuesforConnected CoordinationAvoidance VehiclesinIntelligentTransportation DatabaseConsistencyModels Systems Geo-replicationModels Co-residentAttackinCloudComputing: Geo-scaleTransactionProcessing AnOverview Hardware-AssistedTransactionProcessing DataProvenanceforBigDataSecurityand Hardware-AssistedTransaction Accountability Processing:NVM ExploringScopeofComputationalIntelligence HybridOLTPandOLAP inIoTSecurityParadigm In-MemoryTransactions KeywordAttacksandPrivacyPreservingin TransactionsinMassivelyMultiplayer Public-Key-BasedSearchableEncryption OnlineGames NetworkBigDataSecurityIssues WeakerConsistencyModels/Eventual PrivacyCube Consistency Privacy-AwareIdentityManagement x ListofTopics ScalableBigDataPrivacywithMapReduce Microbenchmark SecureBigDataComputinginCloud: SparkBench AnOverview StreamBenchmarks SecurityandPrivacyinBigDataEnvironment SystemUnderTest TPC TPC-DS BusinessProcessAnalytics TPC-H TPCx-HS SectionEditor:MarlonDumas VirtualizedBigDataBenchmarks andMatthiasWeidlich YCSB Artifact-CentricProcessMining AutomatedProcessDiscovery BusinessProcessAnalytics Graphdatamanagementandanalytics BusinessProcessDevianceMining SectionEditor:HannesVoigt BusinessProcessEventLogsandVisualization andGeorgeFletcher BusinessProcessModelMatching BusinessProcessPerformanceMeasurement FeatureLearningfromSocialGraphs BusinessProcessQuerying GraphDataIntegrationandExchange ConformanceChecking GraphDataModels Data-DrivenProcessSimulation GraphExplorationandSearch DecisionDiscoveryinBusinessProcesses GraphGenerationandBenchmarks DeclarativeProcessMining GraphInvariants DecomposedProcessDiscoveryand GraphOLAP ConformanceChecking GraphPartitioning:Formulationsand EventLogCleaningforBusinessProcess ApplicationstoBigData Analytics GraphPathNavigation HierarchicalProcessDiscovery GraphPatternMatching MultidimensionalProcessAnalytics GraphQueryLanguages PredictiveBusinessProcessMonitoring GraphQueryProcessing ProcessModelRepair GraphRepresentationsandStorage QueueMining GraphVisualization StreamingProcessDiscoveryandConformance GraphDataManagementSystems Checking HistoricalGraphManagement TraceClustering IndexingforGraphQueryEvaluation InfluenceAnalyticsinGraphs LinkAnalyticsinGraphs BigDataBenchmarking LinkedDataManagement ParallelGraphProcessing SectionEditor:MeikelPoessandTilmannRabl VisualGraphQuerying AnalyticsBenchmarks Auditing BenchmarkHarness DataCompression CRUDBenchmarks SectionEditor:PaoloFerragina ComponentBenchmark End-to-EndBenchmark (Web/Social)GraphCompression EnergyBenchmarking CompressedIndexesforRepetitive GraphBenchmarking TextualDatasets MetricsforBigDataBenchmarks ComputingtheCostofCompressedData

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.