ebook img

Data-Intensive Computing: Architectures, Algorithms, and Applications PDF

300 Pages·2012·6.43 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data-Intensive Computing: Architectures, Algorithms, and Applications

Data-IntensiveComputing Theworldisawashwithdigitaldatafromsocialnetworks,blogs,business,science,and engineering.Data-intensivecomputingfacilitatesunderstandingofcomplexproblems thatmustprocessmassiveamountsofdata.Throughthedevelopmentofnewclassesof software,algorithms,andhardware,data-intensiveapplicationscanprovidetimelyand meaningfulanalyticalresultsinresponsetoexponentiallygrowingdatacomplexityand associatedanalysisrequirements.Thisemergingareabringsmanychallengesthatare differentfromtraditionalhigh-performancecomputing. Thisreferenceforcomputingprofessionalsandresearchersdescribesthedimensions of the field, the key challenges, the state of the art, and the characteristics of likely approaches that future data-intensive problems will require. Chapters cover general principles and methods for designing such systems and for managing and analyzing thebigdatasetsoftodaythatliveinthecloud,anddescribeexampleapplicationsin bioinformaticsandcyber-securitythatillustratetheseprinciplesinpractice. ian gorton isaLaboratoryFellowinComputationalSciencesandMathatPacific NorthwestNationalLaboratory(PNNL),wherehemanagestheDataIntensiveScientific ComputinggroupandwastheChiefArchitectforPNNL’sDataIntensiveComputing Initiative.GortonisaSeniorMemberoftheIEEEComputerSocietyandaFellowof theAustralianComputerSociety. deborah k. gracio joined Pacific Northwest National Laboratory in 1990 and is currentlytheDirectoroftheComputationalandStatisticalAnalyticsDivisionandof theDataIntensiveComputingResearchInitiative.Sincejoiningthelaboratory,shehas ledtheresearch,development,andmanagementofmultiplecross-disciplinary,multi- laboratoryprojectsfocusedinthebasicsciencesandnationalsecuritysectors. Data-Intensive Computing Architectures, Algorithms, and Applications Editedby IAN GORTON PacificNorthwestNationalLaboratory DEBORAH K. GRACIO PacificNorthwestNationalLaboratory cambridgeuniversitypress Cambridge,NewYork,Melbourne,Madrid,CapeTown, Singapore,Sa˜oPaulo,Delhi,MexicoCity CambridgeUniversityPress 32AvenueoftheAmericas,NewYork,NY10013-2473,USA www.cambridge.org Informationonthistitle:www.cambridge.org/9780521191951 (cid:2)C CambridgeUniversityPress2013 Thispublicationisincopyright.Subjecttostatutoryexception andtotheprovisionsofrelevantcollectivelicensingagreements, noreproductionofanypartmaytakeplacewithoutthewritten permissionofCambridgeUniversityPress. Firstpublished2013 PrintedintheUnitedStatesofAmerica AcatalogrecordforthispublicationisavailablefromtheBritishLibrary. LibraryofCongressCataloginginPublicationData Data-intensivecomputing:architectures,algorithms,andapplications/[editedby] IanGorton,DeborahK.Gracio. pages cm Includesbibliographicalreferencesandindex. ISBN978-0-521-19195-1 1.Highperformancecomputing. 2.Databasemanagement. 3.Computerstoragedevices. 4.Softwarearchitecture. 5.Datatransmissionsystems. I.Gorton,Ian. II.Gracio,DeborahK.,1965– QA76.88.D38 2012 004.5–dc23 2012015720 ISBN978-0-521-19195-1Hardback CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyofURLsfor externalorthird-partyInternetWebsitesreferredtointhispublicationanddoesnotguarantee thatanycontentonsuchWebsitesis,orwillremain,accurateorappropriate. Contents ListofContributors pagevii 1 Data-IntensiveComputing:AChallengeforthe 21stCentury 1 IanGortonandDeborahK.Gracio 2 AnatomyofData-IntensiveComputingApplications 12 IanGortonandDeborahK.Gracio 3 HardwareArchitecturesforData-IntensiveComputing Problems:ACaseStudyforStringMatching 24 AntoninoTumeo,OresteVilla,andDanielChavarr´ıa-Miranda 4 DataManagementArchitectures 48 TerenceCritchlow,GhalebAbdulla,JacekBecla,Kerstin Kleese-VanDam,SamLang,andDeborahL.McGuinness 5 Large-ScaleDataManagementTechniquesinCloud ComputingPlatforms 85 SherifSakrandAnnaLiu 6 DimensionReductionforStreamingData 124 ChandrikaKamath 7 BinaryClassificationwithSupportVectorMachines 157 PatrickNichols,Bobbie-JoWebb-Robertson,and ChristopherOehmen 8 BeyondMapReduce:NewRequirementsforScalable DataProcessing 180 BillHoweandMagdalenaBalazinska v vi Contents 9 LettheDataDotheTalking:HypothesisDiscoveryfrom Large-ScaleDataSetsinRealTime 235 ChristopherOehmen,ScottDowson,WesHatley,Justin Almquist,Bobbie-JoWebb-Robertson,JasonMcDermott, IanGorton,andLeeAnnMcCue 10 Data-IntensiveVisualAnalysisforCyber-Security 258 WilliamA.Pike,DanielM.Best,DouglasV.Love,and ShawnJ.Bohn Index 287 List of Contributors GhalebAbdullaLawrenceLivermoreNationalLaboratory JustinAlmquistPacificNorthwestNationalLaboratory MagdalenaBalazinskaUniversityofWashington JacekBeclaStanfordUniversity DanielM.BestPacificNorthwestNationalLaboratory ShawnJ.BohnPacificNorthwestNationalLaboratory DanielChavarr´ıa-MirandaPacificNorthwestNationalLaboratory TerenceCritchlowPacificNorthwestNationalLaboratory ScottDowsonPacificNorthwestNationalLaboratory IanGortonPacificNorthwestNationalLaboratory DeborahK.GracioPacificNorthwestNationalLaboratory WesHatleyFuturePointSystems BillHoweUniversityofWashington ChandrikaKamathLawrenceLivermoreNationalLaboratory SamLangPacificNorthwestNationalLaboratory AnnaLiuNationalICTAustralia(NICTA),UniversityofNewSouthWales KerstinKleese-VanDamPacificNorthwestNationalLaboratory DouglasV.LovePacificNorthwestNationalLaboratory LeeAnnMcCuePacificNorthwestNationalLaboratory vii viii ListofContributors JasonMcDermottPacificNorthwestNationalLaboratory DeborahL.McGuinnessRensselaerPolytechnicInstitute PatrickNicholsPacificNorthwestNationalLaboratory ChristopherOehmenPacificNorthwestNationalLaboratory WilliamA.PikePacificNorthwestNationalLaboratory SherifSakrNationalICTAustralia(NICTA),UniversityofNewSouthWales AntoninoTumeoPacificNorthwestNationalLaboratory OresteVillaPacificNorthwestNationalLaboratory Bobbie-JoWebb-RobertsonPacificNorthwestNationalLaboratory

Description:
The world is awash with digital data from social networks, blogs, business, science, and engineering. Data-intensive computing facilitates understanding of complex problems that must process massive amounts of data. Through the development of new classes of software, algorithms, and hardware, data-i
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.