Series ISSN: 1935-3235 A A M Synthesis Lectures on O D T Computer Architecture • F U N G • R O G Series Editor: Margaret Martonosi,Princeton University E R S General-Purpose General-Purpose Graphics Processor Architecture Tor M. Aamodt, University of British Columbia Wilson Wai Lun Fung, Samsung Electronics Timothy G. Rogers, Purdue University G Graphics Processor E N E Originally developed to support video games, graphics processor units (GPUs) are now increasingly used R A for general-purpose (non-graphics) applications ranging from machine learning to mining of cryptographic L - P currencies. GPUs can achieve improved performance and efficiency versus central processing units (CPUs) U R Architecture by dedicating a larger fraction of hardware resources to computation. In addition, their general-purpose P O programmability makes contemporary GPUs appealing to software developers in comparison to domain- SE G specific accelerators. This book provides an introduction to those interested in studying the architecture of R A GPUs that support general-purpose computing. It collects together information currently only found among P H a wide range of disparate sources.The authors led development of the GPGPU-Sim simulator widely used in I C academic research on GPU architectures. S P R The first chapter of this book describes the basic hardware structure of GPUs and provides a brief O C overview of their history. Chapter 2 provides a summary of GPU programming models relevant to the rest of E S the book. Chapter 3 explores the architecture of GPU compute cores. Chapter 4 explores the architecture of S O the GPU memory system. After describing the architecture of existing systems, Chapters 3 and 4 provide an R A overview of related research. Chapter 5 summarizes cross-cutting research impacting both the compute core R C H and memory system. I T This book should provide a valuable resource for those wishing to understand the architecture of graphics E Tor M. Aamodt C processor units (GPUs) used for acceleration of general-purpose applications and to those who want to obtain T U an introduction to the rapidly growing body of research exploring how to improve the architecture of these RE Wilson Wai Lun Fung GPUs. Timothy G. Rogers About SYNTHESIS This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and Computer Science. Synthesis books provide concise, original presentations of important research and M development topics, published quickly, in digital and print formats. O R G A N Synthesis Lectures on & C Computer Architecture L A store.morganclaypool.com Y P O O L General-Purpose Graphics Processor Architectures Synthesis Lectures on Computer Architecture Editor MargaretMartonosi,PrincetonUniversity FoundingEditorEmeritus MarkD.Hill,UniversityofWisconsin,Madison SynthesisLecturesonComputerArchitecturepublishes50-to100-pagepublicationsontopics pertainingtothescienceandartofdesigning,analyzing,selectingandinterconnectinghardware componentstocreatecomputersthatmeetfunctional,performanceandcostgoals.Thescopewill largelyfollowthepurviewofpremiercomputerarchitectureconferences,suchasISCA,HPCA, MICRO,andASPLOS. General-PurposeGraphicsProcessorArchitectures TorM.Aamodt,WilsonWaiLunFung,andTimothyG.Rogers 2018 CompilingAlgorithmsforHeterogenousSystems StevenBell,JingPu,JamesHegarty,andMarkHorowitz 2018 ArchitecturalandOperatingSystemSupportforVirtualMemory AbhishekBhattacharjeeandDanielLustig 2017 DeepLearningforComputerArchitects BrandonReagen,RobertAdolf,PaulWhatmough,Gu-YeonWei,andDavidBrooks 2017 On-ChipNetworks,SecondEdition NatalieEnrightJerger,TusharKrishna,andLi-ShiuanPeh 2017 Space-TimeComputingwithTemporalNeuralNetworks JamesE.Smith 2017 iv HardwareandSoftwareSupportforVirtualization EdouardBugnion,JasonNieh,andDanTsafrir 2017 DatacenterDesignandManagement:AComputerArchitect’sPerspective BenjaminC.Lee 2016 APrimeronCompressionintheMemoryHierarchy SomayehSardashti,AngelosArelakis,PerStenström,andDavidA.Wood 2015 ResearchInfrastructuresforHardwareAccelerators YakunSophiaShaoandDavidBrooks 2015 AnalyzingAnalytics RajeshBordawekar,BobBlainey,andRuchirPuri 2015 CustomizableComputing Yu-TingChen,JasonCong,MichaelGill,GlennReinman,andBingjunXiao 2015 Die-stackingArchitecture YuanXieandJishenZhao 2015 Single-InstructionMultiple-DataExecution ChristopherJ.Hughes 2015 Power-EfficientComputerArchitectures:RecentAdvances MagnusSjälander,MargaretMartonosi,andStefanosKaxiras 2014 FPGA-AcceleratedSimulationofComputerSystems HariAngepat,DerekChiou,EricS.Chung,andJamesC.Hoe 2014 APrimeronHardwarePrefetching BabakFalsafiandThomasF.Wenisch 2014 v On-ChipPhotonicInterconnects:AComputerArchitect’sPerspective ChristopherJ.Nitta,MatthewK.Farrens,andVenkateshAkella 2013 OptimizationandMathematicalModelinginComputerArchitecture TonyNowatzki,MichaelFerris,KarthikeyanSankaralingam,CristianEstan,NilayVaish,and DavidWood 2013 SecurityBasicsforComputerArchitects RubyB.Lee 2013 TheDatacenterasaComputer:AnIntroductiontotheDesignofWarehouse-Scale Machines,SecondEdition LuizAndréBarroso,JimmyClidaras,andUrsHölzle 2013 Shared-MemorySynchronization MichaelL.Scott 2013 ResilientArchitectureDesignforVoltageVariation VijayJanapaReddiandMeetaSharmaGupta 2013 MultithreadingArchitecture MarioNemirovskyandDeanM.Tullsen 2013 PerformanceAnalysisandTuningforGeneralPurposeGraphicsProcessingUnits (GPGPU) HyesoonKim,RichardVuduc,SaraBaghsorkhi,JeeChoi,andWen-meiHwu 2012 AutomaticParallelization:AnOverviewofFundamentalCompilerTechniques SamuelP.Midkiff 2012 PhaseChangeMemory:FromDevicestoSystems MoinuddinK.Qureshi,SudhanvaGurumurthi,andBipinRajendran 2011 Multi-CoreCacheHierarchies RajeevBalasubramonian,NormanP.Jouppi,andNaveenMuralimanohar 2011 vi APrimeronMemoryConsistencyandCacheCoherence DanielJ.Sorin,MarkD.Hill,andDavidA.Wood 2011 DynamicBinaryModification:Tools,Techniques,andApplications KimHazelwood 2011 QuantumComputingforComputerArchitects,SecondEdition TzvetanS.Metodi,ArvinI.Faruque,andFredericT.Chong 2011 HighPerformanceDatacenterNetworks:Architectures,Algorithms,andOpportunities DennisAbtsandJohnKim 2011 ProcessorMicroarchitecture:AnImplementationPerspective AntonioGonzález,FernandoLatorre,andGrigoriosMagklis 2010 TransactionalMemory,SecondEdition TimHarris,JamesLarus,andRaviRajwar 2010 ComputerArchitecturePerformanceEvaluationMethods LievenEeckhout 2010 IntroductiontoReconfigurableSupercomputing MarcoLanzagorta,StephenBique,andRobertRosenberg 2009 On-ChipNetworks NatalieEnrightJergerandLi-ShiuanPeh 2009 TheMemorySystem:YouCan’tAvoidIt,YouCan’tIgnoreIt,YouCan’tFakeIt BruceJacob 2009 FaultTolerantComputerArchitecture DanielJ.Sorin 2009 vii TheDatacenterasaComputer:AnIntroductiontotheDesignofWarehouse-Scale Machines LuizAndréBarrosoandUrsHölzle 2009 ComputerArchitectureTechniquesforPower-Efficiency StefanosKaxirasandMargaretMartonosi 2008 ChipMultiprocessorArchitecture:TechniquestoImproveThroughputandLatency KunleOlukotun,LanceHammond,andJamesLaudon 2007 TransactionalMemory JamesR.LarusandRaviRajwar 2006 QuantumComputingforComputerArchitects TzvetanS.MetodiandFredericT.Chong 2006 Copyright©2018byMorgan&Claypool Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedin anyformorbyanymeans—electronic,mechanical,photocopy,recording,oranyotherexceptforbriefquotations inprintedreviews,withoutthepriorpermissionofthepublisher. General-PurposeGraphicsProcessorArchitectures TorM.Aamodt,WilsonWaiLunFung,andTimothyG.Rogers www.morganclaypool.com ISBN:9781627059237 paperback ISBN:9781627056182 ebook ISBN:9781681733586 hardcover DOI10.2200/S00848ED1V01Y201804CAC044 APublicationintheMorgan&ClaypoolPublishersseries SYNTHESISLECTURESONCOMPUTERARCHITECTURE Lecture#44 SeriesEditor:MargaretMartonosi,PrincetonUniversity FoundingEditorEmeritus:MarkD.Hill,UniversityofWisconsin,Madison SeriesISSN Print1935-3235 Electronic1935-3243
Description: