Table Of Content

©Copyright2013 HadiEsmaeilzadeh Approximate Accelera(cid:415)on for a Post Mul(cid:415)core Era HadiEsmaeilzadeh Adisserta(cid:415)on submi(cid:425)edinpar(cid:415)alfulfillmentofthe requirementsforthedegreeof DoctorofPhilosophy UniversityofWashington 2013 ReadingCommi(cid:425)ee: DougBurger,Chair LuisCeze,Chair KathrynMcKinley MarkOskin ProgramAuthorizedtoOfferDegree: DepartmentofComputerScienceandEngineering UniversityofWashington Abstract ApproximateAccelera(cid:415)onforaPostMul(cid:415)coreEra HadiEsmaeilzadeh Co-ChairsoftheSupervisoryCommi(cid:425)ee: ProfessorDougBurger Microso(cid:332)Research AssociateProfessorLuisCeze UniversityofWashington Star(cid:415)ng in 2004, the microprocessor industry has shi(cid:332)ed to mul(cid:415)core scaling—increasing the number of coresperdieeachtechnologygenera(cid:415)on—asitsprincipalstrategyforcon(cid:415)nuingperformancegrowth.This workfirststudiestheinterplaybetweentheriseofmul(cid:415)coreprocessorsandtheriseofmanagedlanguages— e.g. Java—inthepastdecade. Then,thisdisserta(cid:415)onlooksintofuture,studiesthetrendsintransistorscal- ing,andinves(cid:415)gateswhethermul(cid:415)corescalingwillsustaintradi(cid:415)onalperformanceimprovementsthathave beenthedrivingforcefortheen(cid:415)recompu(cid:415)ngindustryoverthepastfortyyears.Theresultsfromourwork challengestheconven(cid:415)onalwisdomthatadvocatesmul(cid:415)corescalingisaviablepathforexploi(cid:415)ngincreased transistorcountsandsustaininghistoricalperformancetrends.Asthenumberofcoresincreases,powercon- straintsmaypreventpoweringofallcoresattheirfullspeed,requiringafrac(cid:415)onofthecorestobepowered offatall(cid:415)mes. Accordingtoourmodels, thefrac(cid:415)onofthesechipsthatisdarkmaybeasmuchas50% withinthreeprocessgenera(cid:415)ons.Thelowu(cid:415)lityofthisdarksiliconmaypreventbothscalingtohighercore countsandul(cid:415)matelytheeconomicviabilityofcon(cid:415)nuedsiliconscaling. Ourstudyhighlightsthatradical departuresfromconven(cid:415)onalapproachesmaybenecessarytosustainthetradi(cid:415)onalrateofperformance improvementsingeneral-purposecompu(cid:415)ng.Thesetechniquesshouldprovidesignificantperformanceand energyefficiencygainsacrossawiderangeofapplica(cid:415)ons. Thisdisserta(cid:415)onthenproposesanewdirec(cid:415)on forgeneral-purposecompu(cid:415)ngthatleveragesapproxima(cid:415)ontoaddressthedarksiliconchallenge. While conven(cid:415)onaltechniques—suchasdynamicvoltageandfrequencyscaling—tradeperformanceforenergy, general-purposeapproximatecompu(cid:415)ngtradeserrorforbothperformanceandenergygains. Wepropose variable-precisionarchitectures,aframeworkfromtheISA—Instruc(cid:415)onSetArchitecture—tothetransistor- levelimplementa(cid:415)onsthatallowconven(cid:415)onalvonNeumannprocessorstotradeaccuracyforenergyatthe granularityofsingleinstruc(cid:415)ons. Then,weproposeanend-to-endsolu(cid:415)on,fromtheprogrammingmodel tothemicroarchitecturethatleveragesanapproximatealgorithmictransforma(cid:415)ontoautoma(cid:415)callyconvert ahotcoderegionfromavonNeumannmodeltoaneuralmodel.Thissolu(cid:415)onanditsassociatedalgorithmic transforma(cid:415)onenablesanewclassofaccelerators,calledNeuralProcessingUnits(NPUs)withimplemen- ta(cid:415)onpoten(cid:415)alinboththedigitalandtheanalogdomain. Thisworkshowssignificantgainsbothinperfor- manceandenergywhentheabstrac(cid:415)onoffullaccuracyisrelaxedingeneral-purposecompu(cid:415)ng.Theresults fromthisdisserta(cid:415)onshowthatgeneral-purposeapproximatecompu(cid:415)ngcanbeapathforwardwhenthe gainsfromconven(cid:415)onalapproachesarediminishing. T(cid:131)(cid:144)(cid:189)(cid:155) (cid:202)(cid:165) C(cid:202)(cid:196)(cid:227)(cid:155)(cid:196)(cid:227)(cid:221) Page ListofFigures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v ListofTables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Chapter1: IndustryofNewPossibili(cid:415)es . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Moore’sLawEnablesNewPossibili(cid:415)es . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 DennardScalingEnablesMoore’sLaw . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 EndofDennardScalingandtheMul(cid:415)coreEra . . . . . . . . . . . . . . . . . . . . 7 1.4 General-PurposeApproximateCompu(cid:415)ngforaPostMul(cid:415)coreEra . . . . . . . . . 9 1.5 Disserta(cid:415)onOrganiza(cid:415)onandContribu(cid:415)ons . . . . . . . . . . . . . . . . . . . . . 11 Chapter2: LookingBack: Mul(cid:415)cores,MeasuredPower,andModernWorkloads . . . . 17 2.1 Introduc(cid:415)on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.4 Perspec(cid:415)ve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 FeatureAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6 ConcludingRemarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Chapter3: LookingForward: DarkSiliconandtheEndofMul(cid:415)coreEra . . . . . . . . . 41 3.1 Introduc(cid:415)on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 DeviceModel(M-Device) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 i 3.4 CoreModel(M-Core) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5 Mul(cid:415)coreModel(M-CMP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.6 CombiningModels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.7 ScalingandFutureMul(cid:415)cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.8 ModelAssump(cid:415)ons,Valida(cid:415)on,andLimita(cid:415)ons . . . . . . . . . . . . . . . . . . . 69 3.9 ConcludingRemarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Chapter4: Variable-PrecisionvonNeumannArchitectures . . . . . . . . . . . . . . . 75 4.1 Introduc(cid:415)on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.2 AnISAforDisciplinedApproximateComputa(cid:415)on . . . . . . . . . . . . . . . . . . 77 4.3 DesignSpace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.4 Truffle: ADual-VoltageMicroarchitectureforDisciplinedApproxima(cid:415)on . . . . . . 87 4.5 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.6 ConcludingRemarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Chapter5: FromavonNeumanntoaHybridvonNeumann-NeuralModelofCompu(cid:415)ng107 5.1 Introduc(cid:415)on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.3 ProgrammingModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.4 Compila(cid:415)onWorkflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.5 ArchitectureDesignforNPUAccelera(cid:415)on . . . . . . . . . . . . . . . . . . . . . . 120 5.6 NeuralProcessingUnit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.7 Evalua(cid:415)on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.8 Limita(cid:415)onsandFutureDirec(cid:415)ons . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.9 ConcludingRemarks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Chapter6: RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.1 Power-PerformanceMeasurement . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.2 ModelingMul(cid:415)cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 6.3 ApproximateCompu(cid:415)ng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 6.4 VoltageOverscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 6.5 Informa(cid:415)onFlowTracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.6 General-PurposeConfigurableAccelerators . . . . . . . . . . . . . . . . . . . . . 146 6.7 NeuralNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Chapter7: APathForward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 ii Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 iii iv

Description:

Star ng in 2004, the microprocessor industry has shifted to mul core straints may prevent powering of all cores at their full speed, requiring a frac on of the cores to be powered conven onal techniques—such as dynamic voltage and frequency 2013$. Mul7core$Era$. Dennard$scaling$ broke$. 740$

Approximate Acceleration for a Post-Multicore Era - Department of PDF

189 Pages·2014·6.28 MB·English

Checking for file health...

Save to my drive

Quick download

Download

Download Approximate Acceleration for a Post-Multicore Era - Department of PDF Free - Full Version

by Unknow| 2014| 189 pages| 6.28| English

Download Approximate Acceleration for a Post-Multicore Era - Department of by in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Approximate Acceleration for a Post-Multicore Era - Department of

Detailed Information

Author:	Unknown
Publication Year:	2014
Pages:	189
Language:	English
File Size:	6.28
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Approximate Acceleration for a Post-Multicore Era - Department of Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Approximate Acceleration for a Post-Multicore Era - Department of PDF?

Yes, on https://PDFdrive.to you can download Approximate Acceleration for a Post-Multicore Era - Department of by completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Approximate Acceleration for a Post-Multicore Era - Department of on my mobile device?

After downloading Approximate Acceleration for a Post-Multicore Era - Department of PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Approximate Acceleration for a Post-Multicore Era - Department of?

Yes, this is the complete PDF version of Approximate Acceleration for a Post-Multicore Era - Department of by Unknow. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Approximate Acceleration for a Post-Multicore Era - Department of PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.