ebook img

Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution PDF

187 Pages·2010·5.34 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution

Dynamic Reconfigurable Architectures and Transparent Optimization Techniques Antonio Carlos Schneider Beck Fl. (cid:2) Luigi Carro Dynamic Reconfigurable Architectures and Transparent Optimization Techniques Automatic Acceleration of Software Execution Prof.AntonioCarlosSchneiderBeckFl. Prof.LuigiCarro InstitutodeInformática InstitutodeInformática UniversidadeFederaldoRioGrande UniversidadeFederaldoRioGrande doSul(UFRGS) doSul(UFRGS) CaixaPostal15064 CaixaPostal15064 CampusdoVale,BlocoIV CampusdoVale,BlocoIV PortoAlegre PortoAlegre Brazil Brazil [email protected] [email protected] ISBN978-90-481-3912-5 e-ISBN978-90-481-3913-2 DOI10.1007/978-90-481-3913-2 SpringerDordrechtHeidelbergLondonNewYork LibraryofCongressControlNumber:2010921831 ©SpringerScience+BusinessMediaB.V.2010 Nopartofthisworkmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorby anymeans,electronic,mechanical,photocopying,microfilming,recordingorotherwise,withoutwritten permissionfromthePublisher,withtheexceptionofanymaterialsuppliedspecificallyforthepurpose ofbeingenteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthework. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) ToSabrina, for herunderstanding andsupport To AntônioandLéia, for thecontinuousencouragement To Ulisses,mayhisjourneybefullofjoy To Érika,for allour moments To Cesare, Estherand Beti,forbeingthere Preface As Moore’s law is losing steam, one already sees the phenomenon of clock fre- quencyreductioncausedbytheexcessivepowerdissipationingeneralpurposepro- cessors.Atthesametime,embeddedsystemsaregettingmoreheterogeneous,char- acterizedbyahighdiversityofcomputationalmodelscoexistinginasingledevice. Therefore,asinnovativetechnologiesthatwillcompletelyorpartiallyreplacesili- conarearising,newarchitecturalalternativesarenecessary. Althoughreconfigurablecomputinghasalreadyshowntobeapotentialsolution when it comes to accelerate specific code with a small power budget, significant speedupsareachievedjustin verydedicateddatafloworientedsoftware, failingto capture the reality of nowadays complex heterogeneous systems. Moreover, one importantcharacteristicofanynewarchitectureisthatitshouldbeabletoexecute legacycode,sincetherehasalreadybeenalargeamountofinvestmentintowriting softwarefordifferentapplications.Thewidespreadusageofreconfigurabledevices isstillwithheldbytheneedofspecialtoolsandcompilers,whichclearlypreclude reuseoflegacycodeanditsportability. Theauthorshavewrittenthisbookwiththeaforementionedlimitationsinmind. Therefore,thisbook,whichisdividedinsevenchapters,startspresentingthemain challengescomputerarchitecturesarefacingthesedays.Then,adetailedstudyon the usage of reconfigurable systems, their main principles, characteristics, poten- tial and classifications is done. A separate chapter is dedicated to present several case studies, with a critical analysis on their main advantages and drawbacks, and thebenchmarksusedfortheirevaluation.Thisanalysiswilldemonstratethatsuch architectures need to attack a diverse range of applications with very different be- haviors,besidessupportingcodecompatibility,thatis,theneedfornomodification inthesource orbinarycodes. This provesthatmore mustbedonetobringrecon- figurable computing to be used as main stream computing: dynamic optimization techniques.Therefore,binaryTranslationanddifferenttypesofreuse,withseveral examples, are evaluated. Finally, works that combine both reconfigurable systems and dynamic techniques are discussed, and a quantitative analysis of one of these examplesispresented.Thebookendswithsomedirectionsthatcouldinspirenew fieldsofresearch. vii viii Preface The main purpose of this book is to introduce reconfigurable systems and dy- namicoptimizationtechniquesto the readers, using several examples,so it can be asourceofreferencewheneverthereaderneeds.Theauthorshopeyouenjoyit,as theyhaveenjoyedmakingtheresearchthatresultedinthisbook. PortoAlegre AntonioCarlosSchneiderBeckFl. LuigiCarro Acknowledgements The authors would like to express their gratitude to the friends and colleagues at InstitutodeInformaticaofUniversidadeFederaldoRioGrandedoSul,andtogive aspecialthankstoallthepeopleintheEmbeddedSystemslaboratory,whoduring severalmomentscontributedforthisresearchformanyyears. The authors would also like to thank the Brazilian research support agencies, CAPESandCNPq. ix Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 MainMotivations . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 OvercomingSomeLimitsoftheParallelism . . . . . . . . 4 1.2.2 TakingAdvantageofCombinationalandReconfigurable Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 SoftwareCompatibilityandReuseofExistentBinaryCode 7 1.2.4 IncreasingYieldandReducingManufactureCosts . . . . . 8 1.3 ThisBook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 ReconfigurableSystems . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 BasicPrinciples . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.1 ReconfigurationSteps . . . . . . . . . . . . . . . . . . . . 15 2.3 UnderlyingExecutionMechanism . . . . . . . . . . . . . . . . . 17 2.4 AdvantagesofUsingReconfigurableLogic . . . . . . . . . . . . . 20 2.4.1 Application . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.2 AnInstructionMergingExample . . . . . . . . . . . . . . 22 2.5 ReconfigurableLogicClassification . . . . . . . . . . . . . . . . . 24 2.5.1 CodeAnalysisandTransformation . . . . . . . . . . . . . 24 2.5.2 RUCoupling . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5.3 Granularity . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.4 InstructionTypes. . . . . . . . . . . . . . . . . . . . . . . 29 2.5.5 Reconfigurability . . . . . . . . . . . . . . . . . . . . . . 30 2.6 Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6.1 HeterogeneousBehavioroftheApplications . . . . . . . . 31 2.6.2 PotentialforUsingFineGrainedReconfigurableArrays . . 34 2.6.3 CoarseGrainReconfigurableArchitectures . . . . . . . . . 38 2.6.4 ComparingBothGranularities. . . . . . . . . . . . . . . . 41 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 xi xii Contents 3 DeploymentofReconfigurableSystems . . . . . . . . . . . . . . . . . 45 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 ExamplesofReconfigurableArchitectures . . . . . . . . . . . . . 46 3.2.1 Chimaera . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.2 GARP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2.3 REMARC . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.2.4 Rapid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.2.5 Piperench(1999). . . . . . . . . . . . . . . . . . . . . . . 57 3.2.6 Molen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.7 Morphosys . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.8 ADRES . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.2.9 Concise . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.2.10 PACT-XPP . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.2.11 RAW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.2.12 Onechip . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.2.13 Chess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.2.14 PRISMI . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.2.15 PRISMII. . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.2.16 Nano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.3 RecentDataflowArchitectures . . . . . . . . . . . . . . . . . . . 81 3.4 SummaryandComparativeTables . . . . . . . . . . . . . . . . . 83 3.4.1 OtherReconfigurableArchitectures . . . . . . . . . . . . . 83 3.4.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 84 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4 DynamicOptimizationTechniques . . . . . . . . . . . . . . . . . . . 95 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.2 BinaryTranslation . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.2.1 MainMotivations . . . . . . . . . . . . . . . . . . . . . . 95 4.2.2 BasicConcepts. . . . . . . . . . . . . . . . . . . . . . . . 97 4.2.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.2.4 Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.3 Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.3.1 InstructionReuse . . . . . . . . . . . . . . . . . . . . . . 109 4.3.2 ValuePrediction . . . . . . . . . . . . . . . . . . . . . . . 110 4.3.3 BlockReuse . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.3.4 TraceReuse . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.3.5 DynamicTraceMemoizationandRST . . . . . . . . . . . 114 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5 DynamicDetectionandReconfiguration . . . . . . . . . . . . . . . . 119 5.1 WarpProcessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.1.1 TheReconfigurableArray . . . . . . . . . . . . . . . . . . 120 5.1.2 HowTranslationWorks . . . . . . . . . . . . . . . . . . . 121 5.1.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Description:
As Moore’s law is losing steam, one already sees the phenomenon of clock frequency reduction caused by the excessive power dissipation in general purpose processors. At the same time, embedded systems are concentrating several heterogeneous applications in a single device, and hence new architectu
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.