Rugged Embedded Systems Computing in Harsh Environments Augusto Vega Pradip Bose Alper Buyuktosunoglu AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann is an imprint of Elsevier MorganKaufmannisanimprintofElsevier 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates #2017ElsevierInc.Allrightsreserved. Nopartofthispublicationmaybereproducedortransmittedinanyformorbyanymeans, electronicormechanical,includingphotocopying,recording,oranyinformationstorageand retrievalsystem,withoutpermissioninwritingfromthepublisher.Detailsonhowtoseek permission,furtherinformationaboutthePublisher’spermissionspoliciesandour arrangementswithorganizationssuchastheCopyrightClearanceCenterandtheCopyright LicensingAgency,canbefoundatourwebsite:www.elsevier.com/permissions. Thisbookandtheindividualcontributionscontainedinitareprotectedundercopyrightbythe Publisher(otherthanasmaybenotedherein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging.Asnewresearchand experiencebroadenourunderstanding,changesinresearchmethods,professionalpractices,or medicaltreatmentmaybecomenecessary. Practitionersandresearchersmustalwaysrelyontheirownexperienceandknowledgein evaluatingandusinganyinformation,methods,compounds,orexperimentsdescribedherein. Inusingsuchinformationormethodstheyshouldbemindfuloftheirownsafetyandthesafety ofothers,includingpartiesforwhomtheyhaveaprofessionalresponsibility. Tothefullestextentofthelaw,neitherthePublishernortheauthors,contributors,oreditors, assumeanyliabilityforanyinjuryand/ordamagetopersonsorpropertyasamatterofproducts liability,negligenceorotherwise,orfromanyuseoroperationofanymethods,products, instructions,orideascontainedinthematerialherein. LibraryofCongressCataloging-in-PublicationData AcatalogrecordforthisbookisavailablefromtheLibraryofCongress BritishLibraryCataloguing-in-PublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary ISBN:978-0-12-802459-1 ForinformationonallMorganKaufmannpublications visitourwebsiteathttps://www.elsevier.com/ AcquisitionEditor:ToddGreen EditorialProjectManager:CharlotteKent ProductionProjectManager:MohanaNatarajan CoverDesigner:MathewLimbert TypesetbySPiGlobal,India Dedication TomywifeChiaraandourlittleboyNiccolo`—theyteachmeeverydayhowtofind happinessinthesimplestofthings.TotheamazinggroupofpeopleIhavethepriv- ilege towork with at IBM. AugustoVega To all contributing members of the IBM-led project on: “Efficient resilience in embedded computing,”sponsored byDARPAMTO under its PERFECT program. Tomywife(Sharmila)andmymother(Reba)fortheirincredibleloveandsupport. Pradip Bose TomywifeAgnieszka,toourbabyboyJohnandtomyparents.Tothewonderful group ofcolleagues that Iworkwith each day. Alper Buyuktosunoglu Contributors J. Abella Barcelona SupercomputingCenter,Barcelona,Spain R.A. Ashraf University ofCentral Florida, Orlando,FL,United States P. Bose IBM T. J. Watson Research Center,Yorktown Heights, NY, United States A. Buyuktosunoglu IBM T. J. Watson Research Center,Yorktown Heights, NY, United States V.G. Castellana Pacific NorthwestNationalLaboratory, Richland,WA, UnitedStates F.J. Cazorla IIIA-CSICand BarcelonaSupercomputing Center,Barcelona,Spain M. Ceriani PolytechnicUniversity of Milan, Milano, Italy E.Cheng StanfordUniversity, Stanford, CA,UnitedStates R.F. DeMara University ofCentral Florida, Orlando,FL,United States F. Ferrandi PolytechnicUniversity of Milan, Milano, Italy R. Gioiosa Pacific NorthwestNationalLaboratory, Richland,WA, UnitedStates N. Imran University ofCentral Florida, Orlando,FL,United States M. Minutoli Pacific NorthwestNationalLaboratory, Richland,WA, UnitedStates S. Mitra StanfordUniversity, Stanford, CA,UnitedStates G. Palermo PolytechnicUniversity of Milan, Milano, Italy J. Rosenberg Draper Laboratory, Cambridge, MA, UnitedStates A. Tumeo Pacific NorthwestNationalLaboratory, Richland,WA, UnitedStates xiii xiv Contributors A. Vega IBM T.J. WatsonResearchCenter,Yorktown Heights, NY, United States R. Zalman InfineonTechnologies,Neubiberg, Germany X. Zhang WashingtonUniversity, St. Louis,MO, United States Preface The adoption of rugged chips that can operate reliably even under extreme condi- tions has experienced an unprecedented growth. This growth is in tune with the revolutions related to mobile systems and the Internet of Things (IoT), emergence of autonomous and semiautonomous transport systems (such as connected and driverlesscars),andhighlyautomatedfactoriesandtheroboticsboom.Thenumbers are astonishing—if we consider just a few domains (connected cars, wearable and IoT devices, tablets and smartphones), we will end up having around 16 billion embedded devices surrounding usby 2018, asFig. 1 shows. Adistinctiveaspectofembeddedsystems(probablythemostinterestingone)is thefactthattheyallowustotakecomputingvirtuallyanywhere,fromacar’sbraking system to an interplanetary rover exploring another planet’s surface to a computer attachedto(orevenimplantedinto!)ourbody.Inotherwords,thereexistsamobility aspect—inherent to this type of systems—that gives rise to all sorts of design and operation challenges, high energy efficiency and reliable operation being the most criticalones.Inordertomeettargetenergybudgets,onecandecideto(1)minimize error detection or error tolerance related overheads and/or (2) enable aggressive power and energy management features, like low- or near-threshold voltage Number of devices in use globally (in thousands) 20,000,000 18,000,000 16,000,000 Connected cars 14,000,000 Wearables 12,000,000 Connected TVs 10,000,000 8,000,000 Internet of thinks 6,000,000 Tablets 4,000,000 Smartphones 2,000,000 PCs 0 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014E2015E2016E2017E2018E FIG.1 Embeddeddevicesgrowththrough2018. Source:BusinessInsiderIntelligence. xv xvi Preface operation. Unfortunately, both approaches have direct impact on error rates. The hardening mechanisms (like hardened latches or error-correcting codes) may not be affordable since they add extra complexity. Soft error rates (SERs) are known toincreasesharplyasthesupplyvoltageisscaleddown.Itmayappeartobearather challenging scenario. But looking back at the history of computers, we have over- comesimilar(orevenlarger)challenges.Indeed,wehavealreadyhitseverepower density-relatedissuesinthelate80susingbipolartransistorsandhereweare,almost 30 yearsafter, stillcreatingincreasingly powerfulcomputers and machines. The challenges discussed above motivated us some years ago to ignite serious discussion and brainstorming in the computer architecture community around the mostcriticalaspectsofnew-generationharsh-environment-capableembedded pro- cessors.Amongavarietyofactivities,wehavesuccessfullyorganizedthreeeditions oftheworkshoponHighly-ReliablePower-EfficientEmbeddedDesigns(HARSH), which have attracted the attention of researchers from academia, industry, and governmentresearchlabsduringthelastyears.Someoftheexpertsthatcontributed materialtothisbookhadpreviouslyparticipatedindifferenteditionsoftheHARSH workshop. This book is in part the result of such continued efforts to foster the discussioninthisdomaininvolvingsomeofthemostinfluentialexpertsinthearea of ruggedembedded systems. This book was also inspired by work that the guest editors have been pursuing underDARPA’sPERFECT(PowerEfficiencyRevolutionforEmbeddedComputing Technologies) program. The idea was to capture a representative sample of the currentstateoftheartinthisfield,sothattheresearchchallenges,goals,andsolution strategiesofthePERFECTprogramcanbeexaminedintherightperspective.Inthis regard,thebookeditorswanttoacknowledgeDARPA’ssponsorshipundercontract no.HR0011-13-C-0022. We also express our deep gratitude to all the contributors for their valuable timeandexceptionalwork.Needlesstosay,thisbookwouldnothavebeenpossible withoutthem.Finally,wealsowanttoacknowledgethesupportreceivedfromthe IBMT. J. Watson Research Center tomakethis book possible. AugustoVega Pradip Bose AlperBuyuktosunoglu Summer 2016 CHAPTER 1 Introduction A. Vega, P. Bose, A.Buyuktosunoglu IBMT.J.WatsonResearchCenter,YorktownHeights,NY,UnitedStates Electronicdigitalcomputersareverypowerfultools.Theso-calleddigitalrevolution hasbeenfueledmostlybychipsforwhichthenumberoftransistorsperunitareaon integrated circuits kept doubling approximately every 18 months, following what GordonMooreobservedin1975.Theresultingexponentialgrowthissoremarkable thatitfundamentallychangedthewayweperceiveandinteractwithoursurrounding world.Itisenoughtolookaroundtofindtracesofthisrevolutionalmosteverywhere. Butthedramaticgrowthexhibitedbycomputersinthelastthreetofourdecadeshas alsoreliedonafactthatgoesfrequentlyunnoticed:theyoperatedinquitepredictable environmentsandwithplentifulresources.Twentyyearsago,forexample,adesktop personalcomputersatonatableandworkedwithoutmuchconcernaboutpoweror thermaldissipation;securitythreatsalsoconstitutedrareepisodes(computerswere barelyconnectedifconnectedatall!);andthefewmobiledevicesavailabledidnot havetoworrymuchaboutbatterylife.Atthattime,wehadtoputoureyesonsome specificnichestolookfortrulysophisticatedsystems—i.e.,systemsthathadtooper- ateonunfriendlyenvironmentsorundersignificantamountof“stress.”Oneofthose nicheswas(andis)spaceexploration:forexample,NASA’sMarsPathfinderplan- etary rover was equipped with a RAD6000 processor, a radiation-hardened POWER1-based processor that was part of the rover’s on-board computer [1]. Releasedin1996,theRAD6000wasnotparticularlyimpressivebecauseofitscom- putationalcapacity—itwasactuallyamodestprocessorcomparedtosomecontem- poraryhigh-end(orevenembeddedsystem)microprocessors.Itscost—intheorder ofseveralhundredthousanddollars—isbetterunderstoodasafunctionofthechip ruggednesstowithstandtotalradiationdosesofmorethan1,000,000radsandtem- peratures between (cid:1)25°Cand +105°Cinthe thin Martianatmosphere [2]. Inthelastdecade,computerscontinuedgrowingintermsofperformance(stillrid- ing on Moore’s Law and the multicore era) and chip power consumption became a critical concern. Since the early 2000s, processor design and manufacturing is notdrivenbyjustperformanceanymorebutitisalsodeterminedbystrictpowerbud- gets—aphenomenonusuallyreferredtoasthe“powerwall.”Therationalebehindthe powerwallhasitsoriginsin1974,whenRobertDennardetal.fromtheIBMT.J.Watson 1 RuggedEmbeddedSystems.http://dx.doi.org/10.1016/B978-0-12-802459-1.00001-4 #2017ElsevierInc.Allrightsreserved. 2 CHAPTER 1 Introduction ResearchCenter,postulatedthescalingrulesofmetal-oxide-semiconductorfield-effect transistors(MOSFETs)[3].OnekeyassumptionoftheDennard’sscalingruleisthatoper- atingvoltage(V)andcurrent(I)shouldscaleproportionallytothelineardimensionsofthe transistorinordertokeeppowerconsumption(V(cid:3)I)proportionaltothetransistorarea (A).Butmanufacturerswerenotabletoloweroperatingvoltagessufficientlyovertime andpowerdensity(V(cid:3)I/A)keptgrowinguntilitreachedunsustainablelevels.Asaresult, frequencyscalingwasknockeddownandindustryshiftedtomulticoredesignstocope withsingle-threadperformancelimitations. Thepowerwallhasfundamentallychangedthewaymodernprocessorsarecon- ceived. Processors became aware of power consumption with additional on-chip “intelligence” for powermanagement—clock gating anddynamic voltage and fre- quency scaling (DVFS) are two popular dynamic power reduction techniques in use today. But at the same time, chips turned out to be more susceptible to errors (transient and permanent) as a consequence of thermal issues derived from high power densities as well as low-voltage operation. In other words, we have hit the reliability wall in addition to the power wall. The power and reliability walls areinterlinkedasshowninFig.1.Thepowerwallforcesustowarddesignsthathave tighter design margins and “better than worst case” design principles. But that approach eventually degrades reliability (“mean time to failure”)—which in turn requires redundancy and hardening techniques that increase power consumption and forces usback against the powerwall. This isa vicious “karmic” cycle! FIG.1 Relationshipandmutualeffectbetweenthepowerandreliabilitywalls. Thisalreadyworryingoutlookexacerbatedintunewiththerevolutionsrelatedto mobilesystemsandtheInternetofThings(IoT)sincetheaforementionedconstraints (e.g., power consumption and battery life) get more strict and the challenges asso- ciated with fault-tolerant and reliable operation become more critical. Embedded computing has become pervasive and, as a result, many of the day-to-day devices thatweuseandrelyonaresubjecttosimilarconstraints—insomecases,withcritical consequences when they are not met. Automobiles are becoming “smarter” and in
Description: