Table Of ContentHardware Acceleration of EDA Algorithms
·
Kanupriya Gulati Sunil P. Khatri
Hardware Acceleration
of EDA Algorithms
Custom ICs, FPGAs and GPUs
123
KanupriyaGulati SunilP.Khatri
109BranchwoodTrl DepartmentofElectrical&Computer
CoppellTX75019 Engineering
USA TexasA&MUniversity
kgulati@tamu.edu CollegeStationTX
77843-3128
214ZachryEngineeringCenter
USA
sunilkhatri@tamu.edu
ISBN978-1-4419-0943-5 e-ISBN978-1-4419-0944-2
DOI10.1007/978-1-4419-0944-2
SpringerNewYorkDordrechtHeidelbergLondon
LibraryofCongressControlNumber:2010920238
(cid:2)c SpringerScience+BusinessMedia,LLC2010
Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewritten
permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York,
NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Usein
connection with any form of information storage and retrieval, electronic adaptation, computer
software,orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or not
theyaresubjecttoproprietaryrights.
Printedonacid-freepaper
SpringerispartofSpringerScience+BusinessMedia(www.springer.com)
Toourparentsandourteachers
Foreword
Single-threaded software applications have ceased to see significant gains in per-
formance on a general-purpose CPU, even with further scaling in very large scale
integration (VLSI) technology. This is a significant problem for electronic design
automation (EDA) applications, since the design complexity of VLSI integrated
circuits (ICs) is continuously growing. In this research monograph, we evaluate
custom ICs, field-programmable gate arrays (FPGAs), and graphics processors as
platforms for accelerating EDA algorithms, instead of the general-purpose single-
threadedCPU.Westudyapplicationswhichareusedinkeytime-consumingsteps
of the VLSI design flow. Further, these applications also have different degrees of
inherent parallelism in them. We study both control-dominated EDA applications
and control plus data parallel EDA applications. We accelerate these applications
onthesedifferenthardwareplatforms.Wealsopresentanautomatedapproachfor
acceleratingcertainuniprocessorapplicationsonagraphicsprocessor.
This monograph compares custom ICs, FPGAs, and graphics processing units
(GPUs)aspotentialplatformstoaccelerateEDAalgorithms.Italsoprovidesdetails
oftheprogrammingmodelusedforinterfacingwiththeGPUs.Asanexampleofa
control-dominatedEDAproblem,Booleansatisfiability(SAT)isacceleratedusing
thefollowinghardwareimplementations:(i)acustomIC-basedhardwareapproach
in which the traversal of the implication graph and conflict clause generation are
performedinhardware,inparallel,(ii)anFPGA-basedhardwareapproachtoaccel-
erate SAT in which the entire SAT search algorithm is implemented in the FPGA,
and (iii) a complete SAT approach which employs a new GPU-enhanced variable
orderingheuristic.
In this monograph, several EDA problems with varying degrees of control and
dataparallelismsareacceleratedusingageneral-purposegraphicsprocessor.Inpar-
ticular we accelerate Monte Carlo based statistical static timing analysis, device
model evaluation (for accelerating circuit simulation), fault simulation, and fault
table generation on a graphics processor, with speedups of up to 800×. Addition-
ally, an automated approach is presented that accelerates (on a graphics proces-
sor) uniprocessor code that is executed multiple times on independent data sets
in an application. The key idea here is to partition the software into kernels in an
automatedfashion,suchthatmultipleindependentinstancesofthesekernels,when
vii
viii Foreword
executed inparallelontheGPU,canmaximallybenefitfromtheGPU’shardware
resources.
We hope that this monograph can serve as a valuable reference to individuals
interested in exploring alternative hardware platforms and to those interested in
accelerating various EDA applications by harnessing the parallelism in these plat-
forms.
CollegeStation,TX KanupriyaGulati
CollegeStation,TX SunilP.Khatri
October2009
Preface
In recent times, serial software applications have no longer enjoyed significant
gainsinperformancewithprocessscaling,sincemicroprocessorperformancegains
havebeenhamperedduetoincreasesinpowerandmanufacturabilityissues,which
accompany scaling. With the continuous growth of IC design complexities, this
problem is particularly significant for EDA applications. In this research mono-
graph,weevaluatethefeasibilityofhardwareplatformssuchascustomICs,FPGAs,
andgraphicsprocessors,foracceleratingEDAalgorithms.Wechooseapplications
which contribute significantly to the total runtime of the VLSI design flow and
which have varied degrees of inherent parallelism in them. We study the acceler-
ation of such algorithms on these alternative platforms. We also present an auto-
matedapproachtoacceleratecertainspecifictypesofuniprocessorsubroutineson
theGPU.
This research monograph consists of four parts. The alternative hardware plat-
forms, along with the details of the programming model used for interfacing with
the graphics processing units, are discussed in the first part of this monograph.
The second part of this monograph studies the acceleration of an algorithm in
thecontrol-dominatedcategory,namelyBooleansatisfiability(SAT).Thethirdpart
studies the acceleration of some algorithms in the control plus data parallel cate-
gory,namelyMonteCarlobasedstatisticalstatictiminganalysis,circuitsimulation,
faultsimulationandfaulttablegeneration.Inthefourthpartofthemonograph,we
presenttheautomatedapproachtogenerateGPUcodetoacceleratecertainsoftware
subroutines.
BookOutline
This research monograph is organized into four parts. In Part I of this research
monograph, we discuss alternative hardware platforms. We also provide details of
theprogrammingmodelusedforinterfacingwiththegraphicsprocessor.InChap-
ter 2, we compare and contrast the hardware platforms that are considered in this
monograph.Inparticular,wediscusscustom-designedICs,reconfigurablearchitec-
tures such as FPGAs, and streaming processors such as graphics processing units
ix