Table Of ContentNon-Volatile In-Memory
Computing by Spintronics
Hao Yu
NanyangTechnologicalUniversity,Singapore
Leibin Ni
NanyangTechnologicalUniversity,Singapore
Yuhao Wang
Synopsis,California,USA
SYNTHESISLECTURESONEMERGINGENGINEERING
TECHNOLOGIES#8
M
&C Morgan &cLaypool publishers
Copyright©2017byMorgan&Claypool
Non-VolatileIn-MemoryComputingbySpintronics
HaoYu,LeibinNi,andYuhaoWang
www.morganclaypool.com
ISBN:9781627052948 paperback
ISBN:9781627056441 ebook
DOI10.2200/S00736ED1V01Y201609EET008
APublicationintheMorgan&ClaypoolPublishersseries
SYNTHESISLECTURESONEMERGINGENGINEERINGTECHNOLOGIES
Lecture#8
SeriesEditor:KrisIniewski,RedlenTechnologies,Inc.
SeriesISSN
Print2381-1412 Electronic2381-1439
ABSTRACT
Exa-scalecomputingneedstore-examinetheexistinghardwareplatformthatcansupportinten-
sivedata-orientedcomputing.Sincethemainbottleneckisfrommemory,weaimtodevelopan
energy-efficientin-memorycomputingplatforminthisbook.First,themodelsofspin-transfer
torque magnetic tunnel junction and racetrack memory are presented. Next, we show that the
spintronicscouldbeacandidateforfuturedata-orientedcomputingforstorage,logic,andinter-
connect. As a result, byutilizing spintronics,in-memory-based computinghas been applied for
dataencryptionandmachinelearning.eimplementationsofin-memoryAES,Simoncipher,
aswellasinterconnectareexplainedindetails.Inaddition,in-memory-basedmachinelearning
andfacerecognitionarealsoillustratedinthisbook.
KEYWORDS
Spintronics, in-memory computing, non-volatile memory, logic-memory integra-
tion,dataencryption,AES,machinelearning,dataanalytics,hardwareaccelerator
Contents
Preface ............................................................xi
1 Introduction ....................................................... 1
1.1 MemoryWall .....................................................1
1.2 TraditionalSemiconductorMemory ...................................3
1.2.1 Overview ...................................................3
1.2.2 Nano-scaleLimitations........................................9
1.3 Non-volatileSpintronicMemory ....................................13
1.3.1 BasicMagnetizationProcess...................................14
1.3.2 MagnetizationDamping......................................14
1.3.3 Spin-transferTorque.........................................15
1.3.4 MagnetizationDynamics .....................................18
1.3.5 DomainWallPropagation ....................................20
1.4 TraditionalMemoryArchitecture ....................................21
1.5 Non-volatileIn-memoryComputingArchitecture.......................25
1.6 References.......................................................28
2 Non-volatileSpintronicDeviceandCircuit............................. 31
2.1 SPICEFormulationwithNewNano-scaleNVMDevices ................31
2.1.1 TraditionalModifiedNodalAnalysis ............................32
2.1.2 NewMNAwithNon-volatileStateVariables .....................33
2.2 STT-MTJDeviceandModel .......................................35
2.2.1 STT-MTJ .................................................35
2.2.2 STT-RAM ................................................39
2.2.3 TopologicalInsulator ........................................40
2.3 DomainWallDeviceandModel.....................................50
2.3.1 MagnetizationReversal.......................................51
2.3.2 MTJResistance.............................................53
2.3.3 DomainWallPropagation ....................................53
2.3.4 CircularDomainWallNanowire ...............................54
2.4 SpintronicStorage ................................................57
2.4.1 SpintronicMemory..........................................57
2.4.2 SpintronicReadout ..........................................57
2.5 SpintronicLogic..................................................61
2.5.1 XOR .....................................................61
2.5.2 Adder.....................................................63
2.5.3 Multiplier .................................................64
2.5.4 LUT .....................................................64
2.6 SpintronicInterconnect ............................................65
2.6.1 Coding-basedInterconnect....................................65
2.6.2 DomainWall-basedEncoder/Decoder ..........................68
2.6.3 PerformanceEvaluation ......................................72
2.7 References.......................................................74
3 In-memoryDataEncryption ......................................... 81
3.1 In-memoryAdvancedEncryptionStandard ............................81
3.1.1 FundamentalofAES ........................................81
3.1.2 DomainWallNanowire-basedAESComputing...................83
3.1.3 PipelinedAESbyDomainWallNanowire .......................92
3.1.4 PerformanceEvaluation ......................................97
3.2 DomainWall-basedSIMONBlockCipher ...........................102
3.2.1 FundamentalofSIMONBlockCipher .........................103
3.2.2 HardwareStages ...........................................103
3.2.3 RoundCounter ............................................103
3.2.4 ControlSignals ............................................105
3.2.5 KeyExpansion ............................................105
3.2.6 Encryption................................................106
3.2.7 PerformanceEvaluation .....................................106
3.3 References......................................................108
4 In-memoryDataAnalytics.......................................... 111
4.1 In-memoryMachineLearning .....................................111
4.1.1 ExtremeLearningMachine ..................................111
4.1.2 MapReduce-basedMatrixMultiplication .......................112
4.1.3 DomainWall-basedHardwareMapping ........................113
4.1.4 PerformanceEvaluation .....................................116
4.2 In-memoryFaceRecognition ......................................121
4.2.1 Energy-efficientSTT-MRAMwithSpare-representedData........121
4.2.2 QoS-awareAdaptiveCurrentScaling ..........................129
4.2.3 STT-RAMbasedHardwareMapping..........................131
4.2.4 PerformanceEvaluation .....................................135
4.3 References......................................................140
Preface
eexistingmemorytechnologieshavecriticalchallengeswhenscalingatnanoscaleduetopro-
cessvariation,leakagecurrent,andI/Oaccesslimitations.Recently,therearetworesearchtrends
attemptingtoalleviatethememory-wallissuesforfuturebig-datastorageandprocessingsystem.
First,theemergingnon-volatilememorytechnologies,suchasspin-transfertorquemem-
ory,domainwallnanowire(orracetrack)memory,etc.,haveshownsignificantlyreducedstandby
powerandincreasedintegrationdensity,aswellasclose-toDRAM/SRAMaccessspeed.ere-
fore,theyareconsideredaspromisingnextgenerationmemoryforbig-dataapplications.
Second, due to high data-level parallelism in big-data applications, large number of ap-
plicationspecificacceleratorscanbedeployedfordataprocessing.However,theI/Obandwidth
limitation will still be the bottleneck in such a memory-logic integration approach. Instead, an
in-memorycomputingplatformwillbehighlydesiredwithlessdependenceonI/Os.
Inordertoachievelowpowerandhighthroughput(orenergyefficiency)inbig-datacom-
puting, one can build an in-memory non-volatile memory (NVM) hardware platform, where
boththememoryandcomputingresourcesarebasedonNVMdeviceswithinstant-switch-onas
wellastoultra-lowleakagecurrent.Suchanarchitecturecanachievesignificantpowerreduction
due to the non-volatility. Moreover, one can develop an NVM logic accelerator that can per-
formdomain-specificcomputationssuchasdataencryptionandalsodataanalyticsinalogic-in-
memoryfashion.Comparedtoconventionalmemory-logic-integrationarchitectures,thestorage
data do not need to be loaded into volatile main memory, processed by logic, and written back
afterwardwithsignificantI/Ocommunicationcongestion.
In this book, we discuss the following research studies on this regard. First, we introduce
anon-volatilein-memoryarchitecturethatbothdatastorageandlogiccomputingareinsidethe
memoryblock.edatastorageandlogiccomputingislocatedinpairstoperformadistributed
in-memorycomputing.WeillustratetheNVM-basedbasicmemoryandlogiccomponentsand
find significant power reduction. Second, we develop a SPICE-like simulator NVM SPICE,
whichimplementsphysicalmodelsfornon-volatiledevicesinasimilarwayastheBSIMmodel
for MOSFET. We further develop the data storage and logic computing on spintronic devices.
ese operations can be deployed for in-memory data encryption and analytics. Moreover, we
illustratehowdataencryptionincludingadvancedencryptionstandard(AES)andSimoncipher
canbeimplementedinthisarchitecture.Dataanalyticsapplicationssuchasmachinelearningfor
super-resolutionandfacerecognitionareperformedonthedevelopedNVMplatformaswell.
is book provides a state-of-the-art summary for the latest literature on emerging non-
volatile spintronic technologies and covers the entire design flow from device, circuit to system
perspectives, which is organized into five chapters. Chapter 1 introduces the basics of conven-
xii PREFACE
tional memory architecture with traditional semiconductor devices as well as the non-volatile
in-memory architecture. Chapter 2 details the device characterization for spintronics by non-
electrical states and the according storage and logic implementation by spintronics. Chapter 3
exploresthein-memorydataencryptionbasedonSTT-RAMaswellasdomainwallnanowire.
e implementations of in-memory AES, domain wall-based Simon cipher and low power in-
terconnect are explained in detail. Chapter 4 presents the system level architectures with data
analytics applicationsfor the emerging non-volatilememory.In-memory machine learning and
face recognitionarediscussed in this chapter. is bookassumes that readershave basic knowl-
edge of semiconductor device physics. It will be a good reference for senior undergraduate and
graduatestudentswhoareperformingresearchonnon-volatilememorytechnologies.
HaoYu,LeibinNi,andYuhaoWang
Singapore
October2016
C H A P T E R 1
Introduction
Abstract Computer memory is any physical device capable to store data temporarily or per-
manently. It covers from fastest, yet most expensive, static random-access-memory (SRAM) to
cheapest,butslowest,harddrivedisk,whileinbetweentherearemanyothermemorytechnolo-
giesthatmaketrade-offsamongcost,speed,andpowerconsumption.However,thelargevolume
ofmemory will experience significant leakagepower,especially at advancedCMOS technology
nodes, for holding data in volatile memory for fast accesses. e spin-transfer torque magnetic
random-accessmemory(STT-RAM),anovelnon-volatilememory-(NVM)basedonspintronic
devices,hasshownthegreatbenefitsonpower-wallissuecomparedtotraditionalvolatilemem-
ories. In addition, in traditional Von-Neumann architecture, the memory is separated from the
central processing unit. As a result, the I/O congestion between memory and processing unit
leadstothememory-wallissue,andultimatesolutionrequiresabreakthroughonmemorytech-
nology.enovelin-memoryarchitectureisthesolutionofmemory-wallissuesthatbothlogic
operationanddatastoragearelocatedinsidememory.ischapterreviewstheexistingsemicon-
ductor memory technologies and traditional memory architecture first, and then introduces the
spintronicmemorytechnologiesaswellasthein-memoryarchitecture.
1.1 MEMORYWALL
e internet data has reached exa-scale (1018 bytes), which has introduced emerging need to
re-examinetheexistinghardwareplatformthatcansupportintensivedata-orientedcomputing.
Atthesametime,theanalysisofsuchahugevolumeofdataneedsascalablehardwaresolution
for both big-data storage and processing, which is beyond the capability from pure software-
based data analytic solution. e main bottleneck is from the well-known memory bottleneck.
Assuch,oneneedstore-designanenergy-efficienthardwareplatformforfuturebig-datadriven
application.
A big-data-drivenapplicationrequires high-bandwidth with maintained low-powerden-
sity.Forexample,web-searching applicationinvolvescrawling,comparing,ranking,and paging
of billions of web-pages or images with extensive memory access. e microprocessor needs to
processthestoreddatawithintensivememoryaccess.However,thepresentdatastorageandpro-
cessinghardwarehavewell-knownbandwidth-wallduetolimitedaccessingbandwidthatI/Os;
but also power-wall due to large leakage power in advanced CMOS technology when holding
data.Assuch,adesignofscalableenergy-efficientbig-dataanalytichardwareishighlyrequired.
2 1. INTRODUCTION
e following research activities attempted to alleviate the memory bottleneck problem for the
futurebig-datastorageandprocessing.
• Emergingnon-volatilein-memorycomputingisapromisingtechnologyforbig-dataori-
ented computing. e application-specific accelerators can be developed within memory
suchthatthedatawillbepre-processedbeforetheyarereadoutwiththeminimumnumber
ofdatamigrationssuchthatthebandwidth-wallcanberelieved.Moreover,thenon-volatile
memory devices hold information without using charge, such that leakage current can be
significantlyreducedwiththepower-wallrelievedaswell.
• Sparse-representeddatabycompressivesensingisaneffectiveapproachtoreducedatasize
byprojectingdatafromhigh-dimensionalspacetolow-dimensionalsubspacewithessential
feature preserved. Data stored in memory after compressive sensing can be recovered or
directlyprocessed.isprocedurereducesdatacomplexityindatastorageaswellasindata
analyticswithafurthersignificantpowerreductionandbandwidthimprovement.
• Data analytics by a machine-learning accelerator can be developed for fast data analytics.
e latest machine learning algorithm on-chip can accelerate the learning process with
a potential to realize online training, which is important for applications in autonomous
vehiclesandunmannedaerialvehicles.
Inthisbook,weaimatfindingascalablehardwaresolutionbasedontheconceptofnon-
volatile in-memory data-analytics accelerator. In order to achieve low power and high band-
width (or energy efficient) in big-data-driven computing, we propose developing a non-volatile
memory-(NVM)basedhardwareplatform,whereboththedatastorageandprocessingarebased
on NVM devices with instant-switch-on as well as ultra-low leakage current. is can result in
significant power reduction due to the non-volatility. Moreover, we will develop NVM-based
logic accelerator that can perform domain-specific computation such as machine-learning with
a logic-in-memory architecture. In contrast for the conventional memory-logic-integration ar-
chitecture, the stored data must be loaded into volatile main memory, processed by logic, and
written back with significant I/O communication and power overhead. Last, we will develop a
machine-learningalgorithmbasedonsparse-representeddatatofurtherdealwithlarge-scaledata
analytics.Asaresult,weoutlinethefollowingdetailedstudiesinthisbook.
• First, we plan to develop an energy-efficient in-memory computing architecture by lever-
agingthenon-volatilememorydevicesthatcanfundamentallyresolvethememorybottle-
neckforbig-dataanalytics.Inthisbook,non-volatile-memorydevicesuchasspin-transfer
torque magnetic random-access memory (STT-RAM) or domain wall nanowire (DW-
NN) will be studied to realize both data storage and also in-memory logic. As such, it
cangreatlyreducetheleakagepowerduetoitsnon-volatilestatewithoutholdingelectrons
in nature. In addition, the logic-in-memory architecture will also be studied to perform
logiccomputationlocally,thusrelaxingthedemandonI/Obandwidthandalleviatingthe
bandwidthconstraint.