ebook img

Anton 2: A 2nd-Generation ASIC for molecular Dynamics - Hot Chips PDF

18 Pages·2014·2.78 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Anton 2: A 2nd-Generation ASIC for molecular Dynamics - Hot Chips

A Second-Generation ASIC for Molecular Dynamics The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 1 Hot Chips 2014 The Hardware Team: All 25 of Us J. Adam Butts, Brannon Batson, Jack C. Chao*, Martin M. Deneroff*, Ron O. Dror*, Christopher H. Fenton, Anthony Forte, Joseph Gagliardo, Gennette Gill, Brian Greskamp, J.P. Grossman, C. Richard Ho*, Jeffrey S. Kuskin, Richard H. Larson*, Timothy Layman*, Li-Siang Lee*, Chester Li*, Shark Yeuk-Hai Mok*, Mark A. Moraes, Rolf Mueller*, Lawrence J. Nociolo, Jon L. Peticolas, Terry Quan, Daniel Ramot*, Naseer Siddique, Jochen Spengler, Ping Tak Peter Tang*, Michael Theobald, Horia Toma*, Brian Towles, Stanley C. Wang, and David E. Shaw * Work conducted while at D. E. Shaw Research; author’s affiliation has subsequently changed. The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation Hot Chips 2014 Molecular Dynamics (MD) and Why It’s Hard • Simulation of the motions of all atoms in a biochemical system – Calculate all forces on each atom of the system in each discrete time step – Each simulation time step is ~2 fs • Efficient parallelization of force calculation is hard – Time steps are sequentially dependent – All atoms interact with all other atoms • Massive computational task! – Many interesting biochemical systems have 105 – 106 atoms – Many biological processes take milliseconds (i.e., 1012 time steps) – ~104 FLOP/atom/step, even with range-limited approximations The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 3 Hot Chips 2014 : The Original Series* * D. E. Shaw et al., “Millisecond-Scale Molecular Dynamics Simulations on Anton,” in Proc. Supercomputing (SC-09), Nov. 2009. The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 4 Hot Chips 2014 : The Next Generation • Increased speed – More cores, pipelines CPU cores 13 66 – Higher clock frequency – In-house physical design CPU clock 485 MHz 1,650 MHz • Enhanced capability Interaction 66 152 pipelines – More flexible fixed-function pipelines Pipeline 970 MHz 1,650 MHz clock – Higher capacity (atoms/ASIC) Peak 2.73 TFXOPS 12.7 TFXOPS • Simplified software throughput* – Single ISA for all CPU cores Main SRAM 128 KB 4,096 KB – Optimized compiler port (gcc) Atoms/ASIC 460 8,000 Channel B/W 607 Gb/s 2.7 Tb/s *TFXOPS = 1012 32-bit fixed-point operations per second. The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 5 Hot Chips 2014 ASIC Architecture The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation Hot Chips 2014 Flexible Subsystem Tile (Flex) • 4 geometry cores (GCs) – Fixed-point SIMD processors – Compiler-friendly ISA optimized Dispatch 256 KB Network for MD Unit SRAM Interface – Large instruction cache • Dispatch unit* – Low-latency hardware event synchronization Geometry Geometry Geometry Geometry – Enables exploiting fine-grained Core Core Core Core parallelism • Globally shared SRAM 16 KB D$ 16 KB D$ 16 KB D$ 16 KB D$ – Stores particle data, additional 56 KB I$ 56 KB I$ 56 KB I$ 56 KB I$ code – Built-in atomic operations • In-tile network with gateway to on-chip mesh * J.P. Grossman et al., “Hardware support for Fine-Grained Event-Driven Computation in Anton 2,” in Proc. ASPLOS-18, Mar. 2013. The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 7 Hot Chips 2014 High-Throughput Interaction Subsystem (HTIS) Mini-flex tile ICB PPIM array 128 output 64 KB Network ds queues PPIM PPIM PPIM PPIM SRAM Interface n a 0 1 17 18 m m o c / ol 640 KB r t n local memory PPIM PPIM PPIM PPIM Dispatch Geometry o c 19 20 36 37 Unit Core • Pairwise Point Interaction • Single-GC “mini-flex” tile for Modules (PPIMs) control – Two specialized compute • Interaction Control Block (ICB) pipelines each – Queues for receiving different – Over 250 billion incoming particle streams interactions/s/ASIC – Streaming memory supplies – Significant flexibility high-bandwidth to PPIM array • Configurable pair selection • Output block merges results • Flexible interaction functions from two PPIM rows • Particle/pair parameter overrides The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation 8 Hot Chips 2014 Network Architecture* • On-chip mesh network – 2.5 Tb/s bisection bandwidth – Also carries inter-chip through traffic • Inter-chip 3D torus network – Mirrors spatial geometry of MD – Network sliced for load-balancing – 96 bi-directional lanes @ 14 Gb/s • 2.7 Tb/s total bandwidth / ASIC 7 6 • 512-node bisection bandwidth of 57 Tb/s 7 5 4 6 3 5 – Custom channel protocol 1 lane 4 2 3 1 Tx 0 0 (14 Gb/s/dir) 2 • Optimized for exploiting parallelism in MD 01 Rx 0 – Arbitrary dimension-order routing 7 6 5 – Table-based multicast 1 slice 67 34 5 2 (8 lanes/dir) 34 01 Tx 1 – In-network reductions 2 1 Rx 1 0 * B. Towles et al., “Unifying on-chip and inter-node switching within the Anton 2 network,” in Proc. ISCA-41, Jun. 2014. The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation Hot Chips 2014 Host and Debug Interfaces • Standard Host Interface – High-bandwidth PCIe Gen3 ×16 – Direct handling of GC cache misses to host-mapped memory – Supports DMA to/from on-chip Flex tile memory – Full-swing CMOS interface for configuration and debug • On-chip Logic Analyzer (LA) – Flexible triggering and monitoring of on-chip state – Trace capacity of over 650,000 cycles – Extensively used for performance tuning The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation Hot Chips 2014

Description:
The Anton 2 Chip: A 2nd Generation ASIC for Molecular Dynamics Simulation. Hot Chips 2014. The Hardware Team: All 25 of Us. J. Adam Butts, Brannon
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.