Table Of Content

Cluster performance, how to get the most out of Abel Ole W. Saastad, Dr.Scient USIT / UAV / FI April 18th 2013 Introduction • Architecture x86-64 and NVIDIA • Compilers • MPI • Interconnect • Storage • Batch queue system Installed compute hardware 630 Supermicro nodes Two socket Intel E5-2670 2.6 GHz octa core 64 GiB memory FDR InfiniBand Universitetets senter for informasjonsteknologi Compute nodes - performance • CPU performance – 332 Gflops/s Theoretical – 318 Gflops/s Practical – HPL (top500) • Memory bandwidth – 63 GiB/s Practical (streams) • Memory latency – 115 nano seconds (random access) Universitetets senter for informasjonsteknologi Node performance High Performance Linpack performance (top500 test) : T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR11R2R4 87500 180 4 4 1404.46 3.180e+02 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0033152 ...... PASSED ================================================================================ [olews@login-0-0 hpl]$ ./xhpl-max.sh HPL.single.node.log No clock freq given, setting it to 2.6 GHz High perf. linpack results: Params size block nxm time Tflops %peak WR11R2R4 87500 180 4x4 1404.46 0.318 95.6 WR11R2R4 87500 200 4x4 1408.52 0.317 95.3 WR11R2R4 85000 200 4x4 1301.79 0.315 94.5 Universitetets senter for informasjonsteknologi Installed compute hardware, GPU 16 Supermicro nodes with GPUs Two socket, two GPUs Intel E5-2670 2.2 GHz quad core 64 GiB memory FDR InfiniBand Tesla K20Xm 6 GiB memory 2688 SP cores 896 DP cores Universitetets senter for informasjonsteknologi Node performance, all hardware High Performance Linpack performance (top500 test) : T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR10L2L2 85000 1280 1 2 223.20 1.844e+03 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0033152 ...... PASSED ================================================================================ [olews@login-0-0 hpl]$ ./xhpl-max.sh HPL.single.node.log High perf. linpack results: Params size block nxm time Tflops %peak WR10L2L2 85000 1280 1x2 223.20 1.844 66.5 WR10L2L2 85000 1024 1x2 224.62 1.823 65.7 WR10L2L2 85000 1408 1x2 232.41 1.762 63.5 Universitetets senter for informasjonsteknologi Node performance, two K20Xm DGEMM performance GPU vs. CPU Tesla K20X vs Intel SB Single precision 32 bit, 600 CUDA BLAS 2.6 Tflops/s 500 s] MKL BLAS s/ p 400 o fl G e [ 300 c n SGEMM performance GPU vs CPU a m 200 or Tesla K20X vs Intel SB rf e 100 P 1400 0 CUDA BLAS 1200 91 366 1464 3295 5149 MKL BLAS Total matrix foorptrint [MiB] s] 1000 s/ p o Double precision 64 bit, Gfl 800 e [ c n 600 a 1 Tflops/s m or 400 rf e P 200 0 45 183 732 1647 2575 3708 4577 5538 Total matrix footprint [MiB] Universitetets senter for informasjonsteknologi InfiniBand Basics • Ping-pong Latency key for performance • Intra rack : 0.95 second • Inter rack : 1.40 second • Ping-pong Bandwith • 6.14 GiB/s • All numbers measured under full production using OpenMPI Universitetets senter for informasjonsteknologi InfiniBand Basics • TCP/IP over InfiniBand – IPoIB on all node • Both GbE and IB interfaces, eth0 and ib0 named • Example – Node compute-x-y has two interfaces – cx-y is the eth0 interface – ib-x-y is the ib0 interface scp gamessplus.tar.bz2 compute-9-1.local:/tmp/gamessplus.tar.bz2 100% 722MB 55.5MB/s 00:13 scp gamessplus.tar.bz2 ib-9-1:/tmp/gamessplus.tar.bz2 100% 722MB 90.2MB/s 00:08 Universitetets senter for informasjonsteknologi

Description:

TCP/IP over InfiniBand – IPoIB on all node. • Both GbE and IB interfaces, eth0 and ib0 named. • Example. – Node compute-x-y has two interfaces. – cx-y is the eth0 interface. – ib-x-y is the ib0 interface scp gamessplus.tar.bz2 compute-9-1.local:/tmp/gamessplus.tar.bz2. 100% 722MB 55.5MB

Cluster performance, how to get the most out of Abel PDF

32 Pages·2013·0.82 MB·English

by Ole Saastad

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Cluster performance, how to get the most out of Abel PDF Free - Full Version

by Ole Saastad| 2013| 32 pages| 0.82| English

Download Cluster performance, how to get the most out of Abel by Ole Saastad in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Cluster performance, how to get the most out of Abel

TCP/IP over InfiniBand – IPoIB on all node. • Both GbE and IB interfaces, eth0 and ib0 named. • Example. – Node compute-x-y has two interfaces. – cx-y is the eth0 interface. – ib-x-y is the ib0 interface scp gamessplus.tar.bz2 compute-9-1.local:/tmp/gamessplus.tar.bz2. 100% 722MB 55.5MB

Detailed Information

Author:	Ole Saastad
Publication Year:	2013
Pages:	32
Language:	English
File Size:	0.82
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Cluster performance, how to get the most out of Abel Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Cluster performance, how to get the most out of Abel PDF?

Yes, on https://PDFdrive.to you can download Cluster performance, how to get the most out of Abel by Ole Saastad completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Cluster performance, how to get the most out of Abel on my mobile device?

After downloading Cluster performance, how to get the most out of Abel PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Cluster performance, how to get the most out of Abel?

Yes, this is the complete PDF version of Cluster performance, how to get the most out of Abel by Ole Saastad. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Cluster performance, how to get the most out of Abel PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.