ebook img

SGE Admin PDF

83 Pages·2009·0.99 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview SGE Admin

Grid Engine Administration Overview This module covers (cid:1) Grid Problem Types (cid:1) How it works (cid:1) Distributed Resource (cid:1) Grid Engine Scheduling Management Grid Engine 6 Architecture (cid:1) Grid Engine 6 Variants (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Grid Problem Types SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net “Grid” Problem Types Tightly Coupled Embarrassingly Parallel 1 + 1 = X 1 + 1 = X X + 2 = Y 2 + 2 = Y Y + 3 = Z 3 + 3 = Z N + Z = W N + N = W SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Traditional Parallelism Single process in which several parallel elements (cid:1) (threads) must communicate to solve a single problem Parallel programming is complicated and far from (cid:1) automatic Users must explicitly use parallel programming tools (cid:1) and application environments Speedup on clusters not guaranteed (cid:1) Depends on code, interconnect latency & the (cid:1) problem space SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Traditional Parallelism Used to be a much more fractured space (cid:1) PVM, LINDA, MPI, etc. (cid:1) MPI rules present day (cid:1) (*some exceptions) Infinite possibilities for admin headaches (cid:1) MPI flavor “A” using interconnect type “B” via (cid:1) integration method “C” using SGE CPU $allocation_rule=“D” Ouch. (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Embarrassingly Parallel Also known as “Serial” or “Batch” Computing (cid:1) Large numbers of nearly identical jobs (cid:1) Possibly only the input file changes (cid:1) Identical analysis on large pools of input data (cid:1) or queries Easily subdivided, mostly independent tasks (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Embarrassingly Parallel Primary optimization focus (cid:1) Often less focus on interconnect and latency issues (cid:1) More focus on speed, size and scaling headroom of (cid:1) the storage infrastructure (or memory) Performance tuning the grid scheduler (‘DRM’) and (cid:1) batch subsystem for high throughput Data locality issues often important (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Generic Use Case Scientist has a problem (cid:1) Embarrassingly She does not want or need Parallel (cid:1) to run ONE parallel 1 + 1 = X application stretching across a 100 CPU cluster 2 + 2 = Y She wants to run a (cid:1) 3 + 3 = Z standalone program 100,000 or 1,000,000 N + N = W times with slightly different input and output values SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Cliche Life Science use case “database” of DNA or Protein Result known sequences sequence(s) query >Sequence1 >Sequence2 DB Result1 >Sequence3 Comparing sequences of interest to known repositories in order to identify areas of biologically significant similarity SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net

Description:
Tools for reporting Job/Host/Cluster status. ▫ Job Arrays . All code is open source and hosted at http://gridengine.sunsource.net. ▫. Only the courtesy
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.