Grid Engine Administration Overview This module covers (cid:1) Grid Problem Types (cid:1) How it works (cid:1) Distributed Resource (cid:1) Grid Engine Scheduling Management Grid Engine 6 Architecture (cid:1) Grid Engine 6 Variants (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Grid Problem Types SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net “Grid” Problem Types Tightly Coupled Embarrassingly Parallel 1 + 1 = X 1 + 1 = X X + 2 = Y 2 + 2 = Y Y + 3 = Z 3 + 3 = Z N + Z = W N + N = W SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Traditional Parallelism Single process in which several parallel elements (cid:1) (threads) must communicate to solve a single problem Parallel programming is complicated and far from (cid:1) automatic Users must explicitly use parallel programming tools (cid:1) and application environments Speedup on clusters not guaranteed (cid:1) Depends on code, interconnect latency & the (cid:1) problem space SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Traditional Parallelism Used to be a much more fractured space (cid:1) PVM, LINDA, MPI, etc. (cid:1) MPI rules present day (cid:1) (*some exceptions) Infinite possibilities for admin headaches (cid:1) MPI flavor “A” using interconnect type “B” via (cid:1) integration method “C” using SGE CPU $allocation_rule=“D” Ouch. (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Embarrassingly Parallel Also known as “Serial” or “Batch” Computing (cid:1) Large numbers of nearly identical jobs (cid:1) Possibly only the input file changes (cid:1) Identical analysis on large pools of input data (cid:1) or queries Easily subdivided, mostly independent tasks (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Embarrassingly Parallel Primary optimization focus (cid:1) Often less focus on interconnect and latency issues (cid:1) More focus on speed, size and scaling headroom of (cid:1) the storage infrastructure (or memory) Performance tuning the grid scheduler (‘DRM’) and (cid:1) batch subsystem for high throughput Data locality issues often important (cid:1) SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Generic Use Case Scientist has a problem (cid:1) Embarrassingly She does not want or need Parallel (cid:1) to run ONE parallel 1 + 1 = X application stretching across a 100 CPU cluster 2 + 2 = Y She wants to run a (cid:1) 3 + 3 = Z standalone program 100,000 or 1,000,000 N + N = W times with slightly different input and output values SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net Cliche Life Science use case “database” of DNA or Protein Result known sequences sequence(s) query >Sequence1 >Sequence2 DB Result1 >Sequence3 Comparing sequences of interest to known repositories in order to identify areas of biologically significant similarity SGE training, consulting and special projects - BioTeam Inc. - http://www.bioteam.net
Description: