ebook img

Job Scheduling Strategies for Parallel Processing: 7th International Workshop, JSSPP 2001 Cambridge, MA, USA, June 16, 2001 Revised Papers PDF

213 Pages·2001·4.004 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Job Scheduling Strategies for Parallel Processing: 7th International Workshop, JSSPP 2001 Cambridge, MA, USA, June 16, 2001 Revised Papers

Lecture Notes in Computer Science 2221 EditedbyG.Goos,J.Hartmanis,andJ.vanLeeuwen 3 Berlin Heidelberg NewYork Barcelona HongKong London Milan Paris Tokyo Dror G. Feitelson Larry Rudolph (Eds.) Job Scheduling Strategies for Parallel Processing 7th International Workshop, JSSPP 2001 Cambridge, MA, USA, June 16, 2001 Revised Papers 1 3 SeriesEditors GerhardGoos,KarlsruheUniversity,Germany JurisHartmanis,CornellUniversity,NY,USA JanvanLeeuwen,UtrechtUniversity,TheNetherlands VolumeEditors DrorG.Feitelson TheHebrewUniversity SchoolofComputerScienceandEngineering 91904Jerusalem,Israel E-mail:[email protected] LarryRudolph MassachusettsInstituteofTechnology LaboratoryforComputerScience Cambridge,MA02139,USA E-mail:[email protected] Cataloging-in-PublicationDataappliedfor DieDeutscheBibliothek-CIP-Einheitsaufnahme Jobschedulingstrategiesforparallelprocessing:7thinternational workshop;revisedpapers/JSSPP2001,Cambridge,MA,USA,June16,2001. DrorG.Feitelson;LarryRudolph(ed.).-Berlin;Heidelberg;NewYork; Barcelona;HongKong;London;Milan;Paris;Tokyo:Springer,2001 (Lecturenotesincomputerscience;Vol.2221) ISBN3-540-42817-8 CRSubjectClassification(1998):D.4,D.1.3,F.2.2,C.1.2,B.2.1,B.6,F.1.2 ISSN0302-9743 ISBN3-540-42817-8Springer-VerlagBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer-Verlag.Violationsare liableforprosecutionundertheGermanCopyrightLaw. Springer-VerlagBerlinHeidelbergNewYork amemberofBertelsmannSpringerScience+BusinessMediaGmbH http://www.springer.de ©Springer-VerlagBerlinHeidelberg2001 PrintedinGermany Typesetting:Camera-readybyauthor,dataconversionbySteingra¨berSatztechnikGmbH Printedonacid-freepaper SPIN:10840923 06/3142 543210 Preface This volume contains the papers presented at the seventh workshop on Job SchedulingStrategiesforParallelProcessing,whichwasheldinconjunctionwith the SIGMETRICS 2001 Conference in Cambridge, MA, on June 16, 2001. The papers have been through a complete refereeing process, with the full version being read and evaluated by five to seven members of the program commit- tee. We would like to take this opportunity to thank the program committee, Andrea Arpaci-Dusseau, Steve Chapin, Allen Downey, Wolfgang Gentzsch, Al- lan Gottlieb, Atsushi Hori, Richard Lagerstrom, Virginia Lo, Cathy McCann, Bill Nitzberg, Uwe Schwiegelshohn, Mark Squillante, and John Towns, for an excellent job. Thanks are also due to the authors for their submissions, pre- sentations, and final revisions for this volume. Finally, we would like to thank the MIT Laboratory for Computer Science and the School of Computer Science and Engineering at the Hebrew University for the use of their facilities in the preparation of these proceedings. Thiswastheseventhannualworkshopinthisseries,whichreflectsthecontin- uedinterestinthisfield.TheprevioussixwereheldinconjunctionwithIPPS’95 through IPDPS’00. Their proceedings are available from Springer-Verlag as vol- umes 949, 1162, 1291, 1459, 1659, and 1911 of the Lecture Notes in Computer Science series. The last three are also available on-line from Springer LINK. This year saw an emphasis on workloads and their effect, with two invited papers related to heavy tails, and a contributed paper presenting a comprehen- sive analysis of the workload on a large-scale SMP cluster. We also kept the tradition of having papers that describe real systems, with a much anticipated paperdescribingtheMAUIscheduler.Inaddition,wesawthematurationofthe fieldasreflectedbypapersthatdelvedintodeepertopics,suchastheinteraction of scheduling with memory management and communication, the use of SMP clusters, and considerations regarding what metrics to use. We hope you find these papers interesting and useful. August 2001 Dror Feitelson Larry Rudolph Table of Contents Performance Evaluation with Heavy Tailed Distributions ................... 1 M.E. Crovella SRPT Scheduling for Web Servers ......................................... 11 M. Harchol-Balter, N. Bansal, B. Schroeder, and M. Agrawal An Efficient and Scalable Coscheduling Technique for Large Symmetric Multiprocessor Clusters .............................. 21 A.B. Yoo and M.A. Jette Coscheduling under Memory Constraints in a NOW Environment ......... 41 F. Gin´e, F. Solsona, P. Herna´ndez, and E. Luque The Influence of Communication on the Performance of Co-allocation ..... 66 A.I.D. Bucur and D.H.J. Epema Core Algorithms of the Maui Scheduler ................................... 87 D. Jackson, Q. Snell, and M. Clement On the Development of an Efficient Coscheduling System ................ 103 B.B. Zhou and R.P. Brent Effects of Memory Performance on Parallel Job Scheduling ............... 116 G.E. Suh, L. Rudolph, and S. Devadas An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration ....................... 133 Y. Zhang, H. Franke, J.E. Moreira, and A. Sivasubramaniam Characteristics of a Large Shared Memory Production Workload ......... 159 S.-H. Chiang and M.K. Vernon Metrics for Parallel Job Scheduling and Their Convergence ............... 188 D.G. Feitelson Author Index ............................................................ 207 Performance Evaluation with Heavy Tailed Distributions (Extended Abstract)(cid:2) Mark E. Crovella Department of Computer Science Boston University 111 Cummington St. Boston MA USA 02215 [email protected] 1 Introduction Over the last decade an important new direction has developed in the perfor- mance evaluation of computer systems: the study of heavy-tailed distributions. Looselyspeaking,thesearedistributionswhosetailsfollowapower-lawwithlow exponent, in contrast to traditional distributions (e.g., Gaussian, Exponential, Poisson) whose tails decline exponentially (or faster). In the late ’80s and early ’90s experimental evidence began to accumulate that some properties of com- puter systems and networks showed distributions with very long tails [7,28,29], and attention turned to heavy-tailed distributions in particular in the mid ’90s [3,9,23,36,44]. To define heavy tails more precisely, let X be a random variable with cu- mulative distribution function F(x) = P[X ≤ x] and its complement F¯(x) = 1−F(x)=P[X >x]. We say here that a distribution F(x) is heavy tailed if F¯(x)∼cx−α 0<α<2 (1) for some positive constant c, where a(x) ∼ b(x) means limx→∞a(x)/b(x) = 1. This definition restricts our attention somewhat narrowly to distributions with strictlypolynomialtails;broaderclassessuchasthesubexponential distributions [19] can be defined and most of the qualitative remarks we make here apply to such broader classes. Heavy tailed distributions behave quite differently from the distributions more commonly used in performance evaluation (e.g., the Exponential). In par- ticular, when sampling random variables that follow heavy tailed distributions, the probability of very large observations occurring is non-negligible. In fact, under our definition, heavy tailed distributions have infinite variance, reflect- ing the extremely high variability that they capture; and when α ≤ 1, these distributions have infinite mean. (cid:2) ThisisarevisedversionofapaperoriginallyappearinginLectureNotesinComputer Science 1786, pp. 1–9, March 2000. D.G.FeitelsonandL.Rudolph(Eds.):JSSPP2001,LNCS2221,pp.1–10,2001. (cid:4)c Springer-VerlagBerlinHeidelberg2001 2 M.E. Crovella 2 Evidence The evidence for heavy-tailed distributions in a number of aspects of computer systems is now quite strong. The broadest evidence concerns the sizes of data objectsstoredinandtransferredthroughcomputersystems;inparticular,there is evidence for heavy tails in the sizes of: – Files stored on Web servers [3,9]; – Data files transferred through the Internet [9,36]; – Files stored in general-purpose Unix filesystems [25]; and – I/O traces of filesystem, disk, and tape activity [21,38,39,40] This evidence suggests that heavy-tailed distributions of data objects are widespread, and these heavy-tailed distributions have been implicated as an underlying cause of self-similarity in network traffic [9,30,35,44]. Next, measurements of job service times or process execution times in general-purposecomputingenvironmentshavebeenfoundtoexhibitheavytails [17,23,28]. Athirdareainwhichheavytailshaverecentlybeennotedisinthedistribu- tion of node degree of certain graph structures. Faloutsos et al. [14] show that theinter-domainstructureoftheInternet,consideredasadirectedgraph,shows aheavy-taileddistributionintheoutdegreeofnodes.Thesestudieshavealready influencedthewaythatInternet-likegraphtopologiesarecreatedforuseinsim- ulation[32,26].Anotherstudyshowsthatthesameistrue(withrespecttoboth indegree and outdegree) for certain sets of World Wide Web pages which form a graph due to their hyperlinked structure [1]; this result has been extended to the Web as a whole in [6]. Finally, a phenomenon related to heavy tails is the so-calledZipf’s Law [45]. Zipf’s Law relates the “popularity” of an object to its location in a list sorted by popularity. More precisely, consider a set of objects (such as Web servers, or Web pages) to which repeated references are made. Over some time interval, count the number of references made to each object, denoted by R. Now sort the objects in order of decreasing number of references made and let an object’s place on this list be denoted by n. Then Zipf’s Law states that R=cn−β for some positive constants c and β. In its original formulation, Zipf’s Law set β = 1 so that popularity (R) and rank (n) are inversely proportional. In prac- tice, various values of β are found, with values often near to or less than 1. Evi- denceforZipf’sLawincomputingsystems(especiallytheInternet)iswidespread [2,13,18,33]; a good overview of such results is presented in [5]. 3 Implications of Heavy Tails Unfortunately, although heavy-tailed distributions are prevalent and important in computer systems, their unusual nature presents a number of problems for performance analysis. Performance Evaluation with Heavy Tailed Distributions 3 The fact that even low-order distributional moments can be infinite means that many traditional system metrics can be undefined. As a simple example, consider the mean queue length in an M/G/1 queue, which (by the Pollaczek- Khinchin formula) is proportional to the second moment of service time. Thus, when service times are drawn from a heavy-tailed distribution, many properties of this queue (mean queue length, mean waiting time) are infinite. Observations likethisonesuggestthatperformanceanalystsdealingwithheavytailsmayneed toturntheirattentionawayfrommeansandvariancesandtowardunderstanding the full distribution of relevant metrics. Most early work in this direction has focused on the shape of the tail of such distributions (e.g., [34]). Some heavy-tailed distributions apparently have no convenient closed-form Laplace transforms (e.g., the Pareto distribution), and even for those distribu- tions possessing Laplace transforms, simple systems like the the M/G/1 must be evaluated numerically, and with considerable care [41]. In practice, random variables that follow heavy tailed distributions are char- acterized as exhibiting many small observations mixed in with a few large ob- servations. In such datasets, most of the observations are small, but most of the contribution to the sample mean or variance comes from the rare, large obser- vations. This means that those sample statistics that are defined converge very slowly. This is particularly problematic for simulations involving heavy tails, which many be very slow to reach steady state [12]. Finally, because arbitrarily large observations can not be ruled out, issues of scale should enter in to any discussion of heavy tailed models. No real system can experience arbitrarily large events, and generally one must pay attention to the practical upper limit on event size, whether determined by the timescale of interest, the constraints of storage or transmission capacity, or other system- defined limits. On the brighter side, a useful result is that it is often reasonable to substitute finitely-supported distributions for the idealized heavy-tailed dis- tributionsinanalyticsettings,aslongastheapproximationisaccurateoverthe range of scales of interest [16,20,22]. 4 Taking Advantage of Heavy Tails Despite the challenges they present to performance analysis, heavy tailed distri- butions also exhibit properties that can be exploited in the design of computer systems.Recentworkhasbeguntoexplorehowtotakeadvantageofthepresence of heavy tailed distributions to improve computer systems’ performance. 4.1 Two Important Properties In this regard, there are two properties of heavy tailed distributions that offer particular leverage in the design of computer systems. The first property is re- lated to the fact that heavy tailed distributions show declining hazard rate, and is most concisely captured in terms of conditional expectation: E[X|X >k]∼k

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.