edited by Walter M e r g er Division of Applied Mathematics Brown University Providence, Rhode Island with the collaboration of Uli Grenander Division of Applied Mathematics Brown University Providence, Rhode Island Barry H. Margolin Department of Statistics Yale University New Haven, Connecticut Rhett F. Tsao Thomas J. Watson Research Center, IBM York town Heights, New York Proceedings of a Conference held at Brown University, Providence, Rhode Island, November 22-23, 1971 under the Auspices of the Division of Applied Mathematics and the Center for Computer and Information Sciences, and supported by the Office of Naval Research Academic Press New York and London 1972 COPYRIGHT © 1972, BY ACADEMIC PRESS, INC. ALL RIGHTS RESERVED NO PART OF THIS BOOK MAY BE REPRODUCED IN ANY FORM, BY PHOTOSTAT, MICROFILM, RETRIEVAL SYSTEM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS. ACADEMIC PRESS, INC. Ill Fifth Avenue, New York, New York 10003 United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NW1 LIBRARY OF CONGRESS CATALOG CARD NUMBER: 72-77727 PRINTED IN THE UNITED STATES OF AMERICA AUTHORS, DISCUSSANTS, AND SESSION CHAIRMEN Harold A. Anderson, Jr., Thomas J. Watson Research Center, IBM, Yorktown Heights, New York 10598 F. J. Anscombe, Department of Statistics, Yale University, New Haven, Connecticut 06520 Sant R. Arora, Sperry Rand Corporation, Univac Data Processing Division, Roseville, Minnesota 55113 /. L. Baer, Computer Science Group, University of Washington, Seattle, Washington 98195 Y. Bard, IBM Scientific Center, Cambridge, Massachusetts 02139 H. Beilner, Department of Computer Sciences, University of Stuttgart, Stuttgart, Germany L. A. Belady, Department of Computer Science, University of California, Berkeley, California 94720 L. W. Comeau, IBM Federal Systems Division, Gaithersburg, Maryland 20760 Cuthbert Daniel Box 150, R.D. 2, Rhinebeck, New York 12572 Marvin Denicoff, Office of Naval Research, Arlington, Virginia 22217 John M. Feeley, Computer Sciences Corporation, Pasadena, California 91107 Walter Freiberger, Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912 H. P. Friedman, IBM Systems Research Institute, New York, New York 10017 /. Gerald, University of Chicago, Chicago, Illinois 61822 ix AUTHORS, DISCUSSANTS, AND SESSION CHAIRMEN Ulf Grenander, Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912 Jerrold M. Grochow, Office of the Director of Information Processing Services, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 D. Hatfield, IBM Scientific Center, Cambridge, Massachusetts 02139 H. Hellerman, State University of New York, Binghamton, New York 13901 Z. Jelinski, McDonnell Douglas Astronautics Co., Huntington Beach, California 92647 Edwin R. Lassettre, IBM Systems Development Division, Poughkeepsie, New York 12602 M. M. Lehman, Thomas J. Watson Research Center, IBM, Yorktown Heights, New York 10598 P. A. W. Lewis, Naval Postgraduate School, Monterey, California 93940 Barry H. Margolin, Department of Statistics, Yale University, New Haven, Connecticut 06520 P. Moranda, McDonnell Douglas Astronautics Co., Huntington Beach, California 92647 /. F. Mount, National Cash Register Company, Dayton, Ohio 45401 R. G. Munck, Computing Laboratory, Brown University, Providence, Rhode Island 02912 R. P. Parmelee, IBM Scientific Center, Cambridge, Massachusetts 02139 Emanuel Parzen, Department of Statistics, State University of New York, Buffalo, New York 14214 M. P. Racite, IBM Systems Development Division, Poughkeepsie, New York 12002 N. Rasmussen, IBM Scientific Center Complex, White Plains, New York 10601 G. R. Sager, Computer Science Group, University of Washington, Seattle, Washington 98195 x AUTHORS, DISCUSSANTS, AND SESSION CHAIRMEN Robert G. Sargent, Department of Industrial Engineering, Syracuse University, Syracuse, New York 13210 M. Schatzoff, IBM Scientific Center, Cambridge, Massachusetts 02139 Allan L. Sehen, IBM Systems Development Division, Poughkeepsie, New York 12002 Akira Sekino, Project MAC, Cambridge, Massachusetts 02139 Martin L. Shooman, Department of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 K. V. Suryanarayana, Fayetteville State University, Fayetteville, North Carolina 28301 W. Timlake, IBM Scientific Center, Cambridge, Massachusetts 02139 Rhett F. Tsao, Thomas J. Watson Research Center, IBM, Yorktown Heights, New York 10598 G. Waldbaum, Thomas J. Watson Research Center, IBM, Yorktown Heights, New York 10598 Peter Wegner, Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912 Frank L. Wu, Sperry Rand Corporation, Univac Data Processing Division, Roseville, Minnesota 55113 A. C. Yeh, IBM Systems Development Division, Poughkeepsie, New York, 12002 P. C Yue, Thomas J. Watson Research Center, IBM, Yorktown Heights, New York 10598 XI PREFACE It was the purpose of this conference to investigate a new and promising field in computer science: the application of quantitative, and particularly statistical, methods to the study of computer performance. The present state of the art of the field is somewhat paradoxical: there is, on the one hand, a wealth of data, gathered, usually, in rather haphazard fashion; and there are, on the other hand, theoretical models, mostly of queueing theory type, whose stringent assumptions are not examined and whose conclusions are not verified. We have tried to exclude both types of papers and to accept only those that dealt with real data in a reasonably sophisticated manner. The organizing committee of the conference consisted of Professor Walter Freiberger (Chairman), Brown University, Professor Ulf Grenander, Brown University, Professor Barry H. Margolin, Yale University, and Dr. Rhett F. Tsao, Thomas J. Watson Research Center, IBM. We are very grateful to the Office of Naval Research, particularly to Dr. Robert J. Lundegard, Director of its Mathematical and Information Sciences Division, and to Dr. Marvin Denicoff, Director of its Information Systems Program, for their advice and support which made the conference possible. Mrs. Katrina Avery edited and typed the entire manuscript with dedication, skill, and dispatch. We are, of course, particularly indebted to the speakers, discussants, and chairmen who made the conference a success, and who helped open up a field which will not only enhance our understanding of computer systems but possibly also contribute to statistical methodology. Walter Freiberger Professor of Applied Mathematics Director, Center for Computer and Information Sciences Brown University xin FOREWORD We have lived through two decades in which challenges for measures and proofs of computer system efficiency were thrust aside, or nimbly avoided by relentless and even dazzling innovation. As response to those nasty questions about efficiency, we provided more speed, larger memories, 1st, 2nd, and 3rd generation hardware; paging, time-sharing, multiprogram ming, multiprocessing, the progression from assembly through natural lan guage. We performed. Certainly, we did perform. But, I would suggest that this conference symbolizes the fact that, at long last, we are about to get down to the serious business of measuring that performance. The time has come to raise questions, as this conference will, of how we go about defining not simply what the measures are, but what should be measured, and how the measuring should be done. The legacy of the past is frightening complexity as well as breathtaking development. From whose standpoint do we evaluate efficiency—the pro grammer, the individual user, the manager of the computer installation, the functional organization? In what time-frame do we apply our measurement instruments—a single day, a month, a year of computation? Are we concerned about optimizing the throughput of one job, some unique combination of jobs, all of the jobs? Are these concerns and interests even compatible? Additional to these global questions, there remains the practical matter of measuring the utilization of computing resources; this interest is tremen dously complicated by the advent of the multiprogramming, multiprocessing environment. In opposition to the philosophy that inherent to these or any hardware or software innovation was the promise of greater efficiency—we are beginning to see doubts and even suspicions of inefficiency for these same developments. The managers once again are breathing down our necks. The time has come to bring in the statisticians and the mathematicians. I am pleased to be in the company of such brave men. Marvin Denicoff Program Director Informative Systems Office of Naval Research xv QUANTITATIVE METHODS FOR EVALUATING COMPUTER SYSTEM PERFORMANCE: A REVIEW AND PROPOSALS U. Grenander and R.F. Tsao ABSTRACT: The purpose of this paper is to appraise the state of the art of computer system evaluation and to review some quantitative methods (i.e., analytical, simulation and empirical methods) which are applicable to the problem. The main theme of the paper is that statistical techniques will have to be used to reach satisfactory results. I. INTRODUCTION: STATE OF THE ART 1.1 The problems: comparing, tuning and designing The problem with which we are concerned is of more recent origin than the digital computer. In the early days of computer technology, the question of evaluating a compu ter reduced to finding some of the physical parameters of the machine, such as the speed of fundamental arithmetic operations, memory capacity and I/O limitations. As compu ter technology advanced and the architecture of the systems tended to become more complex, it gradually became clear that one or a few figures of merit like the ones mentioned would no longer suffice for evaluating computing systems. One then faced a formidable problem and, although it has received much attention in the literature, no definitive conclusions have yet been reached as to the solution. Much controversy surrounds this question and it is the purpose of this paper to appraise the state of the art and to evaluate the methods that can be used. The main theme of the paper is that modern statistical methodology offers us powerful tools enabling us to reach satisfactory results. Let us first take a look at what constitutes the prob lem. We shall describe it on three different levels. Say, to begin with, that we are given two systems and we simply want to compare them from the point of view of a potential buyer. Physical parameters measuring speed and 3 U. GRENANDER AND R. F. TSAO capacity are certainly necessary but they are not sufficient, and the reason for this is simply that the value of the sys tem to a particular owner depends to a high degree on what sort of workload he will put on it. It is not possible to base the comparison only on such parameters since the per formance of the system cannot be evaluated in a vacuum with out taking the workload into account. The difficulty, or at least a part of it, lies in characterizing workloads and relating them to the characteristics of the system. On the next level we start from a given system and wish to tune it: i.e., to change algorithms or other hardware and/or software components in such a way that higher effic iency is obtained. The performance of a system is typically a multiparameter phenomenon and it becomes very complicated when the interrelationship among these system parameters has to be taken into account. Again, the workload appears as a crucial part of the problem. Therefore, it is far from easy to advance a single convincing numerical criterion as a measure of the efficiency. We need a set of rationally selected criteria for the performance with meaningful econ omic interpretation. On the third, and highest, level of ambition, we start already with the design of the system. An additional com plexity is introduced at this level by the fact that in the process of designing a system, there exist enormous numbers of alternative decisions. It is often difficult, if not im possible, to trace the effect of a particular design deci sion on the overall performance of a system after it is built. A thorough analysis should lead to a method of pre dicting changes in efficiency of a system due to changes in its design. Until now, the emphasis at this stage has been on making the system work with less attention paid to effic iency, but the attitude, both of the computer manufacturers and of the buyers, is clearly changing. There is no doubt that a more systematic and quantitative approach is needed. It would be too optimistic to hope for a drastic break through on all of the three levels via quantitative method ology alone. We shall, however, show how some significant improvements in methodology can help at each level (i.e., comparison, tuing and design) of system performance evalua tion. Before turning to this, however, it will be useful to look at some ways in which the problem has been approached. 4 COMPUTER PERFORMANCE EVALUATION 1.2 Current approach to evaluation: an appraisal There are quite a few papers presenting a general survey and review of computer system performance evaluation; e.g., Calingaert [4] and Drummond [9]. The main purpose of this section, however, is to present our personal views on the appraisal of current approaches and to set the stage for dis cussion of some quantitative methods in the next section. Our appraisal of current approaches to evaluation can be summarized as follows : (1) There appear to be two divergent efforts, which we shall refer to as analytical studies and system-oriented studies. Analytical studies include, in the most part, the probabilistic model building of queueing theory. System- oriented studies are mainly concerned with the actual meas urement and simulation of a real computing system. There is no question that both efforts have been proven worthwhile, and we shall discuss them at length in the next section. There exists, however, a large gap between the complexity of the models theoretically analyzable and that of even the simplest real systems. It is questionable if the actual design of a real system has benefited from the analytical studies of oversimplified models. On the other hand, for some reasons which will become clear later, we doubt that the measurement and simulation activities of a particular system have improved significantly the general understanding of computer systems. (2) As computer systems become more complex, the choice of criteria for evaluating systems becomes more difficult. Some effort has been made in the search for reasonable cri teria either in tuning or comparing systems, e.g., Stimler [22] has proposed six criteria for time-sharing system per formance, Schatzoff et al. [19] have applied another set of six measures for comparing time-sharing and batch-processing systems. It is clear from these studies that by its very nature the evaluation of system performance is a multidimen sional problem. However, most efforts stop short of a quan titative investigation of the trade-off among these perfor mance measures. The measures themselves are usually in terms of averages whose use is often necessitated by the primitive stage of the methodology applied. For example, one could argue that, from a time-sharing-user viewpoint, the variance of the response time is as important as the mean of the response time. One common characteristic which 5