ebook img

Computer Architecture Technology Trends PDF

53 Pages·1991·9.738 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computer Architecture Technology Trends

COMPUTER ARCHITECTURE TECHNOLOGY TRENDS SECOND EDITION SEPTEMBER 1991 J k ARCHITECTURE DISTRIBUTED OUTSIDE THE USA/CANADA BY: ELSEVIER ADVANCED TECHNOLOGY J f _ jl TECHNOLOGY MAYFIELD HOUSE ^ ^ ■k CORPORATION ci c c u i cn 256BANBURYROAD t L b t V l tK OXFORD OX27DH ^ ■1 SPECIALISTS IN COMPUTER ARCHITECTURE ADVANCED UNITED KINGDOM P.O. BOX 24344 · MINNEAPOLIS, MINNESOTA 55424 · (612) 935-2035 TECHNOLOGY © Copyright 1991 Architecture Technology Corporation. All rights reserved. No part of this publication may be reproduced, photocopied, stored on a retrieval system, or transmitted without the express prior written consent of the publisher. COMPUTER ARCHITECTURE TECHNOLOGY TRENDS SECOND EDITION SEPTEMBER 1991 J k ARCHITECTURE DISTRIBUTED OUTSIDE THE USA/CANADA BY: ELSEVIER ADVANCED TECHNOLOGY J f _ jl TECHNOLOGY MAYFIELD HOUSE ^ ^ ■k CORPORATION ci c c u i cn 256BANBURYROAD t L b t V l tK OXFORD OX27DH ^ ■1 SPECIALISTS IN COMPUTER ARCHITECTURE ADVANCED UNITED KINGDOM P.O. BOX 24344 · MINNEAPOLIS, MINNESOTA 55424 · (612) 935-2035 TECHNOLOGY © Copyright 1991 Architecture Technology Corporation. All rights reserved. No part of this publication may be reproduced, photocopied, stored on a retrieval system, or transmitted without the express prior written consent of the publisher. DISCLAIMER Architecture Technology Corporation makes no representations or warranties with respect to the Contents hereof and specifically disclaims any implied warranties of merchantability of fitness for any particular purpose. Further, reasonable care has been taken to ensure the accuracy of this report, but errors and omissions could have occurred. Architecture Technology assumes no responsibility for any incidental or consequen tial damages caused thereby. Further, Architecture Technology Corporation reserves the right to revise this guide and to make changes from time to time in the content thereof without obligation to notify any person or organization of such revision or changes. This disclaimer applies to all parts of this document. Computer Architecture Technology Trends List of Figures Figure 1: Flynn Machine Organization 3 Figure 2: Skillicorn Classification Method 4 Figure 3: Skillicorn Architecture Classification 5 Figure 4: SISD Machine 6 Figure 5: SISD Machine State Diagram 7 Figure 6: Harvard/Princeton Class Computers 8 Figure 7: Type 1 Array Processor 9 Figure 8: Type 2 Array Processor 10 Figure 9: Tightly Coupled Processor 12 Figure 10: Loosely Coupled Multiprocessor 12 Figure 11: High-level Taxonomy of Parallel Computer Architectures 15 Figure 12: Systolic Flow of Data From and To Memory 16 Figure 13: MIMD Distributed Memory Architecture Structure 16 Figure 14: MIMD Interconnection Network Topologies 17 Figure 15: Graph Reduction Machine 18 Figure 16: Graph Reduction Machine 19 Figure 17: Dataflow Machine 20 Figure 18: System Classification 23 Figure 19: Arithmetic Processors 26 Figure 20: Block Diagram of the PIPE Machine 28 Figure 21: One-to-many Communications 31 Figure 22: Inter- and Intragroup Communications 32 Figure 23: Many-to-many (Group-to-group) Communications 32 Figure 24: Communication Pattern for Overlapping Groups 33 Figure 25: Multiprocessor Interconnection Structures 37 ii Computer Architecture Technology Trends 1. Introduction This report is about the trends which are taking place in the architecture of computing systems. It serves as an introduction into a series of reports detailing the recent advances in computing. Here, we provide insight into the fundamentals of computer architecture - what it is, how it is applied to fit a particular problem definition, and where the future leads, given current trends in computing architecture. Webster defines architecture as "the art of science of building; specific: the art of practice of designing and building structures". The connotation of architecture as art is particularly appropriate in the context of computing systems and their requirements. For some, computer architecture is the interface seen by the machine language programmer, namely, the instruction set. For others, computer architecture is a subset of computer systems organization (such as the classification given in ACM Computing Reviews). We shall view architecture in a two-tiered hierarchical fashion, first classifying computing engine architectures, then building upon this classification to encompass system level architectures. This is a rather broad view of computer architecture which encompasses structure, organization, implementation, and performance metrics. The differences between these attributes can be subtle; structure being the interconnection of the various hardware elements of a computer system; organization, the dynamic interactions and management of the elements; implementation, the design of specific pieces; and performance relating to the behavior of the system. The selection of an appropriate computing architecture for any given processing element is dependant upon a number of factors - these are introduced in section 4 of the paper. This is an area which for years has lacked any scientific discipline. Typically, one hears of all sorts of figures measured in MIPS and MFLOPS. Granted, there are several features common to computers which contribute to a systems performance. These include the number and size/speed of central processing units, the amount of real memory, the amount of virtual memory (if any), disk capacity and I/O speed, and number and speed of internal busses. However, these technical figures of merit address only the smallest amount of the overall picture required to make an informed decision on which architecture best fits the problem at hand. While there is no magic formula one can apply in answering the question even today, there is a basic set of guiding principles which can insure a more objective decision making process. Finally, we survey the current trends in computer architecture. Due to the sheer number of different applications to which computers are being applied today, there seems no end to the different adaptations which proliferate. There are, however, some underlying trends which appear. Decision makers should be . aware of these trends when specifying architectures, particularly for future applications. The degree to which these trends affect individual system designs varies. Highly specialized applications, such as embedded systems, may not be as concerned with these trends since the primary objective of such systems is to pay a minimum cost for sufficient performance in a given market window. Yet, even in such circum stances, the need to offer scalable solutions may be an important factor which requires some understanding of these trends. By introducing the basic parlance of computer architecture and process by which an architecture is selected, industry trends take on new meaning. This baseline of knowledge is applicable to all of the market segments which make up the computing industry today. 1 Computer Architecture Technology Trends 2. Machine Architecture The description of a computing machine's architecture describes organizational aspects which allow us to describe its operation and compare it to other machines. An architecture consists o fa machine organiza­ tion and an instruction set; it resolves the division of labor between hardware and software. The classifica­ tion of this architecture is useful for three primary reasons. First, it helps us to understand what has already been achieved. Second, it reveals possible configurations which may have not originally occurred to designers. Third, it allows useful models of performance to be built and tested. In very simple terms, a computing machine can be thought as applying a sequence (stream) of instructions to a sequence (stream) of data. To achieve better performance, it is necessary at some point to find ways to do more than one thing at a time within this machine. In order to define and better understand the parallelism possible within computing machines, Flynn1 categorized machine organization into a generally accepted set based on instruction and data stream multiplicity. Flynn allows for both single and multiple data and instruction streams giving rise to four categories of architectures (Figure 1). Single Data Multiple Data 1 Stream (SD) Stream (MD) Single SISD SIMD Instruction Stream (SI) uniprocessor parallel processor 1 Multiple Instruction MISD MIMD Stream (Ml) multiprocessor Figure 1: Flynn Machine Organization Recently, Skillicornz further extended this taxonomy in order to categorize and relate the growing variety of multiprocessors. His classification scheme involves four abstraction levels (Figure 2). The highest level classifies the model of computation being used. Most computing architectures to date have used the traditional Von Neuman machine model of computation. This type of machine is based upon the sequential execution of threads of instructions operating upon data stored in named locations. There are, however, other models of computation possible (with corresponding machines in existence) which we will discuss later in this paper. The next (second) level addressed is the abstract machine level - this level is roughly equivalent to Flynn's classification. Here, machines are classified using four types of functional units as building blocks. These units consist of: 1. An instruction processor (IP) which is a functional unit which acts as an interpreter of instructions. This device is sometimes referred to as the control unit within a processor. 3 Computer Architecture Technology Trends Model of computation I Abstract machine model No. of instruction processors No. of data processors connection structure I Performance model simple or pipelined state diagram I Implementation model Implementation technology speed Figure 2: Skillicorn Classification Method 2. A data processor (DP) which is a functional unit that transforms data, as in the executor of an arithmetic operation. The DP can be considered an arithmetic-logic unit or ALU. 3. A memory hierarchy which is an intelligent device that passes data to and from the instruction and data processors. The real world analogy to this device is the use of a memory cache, layered above main memory, layered above disk storage and any other types of I/O. 4. A switch, which is an abstract device that acts as a connector between other functional units. A switch may be one of four types: a) 1-to-l - this type of switch acts as a conductor for information between any two functional units. Information may flow in either direction. b) n-to-n - this type of switch is a 1-to-l connection replicated n times. c) 1-to-n - in this configuration, 1 functional unit connects to n functional units. d) n-by-n - in this configuration, each functional unit can communicate with any other functional unit. The third abstraction level in the taxonomy addresses machine implementations. This is not used to differentiate the physical implementations of machines, but rather to illustrate machine operation through the use of state diagrams. Skillicorn proposes a fourth (and lowest) abstraction level in this taxonomy in which the physical implementations of a machine are qualified. 4 Computer Architecture Technology Trends The additional SIMD and MIMD classes identified in Skillicorns taxonomy gives rise to twenty-eight possible architectures as shown in Figure 3. Class IPs DPs IP-DP IP-DM DP-DM DP-DP Name Flynn Class 1 0 1 none none 1-1 none reduct/dataflow uniprocessor 2 0 n none none n-n none separate machines 3 0 n none none n-n nxn loosely coupled reduct/dataflow 4 1 0 n none none nxn none tightly coupled reduct/dataflow 5 0 n none none nxn nxn 6 1 1 1-1 1-1 1-1 none von Neumann uniprocessor SISD 7 1 n 1-n 1-1 n-n none SIMD 8 1 n 1-n 1-1 n-n nxn Type 1 array processor SIMD 9 1 n 1-n 1-1 nxn none Type 2 array processor SIMD 10 1 n 1-n 1-1 nxn nxn SIMD 11 n 1 1-n n-n 1-1 none MISD 12 n 1 1-n n-n 1-1 none MISD 13 n n n-n n-n n-n none separate von Neumann uniprocessors 14 n n n-n n-n n-n nxn loosely coupled von Neumann MIMD 15 n n n-n n-n nxn none tightly coupled von Neumann MIMD 16 n n n-n n-n nxn nxn MIMD 17 n n n-n nxn n-n none MIMD 18 n n n-n nxn n-n nxn MIMD 19 n n n-n nxn nxn none Denelcor Heterogeneous Element Processor MIMD 20 n n n-n nxn nxn nxn MIMD 21 n n nxn n-n n-n none 22 n n nxn n-n n-n nxn 23 n n nxn n-n nxn none 24 n n nxn n-n nxn nxn 25 n n nxn nxn n-n none 26 n i n nxn nxn n-n i nxn 27 n n nxn nxn nxn none 28 n n nxn nxn nxn nxn Figure 3: Skillicorn Architecture Classification 2Λ Machine Organization The following sections will begin by discussing the four architecture types within Flynn's classification. As mentioned earlier, this classification corresponds to the second highest level of architecture classification presented by Skillicorn. Within each of the four sections, individual architectures will be presented utilizing Skillicorn's taxonomy as it provides a finer level of detail between the various SIMD and MIMD architectures. A fifth and final section is presented for machine architectures which do not fit well within Flynn's classification; these are the machines addressed by Skillicorn's highest level of taxonomy - the model of computation. Skillicorn has also shown that locality-based computation, the foundation for an architecture-independent programming language grounded in the Bird-Meertens formalism, shows that architecture-independent parallel programming is possible.3 This is an important result as users try to exploit advanced parallel architectures. 2..1.1 SISD The SISD or single instruction, single data stream architecture is the simplest of the four machine types in Flynn's classification. It consists of a single instruction stream and a single data stream. This is represent­ ed using a single IP, a single DP, and two memory hierarchies as arranged in Figure 4. In operation, the IP repeatedly provides the instruction memory hierarchy with the address of the instruction desired, then fetches the instruction from the instruction memory hierarchy. The IP then sends operational commands to the DP, also specifying the address of the operands within the data memory 5 Computer Architecture Technology Trends Instructions DP I IP / v. State ^Λ Operands Addresses Data Instruction memory memory hierarchy hierarchy Figure 4: SISD Machine hierarchy. The DP, after performing the operations, provides the IP with result codes (see Figure 5). The usefulness of Skillicorns fourth level of taxonomy becomes apparent when we examine two types of architecture (SISD in our case) which differ in implementation. The Harvard class computer utilizes dedicated address and operand busses to increase throughput. The Princeton class computer operates with a single non-dedicated bus (Figure 6). The SISD architecture typifies most conventional computing equipment in use today. To date, there is no better architecture for general-purpose problem solving than the SISD. However, given a certain level of technology, the only way to increase performance is through some method of parallelism to take place in the computing process. Performance requires parallelism; and generality restricts the needs of special purpose computing. The performance of an SISD machine can be improved by overlapping certain operations taking place within the machine. This process is commonly referred to as pipelining. Consider the state diagrams presented in Figure 5. If the number of stages in a state diagram is n, and the time required to execute each stage is t, then the time required to execute an average instruction is n*t. If we allow more than one token to be present within a state diagram at a time, given that no more than one token is present at any given stage at any one time, then we have introduced pipelining. The time required to execute each instruction is still t. However, the average rate of instructions which are being completed is 1/t, or, an n- fold increase in throughput (given that n tokens are always in process). Such a performance scenario is ideal, and actual performance is limited by a number of issues. These issues include the degradation due to branch instructions (instructions in process may have to be flushed due to a branch condition), I/O latency problems, and the relative power of different processor instruction sets. Several metrics regarding SISD performance have been developed. Flynn4 characterized the performance, or number of instructions executed per unit time, of an overlapped SISD machine based upon the amount of instruction stream turbulence introduced by conditional branching: perf. = (J / (L * t)) * (1 / [1 + p(J - N - 1)]) where: 6 Computer Architecture Technology Trends Figure 5: SISD Machine State Diagram L = average instruction execution time. t = single instruction decode time. J = # of instructions in a single instruction stream that are being processed during the latency time for one instruction (stream inertia factor). In other words, the number of instructions being operated upon during one L. 7

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.