USOO8904398B2 (12) United States Patent (10) Patent N0.2 US 8,904,398 B2 Chung et al. (45) Date of Patent: Dec. 2, 2014 (54) HIERARCHICAL TASK MAPPING 8,127,300 132* 2/2012 Arimilli 61 al. ............. .. 718/105 2006/0101104 A1* 5/2006 Bhanot et a1. .. 708/105 (75) Inventors: I-Hsin Chung, Chappaqua, NY (U S); 2006/0112297 A1* 5/2006 Davidson . . . . . . . . . . . . . . . . .. 714/2 2007/0255836 A1* 11/2007 Takahashi et a1 709/226 David J. Klepacki, New PaltZ, NY (U S); 2009/0031316 A1* 1/2009 RichouX ...... .. 718/102 Che-Rung Lee, PingTung (TW); 2009/0328047 A1* 12/2009 Li et al. ........ .. 718/102 Hui-Fang Wen, Chappaqua, NY (US) 2011/0258629 A1* 10/2011 DantZig et al. 718/100 2011/0321056 A1* 12/2011 Branson et al. 718/105 (73) Assignee: International Business Machines 2012/0102500 A1* 4/2012 Waddington et al. 718/104 Corporation, Armonk, NY (US) 2012/0174117 A1* 7/2012 Jula et al. .................... .. 718/105 ( * ) Notice: Subject to any disclaimer, the term of this OTHER PUBLICATIONS patent is extended or adjusted under 35 Adiga et al., “IBM Research ReportiAn Overview of the U.S.C. 154(b) by 213 days. BlueGene/L Supercomputer”, RC22570 (W0209-033) Sep. 10, (21) Appl.No.: 13/413,286 2002* Moon et al., Analysis of the Clustering Properties of the Hilbert (22) Filed: Mar. 6, 2012 Space-Filling Curve, IEEE Transactions on Knowledge and Data Engineering, vol. 13, N0. 1, Jan. 2001.* (65) Prior Publication Data Kaddoura et al., “Partitioning Unstructured computational graphs for nonuniform and adaptive environments”, IEEE, Fall 1995* US 2012/0254879 A1 Oct. 4, 2012 Chung et al., “Automated Mapping of Regular Communication Graphs 0n Mesh Interconnects,”, IEEE, 2010.* Related US. Application Data (Continued) (60) Provisional application No. 61/470,232, ?led on Mar. 31, 2011. Primary Examiner * Lewis A Bullock, Jr. (51) Int. Cl. G06F 15/76 (2006.01) Assistant Examiner * Kevin X Lu G06F 9/46 (2006.01) (74) Attorney, Agent, or Firm * Scully, Scott, Murphy & G06F 15/1 73 (2006.01) Presser, P.C.; Daniel P. Morris, Esq. G06F 9/50 (2006.01) (52) vs. C]. (57) ABSTRACT CPC .................................. .. G06F 9/5066 (2013.01) USPC ........... .. 718/102; 718/104; 718/105; 712/12; Mapping tasks to physical processors in parallel computing 712/13; 709/226 system may include partitioning tasks in the parallel comput (58) Field of Classi?cation Search ing system into groups of tasks, the tasks being grouped CPC ................................................... .. G06F 9/4881 according to their communication characteristics (e.g., pat See application ?le for complete search history. tem and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and ?ne (56) References Cited tuning, by the processor, the mapping within each of the U.S. PATENT DOCUMENTS groups. 6,292,822 Bl”< 9/2001 Hardwick ................... .. 718/105 6,456,996 B1 * 9/2002 Crawford et a1. ................... .. 1/1 24 Claims, 4 Drawing Sheets PARTITION TASKS INTO SUPERNODES /102 ‘7 MAP SUPERNODES TO PHYSICAL /104 PROCESSORS V FINETUNEMAPPING /106 US 8,904,398 B2 Page 2 (56) References Cited G. Bhanaot et al. “Optimizing task layout on the Blue Gene/ L supercomputer” IBM J Res vol. 49, No. 23 Mar./May 2005. OTHER PUBLICATIONS Sangman Moh et al. “Mapping Strategies for Switch-Based Cluster Systems of Irregular Topology” Parallel and Distributed Systems, Rami G. Melhem et al., “Embedding Rectangular Grids into Square 2001. ICPADS 2001. Proceedings. Eighth International Conference Grids With Dilation Two”, IEEE Transactions on Computers , vol. 39, on Jun. 26-29, 2001 Kyongju City , South Korea. No. 12, Dec. 1990, pp. 1446-1455. Jesper Larsson Traff, “Implementing the MPI Process Topology John A. Ellis, “Embedding Rectangular Grids into Square Grids” Mechanism” C&C Research Laboratories, Proceedings of the ACM/ IEEE Transactions on Computers, vol. 40, No. 1 Jan. 1991, pp. 46-52. IEEE Conference on Supercomputing, 2002. Moore et al., “Transactions of the American Mathematical Society”, Abhinav Bhatele et al. “A Case Study of Communication Optimiza American Mathematical Society, vol. 1, No. 1, Jan. 1900, pp. 72-90. tions and 3D Mesh Interconnects” Deptrnart of Computer Science, Hao Yu, et al. “Topology Mapping for Blue Gene/L Supercomputer” Urbana II. US, Aug. 2009. Proceedings of the 2006 ACIVUIEEESC/OC Conference, Nov. 2006, Richard W. Vuduc and Hyun-Jin Moon “Fast sparse matrix-vector Tampa, Florida, US. multiplication by exploiting variable block structure” Lawrence Masood Ahmed, et al. “Mapping With Space Filling Surfaces” IEEE Livermore National Lab/University of California, Jul. 5, 2008. Transactions on Paralelle and Distributed Systems vol. 18, No. 9 Sep. 2007 pp. 1258-1269. * cited by examiner US. Patent Dec. 2, 2014 Sheet 1 0f4 US 8,904,398 B2 PARTITION TASKS INTO SUPERNODES /102 V MAP SUPERNODES TO PHYSICAL /104 PROCESSORS V FINETUNE MAPPING /106 FIG. 1 US. Patent Dec. 2, 2014 Sheet 2 0f 4 US 8,904,398 B2 FIG. 2 US. Patent Dec. 2, 2014 Sheet 3 0f4 US 8,904,398 B2 FIG. 3A FIG. SB US. Patent Dec. 2, 2014 Sheet 4 0f4 US 8,904,398 B2 PROCESSORS U + MEMORY /16 12\ TASK 10 MAPPING / MODULE “ = = STORAGE 18 14\ = "= NETWORKADAPTOR SYSTEM v \ 20\ no 22 INTERFACE(S) 24 V 26\ DEVICE(S) 28 FIG. 4 US 8,904,398 B2 1 2 HIERARCHICAL TASK MAPPING graph-partitioning based for embedding and Sangman Moh, Chansu Yu, Dongsoo Han, Hee Yong Youn, and Ben Lee. CROSS-REFERENCE TO RELATED Mapping strategies for switch-based cluster systems of APPLICATIONS irregular topology. In 8th IEEE International Conference on Parallel and Distributed Systems, Kyongju City, Korea, June This application claims the bene?t of US. Provisional 2001, describes embedding techniques for switch-based net Application No. 61/470,232 ?led on Mar. 31, 201 1, which is work. incorporated by reference herein in its entirety. BRIEF SUMMARY FIELD A method for mapping tasks to physical processors in The present application relates generally to computers, and parallel computing system in a hierarchical manner may be computer applications, parallel computing and more particu provided. In one aspect, a method for mapping tasks to physi larly to task mapping in parallel computing systems. cal processors in parallel computing may include partitioning tasks in the parallel computing system into groups of tasks, BACKGROUND the tasks being grouped according to their communication pattern and frequency. The method may also include mapping As the high performance computing systems scale up, the groups of tasks to groups of physical processors, respec mapping the tasks of a parallel application onto physical tively. The method may further include ?ne tuning the map processors to allow ef?cient communication becomes one of 20 ping of tasks to processors within each of the groups. the challenging problems. Many mapping techniques have A system for mapping tasks to physical processors in par been developed to improve the application communication allel computing system, in one aspect, may include a module performance. First, graph embedding has been studied and operable to execute on a processor, and further operable to applied to optimize very large scale integrated (V LSI) cir partition tasks in the parallel computing system into groups of cuits. See, e.g., John A. Ellis. Embedding rectangular grids 25 tasks, the tasks being grouped according to their communi into square grids. IEEE Trans. Comput., 40(1):46-52, 1991; cation pattern and frequency. The module may be further Rami G. Melhem and Ghil-Young Hwang. Embedding rect operable to map the groups of tasks to groups of physical angular grids into square grids with dilation two. IEEE processors, respectively. The module may be further operable Trans. Comput., 39(12): 1446-1455, 1990. The graph embed to ?ne tune the mapping of tasks to processors within each of ding for VLSI circuits tries to minimize the longest path. 30 the groups. Second, space ?lling curves (See, e.g., Space-Filling A computer readable storage medium storing a program of Curves. Springer-Verlag, 1994) are applied to map parallel instructions executable by a machine to perform one or more programs onto parallel computing systems. The use of space methods described herein also may be provided. ?lling curves to improve proximity for mapping is well stud Further features as well as the structure and operation of ied and has found useful in parallel computing. The paper, 35 various embodiments are described in detail below with ref Masood Ahmed and Shahid Bokhari. Mapping with space erence to the accompanying drawings. In the drawings, like ?lling surfaces. IEEE Trans. Parallel Distrib. Syst., 18:1258 reference numbers indicate identical or functionally similar 1269, September 2007, extends the concept of space ?lling elements. curves to space ?lling surfaces. It describes three different classes of space ?lling surfaces and calculates the distance 40 BRIEF DESCRIPTION OF THE SEVERAL between facets. VIEWS OF THE DRAWINGS There are methods using graph-partitioning and search based optimization to solve the mapping problem. For FIG. 1 is a ?ow diagram illustrating a method of the present example, G. Bhanot, A. Gara, P. Heidelberger, E. Lawless, J. disclosure in one embodiment. C. Sexton, and R. Walkup. Optimizing task layout on the blue 45 FIG. 2 illustrates an example of the Moore’s space ?lling gene/l supercomputer. IBM Journal of Research and Devel curve constructed recursively. opment, 49(2):489-500, March 2005, uses an off-line simu FIGS. 3A and 3B show two examples of mapping 64 tasks lated annealing to explore different mappings on Blue Gene/ into a 4><4><4 cube. LTM. FIG. 4 illustrates a schematic of an example computer or The work in Hao Yu, I-Hsin Chung, and Jose Moreira. 50 processing system that may implement the task mapping Topology mappingfor blue gene/l supercomputer. In SC ’06: system in one embodiment of the present disclosure. Proceedings of the 2006 ACM/IEEE conference on Super computing, page 116, New York, N.Y., USA, 2006. ACM, DETAILED DESCRIPTION developed topology mapping libraries. The mapping tech niques are based on folding heuristics. The methods based on 55 In the present disclosure in one embodiment, a hierarchical folding heuristics require the topologies for guest and host are mapping algorithm is disclosed. The hierarchical mapping known already. algorithm in one embodiment includes partitioning tasks into Recently, new mapping techniques have been developed. groups of related tasks, assigning the groups of tasks to See, e.g., Abhinav Bhatel e, Eric Bohm, and Laxmikant V. groups of physical processors and re?ning or ?ne tuning the Kal e. A case study ofcommunication optimizations on 3d 60 assignments within each group. A group of tasks are referred mesh interconnects. In Euro-Par ’09: Proceedings of the 15th to as a supernode. International Euro-Par Conference on Parallel Processing, In one embodiment of the present disclosure, the tasks that pages 1015-1028, Berlin, Heidelberg, 2009. Springer-Verlag. are grouped together are related by the frequency of commu In terms of supporting message passing interface (MPI) nication among one another. For instance, the hierarchical topology functions, there are works done for speci?c systems: 65 mapping algorithm may partition the tasks by utilizing a Jesper Larsson Traff. Implementing the mpi process topology run-time communication matrix to preserve the locality of mechanism. In Supercomputing, pages 1-14,2002, uses communication. The algorithm then may extend the Moore’ s US 8,904,398 B2 3 4 space ?lling curve on the task partitions for global mapping. target machine. At 106, the mapping of tasks to processors Each partition may be further ?ne tuned using local search within a supernode is ?ne tuned. Each of the steps is explained method to improve the communication performance. in further detail below. The method of the present disclosure in one embodiment Task Partition tries to preserve the locality and reduce the communication The task partition may be done via the analysis of the time. An example of the method of the present disclosure in communication pattern, represented by a matrix, e. g., matrix A where aij represents the size of data transferred between one embodiment further extends the efforts of the known MPI rank i and j. In MPI pro gramming, rank refers to a unique methods so the mapping can be handled e?iciently on large identi?er (ID) given to a task. The matrix may be collected scale systems while run-time communication performance during run-time using a MPI tracing tool such as the one data is taken into consideration. In one embodiment, the described in H. Wen, S. Sbaraglia, S. Seelam, I. Chung, G. method of the present disclosure may use heuristics with Cong, and D. Klepacki. A productivity centered tools frame better initial mapping and explore different mappings in par workfor application performance tuning. In QEST ’07: Pro allel in different supernodes. In addition to the folding heu ceedings of the Fourth International Conference on the Quan ristics, the method of the present disclosure in one embodi titative Evaluation of Systems (QEST 2007), pages 273-274, ment may integrate the run-time measurement into mapping Washington, DC, USA, 2007. IEEE Computer Society. The consideration. The methods based on folding heuristics task partition problem is transformed into the problem of require previous knowledge of the topologies for guest and ?nding the blocks of sparse matrix, for example, described in host. The method of the present disclosure in one embodi Richard Vuduc and Hyun-J in Moon. Fast sparse matrix-vec ment may be based on run-time measurements, which allows 20 tor multiplication by exploiting variable blocks. In Proceed mapping to be done more dynamically. ings of the International Conference on High-Performance The hierarchical mapping algorithm of the present disclo Computing and Communications, 2005. The task partition sure in one embodiment may ?rst group nearby tasks that are explores the structure of the nonzero elements. The matrix is related, e.g., frequently communicate with each other based partitioned into four submatrices, two in each dimension. If on MPI trace collected during run-time, into a “supernode”. 25 the ratio between nonzero and zero elements in a submatrix Similar measurement (e.g., bandwidth and latency) and exceeds some threshold, the partition stops. Otherwise, the grouping are done for the physical processors. Then in the partition will continue recursively, until to some preset block size. This procedure is done automatically. global mapping, we apply mapping methods such as the There are cases when the user with domain knowledge may Moore’s space ?lling curve to map the supemodes onto the processor groups on the host machine. After the supernodes 30 know the problem properties. Then the task partition can be decided directly by the user. For instance, if there are block are mapped onto the host machine, we swap the tasks within structures coming naturally from the problem, then the tasks a supemode to explore better mapping con?gurations (which should be partitioned according to the structure. Another can be done in parallel). Moore’s space ?lling curve is partition example that comes naturally is when the MPI prob described in Eliakim Hastings Moore. On certain crinkly 35 lem uses different MPI communicators for different tasks. curves. Transactions of the American Mathematical Society, This approach gives the user more freedom to choose proper 1(1):72-90, 1900. blocks, since a block can be formed by the elements across the In one embodiment, the local search method described in entire matrix, not just by the adjacent elements. Jon Kleinberg and Eva Tardos. Algorithm Design. Addison Global Mapping Wesley Longman Publishing Co., Inc., Boston, Mass., USA, 40 The global mapping works on supemodes in one embodi 2005, may be used as an optimization technique for swapping ment of the present disclosure. The dimension of the super tasks within a supernode. The technique is effective in solving node is used as a unit to measure the dimension of the host NP-hard problems. In one embodiment of the present disclo machine. For instance, suppose the number of tasks in a sure, to make the method more e?icient, tasks in a supernode supemode is 16, and the topology of host machine is an 8x8><8 are classi?ed into two types: the boundary tasks and the 45 cube. If the dimension of a supernode is decided to be 2><2><4, interior tasks. The boundary tasks are selected to maintain the then the problem becomes mapping 32 supernodes onto a continuity; and the interior tasks are swapped with greedy 4><4><2 cube. method to explore possible improvements. For the mapping of a ring or a chain of supernodes to the When the number of tasks increases, ?nding the optimal reduced host machine, the Moore’ s space ?lling curve may be mapping for those two kinds of problems becomes a chal 50 used. The space ?lling curves can be constructed recursively, lenge since tuning cannot be done by hand. Considering the which means it has hierarchical structure, as shown in FIG. 2. scalability and the automation, the present disclosure pro Also, many applications use periodic boundary condition, poses the hierarchical mapping algorithm. In one embodi which makes the communication pattern as a ring or a torus. ment, the algorithm may include three parts: the task parti Moore’s space ?lling curve can map a ring to a square or to a tion, the global mapping, and the local tuning. In the task 55 cube, which is more versatile than other kinds of space ?lling partition, the algorithm evenly groups tasks that have strong curves. We extend its idea to allow the host space to be relations into “supemodes”. In the global mapping, those rectangular. “supernodes” are mapped onto processor groups of the host High Dimensional Mapping machine. The mapping is ?ne tuned locally by optimization In this section, we demonstrate how the high dimensional methods. Hierarchical mapping in the present disclosure 60 mapping problem can be solved by using the hierarchical refers to grouping of tasks into supemodes, mapping groups mapping algorithm. For the simplicity of illustration, we only and then ?ne tuning within a group. use the problem of mapping a two dimensional mesh (or a FIG. 1 is a ?ow diagram illustrating a method of the present torus) into a cube as examples. However, this idea can be disclosure in one embodiment. At 102, tasks are partitioned extended to solve higher dimensional problems. into groups of tasks, each group into a supernode. At 104, 65 In the task partition step, the tasks in one side of a mesh (or global mapping is performed to assign the partitioned tasks, a torus) are partitioned into a supemode. In the global map or supemodes, to physical processors of a host machine or ping step, the chain (or the ring) of supemodes is then stuffed US 8,904,398 B2 5 6 into the host machine. The idea is just like rolling the mesh j. A simple model to formulate the matrix D is by the number into a tube (or a torus), in which a supemode is formed by the of links on the shortest path between two processors, which is tasks along a circumference, and then to stuff the tube into a also called the hopping distance. For more accurate measure box. ment, matrix D can be evaluated via experiments. FIGS. 3A and 3B show two examples of mapping 64 tasks With the traf?c matrix T and the distance matrix D, the into a 4><4><4 cube. Each point of intersections of lines (cor communication cost of a mapping is de?ned as ner) represents a processor. The ?rst example, shown in FIG. 3A, is for an 8><8 torus. When it is rolled into a tube, each supemode is of size 8. If the dimension of a supemode is set to 4><2><1, the problem becomes putting an 8 node long ring onto a 2x4 plane, which can be done straightforwardly. The conceptual mapped torus is shown in FIG. 3A. FIG. 3B shows the example of mapping a 4x16 torus into a 4><4><4 cube. If the which is the summation of the communication time over all mesh is rolled from the short side, it becomes a tube of pairs of tasks mapped to the host machine. circumference 4 and 16 supernode long. Since each supem FIG. 4 illustrates a schematic of an example computer or ode is of dimension 2><2><1, the global mapping problem processing system that may implement the task mapping becomes stuf?ng a 16 node long ring into a 2><2><4 cube. system in one embodiment of the present disclosure. The Using the space ?lling curve for three dimensional space, one computer system is only one example of a suitable processing can obtain the mapping like shown in FIG. 3B. This mapping approach may encounter problems when the 20 system and is not intended to suggest any limitation as to the tube is turned around the corner. Similar problems had been scope of use or functionality of embodiments of the method studied in Hao Yu, I-Hsin Chung, and Jose Moreira. Topology ology described herein. The processing system shown may be mapping for blue gene/l supercomputer. In SC ’06: Proceed operational with numerous other general purpose or special ings of the 2006 ACM/IEEE conference on Supercomputing, purpose computing system environments or con?gurations. page 116, New York, N.Y., USA, 2006, ACM, in which the 25 Examples of well-known computing systems, environments, nodes aron the comers are twisted to minimize the dilation and/or con?gurations that may be suitable for use with the distances. The same technique may be used for that. However, processing system shown in FIG. 4 may include, but are not the mapping is further evaluated and improved by optimiza limited to, personal computer systems, server computer sys tion methods. tems, thin clients, thick clients, handheld or laptop devices, The Local Tuning 30 multiprocessor systems, microprocessor-based systems, set The local tuning step of the hierarchical mapping ?ne tunes top boxes, programmable consumer electronics, network the mapping by local swapping. The framework of the local PCs, minicomputer systems, mainframe computer systems, tuning is sketched as follows. and distributed cloud computing environments that include 1) Given an initial mapping 4), compute the evaluation func any of the above systems or devices, and the like. tion C(q)). 35 The computer system may be described in the general 2) For k:1, 2, . . . until C(q)) converges context of computer system executable instructions, such as a) Propose a new (If. program modules, being executed by a computer system. b) Evaluate C(q)'). Generally, program modules may include routines, programs, objects, components, logic, data structures, and so on that In the framework, three things can be varied. The ?rst is the 40 perform particular tasks or implement particular abstract data de?nition of the evaluation function; the second is the method of proposing new (If; and the third is the decision of the types. The computer system may be practiced in distributed parameter p k. Many optimization methods are conformable cloud computing environments where tasks are performed by to this framework, such as the local search algorithm and the remote processing devices that are linked through a commu simulated annealing method. The idea is to ?nd a better map 45 nications network. In a distributed cloud computing environ ping from the existing one. In the present disclosure in one ment, program modules may be located in both local and embodiment, we use the simple local search algorithm, which remote computer system storage media including memory ?xes kaI, and proposes new (If by swapping tasks with its storage devices. neighbors. The used evaluation function C(q)) is de?ned as The components of computer system may include, but are follows. 50 not limited to, one or more processors or processing units 12, When the message size of communication is taken into a system memory 16, and a bus 14 that couples various system consideration, the dilation distance may not be the best metric components including system memory 16 to processor 12. for a mapping. Here we propose a new metric, called com The processor 12 may include a ask mapping module 10 that munication cost, to measure the quality of mappings. The performs the methods described herein. The module 10 may communication cost is composed by two factors: the traf?c 55 be programmed into the integrated circuits of the processor pattern of tasks and the processor distance. 12, or loaded from memory 16, storage device 18, or network The traf?c pattern of tasks is modeled by a traf?c matrix, 24 or combinations thereof. e.g., matrix T, whose element Ti,j represents the message size Bus 14 may represent one or more of any of several types sent from task i and task j. The content of matrix T can be of bus structures, including a memory bus or memory con obtained from the analysis of programs, in which function 60 troller, a peripheral bus, an accelerated graphics port, and a calls for communications, such as MPI_SEND or MPI_RE processor or local bus using any of a variety of bus architec DUCE, provides the hints of traf?c pattern and message size. tures. By way of example, and not limitation, such architec A more expensive, but more robust way to obtain T is from tures include Industry Standard Architecture (ISA) bus, measurement of the sample execution of programs. Micro Channel Architecture (MCA) bus, Enhanced ISA The processor distance is also represented by a matrix, e.g., 65 (EISA) bus, Video Electronics Standards Association matrix D. Element D(i, j) is the cost, which may mean time (V ESA) local bus, and Peripheral Component Interconnects taken, of sending a unit message from processori to processor (PCI) bus. US 8,904,398 B2 7 8 Computer system may include a variety of computer sys storage device, or any suitable combination of the foregoing. tem readable media. Such media may be any available media In the context of this document, a computer readable storage that is accessible by computer system, and it may include both medium may be any tangible medium that can contain, or volatile and non-volatile media, removable and non-remov store a program for use by or in connection with an instruction able media. execution system, apparatus, or device. System memory 16 can include computer system readable A computer readable signal medium may include a propa media in the form of volatile memory, such as random access gated data signal with computer readable program code memory (RAM) and/or cache memory or others. Computer embodied therein, for example, in baseband or as part of a system may further include other removable/non-removable, carrier wave. Such a propagated signal may take any of a volatile/non-volatile computer system storage media. By way variety of forms, including, but not limited to, electro-mag of example only, storage system 18 can be provided for read netic, optical, or any suitable combination thereof. A com ing from and writing to a non-removable, non-volatile mag puter readable signal medium may be any computer readable netic media (e. g., a “hard drive”). Although not shown, a medium that is not a computer readable storage medium and magnetic disk drive for reading from and writing to a remov that can communicate, propagate, or transport a program for able, non-volatile magnetic disk (e.g., a “?oppy disk”), and an use by or in connection with an instruction execution system, optical disk drive for reading from or writing to a removable, apparatus, or device. non-volatile optical disk such as a CD-ROM, DVD-ROM or Program code embodied on a computer readable medium other optical media can be provided. In such instances, each may be transmitted using any appropriate medium, including can be connected to bus 14 by one or more data media inter but not limited to wireless, wireline, optical ?ber cable, RF, faces. 20 etc., or any suitable combination of the foregoing. Computer system may also communicate with one or more Computer program code for carrying out operations for external devices 26 such as a keyboard, a pointing device, a aspects of the present invention may be written in any com display 28, etc.; one or more devices that enable a user to bination of one or more programming languages, including interact with computer system; and/ or any devices (e.g., net an object oriented programming language such as Java, work card, modem, etc.) that enable computer system to 25 Smalltalk, C++ or the like and conventional procedural pro communicate with one or more other computing devices. gramming languages, such as the “C” programming language Such communication can occur via Input/ Output (I/O) inter or similar programming languages, a scripting language such faces 20. as Perl, VBS or similar languages, and/or functional lan Still yet, computer system can communicate with one or guages such as Lisp and ML and logic-oriented languages more networks 24 such as a local area network (LAN), a 30 such as Prolog. The program code may execute entirely on the general wide area network (WAN), and/or a public network user’s computer, partly on the user’s computer, as a stand (e. g., the Internet) via network adapter 22. As depicted, net alone software package, partly on the user’s computer and work adapter 22 communicates with the other components of partly on a remote computer or entirely on the remote com computer system via bus 14. It should be understood that puter or server. In the latter scenario, the remote computer although not shown, other hardware and/or software compo 35 may be connected to the user’ s computer through any type of nents could be used in conjunction with computer system. network, including a local area network (LAN) or a wide area Examples include, but are not limited to: microcode, device network (WAN), or the connection may be made to an exter drivers, redundant processing units, external disk drive nal computer (for example, through the Internet using an arrays, RAID systems, tape drives, and data archival storage Internet Service Provider). systems, etc. 40 Aspects of the present invention are described with refer As will be appreciated by one skilled in the art, aspects of ence to ?owchart illustrations and/ or block diagrams of meth the present invention may be embodied as a system, method ods, apparatus (systems) and computer program products or computer program product. Accordingly, aspects of the according to embodiments of the invention. It will be under present invention may take the form of an entirely hardware stood that each block of the ?owchart illustrations and/or embodiment, an entirely software embodiment (including 45 block diagrams, and combinations of blocks in the ?owchart ?rmware, resident software, micro-code, etc.) or an embodi illustrations and/or block diagrams, can be implemented by ment combining software and hardware aspects that may all computer program instructions. These computer program generally be referred to herein as a “circuit,” “module” or instructions may be provided to a processor of a general “system.” Furthermore, aspects of the present invention may purpose computer, special purpose computer, or other pro take the form of a computer program product embodied in one 50 grammable data processing apparatus to produce a machine, or more computer readable medium(s) having computer read such that the instructions, which execute via the processor of able program code embodied thereon. the computer or other programmable data processing appa Any combination of one or more computer readable medi ratus, create means for implementing the functions/acts um(s) may be utilized. The computer readable medium may speci?ed in the ?owchart and/ or block diagram block or be a computer readable signal medium or a computer read 55 blocks. able storage medium. A computer readable storage medium These computer program instructions may also be stored in may be, for example, but not limited to, an electronic, mag a computer readable medium that can direct a computer, other netic, optical, electromagnetic, infrared, or semiconductor programmable data processing apparatus, or other devices to system, apparatus, or device, or any suitable combination of function in a particular manner, such that the instructions the foregoing. More speci?c examples (a non-exhaustive list) 60 stored in the computer readable medium produce an article of of the computer readable storage medium would include the manufacture including instructions which implement the following: an electrical connection having one or more wires, function/act speci?ed in the ?owchart and/or block diagram a portable computer diskette, a hard disk, a random access block or blocks. memory (RAM), a read-only memory (ROM), an erasable The computer program instructions may also be loaded programmable read-only memory (EPROM or Flash 65 onto a computer, other programmable data processing appa memory), an optical ?ber, a portable compact disc read-only ratus, or other devices to cause a series of operational steps to memory (CD-ROM), an optical storage device, a magnetic be performed on the computer, other programmable appara
Description: