LOOP PARALLELIZATION A Book Series On LOOP TRANSFORMATIONS FOR RESTRUCTURING COMPILERS Utpal Banerjee Series Titles: Loop Transformations for Restructuring Compilers: The Foundations Loop Parallelization LOOP PARALLELIZATION Utpal Banerjee Intel Corporation Loop Transformations for Restructuring Compilers SPRINGER SCIENCE+BUSINESS MEDIA, LLC ISBN 978-1-4419-5141-0 ISBN 978-1-4757-5676-0 (eBook) DOI 10.1007/978-1-4757-5676-0 Library of Congress Cataloging-in-Publication Data A C.I.P. Catalogue record for this book is available from the Library of Congress. Copyright © 1994 by Springer Science+Business Media New York OriginaIly published by Kluwer Academic Publishers in 1994 AII rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo-copying, recording, or otherwise, without the prior written permission of the publisher, Springer Science+Business Media, LLC Printed on acid-free pap er. Contents Preface xiii Acknowledgments xvii 1 Background 1 1.1 Introduction. 1 1.2 Program Model 2 1.3 Dependence .. 9 1.4 Loop Transformation 19 1.5 Loop Parallelization . 26 2 Loop Permutations 29 2.1 Introduction . . . 29 2.2 Basic Concepts 34 2.3 Preventing Permutations 40 2.4 Parallelization by Permutation . 51 2.5 Computation of Loop Limits 55 2.6 Optimization Problems . . . 62 3 Unimodular Transformations 67 3.1 Introduction . . . . . . . . . 67 3.2 Basic Concepts ....... 73 3.3 Elementary Transformations 78 3.4 Inner Loop Parallelization 86 3.5 Outer Loop Parallelization 97 v vi 3.6 Computation of Loop Limits. .106 4 Remainder Transformations 113 4.1 Introduction........ · 113 4.2 Single-Loop Transformation · 114 4.3 GCD Transformation . . · 124 4.4 Echelon Transformation · 134 5 Program Partitioning 147 5.1 Introduction.... · 147 5.2 Vertical Partitions · 148 5.3 Horizontal Partitions · 156 5.4 Vertical and Horizontal Parallelism · 164 5.5 Suggested Reading ........ . · 164 Bibliography 167 Index 173 List of Figures 1.1 Index space of Example 1.1. . . . . . . . . . . . . .. 6 1.2 Index space of Example 1.2. . . . . . . . . . . . . .. 8 1.3 Dependence graph of Example 1.4 and the major weakly connected components. ............ 15 1.4 Dependence graph of Example 1.5. ........ 17 1.5 Iterations of individual loops (Example 1.6). . . . 21 1.6 Execution order for iterations of L (Example 1.6). 21 2.1 Index space of L in Example 2.1. ... 31 2.2 Index space of Lp in Example 2.1.. . . 33 2.3 Index space of (L1' L in Example 2.7. 57 2) 2.4 Index space of (L1' L in Example 2.8. 58 2) 3.1 Dependence graph for Example 3.1. . . 69 3.2 A wave through the index space of Example 3.1. 70 3.3 Index space of Lv. . . . . . . . . . . . . . 71 3.4 Index space with typical distance vectors. . 80 3.5 Index space after outer loop reversal. . . 81 3.6 Index space after inner loop reversal. . . 82 3.7 Index space after an upper loop skewing. 83 3.8 Index space after a lower loop skewing. . 85 4.1 Dependence graphs of the loop nests of Example 4.1. 117 vii List of Tables 1.1 Iterations of individual loops (Example 1.3). . . .. 11 2.1 Permutations and direction vectors in a triple Loop. 64 2.2 Level change under a permutation in a triple Loop. 64 2.3 Direction vectors, permutations, and change in de- pendence levels in a triple loop. . . . . . . . . . . .. 65 4.1 Values of I and (K, Y) in Example 4.1. ........ 118 ix List of Notations In the following, i = (i1,i2, ... ,im) and j = (jI,i2, ... ,jm) are two vectors of size m, and 1 Sf S m. Xf--+y A function that maps an element x of its domain to an element y of its range . 27 u+ max( u, 0) (positive part of u) . . . . . . . . 76 u max( -u, 0) (negative part of u) .... 76 alb The exact result of division of a by b . 5 sig(i) Sign of an integer i. . . . . . 13 Z Set of all integers. . . . . . . . . . . . . 3 zm Set of all integer m-vectors ....... 3 R Set of all real numbers . . . . . . . . . . . 3 Rm Set of all real m-vectors ........ 3 sig(i) Sign of a vector i . . . . . . . . . . . . 13 0 Zero vector (size implied by context) 13 (i;j) Vector formed by concatenating elements of i with elements of j ... 27 isj ir S jr for each r in 1 S r sm. . . . 4 i -<e j i =jl,i =i2, ... ,ie- =je-l,ie <je· 14 1 2 1 i-<j i -<e j for some f. 9 i >-d j -<e i .................... 14 i>-j j -< i .................... 13 det(A) Determinant of a (square) matrix A . . 67 xi Xll A' Transpose of a matrix A . . . . . . 98 Column r of matrix U . . . . . . . 87 Row t of matrix U. . . . . . . . 100 The m x m identity matrix. 5 An identity matrix (size implied by context) . . 39 p A typical permutation matrix 27 U A typical unimodular matrix. 27 L A typical loop nest (L1' L2, •.. ,Lm). 2 I Index vector (II, 12, ... ,1m) of L .. 2 H(I) Body of L ....... . 2 R Index space of L .. . . . . 3 Po Lower limit vector of L 4 P Lower limit matrix of L . . 4 qo Upper limit vector of L .. 4 Q Upper limit matrix of L .. 4 d A typical distance vectors of L . . 13 N N umber of distance vectors of L . . 13 D Set of distance vectors of L . . . . 13 'D Distance matrix of L . . . . . . . . 13 A typical direction vector of L . . 13 Direction matrix of L .. . . . . . 13 L doall loop corresponding to a do loop L . 22 L' A typical mixed loop nest. . . . . . . . 22 Lp Transformed program of L defined by a permutation matrix P. . . . . . . . . . . . . .. 35 Lv Transformed program of L defined by a unimodular matrix U . . . . . . . . . . . . . .. 73 LG Transformed program of L defined by the gcd transformation . . . . . . . . . . . . . . . 125 Ls Transformed program of L defined by the echelon transformation . . . . . 136 Ls Mixed loop nest obtained from Ls. . . 139