ebook img

Practical Parallel Rendering PDF

370 Pages·2002·4.376 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Practical Parallel Rendering

Practical Parallel Rendering Practical Parallel Rendering Edited by Alan Chalm ers Timothy Davis Erik Reinhard A K Peters Natick, Massachusetts Editorial, Sales, and Customer Service Office A K Peters, Ltd. 63 South Avenue Natick, MA 01760 www.akpeters.com Copyright © 2002 by A K Peters, Ltd. All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form , electronic or mechanical, including photocopying, recording, or by an y information storage and retrieval system, without written permission from the copyright owner. Library of Congress Cataloging-in-Publication Data Practical parallel rendering / edited by Alan Chalmers, Timothy Davis, Erik Reinhard. p. cm. Includes bibliographical references and index. ISBN 1-56881-179-9 1. Parallel processing (Electronic computers) 2. Electronic data processing-Distributed processing. I Chalmers, Alan II. Davis, Timothy, 1965- III. Reinhard, Erik, 1968- IV. Title. QA76.58 .C44 2002 004'.35-dc21 2002070381 Printed in Canada 06 05 04 03 02 10 9 8 7 6 5 4 3 2 1 Contents Preface xi I Parallel Rendering 1 1 Introduction to Parallel Processing 3 1.1 Concepts....................................................................................... 4 1.1.1 Dependencies .............................................................. 5 1.1.2 Scalability......................................................................... 6 1.1.3 Control.............................................................................. 7 1.2 Classification of Parallel Systems.............................................. 7 1.2.1 Parallel versus Distributed Systems.................................. 13 1.3 The Relationship of Tasks and Data.......................................... 14 1.3.1 Inherent Difficulties ..................................................... 15 1.3.2 Tasks ............................................................................ 16 1.3.3 Data.................................................................................. 16 1.4 Evaluating Parallel Implementations........................................ 17 1.4.1 Realization Penalties ................................................... 18 1.4.2 Performance Metrics........................................................ 20 1.4.3 Efficiency......................................................................... 25 2 Task Scheduling and Data Management 31 2.1 Problem Decomposition............................................................... 31 2.1.1 Algorithmic Decomposition............................................. 32 2.1.2 Domain Decomposition ................................................ 32 2.1.3 Abstract Definition of a Task........................................... 34 2.1.4 System Architecture......................................................... 34 V vi Contents 2.2 Computational Models .......................................................... 36 2.2.1 Data Driven Model ....................................................... 36 2.2.2 Demand Driven Model..................................................... 41 2.2.3 Hybrid Computational Model ...................................... 46 2.3 Task Management .................................................................. 46 2.3.1 Task Definition and Granularity ................................... 46 2.3.2 Task Distribution and Control ...................................... 48 2.3.3 Algorithmic Dependencies .......................................... 49 2.4 Task Scheduling Strategies ..................................................... 53 2.4.1 Data Driven Task Management Strategies....................... 53 2.4.2 Demand Driven Task Management Strategies .... 54 2.4.3 Task Manager Process...................................................... 59 2.4.4 Distributed Task Management ...................................... 62 2.4.5 Preferred Bias Task Allocation......................................... 64 2.5 Data Management .................................................................. 65 2.5.1 World Model of the Data: No Data Management Required....................................... 66 2.5.2 Virtual Shared Memory.................................................... 67 2.5.3 The Data Manager............................................................ 69 2.5.4 Consistency ................................................................. 76 2.5.5 Minimizing the Impact of Remote Data Requests . . 80 2.5.6 Data Management for Multistage Problems..................... 85 3 Parallel Global Illumination Algorithms 89 3.1 Rendering..................................................................................... 90 3.2 Parallel Processing .................................................................. 92 3.3 Ray Tracing................................................................................. 94 3.4 Spatial Subdivisions.................................................................... 96 3.4.1 Parallel Ray Tracing......................................................... 99 3.4.2 Demand Driven Ray Tracing............................................ 99 3.4.3 Data Parallel Ray Tracing................................................ 104 3.4.4 Hybrid Scheduling............................................................ 107 3.5 Radiosity...................................................................................... 109 3.5.1 Form Factors..................................................................... 110 3.5.2 Parallel Radiosity ......................................................... Ill 3.6 Full Matrix Radiosity.................................................................. 112 3.6.1 Setting Up the Matrix of Form Factors............................ 113 3.6.2 Solving the Matrix of Form Factors................................. 115 3.6.3 Group Iterative Methods .............................................. 115 3.7 Progressive Refinement .......................................................... 116 3.7.1 Parallel Shooting............................................................... 118 Contents vii 3.8 Hierarchical Radiosity................................................................ 120 3.8.1 Parallel Hierarchical Radiosity ....................................... 121 3.9 Particle Tracing........................................................................... 122 3.9.1 Parallel Particle Tracing................................................... 123 3.9.2 Density Estimation........................................................... 124 3.10 Data Distribution and Data Locality......................................... 125 3.10.1 Data Distribution .......................................................... 126 3.10.2 Visibility Preprocessing................................................... 127 3.10.3 Environment Mapping...................................................... 128 3.10.4 Geometric Simplification................................................. 128 3.10.5 Directional Caching ...................................................... 130 3.10.6 Reordering Computations................................................. 130 3.11 Discussion..................................................................................... 131 4 Overview of Parallel Graphics Hardware 133 4.1 Pipelining..................................................................................... 133 4.2 Parallelism in Graphics Cards................................................... 136 4.2.1 3DLABS Products............................................................ 136 4.2.2 Hewlett-Packard Products................................................ 139 4.2.3 SGI Products (Silicon Graphics, Inc.) .......................... 142 4.2.4 UNC Products.................................................................. 146 4.2.5 Pomegranate Graphics Chip............................................. 149 4.3 Conclusion................................................................................... 151 5 Coherence in Ray Tracing 153 5.1 Scene Analysis............................................................................. 154 5.1.1 Distribution of Data Accesses.......................................... 155 5.1.2 Temporal Characteristics ............................................... 158 5.1.3 Temporal Behaviour per Ray Type.................................. 162 5.1.4 Conclusions...................................................................... 164 5.2 Animation Analysis..................................................................... 165 5.2.1 Background...................................................................... 166 5.2.2 Related Work.................................................................... 167 5.2.3 Frame Coherence Algorithm............................................ 169 5.2.4 Parallel Frame Coherence Algorithm............................... 174 5.2.5 Results .......................................................................... 176 5.2.6 Summary.......................................................................... 184 viii Contents II Case Studies 185 6 Interactive Ray Tracing on a Supercomputer 187 6.1 System Architecture.................................................................... 188 6.1.1 Conventional Operation.................................................... 189 6.1.2 Frameless Rendering........................................................ 192 6.1.3 Performance .................................................................. 193 6.2 Ray Tracing for Volume Visualization...................................... 194 6.2.1 Background...................................................................... 195 6.2.2 Traversal Optimizations................................................... 197 6.2.3 Algorithms........................................................................ 201 6.2.4 Results .......................................................................... 205 6.2.5 Discussion........................................................................ 211 6.3 Ray Tracing for Terrain Visualization...................................... 214 6.4 Conclusions.................................................................................. 215 7 Interactive Ray Tracing on PCs 21*7 7.1 Introduction ............................................................................. 217 7.1.1 Previous Work.................................................................. 220 7.2 An Optimized Ray Tracing Implementation............................ 221 7.2.1 Code Complexity.............................................................. 221 7.2.2 Caching............................................................................. 222 7.2.3 Coherence through Packets of Rays................................. 223 7.2.4 Parallelism through SIMD Extensions............................. 223 7.3 Ray Triangle Intersection Computation................................... 223 7.3.1 Optimized Barycentric Coordinate Test........................... 223 7.3.2 Evaluating Instruction Level Parallelism.......................... 224 7.3.3 SIMD Barycentric Coordinate Test.................................. 224 7.4 BSP Traversal.............................................................................. 226 7.4.1 Traversal Algorithm......................................................... 226 7.4.2 Memory Layout for Better Caching................................. 228 7.4.3 Traversal Overhead.......................................................... 229 7.5 SIMD Phong Shading ............................................................. 229 7.6 Performance of the Ray Tracing Engine................................... 231 7.6.1 Comparison to Other Ray Tracers.................................... 231 7.6.2 Reflection and Shadow Rays ........................................ 233 7.6.3 Comparison with Rasterization Hardware........................ 234 7.7 Interactive Ray Tracing on PC Clusters................................... 236 7.7.1 Overview........................................................................... 238 Contents ix 7.8 Distributed Data Management................................................... 239 7.8.1 Explicit Data Management............................................... 239 7.8.2 Preprocessing................................................................... 241 7.9 Load Balancing............................................................................ 241 7.10 Implementation ....................................................................... 242 7.11 Results.......................................................................................... 243 7.12 Conclusions.................................................................................. 246 8 The "Kilauea" Massively Parallel Ray Tracer 249 8.1 What Is the Kilauea Project? ................................................. 249 8.2 Basic Idea..................................................................................... 250 8.3 System Design.............................................................................. 251 8.3.1 Hardware Environment ................................................ 251 8.3.2 Pthreads ........................................................................ 252 8.3.3 Message Passing............................................................... 252 8.3.4 Front-End Process............................................................ 253 8.3.5 Launching Kilauea........................................................... 253 8.3.6 Single Executable Binary................................................. 254 8.3.7 Multiframe Rendering...................................................... 254 8.3.8 Global Illumination Renderer........................................... 255 8.4 The ShotData File Format ...................................................... 256 8.5 Parallel Ray Tracing................................................................... 261 8.6 Implementation ....................................................................... 268 8.6.1 Low-Level Data Structure................................................ 268 8.6.2 MPI (Message Passing Interface) Layer ....................... 277 8.6.3 Tel Command Interface.................................................... 279 8.6.4 Rank and Task.................................................................. 281 8.6.5 Details of Ray Tracing..................................................... 288 8.6.6 Shading Computation....................................................... 291 8.6.7 Photon Map Method......................................................... 306 8.6.8 Things to Note in Shading Computation.......................... 310 8.6.9 Development in General................................................... 314 8.7 Rendering Results....................................................................... 316 8.7.1 Sample 1: Quatro .......................................................... 316 8.7.2 Sample 2: Jeep.................................................................. 319 8.7.3 Sample 3: Jeep 8............................................................... 321 8.7.4 Consideration of Rendering Results................................. 322 8.8 Conclusion................................................................................... 323 8.9 Future Plans and Tasks .......................................................... 325 x Contents 9 Parallel Ray Tracing on a Chip 329 9.1 The Smart Memories Chip.........................................................329 9.2 The SHARP Ray Tracer.............................................................331 9.3 Simulation Results ...................................................................333 9.3.1 Caching.............................................................................334 9.3.2 Estimated Performance.....................................................335 9.4 Conclusions..................................................................................336 Bibliography 337 Index 363 Author Biographies 369 Preface The eve r i ncreasing c omputational dem ands ass ociated with re ndering has meant that parallel rendering is alm ost as old as rendering itself. However, the fu ll po tential p arallel pro cessing has to offer of producing won derful images in reasonable times has always see med to allude those striving for this "holy grail." While p arallel p rocessing on a low nu mber of processors is relatively straightforward, the challenge com es when co nfronting an i mplementation on a l arge system. Here t he overheads associated with the processors work- ing together can rapidly dominate and l ead to the frustration of a s olution time of more than that which was ach ievable on a single process or. And yet it is precisely these larger systems which offer th e computational per- formance we seek. The aim of t his book is to describe the problems associated with par- allel renderi ng, pr ovide a methodology as t o h ow t hese pr oblems can be minimized and demonstrate how, with care, it is indeed possible to achieve efficient parallel rendering. The book is stru ctured into two parts. The first part is intended to pro- vide textbook material that introduces generic parallel processing issues in Chapters 1 a nd 2. With t his bac kground k nowledge, t he reade r i s t hen introduced to h igh-end graphics algo rithms su ch as ray tracin g, radiosity and particle tracing. T he s pecifics of these algorithms and their i mplica- tions f or parallel processi ng are di scussed i n C hapter 3, while C hapter 4 deals with recent developments in hardware. For many of these algorithms it turns out that the proper exploitation of coherence is essen tial. For this reason, an in-depth analysis of coherence for single images as well as image sequences is presented in Chapter 5. In the second part of this book, a number of case st udies are presented. Each case stud y uses parallelism to speed up ray tracing , but in the process they all take very different approaches to solving the problems faced in par- xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.