ebook img

Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning ... PDF

172 Pages·2009·2.28 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning ...

Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning by William Chun Yip Lo A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Medical Biophysics University of Toronto Copyright (cid:2)c 2009 by William Chun Yip Lo ii Abstract Hardware Acceleration of a Monte Carlo Simulation for Photodynamic Therapy Treatment Planning William Chun Yip Lo Master of Science Graduate Department of Medical Biophysics University of Toronto 2009 Monte Carlo (MC) simulations are widely used in the field of medical biophysics, particularly for modelling light propagation in biological tissue. The iterative nature of MC simulations and their high computation time currently limit their use to solving the forward solution for a given source configuration and optical properties of the tis- sue. However, applications such as photodynamic therapy treatment planning or image reconstruction in diffuse optical tomography require solving the inverse problem given a desired light dose distribution or absorber distribution, respectively. A faster means for performing MC simulations would enable the use of MC-based models for such tasks. In this thesis, a gold standard MC code called MCML was accelerated using two distinct hardware-based approaches, namely designing custom hardware on field-programmable gate arrays and programming commodity graphics processing units (GPUs). Currently, the GPU-based approach is promising, offering approximately 1000-fold speedup with 4 GPUs compared to an Intel Xeon CPU. iii Acknowledgements This thesis is truly a journey. Along its path, I met a number of individuals who have assisted me and taught me a great deal, including different ways to approach problems. Through this interdisciplinary research project, I had the pleasure of working with and being instructed by experts from different research areas. First, I am indebted to both of my supervisors, Prof. Lothar Lilge and Prof. Jonathan Rose, for providing me with an excellent learning environment as well as their guidance and mentorship throughout my project. Together, they have helped realize my goal of applying my engineering skills in the medical field. Furthermore, Dr. David Jaffray’s insights into clinical treatment planning have been instrumental in the development of my thesis. Through this interdisciplinary research, I also had the chance to work with fellow graduate students in the Department of Electrical and Computer Engineering. Specifi- cally, I would like to acknowledge Jason Luu and Keith Redmond’s assistance with the FPGA-based hardware design. In addition, Prof. Chow, my instructor for a number of computer hardware design courses, has offered me insightful advice during this initial phase of my thesis. During the second phase of my thesis, David Han’s expertise in the NVIDIA GPU architecture and his assistance with optimizing the CUDA program have been invaluable. The friendship we have established is something that I value very much. Finally, I would like to thank my parents for bringing me to Canada so that I can meet with all these great individuals. Without their sacrifice, none of this would have been possible today. This thesis is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) through an NSERC postgraduate scholarship. iv Contents List of Tables ix List of Figures xii List of Abbreviations and Symbols xiv 1 Introduction 1 1.1 Progress in Photodynamic Therapy . . . . . . . . . . . . . . . . . . . . . 2 1.2 Clinical Dosimetry for PDT Treatment Planning . . . . . . . . . . . . . . 3 1.3 Light Dosimetry Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 MC-based Light Dosimetry . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 The MCML Light Dose Computation Method 11 2.1 The Monte Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 The MCML Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1 Photon Initialization . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.2 Position Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.3 Direction Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.4 Fluence Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.5 Photon Termination . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 v 3 FPGA-based Acceleration of the MCML Code 21 3.1 Field-Programmable Gate Arrays . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Hardware Design Method . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.2 Hardware Acceleration Techniques . . . . . . . . . . . . . . . . . 25 3.4 FPGA-based Hardware Implementation . . . . . . . . . . . . . . . . . . . 27 3.4.1 Modifications to the MCML code . . . . . . . . . . . . . . . . . . 27 3.4.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.4.3 Overview of Hardware Design . . . . . . . . . . . . . . . . . . . . 28 3.4.4 Pipeline Stages in the Fluence Update Core . . . . . . . . . . . . 31 3.4.5 Design Challenges for the Direction Update Engine . . . . . . . . 36 3.4.6 Importance of Managing Resource Usage . . . . . . . . . . . . . . 38 3.4.7 Trade-offs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.5 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5.1 FPGA System-Level Validation Procedures . . . . . . . . . . . . . 41 3.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.6 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.6.1 Multi-FPGA Platform: TM-4 . . . . . . . . . . . . . . . . . . . . 50 3.6.2 Modern FPGA Platform: DE3 Board . . . . . . . . . . . . . . . . 52 3.7 Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.8 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4 GPU-based Acceleration of the MCML Code 57 4.1 Graphics Processing Units . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3 CUDA-based GPU Programming . . . . . . . . . . . . . . . . . . . . . . 59 vi 4.3.1 GPU Hardware Architecture . . . . . . . . . . . . . . . . . . . . . 59 4.3.2 Programming with CUDA . . . . . . . . . . . . . . . . . . . . . . 63 4.3.3 CUDA-specific Acceleration Techniques . . . . . . . . . . . . . . . 67 4.4 GPU-accelerated MCML Code . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4.1 Parallelization Scheme . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4.2 Key Performance Bottleneck . . . . . . . . . . . . . . . . . . . . . 71 4.4.3 Solution to Performance Issue . . . . . . . . . . . . . . . . . . . . 72 4.4.4 Other Key Optimizations . . . . . . . . . . . . . . . . . . . . . . 74 4.4.5 Scaling to Multiple GPUs . . . . . . . . . . . . . . . . . . . . . . 77 4.5 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5.1 GPU and CPU Platforms . . . . . . . . . . . . . . . . . . . . . . 78 4.5.2 Speedup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.5.3 Effect of Optimizations . . . . . . . . . . . . . . . . . . . . . . . . 79 4.5.4 Effect of Grid Geometry . . . . . . . . . . . . . . . . . . . . . . . 84 4.6 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.6.1 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.6.2 Error Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.6.3 Light Dose Contours . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5 Conclusions 93 5.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2.1 Extension to 3-D and Support for Multiple Sources . . . . . . . . 95 5.2.2 Sources of Uncertainties . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.3 PDT Treatment Planning using FPGA or GPU Clusters . . . . . 98 A Source Code for the Hardware 101 vii B Source Code for the CUDA program 117 Bibliography 146 viii List of Tables 3.1 Photon packet data in shared pipeline registers . . . . . . . . . . . . . . 34 3.2 Resource usage statistics and number of stages per module . . . . . . . . 40 3.3 Optical properties of the five-layer skin tissue . . . . . . . . . . . . . . . 42 3.4 Specifications of the TM-4 and DE3 FPGA platforms . . . . . . . . . . . 48 3.5 Specifications of two Intel-based server platforms . . . . . . . . . . . . . 49 3.6 Runtime of software vs. hardware for 108 photon packets at λ=633 nm . 51 3.7 Runtime of software vs. hardware for 108 photon packets at λ=337 nm . 51 3.8 Performance comparison of Stratix, Stratix III, and Xeon processor . . . 52 3.9 Resource utilization of the hardware on TM-4 and DE3. . . . . . . . . . 53 3.10 Power-delay product of Stratix III, Xeon CPU, and CPU cluster . . . . . 55 4.1 Mapping MCML variables to GPU memories . . . . . . . . . . . . . . . 62 4.2 Performance comparison between GPU-MCML and CPU-MCML . . . . 80 4.3 Effect of local memory usage on simulation time . . . . . . . . . . . . . . 81 4.4 Effect of optimizations on simulation time . . . . . . . . . . . . . . . . . 82 4.5 Effect of grid geometry on simulation time . . . . . . . . . . . . . . . . . 84 ix x

Description:
an excellent learning environment as well as their guidance and mentorship . 2.1 Monte Carlo simulation of photon propagation in a skin model 15 . Compared to radiation therapy treatment planning, PDT treatment planning is still a.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.