ebook img

Parallel computation techniques for virtual acoustics and physical PDF

191 Pages·2014·7.54 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Parallel computation techniques for virtual acoustics and physical

Parallel computation techniques for virtual acoustics and physical modelling synthesis Craig J. Webb U N I VER E S I H T T Y O H G F R E U DI N B Doctor of Philosophy Acoustics and Audio Group University of Edinburgh 2014 Abstract The numerical simulation of large-scale virtual acoustics and physical modelling synthesis is a computationally expensive process. Time stepping methods, such as finite difference time domain, can be used to simulate wave behaviour in models of three-dimensional room acoustics and virtual instruments. In the absence of any form of simplifying assumptions, and at high audio sample rates, this can lead to simula- tions that require many hours of computation on a standard Central Processing Unit (CPU). In recent years the video game industry has driven the development of Graph- icsProcessingUnits(GPUs)thatarenowcapableofmulti-teraflopperformanceusing highly parallel architectures. Whilst these devices are primarily designed for graphics calculations,theycanalsobeusedforgeneralpurposecomputing. Thisthesisexplores theuseofsuchhardwaretoacceleratesimulationsofthree-dimensionalacousticwave propagation, and embedded systems that create physical models for the synthesis of sound. Test case simulations of virtual acoustics are used to compare the performance of workstationCPUstothatofNvidia’sTeslaGPUhardware. Usingrepresentativemulti- core CPU benchmarks, such simulations can be accelerated in the order of 5X for single precision and 3X for double precision floating-point arithmetic. Optimisation strategies are examined for maximising GPU performance when using single devices, aswellasformultipledevicecodesthatcancomputesimulationsusingbillionsofgrid points. This allows the simulation of room models of several thousand cubic metres at audio rates such as 44.1kHz, all within a useable time scale. The performance of alternative finite difference schemes is explored, as well as strategies for the efficient implementationofboundaryconditions. Creating physical models of acoustic instruments requires embedded systems that often rely on sparse linear algebra operations. The performance efficiency of various sparse matrix storage formats is detailed in terms of the fundamental operations that are required to compute complex models, with an optimised storage system achieving substantial performance gains over more generalised formats. An integrated instru- mentmodelofthetimpanidrumisusedtodemonstratetheperformancegainsthatare possibleusingtheoptimisationstrategiesdevelopedthroughthisthesis. i Declaration I declare that this thesis was composed by myself, that the work contained herein is myownexceptwhereexplicitlystatedotherwiseinthetext,andthatthisworkhasnot beensubmittedforanyotherdegreeorprofessionalqualificationexceptasspecified. (CraigJ.Webb) ii Publications Elements of this thesis have been published in the following papers. The contribution ofthecurrentauthorisdetailedineachcase. 1. C. Webb and S. Bilbao, “Computing Room Acoustics with CUDA - 3D FDTD schemes with boundary losses and viscosity”, In Proceedings of the IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech,2011,pp. 317-320. Theauthorproducedtheentirepaper,withtheexceptionofSection2whichdetails thenumericaldesignofthefinitedifferencescheme. 2. C.WebbandS.Bilbao,“VirtualRoomAcoustics: AComparisonoftechniquesfor computing3D-FDTDschemesusingCUDA”,InProceedingsofthe130thConven- tionoftheAudioEngineeringSociety,vol. 2,London,UK,2011,pp. 1163-1169. Theauthorproducedtheentirepaper,withtheexceptionofthenumericaldesignof thefinitedifferenceschemeasdetailedinthepreviouspaper. 3. C. Webb and S. Bilbao, “Binaural Simulations using audio rate FDTD schemes andCUDA”,InProceedingsofthe15thInternationalConferenceonDigitalAudio Effects,York,UK,2012,pp. 97-100. The author is responsible for the entire paper, with the exception of the numerical designoftheEnquistMajdaboundarycondition. 4. S.BilbaoandC.Webb,“TimpaniDrumSynthesisin3DonGPGPUs”,InProceed- ingsofthe15thInternationalConferenceonDigitalAudioEffects,York,UK,2012, pp. 269-276. InthispaperthenumericaldesignandinitialprototypemodellingisbyBilbao. The authorisresponsiblefortheGPUimplementationandoptimisationtechniques,and theresultingperformancedata. 5. J. Sheaffer, C. Webb and B. Fazenda, “Modeling Binaural Receivers in Finite Dif- ferenceSimulationofRoomAcoustics”,InProceedingsofMeetingsonAcoustics: InternationalCongressonAcoustics,vol. 19,Montreal,Canada,2013,p. 15098. The theoretical and analysis sections of this paper are by Sheaffer. The author contributed the large-scale GPU testing data on which the results are based, having implementedthecomputationalmodels. 6. C. Webb and A. Gray, “Large-scale virtual acoustics simulation at audio rates us- ing three-dimensional finite difference time domain and multiple GPUs”, In Pro- ceedings of Meetings on Acoustics: International Congress on Acoustics, vol .19, Montreal,Canada,2013,p. 70092. Theauthorproducedtheentirepaper,withmanuscripteditingbyGray. iii 7. S. Bilbao, B. Hamilton, A. Torin, C. Webb, P. Graham, A. Gray, K. Kavoussanakis and J. Perry, “Large scale physical modeling sound synthesis”, In Proceedings of the Stockholm Music Acoustics Conference, Stockholm, Sweden, 2013, pp. 593- 600. Forthisreviewpaper,theauthorcontributedSection5onGPUimplementation,and elementsofSection6.4showingbenchmarkCPUandGPUperformanceanalysis. 8. C. Webb, “Computing virtual acoustics using the 3D finite difference time domain method and Kepler architecture GPUs” , In Proceedings of the Stockholm Music AcousticsConference,Stockholm,Sweden,2013,pp. 648-653. Theauthorisresponsiblefortheentirepaper. 9. B.HamiltonandC.Webb,“RoomacousticsmodellingusingGPUacceleratedfinite differenceandfinitevolumemethodsonaface-centeredcubicgrid”,InProceedings of the 16th International Conference on Digital Audio Effects, Maynooth, Ireland, 2013,pp. 336-343. For this paper, the author produced GPU implementations of the various finite schemesthataretested,andtheperformancedatausedinSection8. 10. S. Bilbao and C. Webb, “Physical modeling of timpani drums in 3D on GPGPUs”, JournaloftheAudioEngineeringSociety,vol. 61,no. 10,pp. 737-748,2013. InthispaperthenumericaldesignandinitialprototypemodellingisbyBilbao. The authoris responsiblefor theGPUimplementation andoptimisationtechniques, the resultingperformancedata,andtheabstractedmulti-instrumentmodel. iv Acknowledgements ThisresearchhasbeensupportedbyUniversityofEdinburghscholarshipsawardedby the College of Humanities and Social Science and the Edinburgh College of Art, and bytheEuropeanResearchCouncilunderGrantStG-2011-279068-NESS. The NESS project (Next generation sound synthesis) began in the second year of myPhDstudies,andIamveryfortunatetohavebeenapartofitfromtheoutset. When I began my PhD as the only student working with Stefan I had no idea that I would verysoonbepartofalarge,dedicated,researchgroup. IwouldliketothankthemanycolleaguesthatIhavemetalongtheway,bothfrom the MSc course and the NESS project group, that really made my time at Edinburgh theexperiencethatitwas. InparticularIhavetothankBrianHamilton,AlbertoTorin, and Dr. Michele Ducceschi for their collaboration on various papers and elements of mywork. IwouldalsoliketothankmyfamilyandElenifortheircontinuedsupportthrough- out my four and half years of MSc and then PhD study. Finally I wish to thank my excellentsupervisorsDr. StefanBilbaoandDr. AlanGray,withoutwhomnoneofthis wouldhavebeenpossible. v Table of Contents I Introduction and background theory 1 1 Introduction 2 1.1 Introductoryremarks . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesisobjectivesandoutline . . . . . . . . . . . . . . . . . . . . . . 3 2 Backgroundtheoryandliterature 6 2.1 Virtualacousticsimulations . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Principlesofacousticsandthewaveequation . . . . . . . . . 7 2.1.2 Geometricmethods . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.3 Wave-basedmethods . . . . . . . . . . . . . . . . . . . . . . 11 2.1.4 3Dfinitedifferencetimedomainmethod . . . . . . . . . . . 14 2.1.5 Hybridandalternativemethods . . . . . . . . . . . . . . . . 19 2.2 Physicalmodellingsynthesis . . . . . . . . . . . . . . . . . . . . . . 19 2.2.1 Modalsynthesis . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.2 Digitalwaveguides . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.3 Finitedifferencetimedomainmethod . . . . . . . . . . . . . 23 2.3 Parallelcomputingusinggraphicsprocessingunits . . . . . . . . . . 24 2.3.1 Parallelcomputing . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.2 EvolutionofGPUcomputing . . . . . . . . . . . . . . . . . 27 2.3.3 GPUarchitectures . . . . . . . . . . . . . . . . . . . . . . . 28 2.3.4 CUDAdevicememory . . . . . . . . . . . . . . . . . . . . . 29 2.3.5 CUDAthreadmodel . . . . . . . . . . . . . . . . . . . . . . 30 2.3.6 Performanceoptimisation . . . . . . . . . . . . . . . . . . . 31 2.3.7 GPUsand3DFDTDsimulations . . . . . . . . . . . . . . . 32 2.3.8 GPUsforaudioprocessing . . . . . . . . . . . . . . . . . . . 33 2.4 Sparselinearalgebra . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 vi 2.4.2 Sparsematrices . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.4.3 Solutiontosystemsoflinearequations . . . . . . . . . . . . 39 2.4.4 SparselinearalgebraandGPUs . . . . . . . . . . . . . . . . 43 2.5 Timingmethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 II Accelerating virtual acoustics 46 3 Computingsolutionstothe3Dwaveequation 47 3.1 Thebasic3Dscheme . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2 Testsimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3 Lineardecompositionofthree-dimensionaldata . . . . . . . . . . . . 49 3.4 CPUbenchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.4.1 Singlethreadcode . . . . . . . . . . . . . . . . . . . . . . . 50 3.4.2 Multi-threadedcode . . . . . . . . . . . . . . . . . . . . . . 51 3.4.3 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 52 3.5 GPUkerneldesignandoptimisation . . . . . . . . . . . . . . . . . . 55 3.5.1 OverviewoftheCUDAprogramdesign . . . . . . . . . . . . 55 3.5.2 Mappingthreadstothedataset . . . . . . . . . . . . . . . . 56 3.5.3 Useofsharedmemory . . . . . . . . . . . . . . . . . . . . . 57 3.5.4 Cacheoptimisation . . . . . . . . . . . . . . . . . . . . . . . 59 3.5.5 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 60 3.6 TheuseofmultipleGPUswithCUDA . . . . . . . . . . . . . . . . . 63 3.6.1 DatapartitioningovermultipleGPUs . . . . . . . . . . . . . 64 3.6.2 Non-asynchronousimplementation . . . . . . . . . . . . . . 65 3.6.3 Asynchronousimplementation . . . . . . . . . . . . . . . . . 65 3.6.4 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 66 3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4 Performanceofalternativeschemes 70 4.1 Benchmarksimulation . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2 Staggeredgridformation . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.2.2 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 74 4.3 Interpolatedwidebandandface-centredcubic . . . . . . . . . . . . . 75 4.3.1 Descriptionofschemes . . . . . . . . . . . . . . . . . . . . . 75 vii 4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.3.3 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 78 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5 Virtualacousticsimulations 83 5.1 State-freeboundaryconditions . . . . . . . . . . . . . . . . . . . . . 83 5.1.1 Frequency-independentlossyboundary . . . . . . . . . . . . 84 5.1.2 Implementationmethods . . . . . . . . . . . . . . . . . . . . 85 5.1.3 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 86 5.2 Boundariesrequiringstatememory . . . . . . . . . . . . . . . . . . . 87 5.2.1 EngquistMajdaboundarycondition . . . . . . . . . . . . . . 87 5.2.2 Implementationmethods . . . . . . . . . . . . . . . . . . . . 87 5.2.3 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 90 5.3 Attenuationofsoundinair . . . . . . . . . . . . . . . . . . . . . . . 91 5.3.1 Descriptionofschemewithviscosity . . . . . . . . . . . . . 91 5.3.2 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 92 5.4 Large-scalemodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.4.1 MaximumGPUmemoryusage . . . . . . . . . . . . . . . . 93 5.4.2 Singleprecisionissues . . . . . . . . . . . . . . . . . . . . . 94 5.4.3 Dispersioninthestandardscheme . . . . . . . . . . . . . . . 96 5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 III Integrated physical models of instruments 99 6 Basiclinearalgebraoperations 100 6.1 Sparsematrixstorageformats . . . . . . . . . . . . . . . . . . . . . 101 6.1.1 CSRandCSCformats . . . . . . . . . . . . . . . . . . . . . 101 6.1.2 DIAgonalformat . . . . . . . . . . . . . . . . . . . . . . . . 102 6.1.3 ELLPACKformat . . . . . . . . . . . . . . . . . . . . . . . 103 6.2 Vectoraddition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.2.1 CPUperformanceevaluation . . . . . . . . . . . . . . . . . . 104 6.2.2 GPUperformanceevaluation . . . . . . . . . . . . . . . . . . 105 6.3 Thedotproduct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.3.1 Parallelalgorithmdesign . . . . . . . . . . . . . . . . . . . . 106 6.3.2 Performanceevaluation . . . . . . . . . . . . . . . . . . . . . 106 viii 6.4 Matrixbyvectormultiplication . . . . . . . . . . . . . . . . . . . . . 107 6.4.1 Performancecomparisonofmatrixstorageformats . . . . . . 107 6.4.2 Singleandmulti-threadedCPUperformanceofDIAformat . 110 6.4.3 GPUperformancecomparisonofDIAformat . . . . . . . . . 111 6.5 Matrixvsnon-matrixforms . . . . . . . . . . . . . . . . . . . . . . . 112 6.5.1 Categoriesofschemes . . . . . . . . . . . . . . . . . . . . . 112 6.5.2 2Dwaveequationsystem . . . . . . . . . . . . . . . . . . . 113 6.5.3 CPUperformanceevalutaion . . . . . . . . . . . . . . . . . . 113 6.5.4 GPUperformanceevalutaion . . . . . . . . . . . . . . . . . . 114 6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7 Anintegratedmodelofthetimpanidrum 116 7.1 Overviewofthemodel . . . . . . . . . . . . . . . . . . . . . . . . . 117 7.2 Computationalelements . . . . . . . . . . . . . . . . . . . . . . . . 120 7.2.1 Simulationsetup . . . . . . . . . . . . . . . . . . . . . . . . 120 7.2.2 Timeiterationmatrices . . . . . . . . . . . . . . . . . . . . . 122 7.2.3 Timeiterationoperations . . . . . . . . . . . . . . . . . . . . 125 7.3 Implementationdesigns . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.3.1 Parallelimplementationofthetimeiterationstages . . . . . . 129 7.3.2 CSCformatusingCSparse . . . . . . . . . . . . . . . . . . . 131 7.3.3 CSRformatandCUSPARSE . . . . . . . . . . . . . . . . . . 132 7.3.4 DIAandELLPACKformat . . . . . . . . . . . . . . . . . . 132 7.3.5 Unrolledmatrix-freeoperations . . . . . . . . . . . . . . . . 133 7.4 Performanceanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.4.1 Testsimulation . . . . . . . . . . . . . . . . . . . . . . . . . 135 7.4.2 Summarycomparisons . . . . . . . . . . . . . . . . . . . . . 136 7.4.3 Analysisofasingleiterationintime . . . . . . . . . . . . . . 137 7.5 Amulti-instrumentmodelinavirtualroom . . . . . . . . . . . . . . 140 7.5.1 Systemabstraction . . . . . . . . . . . . . . . . . . . . . . . 140 7.5.2 Examplesimulation . . . . . . . . . . . . . . . . . . . . . . 141 7.5.3 Furtherparallelimplementations . . . . . . . . . . . . . . . . 142 7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 ix

Description:
DI N B U. R. G. H. Doctor of Philosophy. Acoustics and Audio Group . method and Kepler architecture GPUs” , In Proceedings of the Stockholm Music were developed by Chaigne and others for modelling guitars and pianos [74]
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.