ebook img

Languages and Compilers for Parallel Computing: 22nd International Workshop, LCPC 2009, Newark, DE, USA, October 8-10, 2009, Revised Selected Papers PDF

435 Pages·2010·5.757 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Languages and Compilers for Parallel Computing: 22nd International Workshop, LCPC 2009, Newark, DE, USA, October 8-10, 2009, Revised Selected Papers

Lecture Notes in Computer Science 5898 CommencedPublicationin1973 FoundingandFormerSeriesEditors: GerhardGoos,JurisHartmanis,andJanvanLeeuwen EditorialBoard DavidHutchison LancasterUniversity,UK TakeoKanade CarnegieMellonUniversity,Pittsburgh,PA,USA JosefKittler UniversityofSurrey,Guildford,UK JonM.Kleinberg CornellUniversity,Ithaca,NY,USA AlfredKobsa UniversityofCalifornia,Irvine,CA,USA FriedemannMattern ETHZurich,Switzerland JohnC.Mitchell StanfordUniversity,CA,USA MoniNaor WeizmannInstituteofScience,Rehovot,Israel OscarNierstrasz UniversityofBern,Switzerland C.PanduRangan IndianInstituteofTechnology,Madras,India BernhardSteffen TUDortmundUniversity,Germany MadhuSudan MicrosoftResearch,Cambridge,MA,USA DemetriTerzopoulos UniversityofCalifornia,LosAngeles,CA,USA DougTygar UniversityofCalifornia,Berkeley,CA,USA GerhardWeikum Max-PlanckInstituteofComputerScience,Saarbruecken,Germany Guang R. Gao Lori L. Pollock John Cavazos Xiaoming Li (Eds.) Languages and Compilers for Parallel Computing 22nd International Workshop, LCPC 2009 Newark, DE, USA, October 8-10, 2009 Revised Selected Papers 1 3 VolumeEditors GuangR.Gao UniversityofDelaware DepartmentofElectricalandComputerEngineering Newark,DE19716,USA E-mail:[email protected] LoriL.Pollock UniversityofDelaware DepartmentofComputerandInformationSciences Newark,DE19716,USA E-mail:[email protected] JohnCavazos UniversityofDelaware DepartmentofComputerandInformationSciences Newark,DE19716,USA E-mail:[email protected] XiaomingLi UniversityofDelaware DepartmentofElectricalandComputerEngineering Newark,DE19716,USA E-mail:[email protected] LibraryofCongressControlNumber:2010927404 CRSubjectClassification(1998):D.1.3,C.2.4,D.4.2,H.3.4,D.2 LNCSSublibrary:SL1–TheoreticalComputerScienceandGeneralIssues ISSN 0302-9743 ISBN-10 3-642-13373-8SpringerBerlinHeidelbergNewYork ISBN-13 978-3-642-13373-2SpringerBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,re-useofillustrations,recitation,broadcasting, reproductiononmicrofilmsorinanyotherway,andstorageindatabanks.Duplicationofthispublication orpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLawofSeptember9,1965, initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer.Violationsareliable toprosecutionundertheGermanCopyrightLaw. springer.com ©Springer-VerlagBerlinHeidelberg2010 PrintedinGermany Typesetting:Camera-readybyauthor,dataconversionbyScientificPublishingServices,Chennai,India Printedonacid-freepaper 06/3180 Preface Itisourpleasuretopresentthepapersacceptedforthe22ndInternationalWork- shop on Languages and Compilers for Parallel Computing held during October 8–10 2009 in Newark Delaware,USA. Since 1986,LCPC has became a valuable venueforresearcherstoreportonworkinthegeneralareaofparallelcomputing, high-performance computer architecture and compilers. LCPC 2009 continued this tradition and in particular extended the area of interest to new parallel computing acceleratorssuch as the IBM Cell Processorand Graphic Processing Unit (GPU). This year we received 52 submissions from 15 countries. Each submission receivedatleastthreereviewsandmosthadfour.ThePCalsosoughtadditional externalreviewsforcontentiouspapers.ThePCheldanall-dayphoneconference on August 24 to discuss the papers. PC members who had a conflict of interest were asked to leave the call temporarily when the corresponding papers were discussed. From the 52 submissions, the PC selected 25 full papers and 5 short paperstobeincludedintheworkshopproceeding,representinga58%acceptance rate. We were fortunate to have three keynote speeches, a panel discussion and a tutorial in this year’s workshop. First, Thomas Sterling, Professor of Computer Science atLouisianaState University,gavea keynotetalk titled “HPC inPhase Change:Towards a New ParallelExecution Model.” Sterling arguedthat a new multi-dimensional research thrust was required to realize the design goals with regard to power, complexity, clock rate and reliability in the new parallel com- putersystems.ParalleX,anexploratoryexecutionmodeldevelopedbySterling’s groupwas introducedto guide the co-designofnew architectures,programming methods and system software. The secondkeynote talk,“The PolytopeModel, Past,Present,Future,” pre- sented by Paul Feautrier from Ecole Normale Superieure De Lyon, France, re- viewedthehistoryofthepolytopemodelfromtheperspectivesofitsmotivations, its applications in program transformations, and its limitations. Feautrier also shared with the audience his visions of the future of the polytope model and highlighted several important unsolved problems. Third, Bill Carlson from the IDA Center for Computing Sciences gave a keynotetalkontheparallelprogrammingparadigmUPC.Asoneoftheorigina- tors of UPC, Carlsonillustrated the intentions of UPC, its applications, and its roleinthenewgenerationofhigh-performancecomputerarchitectures.Thistalk provided LCPC attendees with an insightful perspective on the history, current status and future of UPC. AspecialpanelwasheldonThursdayeveningtostimulatediscussionamong the LCPCattendeesonthe meaningofcompileroptimizationsinthe new world ofmany-core-basedcomputersystems.Thispanelwasorganizedandmoderated VI Preface byXiaomingLifromtheUniversityofDelawareandfiveleadingresearchersfrom both academia and industry shared their perspectives on the major challenges ofcompileroptimizationinviewofthe rapidevolutionofcomputerarchitecture andsystemsoftware.ThepanelincludeAlbertCohen(INRIA,France),Hironori Kasahara(WasedaUniversity,Japan),RishiKhan(ETI),DavidPadua(Univer- sity of Illinois at Urbana-Champaign), and Nicolas Vasilache (Reservoir Inc.). We were also fortunate to be able to invite a distinguished group of re- searchersto give a tutorial on “SSA-Based Register Allocation” on the last day oftheworkshop.ThetutorialintroducedtheSSA-basedregisterallocationtech- nique, it properties and complexities, enabling analysis techniques and its ap- plicationsincompilers.The presenterswerePhilipBrisk(EPFL),JensPalsberg (UCLA), Fabrice Rastello (ENS Lyon), Sebastian Hack (Saarland University, Germany), and Florent Bouchez (Indian Institute of Science, India). We would like to thank the many people whose valuable time and effort made LCPC 2009a success.We first wantto thank allauthorswho contributed papers to the workshop. Furthermore, the success of LCPC is unimaginable without the passionate commitment of David Padua, the Steering Committee, as well as the great effort of the Program Committee members and external reviewers.We alsowantto expressourgratitudeto ET International,HP,IBM, NVIDIA and Reservoir Labs, whose financial support made the workshop a pleasantexperience.Finally,thequalityorganizationoftheworkshopowedmuch to a group of outstanding volunteers led by Xu Wang. October 2009 Lori Pollock Guang R. Gao Organization LCPC 2009 was organized by the Department of Computer and Information Science and the Department of Electrical and Computer Engineering at the University of Delaware. Steering Committee Rudolf Eigenmann Purdue University Alex Nicolau University of California at Irvine David Padua University of Illinois at Urbana-Champaign Lawrence Rauchwerger Texas A&M University Program Committee Jose Nelson Amaral University of Alberta, Canada Saman Amarasinghe MIT, USA Eduard Ayguad UPC, Spain Hans J. Boehm HP, USA Calin Cascaval IBM, USA John Cavazos University of Delaware, USA Dan Connors University of of Colorado,USA Keith Cooper Rice University, USA Maria Garzaran University of Illinois, USA Mary Hall University of Utah, USA William Jalby University of Versailles, France Hironori Kasahara Waseda University, Japan Jenq Kuen Lee National TsingHua University, Taiwan Xiaoming Li University of Delaware, USA John Mellor-Crummey Rice University, USA Michael O’Boyle University of Edinburgh, UK Paul Petersen Intel, USA Keshav Pingali University of Texas, USA Vivek Sarkar Rice University, USA Vugranam Sreedhar IBM, USA Sponsoring Institutions ET International, Inc. Hewlett-Packard Corp. IBM Corp. NVIDIA Corp. Reservoir Labs, Inc. Table of Contents A Communication Framework for Fault-Tolerant ParallelExecution .... 1 Nagarajan Kanna, Jaspal Subhlok, Edgar Gabriel, Eshwar Rohit, and David Anderson The STAPL pList................................................ 16 Gabriel Tanase, Xiabing Xu, Antal Buss, Harshvardhan, Ioannis Papadopoulos, OlgaPearce, TimmieSmith,Nathan Thomas, Mauro Bianco, Nancy M. Amato, and Lawrence Rauchwerger Hardware Support for OpenMP Collective Operations ................ 31 Soohong P. Kim, Samuel P. Midkiff, and Henry G. Dietz Loop Transformation Recipes for Code Generation and Auto-Tuning ... 50 Mary Hall, Jacqueline Chame, Chun Chen, Jaewook Shin, Gabe Rudy, and Malik Murtaza Khan MIMD Interpretation on a GPU ................................... 65 Henry G. Dietz and B. Dalton Young TL-DAE: Thread-Level Decoupled Access/Execution for OpenMP on the Cyclops-64 Many-Core Processor ............................... 80 Ge Gan and Joseph Manzano Mapping Streaming Languages to General Purpose Processors through Vectorization .................................................... 95 Raymond Manley and David Gregg A Balanced Approach to Application Performance Tuning............. 111 Souad Koliai, St´ephane Zuckerman, Emmanuel Oseret, Micka¨el Ivascot, Tipp Moseley, Dinh Quang, and William Jalby Automatically Tuning Paralleland ParallelizedPrograms ............. 126 Chirag Dave and Rudolf Eigenmann DFT Performance Prediction in FFTW............................. 140 Liang Gu and Xiaoming Li Safe and Familiar Multi-core Programming by Means of a Hybrid Functional and Imperative Language ............................... 157 Ronald Veldema and Michael Philippsen Hierarchical Place Trees: A Portable Abstraction for Task Parallelism and Data Movement.............................................. 172 Yonghong Yan, Jisheng Zhao, Yi Guo, and Vivek Sarkar X Table of Contents OSCAR API for Real-Time Low-PowerMulticores and Its Performance on Multicores and SMP Servers.................................... 188 Keiji Kimura, Masayoshi Mase, Hiroki Mikami, Takamichi Miyamoto, Jun Shirako, and Hironori Kasahara Programming with Intervals....................................... 203 Nicholas D. Matsakis and Thomas R. Gross Adaptive and Speculative Memory Consistency Support for Multi-core Architectures with On-Chip Local Memories......................... 218 Nikola Vujic, Lluc Alvarez, Marc Gonzalez Tallada, Xavier Martorell, and Eduard Ayguad´e Synchronization-Free Automatic Parallelization: Beyond Affine Iteration-Space Slicing............................................ 233 Anna Beletska, Wlodzimierz Bielecki, Albert Cohen, and Marek Palkowski Automatic Data Distribution for Improving Data Locality on the Cell BE Architecture ................................................. 247 Miao Wang, Franc¸ois Bodin, and S´ebastien Matz Automatic Restructuring of Linked Data Structures .................. 263 Harmen L.A. van der Spek, C.W. Mattias Holm, and Harry A.G. Wijshoff Using the Meeting Graph Framework to Minimise Kernel Loop Unrolling for Scheduled Loops ..................................... 278 Mounira Bachir, David Gregg, and Sid-Ahmed-Ali Touati Efficient Tiled Loop Generation: D-Tiling ........................... 293 DaeGon Kim and Sanjay Rajopadhye Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization........................................... 308 Chunhua Liao, Daniel J. Quinlan, Richard Vuduc, and Thomas Panas Speculative Optimizations for ParallelPrograms on Multicores ........ 323 Vijay Nagarajan and Rajiv Gupta Fastpath Speculative Parallelization ................................ 338 Michael F. Spear, Kirk Kelsey, Tongxin Bai, Luke Dalessandro, Michael L. Scott, Chen Ding, and Peng Wu PSnAP: Accurate Synthetic Address Streams through Memory Profiles ......................................................... 353 Catherine Mills Olschanowsky, Mustafa M. Tikir, Laura Carrington, and Allan Snavely Table of Contents XI Enforcing Textual Alignment of Collectives Using Dynamic Checks .... 368 Amir Kamil and Katherine Yelick A Code Generation Approach for Auto-Vectorization in the Spade Compiler ....................................................... 383 Huayong Wang, Henrique Andrade, Bu˘gra Gedik, and Kun-Lung Wu Portable Just-in-Time Specialization of Dynamically Typed Scripting Languages ...................................................... 391 Kevin Williams, Jason McCandless, and David Gregg Reducing Training Time in a One-Shot Machine Learning-Based Compiler ....................................................... 399 John Thomson, Michael O’Boyle, Grigori Fursin, and Bjo¨rn Franke Optimizing Local Memory Allocation and Assignment through a Decoupled Approach ............................................. 408 Boubacar Diouf, Ozcan Ozturk, and Albert Cohen Unrolling Loops Containing Task Parallelism ........................ 416 Roger Ferrer, Alejandro Duran, Xavier Martorell, and Eduard Ayguad´e Author Index.................................................. 425

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.