High-Performance Scientific Computing Michael W. Berry (cid:2) Kyle A. Gallivan (cid:2) Efstratios Gallopoulos (cid:2) Ananth Grama (cid:2) Bernard Philippe (cid:2) Yousef Saad (cid:2) Faisal Saied Editors High-Performance Scientific Computing Algorithms and Applications Editors MichaelW.Berry BernardPhilippe Dept.ElectricalEng.&ComputerScience IRISA UniversityofTennessee INRIARennes-BretagneAtlantique Knoxville,TN,USA Rennes,France KyleA.Gallivan YousefSaad DepartmentofMathematics Dept.ofComputerScience&Engineering FloridaStateUniversity UniversityofMinnesota Tallahassee,FL,USA Minneapolis,MN,USA EfstratiosGallopoulos FaisalSaied Dept.ComputerEngineering&Informatics DepartmentofComputerScience UniversityofPatras PurdueUniversity Patras,Greece WestLafayette,IN,USA AnanthGrama DepartmentofComputerScience PurdueUniversity WestLafayette,IN,USA ISBN978-1-4471-2436-8 e-ISBN978-1-4471-2437-5 DOI10.1007/978-1-4471-2437-5 SpringerLondonDordrechtHeidelbergNewYork BritishLibraryCataloguinginPublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary LibraryofCongressControlNumber:2012930017 ©Springer-VerlagLondonLimited2012 Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,asper- mittedundertheCopyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced, storedortransmitted,inanyformorbyanymeans,withthepriorpermissioninwritingofthepublish- ers,orinthecaseofreprographicreproductioninaccordancewiththetermsoflicensesissuedbythe CopyrightLicensingAgency.Enquiriesconcerningreproductionoutsidethosetermsshouldbesentto thepublishers. Theuseofregisterednames,trademarks,etc.,inthispublicationdoesnotimply,evenintheabsenceofa specificstatement,thatsuchnamesareexemptfromtherelevantlawsandregulationsandthereforefree forgeneraluse. Thepublishermakesnorepresentation,expressorimplied,withregardtotheaccuracyoftheinformation containedinthisbookandcannotacceptanylegalresponsibilityorliabilityforanyerrorsoromissions thatmaybemade. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface ThiscollectionisatributetotheintellectualleadershipandlegacyofProf.Ahmed H.Sameh.HissignificantcontributionstothefieldofParallelComputing,overhis longanddistinguishedcareer,havehadaprofoundinfluenceonhighperformance computingalgorithms,applications,andsystems.Hisdefiningcontributionstothe fieldofComputationalScienceandEngineering,anditsassociatededucationalpro- gram, resulted in a generation of highly trained researchers and practitioners. His high moral character and fortitude serve as exemplars for many in the community andbeyond. Prof. Sameh did his graduate studies in Civil Engineering at the University of Illinois at Urbana-Champaign (UIUC). Upon completion of his Ph.D. in 1966, he wasrecruitedbyDanielL.Slotnick,ProfessorandDirectoroftheIlliacIVproject, to develop various numerical algorithms. Prof. Sameh joined the Department of ComputerScienceasaResearchAssistantProfessor,subsequentlybecomingaPro- fessor,andalongwithProfs.DuncanLawrie,DanielGajskiandEdwardDavidson served as the Associate Director of the Center for Supercomputing Research and Development(CSRD).CSRDwasestablishedin1984undertheleadershipofProf. DavidJ.KucktobuildtheUniversityofIllinoisCedarmultiprocessor.Prof.Sameh directedtheCSRDAlgorithmsandApplicationsGroup.Hisvisionary,yetpractical outlook, in which algorithms were never isolated either from real applications or fromarchitectureandsoftware,resultedinseminalcontributions.By1995CSRD’s main mission had been accomplished, and Prof. Sameh moved to the University of Minnesota as Head of the Computer Science Department and William Norris Chair for Large-Scale Computing. After a brief interlude, back at UIUC, to lead CSRD,duringwhichhewasveryactiveinplanningtheestablishmentofComputa- tionalScienceandEngineeringasadisciplineandanassociatedgraduateprogram at UIUC, he returned to Minnesota, where he remained until 1997. He moved to Purdue University as the Head and Samuel D. Conte Professor of Computer Sci- ence. Prof. Sameh, who is a Fellow of SIAM, ACM and IEEE, was honored with theIEEE1999HarryH.GoodeMemorialAward“Forseminalandinfluentialwork inparallelnumericalalgorithms”. ItwasatPurduethatover50researchersandacademicprogenyofProf.Sameh gatheredinOctober2010tocelebratehis70thbirthday.TheoccasionwastheCon- v vi Preface ferenceonHighPerformanceScientificComputing:Architectures,Algorithms,and Applicationsheldinhishonor.TheattendeesrecalledProf.Sameh’smanyacademic achievements,including,notonlyhisresearchbutalsohiseffortsindefiningthein- terdisciplinary field of Computational Science and Engineering and his leadership andfoundingEditor-in-ChiefroleintheIEEECS&EMagazineaswellasthemany doctoralcandidatesthathehasgraduated:AtUIUC,JonathanLermit(1971),John Larson(1978),JohnWisniewski(1981),JosephGrcar(1981),EmmanuelKamgnia (1983),ChandrikaKamath(1986),MarkSchaefer(1987),Hsin-ChuChen(1988), RandallBramley(1988),Gung-ChungYang(1990),MichaelBerry(1990),FelixG. Lou (1992), Bart Semeraro (1992) and Vivek Sarin (1997); Ananth Grama (1996) attheUniversityofMinnesota;andZhanyeTong(1999),MattKnepley(2000),Ab- delkader Baggag (2003), Murat Manguoglu (2009) and Carl Christian Kjelgaard Mikkelsen(2009)atPurdue. ThisvolumeconsistsofasurveyofProf.Sameh’scontributionstothedevelop- ment high performance computingand sixteen editoriallyreviewed papers written tocommemoratetheoccasionofhis70thbirthday. Knoxville,USA MichaelW.Berry Tallahassee,USA KyleA.Gallivan Patras,Greece StratisGallopoulos WestLafayette,USA AnanthGrama Rennes,France BernardPhilippe Minneapolis,USA YousefSaad WestLafayette,USA FaisalSaied Acknowledgements We are especially grateful to Profs. Zhiyuan Li, Alex Pothen, and Bob Skeel for many arrangements that made the conference possible. We are also grateful to Dr.EricCox,whoundertooktheheavyloadofmakingmanyofthelocalarrange- mentsandDr.GeorgeKolliasandMs.Eugenia-MariaKontopouloufortheirhelpin compilingthisvolume.Finally,wethankSpringerandespeciallyMr.SimonRees for patiently working with us on this project and Donatas Akmanavicˇius of VTeX BookProductionforgreateditingworkincompilingthevolume. vii Contents 1 ParallelNumericalComputingfromIlliacIVtoExascale—The ContributionsofAhmedH.Sameh . . . . . . . . . . . . . . . . . . . 1 Kyle A. Gallivan, Efstratios Gallopoulos, Ananth Grama, Bernard Philippe, Eric Polizzi, Yousef Saad, Faisal Saied, and Danny Sorensen 2 ComputationalCapacity-BasedCodesignofComputerSystems . . . 45 DavidJ.Kuck 3 MeasuringComputerPerformance . . . . . . . . . . . . . . . . . . . 75 WilliamJalby,DavidC.Wong,DavidJ.Kuck,Jean-ThomasAcquaviva, andJean-ChristopheBeyler 4 ACompilationFrameworkfortheAutomaticRestructuring ofPointer-LinkedDataStructures . . . . . . . . . . . . . . . . . . . 97 HarmenL.A.vanderSpek,C.W.MattiasHolm,andHarryA.G.Wijshoff 5 DenseLinearAlgebraonAcceleratedMulticoreHardware . . . . . 123 JackDongarra,JakubKurzak,PiotrLuszczek,andStanimireTomov 6 TheExplicitSpikeAlgorithm:IterativeSolutionoftheReduced System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 CarlChristianKjelgaardMikkelsen 7 The Spike Factorization as Domain Decomposition Method; EquivalentandVariantApproaches . . . . . . . . . . . . . . . . . . 157 VictorEijkhoutandRobertvandeGeijn 8 ParallelSolutionofSparseLinearSystems . . . . . . . . . . . . . . . 171 MuratManguoglu 9 ParallelBlock-JacobiSVDMethods . . . . . . . . . . . . . . . . . . 185 MartinBecˇka,GabrielOkša,andMariánVajteršic ix