ebook img

Grid Computing: Software Environments and Tools PDF

357 Pages·2006·3.111 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Grid Computing: Software Environments and Tools

Grid Computing: Software Environments and Tools Jose´ C. Cunha and Omer F. Rana (Eds) Grid Computing: Software Environments and Tools With121Figures Jose´C.Cunha OmerF.Rana CITICentre SchoolofComputerScience DepartmentofComputerScience CardiffUniversity FacultyofScienceandTechnology UK NewUniversityofLisbon Portugal AcataloguerecordforthisbookisavailablefromtheBritishLibrary LibraryofCongressControlNumber:2005928488 ISBN-10:1-85233-998-5 Printedonacid-freepaper ISBN-13:978-1-85233-998-2 (cid:1)c Springer-VerlagLondonLimited2006 Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,aspermittedunderthe Copyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced,storedortransmitted,inanyform orbyanymeans,withthepriorpermissioninwritingofthepublishers,orinthecaseofreprographicreproductionin accordancewiththetermsoflicencesissuedbytheCopyrightLicensingAgency.Enquiriesconcerningreproduction outsidethosetermsshouldbesenttothepublishers. Theuseofregisterednames,trademarks,etc.inthispublicationdoesnotimply,evenintheabsenceofaspecific statement,thatsuchnamesareexemptfromtherelevantlawsandregulationsandthereforefreeforgeneraluse. Thepublishermakesnorepresentation,expressorimplied,withregardtotheaccuracyoftheinformationcontained inthisbookandcannotacceptanylegalresponsibilityorliabilityforanyerrorsoromissionsthatmaybemade. PrintedintheUnitedStatesofAmerica (SPI/MVY) 9 8 7 6 5 4 3 2 1 SpringerScience+BusinessMedia springeronline.com Preface Gridcomputingcombinesaspectsfromparallelcomputing,distributedcomputinganddataman- agement,andhasbeenplayinganimportantroleinpushingforwardthestate-of-the-artincom- puter science and information technologies. There is considerable interest in Grid computing at present, with a significant number of Grid projects being launched across the world. Many countrieshavestartedtoimplementtheirownGridcomputingprogrammes–suchasintheAsia Pacificregion(includingJapan,Australia,SouthKoreaandThailand),theEuropeanUnion(as partoftheFramework5and6programmes,andnationalactivitiessuchastheUKeSciencepro- gramme),andtheUS(aspartoftheNSFCyberInfrastructureandtheDDDASprogrammes).The risinginterestinGridcomputingcanbeseenbytheincreaseinthenumberofparticipantsatthe GlobalGridForum(http://www.gridforum.org/),aswellasthroughregularsessions onthisthemeatseveralconferences. ManyexistingGridprojectsfocusondeployingcommoninfrastructure(suchasGlobus,UNI- CORE,andLegion/AVAKI).Sucheffortsareprimarilyaimedatimplementingspecialistmiddle- wareinfrastructurethatcanbeutilizedbyapplicationdevelopers,withoutprovidinganydetails about how such infrastructure can best be utilized. As Grid computing infrastructure matures, however,thenextphasewillrequiresupportfordeployinganddevelopingapplicationsandasso- ciated tools and environments which can utilize this core infrastructure effectively. It is there- foreimportanttoexploresoftwareengineeringthemeswhichwillenablecomputerscientiststo addresstheconcernsarisingfromtheuseofthismiddleware. However, approaches to software construction for Grid computing are ad hoc at the present time.ThereiseitherdeploymentofexistingtoolsnotreallymeantforGridenvironments,ortools thatarenotrobust–andthereforenotlikelytobere-usedincommunitiesotherthanthosewithin whichtheyhavebeendeveloped(examplesincludespecializedlibrariesforBioInformaticsand Physics, for instance). On the other hand, a number of projects are exploring the development ofapplicationsusingspecialisttoolsandapproachesthathavebeenexploredwithinaparticular researchproject,withoutconsideringthewiderimplicationsofusinganddeployingthesetools. Asaconsequence,thereislittlesharedunderstandingofthecommonneedsofsoftwareconstruc- tion,development,deploymentandre-use.Themainmotivationforthisbookistohelpidentify what these common themes are, and to provide a series of chapters offering a more detailed perspectiveonthesethemes. Recentdevelopmentsinparallelanddistributedcomputing:Inthepasttwodecades,advances inparallelanddistributedcomputingallowedthedevelopmentofmanyapplicationsinScience andEngineeringwithcomputational anddataintensiverequirements.Soonitwasrealizedthat therewasaneedfordevelopinggenericsoftwarelayersandintegratedenvironmentswhichcould v vi Preface facilitatetheproblemsolvingprocess,generallyinthecontextofaparticularfunctionality.For example, such efforts have enabled applications involving complex simulations with visualiza- tionandsteering,designoptimizationandapplicationbehaviorstudies,rapidprototyping,deci- sion support, and process control (both from industry and academia). A significant number of projectsinGridcomputingbuilduponthisearlierwork. RecenteffortsinGridcomputinginfrastructurehaveincreasedtheneedforhigh-levelabstrac- tions for software development, due to the increased complexity of Grid systems and applica- tions. Gridapplications are addressing several challenges which had not been faced previously byparallelanddistributedcomputing:largescalesystemsallowingtransparentaccesstoremote resources; long running experiments and more accurate models; increased levels of interaction e.g.multi-sitecollaborationforincreasedproductivityinapplicationdevelopment. Distributedcomputing:Thecapabilitytophysicallydistributecomputationanddatahasbeen exploredforalongtime.Oneofitsmaingoalshasbeentobeabletoadapttothegeographical distributionofanapplication(intermsofusers,processingorarchivingability).Increasedavail- ability and reliability of the systems architectures has also been successfully achieved through distribution of data and control. A fundamental challenge in the design of a distributed system has been to determine how a convenient trade-off can be achieved between transparency and awareness at each layer of its software architecture. The levels of transparency, as provided by distributed computing systems, has been (and will continue) to change over time, depending ontheapplicationrequirementsandontheevolutionofthesupportingtechnologies.Thelatter aspectisconfirmedwhenweanalyzeGridcomputingsystems.Advancesinprocessingandcom- municationtechnologieshaveenabledtheprovisionofcost-effectivecomputationalandstorage nodes,andhigherbandwidthsinmessagetransmission.Thishasallowedmoreefficientaccessto remoteresources,supercomputingpower,orlargescaledatastorage,andopenedthewaytomore complexdistributedapplications.Suchtechnologyadvances havealsoenabledtheexploitation of more tightly coupled forms of interactions between users (and programs), and pushed for- wardnovelparadigmsbasedonWebcomputing,Peer-2-Peercomputing,mobilecomputingand multi-agentsystems. Parallelcomputing:Thegoalofreducingapplicationexecutiontimethroughparallelismhas pushedforwardmanysignificantdevelopmentsincomputersystemarchitectures,andalsoinpar- allelprogrammingmodels,methods,andlanguages.Asuccessfuldesignfortaskdecomposition andcooperation,whendevelopingaparallelapplication,dependscriticallyontheinternallayers ofthearchitectureofaparallelcomputingsystem,whichincludealgorithms,programminglan- guages, compilers and runtime systems, operating systems and computer system architectures. Twodecadesofresearchandexperimentationhavecontributedtosignificantspeedupimprove- mentsinmanyapplicationdomains,bysupportingthedevelopmentofparallelcodesforsimula- tionofcomplexmodelsandforinterpretationoflargevolumesofdata.Suchdevelopmentshave been supported by advanced tools and environments, supporting processing and visualization, computationalsteering,andaccessthroughdistinctuserinterfacesandstandardizedapplication programminginterfaces. Developments in parallel application development have also contributed to improvement in methods and techniques supporting the software life cycle, such as improved support for for- mal specification and structured program development, in addition to performance engineering issues. Component-based models have enabled various degrees of complexity, granularity, and heterogeneity to be managed for parallel and distributed applications – generally by reducing dependencies between different software libraries. For example, simulators and mathematical Preface vii packages,dataprocessingorvisualizationtoolswerewrappedassoftwarecomponentsinorder to be more effectively integrated into a distributed environment. Such developments have also allowedaclearidentificationofdistinctlevelsoffunctionalitiesforapplicationdevelopmentand deployment: from problem specification, to resource management and execution support ser- vices. Developments in portable and standard programming platforms (such as those based on theJavaprogramminglanguage),havealsohelpedinthehandlingofheterogeneityandinterop- erabilityissues. Inordertoeasethecomputationalsupportforscientificandengineeringactivities,integrated environments, usually called Problem-Solving Environments (PSEs) have been developed for solvingclassesofrelatedproblemsinspecificapplicationdomains.Theyprovidetheuserinter- facesandtheunderlyingsupporttomanage anincreasinglycomplexlifecycleofactivitiesfor applicationdevelopmentandexecution.Thisstartswiththeproblemspecificationsteps,followed bysuccessiverefinementstowardscomponentdevelopmentandselection(forcomputation,con- trol,andvisualization).Thisisfollowedbytheconfigurationofexperiments,throughcomponent activationandmappingontospecificparallelanddistributedcomputingplatforms(includingthe set up of application parameters), followed by execution monitoring and control, possibly sup- portedthroughvisualizationfacilities. As applications exhibit more complex requirements (intensive computation, massive data processing,higherdegreesofinteraction),manyeffortshavebeenfocusingoneasingtheintegra- tionofheterogeneouscomponents,andprovidingmoretransparentaccesstodistributedresources availableinwide-areanetworks,through(Web-enabled)portalinterfaces. Gridcomputing:WhenlookingatthelayersofaGridarchitecture,theyaresimilartothoseof adistributedcomputingsystem: 1.Userinterfaces,applicationsandPSEs. 2.Programminganddevelopmentmodels,toolsandenvironments. 3.Middleware,servicesandresourcemanagement. 4.Heterogeneousresourcesandinfrastructure. However, researchers in Grid computing are pursuing higher levels of transparency, aiming to provide unifying abstractions to the end-user, with single access points to pools of virtual resources. Virtual resources provide support for launching distributed jobs involving computa- tion,dataaccessandmanipulationofscientificinstruments,withvirtualaccesstoremotedata- bases,cataloguesandarchives,aswellascooperationbasedonvirtualcollaborationspaces.In thisview,themaindistinctivecharacteristicofGridcomputing,whencomparedtopreviousgen- erationsofdistributedcomputingsystems,isthis(more)ambitiousgoalofprovidingincreased transparencyand“virtualization”ofresources,overalargescaledistributedinfrastructure. Indeed,ongoingdevelopmentswithinGridcomputingareaddressingthedeploymentoflarge scaleapplicationanduserprofiles,supportedbycomputationalGridsforhigh-performancecom- puting, intelligent data Grids for accessing large datasets and distributed data repositories – all based on the general concept of “virtual organizations” which enable resource sharing across organizational boundaries. Recent interest in a “Grid Ecosystem” also places emphasis on the needtointegratetoolsatdifferentsoftwarelayersfromavarietyofdifferentvendors,enabling arangeofdifferentsolutionstoco-existforsolvingthesameproblem.Thisviewalsoallowsa developer to combine tools and services, and enables the use of different services which exist atthesamesoftwarelayeratdifferenttimes.Theavailabilityofsuitableabstractionstofacility suchaGridEcosystemstilldonotexisthowever. viii Preface Duetotheaboveaspects,Gridsareverycomplexsystems,whosedesignandimplementation involvesmultipledimensions,suchaslargescale,distribution,heterogeneity,openness,multiple administration domains, security and access control, and dynamic and unpredictable behavior. AlthoughtherehavebeensignificantdevelopmentsinGridinfrastructuresandmiddleware,sup- portisstilllackingforeffectiveGridapplicationsdevelopment,andtoassistsoftwaredevelop- ers in managing the complexity of Grid applications and systems. Such applications generally involve large numbers of distributed, and possibly mobile and intelligent, computational com- ponents, agents or devices. This requires appropriate structuring, interaction and coordination methodsandmechanisms,andnewconceptsfortheirorganizationandmanagement.Workflow tools to enable application composition, common ways to encode interfaces between software components, and mechanisms to connect sets of components to a range of different resource management systems are also required. Grid applications will access large volumes of data, hopefully relying upon efficient and possibly knowledge-based data mining approaches. New problem-solvingstrategieswithadaptivebehaviorwillberequiredinordertoreacttochangesat theapplicationlevel,andchangesinthesystemconfigurationorintheavailabilityofresources, duetotheirvaryingcharacteristicsandbehavior.Intelligentexpertandassistancetools,possibly integrated in PSEs, will also play an increasingly important role in enabling the user-friendly interfacingtosuchsystems. Ascomputationalinfrastructurebecomesmorepowerfulandcomplex,thereisagreaterneed to provide tools to support the scientific computing community to make better use of such infrastructure. The last decade has also seen an unprecedented focus on making computational resourcessharable(parallelmachinesandclusters,anddatarepositories)acrossnationalbound- aries.Significantly,theemergenceofComputationalGridsinthelastfewyears,andthetoolsto supportscientificusersonsuchGrids(sometimesreferredtoas“eScience”)providesnewoppor- tunitiesforthescientificcommunitytoundertakecollaborative,andmulti-disciplinaryresearch. Often tools for supporting application scientists have been developed to support a particular community(Astrophysics,Biosciences,etc),acommonperspectiveontheuseofthesetoolsand makingthemmoregenericisoftenmissing. Further research and developments are therefore needed in several aspects of the software developmentprocess,includingsoftwarearchitecture,specificationlanguagesandcoordination models, organization models for large scale distributed applications, and interfaces to distrib- utedresourcemanagementandexecutionservices.Thespecification,composition,development, deployment, and control of the execution of Grid applications require suitable flexibility in the softwarelifecycle,alongitsmultiplestages,includingapplicationspecificationanddesign,pro- gramtransformationandrefinement,simulationandcodegeneration,configurationanddeploy- ment, and the coordination and control of distributed execution. New abstractions, models and tools are required to support the above stages in order to provide a diversity of functionalities, suchas: –Specificationandmodellingoftheapplicationstructureandbehavior,withincrementalrefine- ment and composition, and allowing reasoning about global functional and non-functional properties. –Abstractionsfortheorganizationofdynamiclargescalesystems. –Representationandmanagementofinteractionpatternsamongcomponentsandservices. –Enablingofalternativemappingsbetweenthelayersofthesoftwarearchitecture,supportedby patternortemplaterepositories,thatcanbemanipulatedduringthesoftwaredevelopmentand executionstages. Preface ix –Flexibleinteractionwithresourcemanagement,schedulinganddiscoveryservicesforflexible applicationconfigurationanddeployment,andawarenesstoQualityofService. –Coordinationofdistributedexecution,withadaptabilityanddynamicreconfiguration. Suchtypesoffunctionalitieswillprovidethefoundationsforbuildingenvironmentsandframe- works, developed on top of the basic service layers that are provided by Grid middleware and infrastructures. Outline of the book: The aim of this book is to identify software engineering techniques for Gridenvironments,alongwithspecialisttoolsthatencapsulatesuchtechniques,andcasestud- ies that illustrate the use of these tools. With the emergence of regional, national and global programmestoestablishGridcomputinginfrastructure,itisimportanttobeabletoutilizethis infrastructure effectively. Specialist software is therefore necessary to both enable the deploy- mentofapplicationsoversuchinfrastructure,andtofacilitatesoftwaredevelopersinconstructing softwarecomponentsforsuchinfrastructure.Wefeelthesecondoftheseisaparticularlyimpor- tantconcern,astheuptakeofGridcomputingtechnologieswillberestrictedbytheavailability ofsuitableabstractions,methodologies,andtools. Thisbookwillbeusefulfor: –Softwaredeveloperswhoareprimarilyresponsiblefordevelopingandintegratingcomponents forGridenvironments. –Itwillalsobeofinteresttoapplicationscientistsanddomainexperts,whoareprimarilyusers oftheGridsoftwareandneedtointeractwiththetools. –The book will also be useful for deployment specialists, who are primarily responsible for managingandconfiguringGridenvironments. Wehopethebookwillcontributetoincreasethereader’sappreciationfor: –Softwareengineering andmodellingtoolswhichwillenablebetterconceptual understanding ofthesoftwaretobedeployedacrossGridinfrastructure. –SoftwareengineeringissuesthatmustbesupportedtocomposesoftwarecomponentsforGrid environments. –SoftwareengineeringsupportformanagingGridapplications. –SoftwareengineeringlifecycletosupportapplicationdevelopmentforGridEnvironments(along withassociatedtools). –How novel concepts, methods and tools within Grid computing can be put at work in the contextofexistingexperimentsandapplicationcasestudies. AsmanyuniversitiesarenowalsointheprocessofestablishingcoursesinGridComputing,we hope this book will serve as a reference to this emerging area, and will help promote further developments both at university and industry. The chapters presented in this book are divided intofoursections: –Abstractions:chaptersincludedinthissectionrepresentkeymodellingapproachesthatarenec- essarytoenablebettersoftwaredevelopmentfordeploymentoverGridcomputinginfrastruc- ture.Withoutsuchabstractions,oneislikelytoseethecontinuinguseofad-hocapproaches. –ProgrammingandProcess:chaptersincludedinthissectionfocusontheoverallsoftwareengi- neeringprocessnecessaryforapplicationconstruction.Suchaprocessisessentialtochannel theactivityofateamofprogrammersworkingonaGridapplication. x Preface –User Environments and Tools: chapters in this section discuss existing application environ- mentsthatmaybeusedtoimplementGridapplications,orprovideadiscussionofhowappli- cationsmaybeeffectivelydeployedacrossexistingGridcomputinginfrastructure. –Applications:thefinalsectionprovidessampleapplicationsinEngineering,ScienceandEdu- cation,anddemonstratesomeoftheideasdiscussedinothersectionwithreferencetospecific applicationdomains. Jose´ Cunha,UniversidadeNovadeLisboa,Portugal OmerF.Rana,CardiffUniversity,UK Contents Preface............................................................................ v Chapter1 VirtualizationinGrids:ASemanticalApproach......................... 1 ZsoltNemethandVaidySunderam Chapter2 UsingEventModelsinGridDesign..................................... 19 AnthonyFinkelstein,JoeLewis-Bowen,GiacomoPiccinelli,andWolfgangEmerich Chapter3 IntelligentGrids....................................................... 45 XinBai,HanYu,GuoqiangWang,YongchangJi,GabrielaM.Marinescu, DanC.Marinescu,andLadislauBo¨lo¨ni ProgrammingandProcess Chapter4 AGridSoftwareProcess............................................... 75 GiovanniAloisio,MassimoCaffaro,andItaloEpicoco Chapter5 GridProgrammingwithJava,RMI,andSkeletons...................... 99 SergeiGorlatchandMartinAlt UserEnvironmentsandTools Chapter6 AReviewofGridPortalTechnology....................................126 MaozhenLiandMarkBaker Chapter7 AFrameworkforLooselyCoupledApplicationsonGrid Environments ................................................................... 157 AndreasHoheisel,ThiloErnst,andUweDer xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.