NATOUNCLASSIFIED+SWE+AUS Parallel Unsteady Overset Mesh Methodology for Adaptive and Moving Grids with Multiple Solvers MatthewW.Floros∗ JayanarayananSitaraman ArmyResearchLaboratory NationalInstituteofAerospace Hampton,Virginia Hampton,Virginia ABSTRACT This paper describes a new domain connectivity module developed to support Chimera-based interfacing of different CFD solversforperformingtime-dependent,adaptive,andmovingbodycalculationsofexternalaerodynamicflows.Thecapabilities ofthedomainconnectivitymoduleareprovideapowerfultoolforCFDanalysisofmorphingvehicles.Thedomainconnectivity modulecoordinatesthedatatransferbetweendifferentsolversappliedindifferentpartsofthecomputationaldomain—body fittedstructuredorunstructuredtocaptureviscousnear-walleffects,andCartesianadaptivemeshrefinementtocaptureeffects awayfromthewall. TheCFDsolversandthedomainconnectivitymoduleareexecutedwithinaPython-basedcomputational infrastructure. The domain connectivity module is fully parallel and performs all its operations (identification of holes and fringepoints,donorcellsearchesanddatainterpolation)onthepartitionedgriddata. Inaddition,theconnectivityprocedures are completely automated using the implicit hole-cutting methodology such that no user intervention or explicit hole-map specificationisnecessary. Thecapabilitiesandperformanceofthepackagearepresentedforseveraltestproblems,including flowoveraNACA0015wing, AGARDA2slottedairfoil, hoversimulationofscaledV-22rotor, andadynamicsimulationof UH-60Arotorinforwardflight. Amodificationtotheprocedureforselectingthebestdatafrommultipleoverlappinggridsis alsopresented. 1.0 INTRODUCTION Morphing vehicles present several unique challenges for computational fluid dynamics analysis. By definition, a morphing vehicle changes shape with time. The shape changing may be a gross deformation of the entire vehicle or the motion may be only small control surface or support structure. In either case, a single grid that captures both the important flow features aroundtheentirevehicleandthedetailsofitsgeometrywhileinmotioncanbechallenging. Suchproblemslendthemselves tomultiple,oversetgrids. Inthefirstcase,wherelargepartsofthevehiclearemoving,gridswrappedaroundtheselargeparts mustundergosignificantdeformationtoaccommodatetheshapechange. Inthelattercase,onlyasmalldeformationmaybe required,butthepurposeofthedeformationmaybetotakeadvantageofuniqueflowphysicsthatrequirelocalizedrefinement orevenaspecificsolutionmethodologythatisnotnecessaryfortherestofthevehicle. Theabilitytocombineheterogeneous gridtypesandsolversforanunsteady,movingbodysimulationisofgreatutility. Traditional CFD codes are often written to support a single gridding and solution paradigm. Grids fall under three main classifications: Cartesian(structuredorunstructured),structured-curvilinear(body-fitted)orunstructured(tetrahedral,hexahe- dralorprismatic). Eachmeshingparadigmhasspecificadvantagesanddisadvantages. Forexample,Cartesiangridsareeasy to generate, to adapt, and to extend to higher-order spatial accuracy, but they are not well-suited for resolving boundary lay- ersaroundcomplexgeometries. Structuredcurvilineargridsworkwellforresolvingboundarylayers,butthegridgeneration processforcomplexgeometriesremainstediousandrequiresconsiderableuserexpertise. Generalunstructuredgridsarewell- suited for complex geometries and are relatively easy to generate, but their spatial accuracy is often limited to second-order, andtheassociateddatastructuresarelesscomputationallyefficientthantheirstructured-gridcounterparts. Thus, while a single gridding paradigm brings certain advantages in some portions of the flow field, it also imposes an undueburdenonothers.Acomputationalplatformthatsupportsmultiplemeshparadigmsprovidesthepotentialforoptimizing thegriddingstrategyonalocalbasis. However,integratingdifferentmeshingparadigmsintoasinglelargemonolithiccodeis ∗Presentingauthor,matt.fl[email protected],NASALangleyResearchCenterMS266,6CEastTaylorStreet,Hampton,VA23681 RTO-MP-AVT-168 -1 NATOUNCLASSIFIED+SWE+AUS Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2010 2. REPORT TYPE 00-00-2010 to 00-00-2010 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Parallel Unsteady Overset Mesh Methodology for Adaptive and Moving 5b. GRANT NUMBER Grids with Multiple Solvers 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Army Research Laboratory,Hampton, VA, , , REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) NATO/RTO 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE Public Release 23 unclassified unclassified unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 NATOUNCLASSIFIED+SWE+AUS PARALLEL UNSTEADY OVERSET MESH METHODOLOGY FOR ADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS complex and usually relegates at least one of the models to be less accurate, less optimized, or less flexible than the original standalonesolver. This paper describes an approach to mitigate these issues by using multiple-mesh strategy that is implemented through the use of multiple CFD codes, each optimized for a particular mesh type. It was developed for analysis of rotorcraft, but featuresseveralenablingtechnologiesformorphingvehiclesandothertypesofunsteady,moving-bodyproblems. Inaddition tosupportingheterogeneousgridsandsolvers,thesoftwarewasdevelopedtominimizetheanalysisburdenontheuser. While aninterfaceprocedureisrequiredforeachspecificsolver,thisinterfacemustonlybewrittenonce. Oncetheinterfaceexists, thedomainconnectivitysoftwareretrievestheinformationitneedsdirectlyfromtheflowsolversandthegridsthemselves,so thehumananalystdoesnotneedtoprovideinputspecifictoeachproblem. Forrotorcraftanalysis,theapproachistoapplyunstructuredorbody-fittedcurvilineargridsnearthebodysurfacetocapture complexgeometryandviscousboundarylayers. Ashortdistancefromthebodysurface,theflowisdeterminedbyahigh-order block-structured adaptive Cartesian solver that adapts time-dependently to capture wake effects. Previous work [1, 2] has addressedthedevelopmentofaPython-basedmulti-solverinfrastructureusingbothunstructured(NSU3D[3])andstructured (UMTURNS[4])near-bodysolversandanadaptiveCartesianoff-bodysolver(SAMARC).Usingthishigh-levelframework, discretecodescanbecoupledtogethertoobtainacontiguoussolutionovertheentirecomputationaldomain. ThePythoncode merelymanagesdatapointerstopassinformationbetweenthesolversandthedomainconnectivitypackageandaddsnegligible overheadtotheanalysis. Figure1showsanexamplecalculationofflowoverasphereusingthisapproachforwhichboththe viscousboundarylayerandtheshedvorticityareaccuratelycaptured. Figure1: UnsteadyflowoverasphereatRe=1000usingthemultiple-solverapproach. TheNSU3Dunstructuredsolverisused nearthebodysurface,thehigh-orderCartesianAMRsolverSAMARCisusedinthefield,anddataisinterpolatedbetweenthe solversusingChimera-basedinterpolation[1]. Acriticalaspectofthemulti-mesh/multi-solverapproachistheneedfordataexchangebetweenthedifferentmeshes,which isfacilitatedinthisworkusingthewell-establishedChimera-basedoversetprocedure. AsshowninFig2,thenear-bodyand off-body meshes are constructed to be overlapping and the fringe data are interpolated using a domain connectivity module. Specifically,thedomainconnectivityproceduresinvolvetheevaluationofinter-gridboundarydatapoints(pointsthatreceive data),donorcells(pointsthatprovidedata),holepoints(pointsthatdonotneedtobesolved)andinterpolationweights. The focus of the present paper is on the development of a new domain connectivity module to support the overset multiple-mesh paradigm efficiently for large-scale unsteady computations and its applicability to unsteady moving body problems such as -2 RTO-MP-AVT-168 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLELUNSTEADYOVERSETMESHMETHODOLOGYFORADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS morphingvehicles. Several domain connectivity approaches have been investigated in the past by various research groups. The prominent among them are PEGASUS5 [5], OVERFLOW-DCF [6, 7], SUGGAR/DiRTlib [8], CHIMPS [9], BEGGAR [10], FAS- TRAN [11], and Overture [12]. Prior to the inception of the current effort, an evaluation of these packages showed that all ofthemhadcertaindeficienciesthatmadethemnon-idealforthemulti-solverparadigm. Forexample,PEGASUS5andSUG- GAR, while being robust and validated, are intended to be integrated with a single solver as opposed to interfacing multiple solvers,andthedomainconnectivityoperationisnotparallel,whichlimitstheirapplicabilitytolarge-scalemovingbodyprob- lems. OVERFLOW-DCFprovedtobeanefficientparallelsolverformovingbodyproblems,butitistightlyintegratedwithin theOVERFLOW[7]code,makingitsapplicabilitytoamodularmultiplesolverparadigmdifficult. Also,thereisnosupport currentlyavailableinthemoduleforunstructuredmeshes. TheCHIMPSpackagewasquitemodularwithanexcellentAPIand hadsupportforparallelexecutioninmovingbodyenvironments,butitsufferedfromlackofsupportforautomatedhole-cutting and inefficiency in search procedures. All of the above packages required user input in some form or other for specifying hole-regions(i.e.regionsofgridwhereflowsolutionsarenotperformed)andtypicallyperformedexplicitholecutting. Figure2: Oversetnear/off-bodygridding. Unstructuredorcurvilineargridstocapturegeometricfeaturesandboundarylayer nearbodysurface,adaptiveblock-structuredCartesiangridstocapturefar-fieldflowfeatures. The explicit hole-cutting methodology is known to be prone to several failure modes for moving body problems. An alternateapproachisknownas“implicithole-cutting”inwhichholesandfringepointsaredeterminedaspartofadonorsearch processratherthanaprioribytheanalyst.ThisconceptwasinitiallyinvestigatedbyPEGASUS5[5]researchersandeventually deemedtooinefficientfordynamicproblems. However,morerecentlyaresearchcode,NAVAIR-IHC[13],wasabletoshow efficientusageofthismethodologyforapplicationtomovingbodyproblems. TheNAVAIR-IHCcodealsowasevaluatedfor potential application within the multi-solver paradigm. However, lack of parallelism, implementations that are specific to a structured grid topology, and failure modes for concave grid topologies made it unsuitable for the general application in the multi-solverparadigmwithpartitionedgriddata. For seamless usage in the computational infrastructure, the Domain Connectivity software should be modular, parallel, fully-automated,andefficientforunsteadymovingbodyproblems[14]. Theprimarysubjectofthispaperconcernsthedevel- opmentandapplicationofsuchamethodologyinthemulti-solvercontext. Inaddition,thispaperdemonstratesthecapabilities ofthemulti-solverparadigminmodelingcomplexproblemswithoptimalgriddistribution. The new package developed as part of this work is termed PUNDIT, which is an acronym for Parallel Unsteady Domain InformationTransfer. TheobjectiveofthePUNDITdevelopersistocreateapackagewhichismodular,parallel,andsupports efficientautomateddomainconnectivityoperationsforallparticipantgridtypes(unstructured,structuredcurvilinearandadap- tiveCartesian). Tominimizeuserinputs,theimplicithole-cuttingmethodologyforidentificationofholeandfringeregionsis used. In addition, PUNDIT operates in the same computational infrastructure as the participant codes and uses data pointers forgridsandsolutionvariablesdirectly,therebypromotingeaseofinterfacingaswellasreducingthememoryfootprint. RTO-MP-AVT-168 -3 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLEL UNSTEADY OVERSET MESH METHODOLOGY FOR ADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS 2.0 Methodology The coupling of the near-body solver, off-body grid manager, off-body solver, and domain connectivity module are accom- plished through a Python-based infrastructure with emphasis on preserving the modularity of the participating solvers. In addition to the advantages in efficiency and ease in code development, coupling existing mature simulation codes through a commonhigh-levelinfrastructureprovidesanaturalwaytoreducethecomplexityofthecouplingtaskandtoleveragethelarge amount of verification, validation, and user experience that typically go into the development of each separate model. The infrastructure discussed here currently couples the following codes through a Python infrastructure: (a) the parallel NSU3D code [3] for the unstructured near-body solver, (b) the parallel UMTURNS code as the structured near-body solver, (c) the SAMRAI framework [15] for the off-body Cartesian grid generation, adaptation, and parallel communication, (d) the serial high-orderARC3DCcodeforsolutionontheCartesianblocks,and(e)thedomainconnectivitymodule(PUNDIT). Brief descriptions of the implementations of the first four components are outlined below. More detailed descriptions of the implementations as well as performance statistics and validation are described in more detail in Refs [1, 2]. This paperisdevotedtodetaileddescriptionsofalgorithmsandimplementationofthedomainconnectivitymodule(PUNDIT).The infrastructure which integrates the simulation capabilities of all the modules mentioned above is called HELIOS (Helicopter OversetSimulations)andisdevelopedundertheDoDHPCHI-ARMSprogram.Becauseofthemodulardevelopmentapproach being taken, the individual components of HELIOS, such as the solvers and the domain connectivity package, can be used independentlyformovingbodyproblemsotherthanrotorcraft. 2.1 PythonInfrastructure Python-based computational frameworks have been developed previously by several researchers [16] as a means of coupling togetherexistinglegacycodesormodules. Suchaframeworkhasanumberofadvantagesoveratraditionalmonolithiccode structure:(1)itiseasiertoincorporatewell-testedandvalidatedlegacycodesratherthantobuildthecapabilitiesintoanentirely newcode,(2)thereislesscodecomplexityintheinfrastructureitself,somaintenanceandmodificationcostsareless,and(3) itiseasiertotestandoptimizetheperformanceofeachmoduleseparately,oftenyieldingbetterperformanceforthecodeasa whole. Essentially,Pythonenablesthelegacysolverstoexecuteindependentlyofoneanotherandreferenceeachother’sdata withoutmemorycopiesorfileI/O.Further,thePython-wrappedcodemayberuninparallelusingpyMPIormyMPI,witheach ofthesolversfollowingitsnativeparallelimplementations. FurtherdetailsofthePythoninfrastructureusedherecanbefound inWissinketal.[1]andSitaramanetal.[2]. 2.2 FlowSolverModules Well-established legacy codes are employed for all of the independent modules in this study. For the unstructured solver in thenear-bodyregion,weemploytheNSU3D[3]code,whichisanimplicitnode-centeredReynolds-AveragedNavier-Stokes (RANS)codecapableofhandlingarbitraryunstructuredmeshelements. Inaddition,wealsouseacurvilinearstructuredmesh solverforthenear-bodyregion,namelytheUMTURNScode,whichisalsoanimplicitRANSsolver.FortheCartesiangridsin theoff-bodyregion,weutilizeaCartesiangridderivativeofthewell-knownARC3D[17]code,referredtohereasARC3DC. The ARC3DC code employs third-order temporal discretizations using a multi-stage Runge-Kutta time-stepping framework andiscapableofuptofifth-orderaccuratespatialdiscretizations. Further,theCartesiangridsintheoff-bodyareautomatically generated and managed for parallel execution by the SAMRAI infrastructure [18, 15]. As mentioned earlier, these codes or modules are combined together using a Python-based framework that orchestrates the execution of these modules and the associateddatatransfers. 2.3 MeshingParadigm The meshing paradigm consists of separate near-body and off-body grid systems. The near-body grid typically extends a short distance from the body, sufficient to contain the boundary layer. This grid can be a structured curvilinear grid or an unstructuredtetrahedralorprismaticgridthathasbeenextractedfromastandardunstructuredvolumegridorgenerateddirectly fromasurfacetriangulationusinghyperbolicmarching. Thereasonforusingcurvilinearorunstructuredgridsinthenear-body regionistoproperlycapturethegeometryandviscousboundarylayereffects,whicharedifficultorimpossibletocapturewith Cartesiangridsalone.Wefurthernotethateitherstructuredorunstructuredgridswillworkequallywellinthenear-bodyregion fromthepointofviewofourinfrastructure. Awayfromthebodythenear-bodygridsolutionisinterpolatedontoaCartesian -4 RTO-MP-AVT-168 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLELUNSTEADYOVERSETMESHMETHODOLOGYFORADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS backgroundmeshwiththeaidofthedomainconnectivityalgorithm. Thistransitionnormallyoccursatadistancewhereinthe sizingofthenear-bodygridcellsisapproximatelycommensuratewiththesizingoftheCartesianmeshintheoff-bodyregion. Figure3: BlockstructuredAMRgridcomposedofahierarchyofnestedlevelsofrefinement. Eachlevelcontainsuniformly- spacedlogically-rectangularregions,definedoveraglobalindexspace. Theoff-bodygridsystemconsistsofahierarchyofnestedrefinementlevels,generatedfromcoarsesttofinest. IntheStruc- turedAdaptiveMeshRefinement(SAMR)paradigm[19],thecoarsestleveldefinesthephysicalextentsofthecomputational domain. Eachfinerlevelisformedbyselectingcellsonthecoarserlevelandthenclusteringthemarkedcellstogethertoform the regions that will constitute the new finer level. Solution-based refinement progresses as follows: physical quantities and gradients are computed at each Cartesian cell using the latest available solution and those cells that hold values deemed to requirerefinementaremarked. Themarkedcellsarethenclusteredtoformthenewsetofpatchesthatconstitutethenextfiner level. Theprocessisrepeatedateachnewgridleveluntilthegeometryandsolutionfeaturesareadequatelyresolved. Wenote thatthisentireprocedureisautomatedwithinthesoftwareandnouserinterventionisrequired.Theprocedureofadaptivemesh refinementisgraphicallyillustratedinFig.3. Anexampleoftheoversetnear-body/off-bodymeshingstrategyisgiveninFig.2,whichshowsthemeshesforflowcom- putations over a helicopter fuselage. Here, the mixed-element unstructured near-body grid envelops the fuselage, while a multi-levelCartesianoff-bodygridextendsfromtheouterboundaryofthenear-bodygridtothefar-fieldboundary. Thetwo setsofmeshesoverlapintheso-calledfringeregion,wheredataareexchangedbetweenthegrids. 2.4 OversetMethodology Theoverallsolutionprocedurefortheoversetmeshesisasfollows. Ateachiterationstep,thesolutionofthefluidequationsin eachmeshisobtainedindependentofeachotherwiththesolutioninthefringeregionbeingspecifiedbyinterpolationfromthe overlapping“donor”meshasDirichletboundaryconditions. Attheendoftheiteration,thefringedataisexchangedbetween thesolverssothattheevolutionoftheglobalsolutionisfaithfullyrepresentedintheoversetmethodology. Further, ifoneor moreofthemeshesismovingorchanging, thefringeregionsandtheinterpolationweightsarerecalculatedatthebeginning ofthetimestep. Thisprocedureisrepeatedforeachiterationstepuntilsolutionconvergenceisattainedinboththenear-and off-bodymeshes. 2.5 DomainConnectivityModule(PUNDIT) Foreachsolver, thesolutionatthefringecellsisobtainedbyinterpolationfromtheoverlapping“donor”mesh. Thedomain connectivity module (PUNDIT) is responsible for determining the appropriate interpolation weights for each fringe point. Further,inthecaseofmultipleoverlappingmeshes,PUNDITmustalsoidentifyonemeshasthedonorforeachfringepointin eachmesh. Forstaticmeshes,theseoperationsaredoneonce,atthebeginningofthecomputation,while,forthemoregeneral caseofmovingoradaptingmeshes, thedeterminationsofdonorsandweightshastoberepeatedwithinthetime-steppingor iterationloop. RTO-MP-AVT-168 -5 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLEL UNSTEADY OVERSET MESH METHODOLOGY FOR ADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS PUNDITadoptstheimplicitholecuttingprocedurefollowedbyNAVAIR-IHC[13]. Thecoreideaofthisapproachisto retainthegridswiththefinestresolutionatanylocationinspaceaspartofthecomputationaldomainandinterpolatedataatall coarsergridsinthisregionfromthesolutiononthefinegrid. Thisresultsintheautomaticgenerationofoptimalholeswithout anyuserspecificationasinthecaseofexplicithole-cutting.Moreover,theimplicithole-cuttingprocedureproducesanarbitrary numberoffringepointsbasedonmeshdensitycomparedtotraditionalmethodswhichuseafixedfringewidth(usuallysingle ordouble). Thecriticalparameterthatquantifiesthequalityofagridcellornodeistermedas“resolutioncapacity”. PUNDIT nominallyusesthecellvolumeastheresolutioncapacityforagridcellandtheaverageofcellvolumesofallassociatedgrid cellsasresolutioncapacityforagridnode. Amodificationtothisproceduretoaccountforwallproximityisdescribedlater. In addition,PUNDITseparatesthenear-bodytonear-bodyandnear-bodytooff-bodydomainconnectivityprocedurestofacilitate automaticoff-bodygridgenerationandtoimproveefficiency(Meakinetal.[14]). FollowingarethestepsfollowedbyPUNDITtodeterminethedomainconnectivityinformationinaparallelenvironment. Partitioning of grid and solution data is assumed to be performed using an appropriate solver-based load balancing scheme beforethesestepsareexecuted. (a) Intersecting spheres (b) Oriented bounding (c) Vision space bins are (d) Only Vision space (partitioned grids shown) boxes created using created by dividing the bins that contain mesh inertial bisection bounding boxes in to cells are shown . smaller partitions Figure 4: Oriented bounding boxes and vision space bins that are created during preprocessing to accelerate donor search processes. 1. Registration: Oneachprocessor,theflowsolversregistergridandsolutiondatapointersforeverygridblock(bothnear bodyandoff-body)withPUNDIT. 2. Profiling: PUNDITprofileseachofthegridblocksandformsmeta-datarepresentationstofacilitatefasterdonorsearch operations. Themainproceduresthatareexecutedinthisstepare: • Minimal bounding box computation: Oriented bounding boxes are constructed instead of axis-aligned bounding boxestominimizethesearchspace(Figure4(b)). • Divisionofminimalboundingboxintovisionspacebins:ThevisionspacebinsaresmallerCartesianboxeswithin theboundingbox. Thesizeofthevisionspacebinsisdeterminedbydividingthevolumeoftheboundingboxby thetotalnumberofcellscontainedinit(Fig4(c)andFig4(d)). • Generationofacumulativefill-table: Thecumulativefilltableisabin-wisereorderingofthecellindicesthatare containedineachvisionspacebin. Theyfacilitatefastidentificationofallcellscontainedwithineachvisionspace bin. -6 RTO-MP-AVT-168 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLELUNSTEADYOVERSETMESHMETHODOLOGYFORADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS • Estimation of element resolution capacity: Element resolution capacity is the grid quality measure for each cell and eachgrid node. Nominally, cell volumesare used as resolutioncapacity for each gridcell and averageof all the associated cell volumes is used as the resolution capacity for each grid node. A method to modify resolution capacitytoretaincellsnearasolidwallinthegridcontainingthewallisdescribedinsection4.0. 3. Near-bodytonear-bodydomainconnectivity: Inthisstepthedomainconnectivityoperationsareperformedbetweenall near-body blocks on all processors. The sequence of the operations is as follows: The bounding boxes constructed in eachprocessorintheprevioussteparegatheredineveryprocessor. Eachprocessorchecksforintersectionofbounding boxes of its grid blocks with the global set of bounding boxes, which aids in identifying candidate donor grid blocks. Theboundingboxintersectionchecksareoptimizedsuchthatdetectionofintersectionaswellasidentificationofvision spacebinscontainingpotentialreceivergridpointsaresimultaneouslyprocessed. Theidentificationofvisionspacebins acceleratestheidentificationofpotentialreceivergridpointsasonlythosenodeswhichbelonginthesevisionspacebins arecheckedforcontainmentinthedonorboundingbox. Following the intersection checks a communication packet is set up and exchanged between all the processors. The communicationpackageconsistsofalistofcoordinatesofallpotentialreceivergridpointsandtheirresolutioncapacity organized such that the donor search can be directed to the appropriate grid block in the candidate processor. Upon completionofthiscommunication,eachgridblockineachprocessorwillformalistofpointsforwhichitneedstolocate donorcells.Thedonorsearchisthenconductedfollowingthealgorithmoutlinedbelow.Oncealistofdonorsisobtained, it is communicated back to the processor that requested the donor search, where an evaluation of potential donor cells whichwerefoundisperformed,suchthatthedonorofthebestresolutioncapacityischosenforeveryreceivergridpoint. Indicesofallthepotentialdonorsthatarenotacceptablearecommunicatedbacktotheappropriatedonorprocessorsuch thatitcanupdateitsdonorlist. Additionallyaqualitycheckisperformedtomakesurethatnodonorcellshaveareceiver pointasitsvertex. Ifsuchacellislocateditisdeletedfromthedonorlistandthisinformationiscommunicatedtothe receiverprocessorwhichadjustsitsreceiverlistaccordinglyaswell. Thisprocessrigorouslyestablishesdonorquality since all donors will be composed of only nodes which are being solved and are not themselves receiver points. The finalproductofthisstepisacommunicationtableconsistingofalistofdonorsandreceiversineachprocessorwhichis synchronizedfordatainterpolation. Figure 5 shows a graphical illustration and flow chart of the donor search algorithm outlined below. The donor search algorithmproceedsasfollows: Foranypotentialreceiverpoint,alocalizationisperformedbylocatingthevisionspace binthatcontainsit. ThisisatrivialoperationsincethevisionspacebinsfollowaCartesianstructurewithintheoriented boundingbox. Subsequentlyaspiralsearchisperformedbeginningatthevisionspacebinuntilabinwithatleastone gridcellislocated. Thisgridcellischosenastheseedforinitiatingthesocalled“stencilwalkprocess”. Alineiscreated byconnectingapointinsideapotentialdonorcell(centroidfortheinitialcellandfaceintersectionpointforsubsequent cells) with the receiver point. A check is made to determine if this edge intersects any of the faces of the cell. If an intersection is located, the cell which forms the neighbor at that face is chosen as the next potential donor cell. This procedurecontinuesuntilacellwithnointersectionsislocatedwhichwouldbethedonorcellforthegivenpoint. For afacewhichshowsintersectionbuthasnoneighbor(i.e. aboundaryface),acheckisconductedusingthespiralsearch techniquetocheckiftheedgeintersectsanyotherboundaryfacesintheneighborhood. Ifanintersectionisfoundthen stencilwalkingproceedstothecellwhichownsthatboundaryface. Thisprocedureaddressescomplexgridboundaries such as those found in partitioned unstructured grids or thin trailing edges. If a cell is not found, the receiver point is pronouncedtohavenodonorcellinthecurrentgridblock. Inaddition,ifthefinalboundaryfacethatthelineintersected isawallboundaryfacethereceiverpointwillbetaggedasaholepoint(i.e. itisinsidethesolidbody). Onceapointis taggedasaholebyanearbodygrid,itwillnolongerreceivedataevenifasuitabledonorisfoundbyanothergrid. This ensuresthatthevolumewithinthebodyisblankedforthenodesinallgridswhichoverlapwithinthebody. 4. Offbodygridgeneration:TheadaptiveCartesianmeshgenerationisdictatedbytheresolutionofthenearbodymeshesat theirboundaries. Inordertoachievethis,alistisgeneratedwhichiscomposedofallouterboundarypointsofnearbody meshesacrossprocessorsthatdidnotfindanear-bodydonorinthepreviousstep. Thislistiscommunicatedtotheoff- bodygridmanager(SAMRAI)whichautomaticallygenerates/adjustsnestedCartesianmeshessuchthattheresolutionis commensurateatthenear-bodyboundaries. 5. Off-body to Near-body connectivity: First valid donors (i.e those with better resolution capacity) for all points in near- bodygridblocksaresearchedintheoff-bodyCartesianblocks. Thissearchprocessisextremelyefficientbecauseofthe isotropic nature of the Cartesian blocks and the global grid indexing followed by the off-body grid manager. A donor Cartesiancellcanbelocatedinjustasinglestep. Inaddition,thisprocessidentifiesalltheCartesianblockswhichmight RTO-MP-AVT-168 -7 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLEL UNSTEADY OVERSET MESH METHODOLOGY FOR ADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS Vision space bins Spiral search path Recvrpoint Locate vision space bin coordinates Potential Spiral search to locate receiver point the closest bin containing grid cells and choose the closest cell Form line between cell-point and receiver Stencil walk path point (cell point = cell center if the first time otherwise = face intersection point) yes Check for intersection with all faces yes of the grid cell that does not contain the cell-point no yes Boundary face Valid face intersection intersection neighbor no no Donor located Donor not located Figure5: Flowchartofdonorfindingalgorithmthatusesvisionspacebinbasedlocalizationandstencil-walkbasedsearch. contain potential receiver points. Following this, valid donors in the near-body grids are searched for all potential off- bodygridpointsusingthesameprocessasthatwasoutlinedforthenear-bodytonear-bodydomainconnectivity. Once thedonorsandreceiverpointsarelocatedforboththeoff-bodyandnear-bodygrids,acommunicationpatternisfollowed whichupdatesthecommunicationtablesthatsynchronizethedonorandreceiverlistsineachprocessor. Donorquality is again guaranteed by removing low quality donors and their corresponding receivers from the communication tables. Oncethecommunicationtablesarefinalized,theinterpolationweightsforeachgridnodeofeachdonorcellinthedonor listarecomputed. Tri-linearbasisfunctionsareusedfordatainterpolationandtheinterpolationweightsarecomputed usingaNewton-Raphsonprocedure. 6. Data interpolation: Data is interpolated in a three step process. The first step consists of populating a communication buffer using the solution data and the donor list from the communication tables. These data buffers are exchanged betweenprocessorsthroughinterprocessorcommunicationinthesecondstep. Inthethirdandfinalstepthesolutiondata isupdatedineachprocessorbasedonthereceiverlistfromthecommunicationtables. Incontrasttomanyoftheexisting domainconnectivitysolvers,PUNDITminimizesthecommunicationoverheadbymaintainingtheinterpolationweights in the host processor. The client processor only receives solution data and is capable of assigning them appropriately becauseofthesynchronizationincommunicationtables. Moreover,interpolationweightsareevaluatedonlyforthefinal listofdonors,minimizingtheamountoffloatingpointoperations. Inordertosupportinterfacingofsolverswhichfollow different non-dimensionalizations for flow variables, PUNDIT maintains non-dimensionalizing factors for each flow variablefromeachparticipantsolver. Thedatabuffersthatareexchangedareinthedimensionalformandappropriate non-dimensionalizationisappliedattheupdatestepbasedonthefactorsprovidedbyeachparticipantcode. Itisworth notingthattheimplementationofdatainterpolationisquitegeneralanddoesnotimposeanyrestrictiononthetypeor numberofflowvariables. -8 RTO-MP-AVT-168 NATOUNCLASSIFIED+SWE+AUS NATOUNCLASSIFIED+SWE+AUS PARALLELUNSTEADYOVERSETMESHMETHODOLOGYFORADAPTIVE ANDMOVINGGRIDSWITHMULTIPLESOLVERS 3.0 Software Verification and Application Results WepresentresultsusingPUNDITwithintheflowsolverpackageHELIOSforseveralexamples.Theresultssectionisorganized asfollows:first,testcasesforinterpolationaccuracyandscalabilityarepresented.Then,applicationresultsarepresentedwhich demonstratetheuseofthesoftwareforadaptingandmovingbodymeshes. ThefirstresultisaNACA0015fixedwingtestcase which demonstrates the advantages of adaptive grids and higher-order methods for wake capturing. The next set of results illustrates the capability of PUNDIT to perform domain connectivity operations which automate hole cutting and generate optimal overlaps. The domain connectivity solutions are shown for the AGARD A2 test case (NHLP-2D slotted airfoil) in steadyandunsteadymotion. Followingthatwepresentresultsforrotorcraftsimulationsforthe1/4thscaledV-22andfullscale UH-60Arotors. AdditionalapplicationresultsarepresentedinRef.[20]. 3.1 InterpolationAccuracy All the polyhedra that are common in unstructured mixed-element meshes (tetrahedra, pyramids, prisms and hexahedra) are supported in PUNDIT. Tri-linear basis functions are used to perform data interpolation inside each of these polyhedra. The accuracyofinterpolationwasverifiedbyconstructingunstructuredmeshesinaunitcubeofincrementalresolutions. PUNDIT isusedthentoperformsearchandinterpolationonarandomsetofpointsdistributedintheunitcubeusingthedataatthegrid nodes(whichareprescribedusingchosentestfunctions)oftheunstructuredmeshes.Theinterpolatedvaluesarecomparedwith exacttestfunctionvaluestoestimatetheinterpolationerror. Figure6showsthevariationofinterpolationerrorwithimproving resolution. Forlineartestfunctions,thetri-linearinterpolationgivesexactsolutionswithinmachineprecision. Foranon-linear testfunctionsinterpolationsinalltypesofpolyhedrashow2ndorderaccuracy. Interpolationaccuracyisalsoverifiedtobe2nd orderforatestfunctionthatusestheLamb-vortexsolution. 0 -2 10 10 tetrhedra -4 L2 error10-10 slope=2 pphyreirsxaammhseiddsra1100-6 slope=2 -20 -8 10 10 -4 -2 0 -4 -2 0 10 10 10 10 10 10 dx dx (a) Linear function f=x+y+z (b) non-linear function f=cos(x)*cos(y)*cos(z) -4 tetrahedron 10 error 10-6 pyramid 2 L hexahedron slope=2 prism -8 10 -4 -2 0 10 10 10 dx (c) Lamb-vortex solution (d) Topologies of polydedra Figure6: Variationofinterpolationerrorsforpolyhedrashowninsubplot(d): subplot(a)showserrorvariationforalineartest function,subplot(b)showserrorvariationforanon-lineartestfunctionandsubplot(c)showserrorvariationfortheLamb-vortex solution. RTO-MP-AVT-168 -9 NATOUNCLASSIFIED+SWE+AUS