ebook img

NASA Technical Reports Server (NTRS) 19930005696: Preconditioned conjugate-gradient methods for low-speed flow calculations PDF

14 Pages·1.1 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 19930005696: Preconditioned conjugate-gradient methods for low-speed flow calculations

NASA Technical Memorandum 105929 - p,/3 AIAA-93-0881 ICOMP-92-22 _ _ _ 5_ _ _ _ • Preconditioned Conjugate-Gradient Methods _ for L0w-Speed FlowCalculations Kumud Ajmanil Institute for Computational Mechanics in Propulsion Lewis Research Center Cleveland, Ohio Wing-Fai Ng -- -_-_ -_: _ _ _ _ Department of Mechanical Engineering Virginia Polytechnic Institute and State University Blacksburg, Virginia ........ --_--------_ .............. _ .................... and Meng-Sing Liou National Aeronautics an d Space Adm_istration Lewis Research Center Cleveland, Ohio ! Prepared for the 31 st Aerospace Sciences Meeting and Exhibit .......... :,--_onsored b37ih_ Amefican institute of Aeronautics and=)_stronautics- .............................. .................... Reno_ Nevada, January 11-11; I993 + . --it .... £=k=q£ ..--_-._S 517____]____--=L_;== =--L._/_;- ..S ? ._- ................................................ CONJUGATe-GRADIeNT ME-THODS FOR LOW-SPEED FLOW CALCULATIONS (NASA) "_- =....... _ 13 p G_ _ /34 0136516 E PRECONDITIONED CONJUGATE GRADIENT METIIODS FOR LOW SPEED FLOW CALCULATIONS Kumud Ajmani* Institute for Computational Mechanics in Propulsion NASA Lewis Research Center, Cleveland, Ohio and Wing-Fai Ngt Department of Mechanical Engineering Virginia Polytechnic Institute and State University, Blacksburg, VA and Meng-Sing Lion _t Internal Fluid Mechanics Division NASA Lewis Research Center, Cleveland, Ohio Abstract able popttlarity as tools for solving the large sys- tems of simultaneous, linear equations that appear An investigation is conducted into the viabiUty of at each time-step of an implicit time-integration using a generalized Conjugate Gradient-like method scheme. However, these conventional solvers exhibit as an iterative solver to obtain steady-state solu- several documented weaknesses 3, namely, (i) con- tions of very low-speed fluid flow problems. Low- vergence of LGSR depends on the choice of the speed flow at Mach 0.1 over a backward-facing step relaxation parameter (under-relaxation is often re- is chosen as a representative test problem. The un- quired for stability), (ii) LGSR and AF encounter steady form of the two-dimensional, compressible a maximum time-step restriction (even in an im- Navier-Stokes equations is integrated in time using plicit formulation) in complex flow problems, and, discrete time-steps. The Navier-Stokes equations (iii) convergence of AF is sensitive to the time-step are cast in an implicit, upwind finite-volume, flux employed in the time-integration. These weaknesses split formulation. The new iterative solver is used hmit the applicability of these solvers, and they are to solve a linear systems of equations at each step particularly inefficient when applied to 'stiff' prob- 'of the time-integxation. Preconditioning techniques lems (e. g. low-speed flows). are used with the new solver to enhance the stability The past few years have seen a resurgence of and convergence rate of the solver, and are found interest in using other methods like the Precondi- to be critical to the overall success of the solver. tioned Conjugate-Gradient Method 4 (PCGM), and A study of various preconditioners reveals that a its generalizations, as iterative solvers. The earli- preconditioner based on the Lower-Upper Succes- est instances of the application of PCGMs to fluid sive Symmetric Over-Relaxation iterative scheme is flow problems are found in the works of Wong and more efficient than a preconditioner based on In- Hafez 5. The work in reference 5 used PCGMs to complete L-U factorizations of the iteration matrix. The performance of the new preconditioned solver solve the potential flow equations for calculations is compared with a conventional Line Gauss-Seidel of flow over transonic airfoils. Wigton et. ale used Relaxation (LGSR) solver. Overall speed-up factors the GMRES r algorithm to improve the robustness of 28 (in terms of global time,sieps required to con- and convergence of existing CFD codes. Reference 6 verge to a steady-state solution) and 20 (in terms of used GMRES to obtain efficient potential and Euler solutions for subsonic and transonic flow over air- total CPU time on one processor of a CRAY-YMP) are found in favor of the new preconditioned solver, foils. The work of Venkatakrishnan s has provided when compared with the LGSR solver. positive evidence of the viability of preconditioned GMRES for the compressible Navier-Stokes equa- tions. Reference 8 used PCGMs to obtain viscous Introduction solutions for subsonic and transonic flow over air- foils. Conventional iterative solvers like Line Gauss- Seidel Relaxation 1 (LGSR) and the Approximate Inspire of the efforts mentioned above, the fam- Factorization 2 (AF) scheme have enjoyed consider- ily of PCGMs has not attracted the full attention of researchers working in the area of algorithm devel- opment for linear system solvers. (i) The power of * Research Associate, AIAA Member generalized CGMs (like GMRES) has not been har- t Professor, Senior AIAA Member nessed to develop a common computational tool for low-speed /or incompressible ) and compressible flow Senior Research Scientist, AIAA Member -- the regime of low-speed flows is virtually unex- Copyright @ 1992 by K. Ajmani. Published plored. (ii) The issues of how and when to terminate by American Institute of Aeronautics and Astronau- the preconditioned GMRES solver need to be exam- tics, Inc. with permission. ined. The termination criteria determines the num- berofsub-iterateast each global iteration (or time- coordinates, which results in step), and directly affects the storage requirements and computational efficiency of the solver. (iii) The performance of preconditioned GMRES (and other 7 0-7+ _ + a_ = a_ (2) related algorithms) needs to be documented against the performance of conventional solvers for upwind schemes, particularly for incompressible flow prob- where lems. None of these issues has been examined in detail in the above mentioned references 5-8, or else- O =Gop,_,p., peo]r ; ] : &ny- eyn_ where in the literature. This paper has endeavored to examine the is- sues listed in the above paragraph. An attempt has J J ' j J thus been made to fill a gap in the existing literature by conducting an investigation of the applicability of preconditioned generalized Conjugate Gradient like d_= _ methods, for solving flow problems in the low-speed regime. This paper also serves to complement the where (_,_,_,77_ are the metrics of the transforma- earlier work of the authors _, where the regime of in- tion. In addition, terest was transonic and hypersonic flow problems. The success of preconditioned CGMs was clearly F =F(Q) = [pu,p,_+p,puv,(_o +p)a]r demonstrated in reference 9 for compressible flow problems. This encouraged the authors to investi- gate the possibility of using preconditioned CGMs G : G(Q) : [pv,p,,v,pv2+v,(_0 +p),,]r for solving incompressible flow problems, with the goal of developing a universal code for incompress- ible and compressible flows. p=(7-1)[peo-p(U'_v-'-)] A full and more eomptete description of the the- ory will follow this Introduction in the next section. _%, = O, _ = at un + aav,7, _, = ctzu, I+ azv n The governing equations of fluid flow, the Conjugate Gradient Methods, and the details of their imple- mentation in this research will be presented. The importance of preconditioning, and some possible = 2 ,u ". _- _Pr(7- 1) preeonditioners will be discussed. The results of the application of the new solver to a low-speed flow problems will be presented and compared to al=l-b_, 1_ a2 = l +'_-1j-,_/y2 aa -- 13_/_j7/_ results using a conventional line relaxation algo- rithm (i.e., LGSR) on the same test problem. The The specific heat ratio, % is taken to be 1.4. results demonstrate the effectiveness of using pre- The molecular viscosity is given by #, a is the speed conditioned, generalized CGMs as iterative solvers. of sound and ReL is the Reynolds number per unit The final section enumerates the conclusions drawn length. Nondimensionalization is with respect to from this work. the freestream density and velocity. The physical coordinates (x,y) and viscosity are nondimension- alized by a reference length L and the molecular Presentation of Theory viscosity of the freestream, respectively. It must be remarked that, in this research, no correction has Napier-Stokes Equations been made to the compressible flow equations to account for the incompressible effects of low-speed The governing equations of compressible fluid flow. This is expected to affect the overall perfor- flow in 2-D are the Napier-Stokes equations written mance of the code (i. e. slower convergence rates for as all solvers). However, the quality of the computed solution is not affected, and this is borne out by the excellent comparisons obtained with experimental aQ aF aG_ OF. aG, 0-7+ b_ + oy a_ + a--_ (1) data. In short, the compressible flow equations are used 'as is' for low-speed flow, and this is consis- tent with the goal of developing a common compu- In this research, the thin-layer form of the above tational tool for all regimes of flow (incompressible equations is used, i.e. all viscous terms which in- and compressible.). volve a gradient in the streamwise direction are ne- The governing equations are solved computa- glected. The use of the thin-layer equations is jus- tionally in their integral, conservation law form, us- tified because the coarseness of the computational ing a cell-centered finite volume formulation. In- mesh used in this research precludes the resolution viscid flux terms are upwinded using Van Lear's 1° of the viscous fluxes in the streamwise direction. fiu_x-splitting scheme. The thin-layer viscous fluxes The thin-layer equations are transformed from are evaluated with second order accurate central dif- cartesian coordinate form (z,y) to generalized (_,n) ferences. Equation 2 can be rewritten in compact form The classical Conjugate Gradient Method (CGM) was proposed by Hestenes and Steifel 4, to solve the as system Az = b,where A is a symmetric and positive definite (SPD) matrix. The idea of using CGMs as 10q _ -R (3) J 0t iterative methods was first discussed by Reid it. The condition that A has to be an SPD ma- where the right-hand-side vector is written as trix is seemingly restrictive because it is difficult to guarantee an SPD matrix for general fluid flow prob- lems which incorporate the Navier-Stokes equations. aP aO ad_, However, the method can be generalized for solving R= R(Q) : _ + o. o,7 linear systems where the coefficient matrix is not symmetric and/or positive definite. Several gener- R is called the residual, and equals to zero for a alizations have been proposed in the literature, par- steadystate solution. The Euler implicit discretiza- ticularly for non-symmetric systems, by Saad and tion of equation 3 in time gives Sehultz r, Young and Jea 12, and Axelsson 13,to name a few. The particular generalization used in this paper is the Generalized Minimal RESidual (GM- Aq" _ _R.+I (4) RES) method of Saad and SchultzL As the name JAr of the method suggests, GMRES seeks to mini- mize the norm of the computed residual vector, r", where AQ" is the incremental change in the cell- (r'_ - b- Az") at each iteration. centered values of the vector Q between the n + 1th The GMRES method is directly applicable to time level and the known nth time level, i.e. solve linear systems with non-symmetric coefficient matrices. The generalization is effected by a two /',q'_ = q"+_ - q" (s) step procedure -- the generation of a set of or- thonormal vectors from a given initial guess using Arnoldi's Method 14, and the solution of a minimiza- R'_+1 is linearized in time about the nth time level tion problem. Arnoldi's Method 14 uses tile well which results in known Gram-Schmidt algorithm for computing an orthonormal basis of vectors for the Krylov subspace I OR\ A " -R" K(A, vl, k) - span{v1, Avl,..., Ak-'vl }. tt It is possible to build linear system solvers for sparse matrices, with the help of the Arnoldi pro- cess. In order to solve the linear system Ax : b, we where I/JAr is a block-diagonal matrix and aR is calculate an approximate solution z_ of the form a large, sparse, block, banded matrix. zk = Zo + zk, where zo is some arbitrary initial Equation 6 can be rewritten as guess to the exact solution 2 (= A-lb). The vector zk lies in the Krylov subspace collectively defined V"AQ"=-R" (7) by A, ro (= b- Axo) and k, i.e., Equation 7 represents the system of linear simul- taneous algebraic equations that has to be solved z_ E K(A, ro, k) = span{ro, Aro,. ..,A_-t,'_),. at each global time-step in the computation. This = span{,'o, r,, ..., ,'_} system of equations can be solved by direct ma- trix inversion; however, this requires extensive com- These methods are referred to as Krylov subspace puter memory and computation effort at each time methods. step. Iterative schemes are attractive because of The iterative scheme based on Krylov subspace their computational efficiency and relatively mea- methods will be most successful when the iterate zk ger memory requirements. In this paper, a compar- minimizes the corresponding residual norm, Ilrk]l. ison is made between the efficiency of two iterative Mathematically, this is equivalent to a minimization solvers -- a conventional Line Gauss-Seidel Relax- of I[",,11over zk c K(A, ro, k), i.e., ation (LGSR) solver and a preconditioned general- ized conjugate-gradient type solver. Results of com- minllrkl I _ mini[(b-A=k)l I (10) parisons with the Approximate Factorization (AF) zh _/, solver may be found in reference 3, but are not pre- sented in this paper. The solution of this minimization problem forms the second part of the GMRES algorithm. Generalized Conjugate Gradient Methods The complete GMRES algorithm can be sum- marized as follows: Recall, that we are interested in solving the lin- 1. For any starting vector zo, form the initial vec- ear system of equations tor _o: b- A_o ; _ = 11_o11=; vl = _o/_. 2. Generate the basis of orthonormal vectors de- V=AQ" =-R = ¢_ Ax = b (8) fined in equation 9 3.Formtheapproximatseolution: (iii) Additional computer storage to store M -- a) Findthevectorzk which solves the mini- which may be of the order of storage requirements mization problem of equation 10 for the coefficient matrix A. The selection of an 'ettl- b) Compute zk = z0 + z/, cient' preconditioner is motivated by the minimiza- In summary, the GMRES method is a mini- tion of the afore-mentioned costs, and is considered mization process to solve linear matrix systems like crucial to the success of the preconditioned GMRES Az = b. The minimization proceeds as a sequence algorithm. of k sub-iterations, and the minimizer is obtained The cheapest preconditioner is the identity ma- by a simple upper-trlangular solve in step 3 of the trix (of appropriate rank). However, for M = 1, algorithm. One major practical difficulty with GM- the original, unprecondltioned iterative scheme is RES is that as k increases, both storage and oper- recovered! The costliest preconditioner results when ation cost increase as O(k) and O(k2), respectively. M = A. This is equivalent to performing a direct Hence, the number of sub-iterations has to be re- solver approach and inverting A to obtain z = A-lb stricted to minimize the cost of each global iteration. -- an approach which is unacceptable for the solu- This issue, which has not received much attention tion of large, sparse linear systems of equations. It is in the literature, is addressed later in this paper. apparent that a practical preconditioner lies some- Further mathematical details, implementation where between the two extremal choices of M - 1 techniques and convergence analyses of the method and M -- A. It is also evident that 3I should be in are contained in references 3 and 8. Since this pa- some sense "close" to A --so that 3I-_A is close to per is also concerned with convergence acceleration, the identity matrix. For effective preconditioning, it may be remarked that the speed of convergence this means that the eigenstructure of A should be of tile algorithm depends on the condition number close to that of the identity matrix [, i.e. A should of the matrix A (_2(A)), and the distribution of be well-conditioned. slngular-values of A (i.e. the spectrum of A). tc2(A) For practical implementations, the 'explicit' is defined as the ratio of the maximum to minimum computation of M -1 is not advisable because (i) singular-value of the matrix A. If _2(A) is large Even though 3I may be sparse (corresponding to and/or the spectrum of singular-values of A is wide and scattered, A is said to be poorly conditioned, the sparsity of A), M -1 may be a dense matrix. and convergence may be very slow. Precondition- Storage requirements for M-1 may thus far exceed ing is employed to improve the conditioning of the those for 3I, and, (it) If M is ill-conditioned (corre- coefficient matrix. The preconditioning technique sponding to ill-conditioning of A), computation of chosen may hence be critical to the success of any M -1 in computer arithmetic will be highly error- generalized CGM used to solve linear systems of prone. equations. In practice, 'explicit' preconditioning is re- placed by an equivalent but highly effective tech- nique called 'implicit' preconditioning, hnplicit pre- Preconditioning Techniques for CGMs conditioning transforms the 'explicit' problem of One of the most effective iterative methods for forming matrix-vector products of the type M-lu solving large, sparse linear systems of equations is (= fi) to an equivalent 'implicit' problem of solv- a combination of a generalized Conjugate Gradient ing a linear system for fi, with M as the coefficient like procedure with some appropriate precondition- matrix. This can be written as ing technique. Assuming that a preconditioning ma- trix M is used on the left of the original unprecon- 3I-lu = fi ¢_ Mfi- u (12) ditioned system, this involves solving the precondi- tioned linear system This linear system has to be solved for each sub- iteration of the preconditioned GMRES algorithm. M-tAx = _I-lb ¢_ _Ax = b (11) Thus, any matrix M which produces easy-to-solve (i.e., computationally efficient) linear systems of the instead of the original system Ax = b. type Mfi = u, is a potentially acceptable precondi- The motivation for preconditioning is twofold tioner. -- to reduce the computational effort required to In the light of the preceding discussions, a good solve the linearized system of equations at each preconditioning matrix M should time-step, and, to reduce the total number of time- i) produce an iteration matrix _1which is better steps (or global iterations) required to obtain a steady-state solution. Preconditioning will be cost- conditioned than A, and effective only if the additional computational work ii) produce easy-to-solve linear systems of tt,e form Mfi= u. incurred for each sub-iteration is compensated for by a reduction in the total number of iterations to On the basis of the above criteria, several pre- conditioners that can be derived from the coefficient convergence -- so that the total cost of solving the overall non-linear system is reduced. matrix A are diagonal, block-diagonal, incomplete The costs associated with preconditioning can L-U factorization (ILUF) and block-ILUF 1_ -- in be enumerated as (i) Computing the precondition- increasing order of computational cost. Iterative ing matrix M, (it) Matrix-vector multiplies or equiv- methods used in existing CFD codes can also be alent linear system solves associated with M, and, used as effective preconditioners 6. Some examples areLineGauss-SeidReel laxation(LGSR),spatial after k steps, in step 2 of the GMRES algorithm. ApproximateFactorization(AF),and variants of The use of a 'stopping criteria' provides a flexible, the LUSSOR scheme of Yoon and :lameson 16. rather than a fixed k. Thus, the issue is to choose an The choice of an effective, stable preconditioner (or accuracy value) for each global iteration which will provide a globally stable and efficient iterative is extremely important to the success of GMRES. scheme. For this work, several preconditioners were investi- During this research, it was observed through gated for their effectiveness and stability. An "effec- numerical experimentation that e can be related to tive" preconditioner is defined here as one that as- sists GMRES in obtaining rapid convergence, while the Courant number (A) of the time-integration by the following relation: requiring a minimal overhead cost for the precon- ditioning. A "stable" preconditioner is one that can be successfully computed from the matrix A E=0.5 0<A<10 (e. g. the computation of the ILUF of A may be 1 (13) unstable if A is severely ill-conditioned). Diago- - A> I0 nal preconditioners -- both scalar and block ver- logl0(A 2) sions -- were found to be 'unstable' when applied to the test case in this paper. LGS relaxation and The use of the above criteria for _ provides a stable spatial AF, representing the use of existing flow GMRES scheme for the low-speed test case in this solvers in CFD codes as preconditioners, were found research. It is also valid for both the precondition- to be 'stable' preconditioners. They were how- ers (LUSSOR and Block ILUF) used with GMRES. ever not as 'effective' in accelerating convergence as It should be remarked that the use of equation 13 to the LUSSOR (Lower-Upper Symmetric Successive specify _as the stopping criterion successfully limits Over-Relaxation) or the BILUF (Block Incomplete k to values below 10, for values of A upto 1000 (as Lower-Upper Factorization with zero fill-in) precon- used in this research). This criteria was also suc- ditioners. LUSSOR and BILUF were thus selected cessful when subsequently applied to the transonic as primary preconditioners for use with GMRES and hypersonic flow test cases of reference 9. in this research. The original LUSSOR scheme of Some issues related to storage requirements are reference 16 was modified, and then incorporated in now discussed. Recall that the global implicit ma- a block version, for use as a preconditioner in this trix has a block, banded structure with a known research. Complete details of LUSSOR and BILUF sparsity pattern. All results presented in this work preconditioning may be found in reference 3. involve a first-order accurate discretization of the implicit operator, thus creating an implicit matrix with five well-defined diagonals. Hence, for a prob- _Implementation of Preconditioned GMR_E_S lem size of N (i.e., N grid points), storage corre- sponding to 5*N blocks (each of size 4*4) is required In this paper, the preconditioned GMRES al- for the implicit coefficient matrix. This storage is gorithm has been implemented for solving the sys- required for both the solvers (GMRES and LGSR) tem of equations represented by equation 8. The tested in this research. The use of precondition- coefficient matrix A is non-symmetric, banded and ing in conjunction with the GMRES solver neces- sparse. The knowledge of the sparse, banded struc- sitates the storage of the corresponding precondi- ture of A is exploited by storing only the five (for a tioning matrix, M. The additional storage depends first-order left-hand-side implicit formulation) non- on the particular preconditioner, and varies from N zero block diagonals of the matrix -- each element blocks (for block diagonal and LUSSOR precondi- in a diagonal being a 4,4 block matrix. All matrix- tioning) to 5*N blocks (for block ILU precondition- vector multiplications are performed using the algo- ing with no fill-in). rithm of Madsen et. alIt, which uses only the diago- Recall, that each linear system solve with GM- nals of the sparse matrix to effect the multiplication. RES requires k sub-iteratlons, which translates to This algorithm is used since it is Consistent with£he k *N *4 additional storage locations for the sub- storage scheme for the matrix A in this paper. iterate vectors. As k becomes large (for high As described earlier, GMRES solves the sys- Courant numbers and stiff problems), the storage tem of linear equations at each global iteration by cost associated with GMRES may become signifi- performing a number of sub-iterations within each cant. In this research, it has been demonstrated global iteration. For the overall efficiency of the that the use of equation 13 establishes an upper preconditioned GMRES solver, the number of sub- limit of k = 10. The corresponding storage of iterations, k, needs to be 'limited', at each time- size 40, N locations has been extracted from ex- step. One approach often adopted is to fix k6,s, isting 'temporary' storage in the code, i.e., storage i.e., perform a predetermined (user-specified) num- is shared by GMRES and other subroutines in the ber of sub-iterations at each time step. A different code. Hence, the need for additional storage for the approach, which is used in this research, is to as- sub-iterate vectors has been successfully eliminated sign a 'stopping criteria' based on the reduction in in the implementation of preconditioned GMRES in the norm of the initial residual vector of the linear this research. system, i.e., to terminate the sub-iterations when lrkll/llr0ll <_e, where _ < 1. In practice, this trans- ares to truncation of the orthogonalizatlon process Test Results and Discussion non-linear residual is sufficient to obtain solutions within the limits of engineering accuracy. Hence, The problem of computing low-speed flow over there seems to be no practical justification in im- a backward-facing step was selected to demonstrate posing a seemingly strict criteria of a ten orders- the effectiveness of preconditioned GMRES over a of-magnitude reduction for the residual. However, conventional Line Gauss Seidel Relaxation (LGSR) there are several instances when this strict criteria solver. This flow problem illustrates the phenom- is required, and often necessary. ena of flow separation and recirculation in internal flows. Extensive experimental and theoretical in- Firstly, if the initial guess to the solution is close to the true solution, then a greater (than 2- vestigations have been performed for this flow by 3 orders) reduction in the residual is necessary to Armaly at. alls. provide the correct steady-state solution. Such in- All computations are done for laminar flow and stances arise when, say, a small perturbation (by have been performed using the optimum vector pro- changing the set of boundary conditions or the grid) eessing facilities on a single processor of a Cray- is applied to a known solution (on a particular set of YMP/832. All flow variables are second-order ac- boundary conditions and grid). Secondly, very high curate, fully-upwinded in the _ direction, and third- residual reductions are often required when results order accurate, upwind-biased in the r/ direction. from CFD codes are used as inputs to perform opti- All boundary conditions are first-order accurate. mization of design parameters. In such cases, results Conserved variables are used for interpolation of to engineering accuracy are unacceptable since they cell-centered values to cell-face values. The implicit may introduce large errors while evaluating sensi- left-hand-side), global operator is discretized in a tivities of critical design parameters. Thirdly, if the rst-order accurate manner. This is done to assure underlying non-llnear problem is stiff, then a 2-3 or- stability of the LGSR solver, and to save on storage ders residual reduction may not be sufficient, even and computational costs at each global iteration. to provide results to engineering accuracy. This All boundary conditions are linearlzed consistently, has been clearly illustrated by the results of the and are included in the implicit coefficient matrix. backward-facing step test case in this research. The computation is started with freestream flow as the initial guess. The relaxation parame- One of the issues that can determine the suc- ter (w) for LGSR is set equal to one. It must be cess of a solver is the amount of computational or remarked that w > 1 (i.e., over-relaxation) is not Central Processing Unit (CPU) time required by the a stable choice for the LGSR solver for the partic- solver to perform each global iteration. The CPU ular test ease investigated here, and w < 1 (i.e., time of an algorithm can be heavily influenced by under-relaxation) can only serve to slow down the the skills of the individual programmer. Efficient convergence rate. Alternate sweeps in both the hor- implementations often require investments in 'real' izontal and vertical directions are used for stability time and patience by the programmer. In addition, of the LGSR solver. an implementation on one particular machine may Excellent qualitative comparisons with experi- run faster or slower on another machine, i.e., the mental data have been obtained for the test problem CPU time may be machine dependent. Matters are examined. However, it is not the intent of this work further complicated by factors like compiler opti- to present detailed comparisons of computational mization, vectorization and paraUelization of the al- and experimental data. Detailed code validation re- gorithm. suits have been presented for the code being used in All the test results presented in this research this research in an earlier paper 19. The results pre- were obtained with vector implementations of algo- sented in this work stress the relative performance rithms executed on a computer with vector process- of the solvers and preconditioners being examined, ing facilities. Hence, the CPU time comparisons of which is consistent with the overall goals of this pa- the preconditioned GMRES and LGSR solvers (and per. the comparisons of preconditioners) do account for the intrinsic differences in their levels of vectoriza- Performance Indices for Comparisons of Solvers tion. Parallel and distributed computing can also afford rapid reductions in the 'turn-around' time of A standard and widely accepted approach to an algorithm. This research has focussed on vec- jndse the performance of linear system solvers is to torization issues in detail, and no efforts have been study the convergence rate provided by the particu- presently made to parallelize any of the solvers or lar solver. The idea is to examine the rate of conver- preconditioners. It is however well accepted that gence of the global non-linear problem as a function Conjugate Gradient llke algorithms can be success- of the number of global iterations (or time steps or fully adapted for use with parallel architectures 2°. solution updates). In this research, the compari- son of solvers is based on the number of iterations It may be remarked that the number of iter- required to reduce the 12norm of the residual vec- ations to convergence is completely independent of tor (normalized by the norm of the initial residual all the factors that influence the CPU time, and can vector) of the non-linear problem by ten orders-of- thus be said to reflect the true convergence charac- magnitude. teristics of a solver. The CPU time comparisons It is often argued, and apparently correctly are useful, and in conjunction with the number of so, that a 2-3 order-of-magnitude reduction of the iterations comparisons, provide an estimate of the practical utility of a solver. This, in turn, created a considerable challenge for the specification of the inflow boundary condition at the step. Details of Numerical Computations According to characteristic wave theory, the proper specification of boundary conditions for sub- Armaiy et. alls have presented detailed mea- sonic inflow requires one physical boundary condi- surements of velocity distribution and reattachment tion and three numerical boundary conditions to be length for the incompressible flow of air downstream satisfied. Note that only one physical quantity can of a single backward facing step in a 2-D chan- be fixed, and the complete specification of a fixed nel. The results show that the various flow regimes inflow (parabolic in this case) velocity profile re- (in the Reynolds number range of 70< Re <8000), quires both components of velocity (u and v) to are characterized by typical variations of separation be fixed. Hence, it seems unlikely that a purely length with Reynolds number. The Reynolds num- parabolic velocity profile, that will remain fixed as ber is based on twice the height of the inlet channel, the time-integration proceeds, can be maintained at and two-thirds of the maximum inflow velocity at the inflow boundary. the step. The particular test cases chosen for this = One approach often used in external flow cal- research corresponds to Re < 400, since the exper- culations is to specify a variation of stagnation en- imentai data suggests that Re > 400 produces 3-D variations or turbulence in the flowfield. thaipy at the inflow, which in turn provides the de- sired velocity profile. This, however, did not prove The numerical computations are performed on to be a stable inflow boundary condition for this in- an R-mesh with 61 and 51 points in the _ (stream- ternal flow case. Severai other variations of the in- wise) and 71(normal) directions, respectively. The flow boundary condition were attempted, but none grid is shown in Figure 1. Grid points are clustered, both in the normal and streamwise directions, to of them yielded satisfactory results. Some of these variations are listed below. Note, that the stagna- resolve the various viscous gradients and the reat- tion enthalpy and entropy are fixed, for all these tachment point of the separated flow. A freestream variations. The subscripts band, refer to boundary Macli number of Moo = 0.1 is specified. Four differ- and interior values, respectively. ent cases with Reynolds number of Reoo = 100,200, 300 and 389 are computed. The thin-layer approxi- Specify profile of u velocity ; vb = vi mation to the Navier-Stokes equations is used. The full Navier-Stokes terms are not included during the Specify profile of u velocity ;Pb = Pi total computations because the coarseness of the grid in iiil_ SSppeecciiffyy pprrooffiillee ooff Mach vneulomcbiteyr ;;vtb,b== '[Ivi_ the streamwise direction will prevent resolution of vb _ 0 ; Ub _ Ui these terms. Moreover, as will be shown later, ex- vi) vt,=O;pb----pi The problem is resolved by imposing a profile tremely accurate physical results are obtained with of Reimann invariants at the inflow boundary. The the thin-layer Navier-Stokes terms. velocity profile obtained with this boundary con- dition does vary in time, but is sufficiently stable BACK_,VARD-FACI NG STEP to simulate the incoming flow in an accurate man- 61x51 GRID ner. Since the Reimann invariants act as a non- refecting boundary condition 21, their use helps in rapidly eliminating the initial transients in the so- Ip+tm+P,s-m__71:_+_+++JlLUf_li_++_t,__+___+++w_o_-++w_ lution, as will be shown later by the convergence history results. Figure 1. Computational Grid Figure 2shows Mach number contours obtained from the computations on the 61.51 grid for the Re = 389 case. The nature and size of the sep- Adiabatic, no slip boundary conditions are used aration and recirculation behind the step closely on the top and bottom walls forming the boundaries matches the physical description of the flow as ob- of the channel, and on the lower portion (which de- fines the step) of the inflow boundary. For fully tained in the experiments of Armaly at. aP s. Sim- developed subsonic flow at the outflow boundary, ilar results were also obtained for lower Reynolds three variables (p, u and v) are extrapolated and numbers of Re = 100, 200 and 300. the pressure is determined by fixing the stagnation MAClt NIIMBER enthalpy. It must be remarked that this particular RANGF_'_)0, 0.1 outflow boundary condition performed considerably INTERVA].-0003 better than a fixed static pressure condition at the outflow. •+ The computational code used in this research is designed to perform single-grid calculations only. ttenee, it is not possible to simulate the entire exper- imental setup of reference ts, as this would require a Figure 2. Mach Number Contours minimum of two separate grids. It was thus decided It should be remarked that the reattachment to omit the inlet channel (before the step) in the computations, and simulate its effect by imposing a point is the primary flow feature used to charac- fully-developed profile for laminar flow, at the step. terize the flow in the experimental data is. For Re = 389, the experimental results suggest a down- _n - 1o0 stream reattachment length (XR) to step height (S) ratio of Xn/S : 7.9_ The numerical computation predicts XR/S : 7.95. The corresponding com- puted values for Re : 100, 200, and 300 are 2.97, 5.09 and &58, respectively. The experimental val- ues are 3.1, 5.2 and 6.7, respectively. A comparison of experimental and numerical reattachment lengths is presented in figure 3. It can be seen that the _ti_O_t_ _/_ _ _/i_ 05 experimental and numerical reattachment lengths are close to each other. This represents particu- larly good numerical solutions, in light of the fact that onset of reattachment is more difficult to pre- dict than onset of separation. The reattachment point for each case is confirmed by inspecting the velocity profiles for each case. An example set of velocity profiles is shown in figure 4 (for Re : 389), Velocity profiles obtained from numerical solutions are compared with available experimental data in Figure 5. Velocity Profiles for Re 100 figure 5 (Re : I00) and figure 6 (Re : 389). The differences between the experimental and computed profiles can be attributed to the excessive numerical diffusion introduced by the Van-Leer flux-splitting scheme used in this research. Recent tests with the flux-splltting scheme of Liou and Steffen 22show much better agreement with experimental data. BACKWARD FACING STEP 10 Mach Number=0.1 Grid Size=61 x51 oi _" 8 / / x ol/ os ./ / g / / / oz / E 4 Go / 9' Figure 6. Velocity Profiles for Re : 389 2 CE 0 Rcf.[18 tioning (GMILU) and the Line Gauss-Seidel Relax- o ,t I I I ation (LGSR) solver. The 'logarithm of the lz norm o 100 200 500 400 500 of the residual' has been plotted against the 'number Reynolds Number (Re) of global iterations' (or time steps). It can be clearly Figure 3. Reattachment Length vs. Reynolds No. seen that both GMLUS and GMILU converge at a much faster rate than the LGSR solver: The 'correct' physical solution (i.e., the proper VELOCITY BACKWARD-FACING STEP reattachment point) is obtained after a six order re- SEPARATION REGION duction of the residual is achieved. The rapid reduc- tion in the initial residual (from zero to three orders) _ Lr',r4i',, .'i,.t-!l_t4l!!l!l!l!!!!!t!t_t!tI't_!itii!:i!itti!til!i; : i : ', ' i : ', i !'. is a result of using the Reimann invariants at the in- flow boundary (as discussed earlier). This rapid re- duction represents elimination of part of the initial transient. The maximum (or asymptotic) Courant Figure 4. Velocity Vectors for Re : 389 numbers ($) for GMLUS, GMILU and LGSR are 200, 200 and I0, respectively' "I_e max/mum ,_for LGSR is the value of $ above which LGSR becomes Comparisons of Solvers and Preconditioners unstable. The maximum $ for GMLUS and GMILU is the value of A above which the number of sub- Figure 7 presents the convergence history com- iterations exceeds 10. Curves A (for GMLUS) and B parisons between GMRES with LUSSOR precondi- (for GMILU) are generated with an initial _ = 100, tioning (GMLUS), GMRES with BILUF precondi- with an increase to _ = 200 after 100 globai itera-

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.