ebook img

Parallel Solution of the Linear Elasticity problem with Applications in Topology Optimization PDF

0.3 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Parallel Solution of the Linear Elasticity problem with Applications in Topology Optimization

PARALLEL SOLUTION OF THE LINEAR ELASTICITY PROBLEM WITH APPLICATIONS IN TOPOLOGY OPTIMISATION J. TURNER‡, M. KOCˇVARA†, AND D. LOGHIN‡ Abstract. In this paper, we aim to solve the system of equations governing linear elasticity in parallel using domain decomposition. Through a non-overlapping decomposition of the domain, 5 our approach aims to target the resulting interface problem, allowing for the parallel computation 1 of solutions in an efficient manner. As a major application of our work, we apply our results to 0 thefieldoftopologyoptimisation,wheretypicalsolversrequirerepeatedsolutionsoflinearelasticity 2 problemsresultingfromtheuseofaPicardapproach. n a Key words. Linear elasticity, topology optimisation, domain decomposition, preconditioning, J Krylovmethods 8 2 1. Introduction. Consider a solid elastic body occupying an open and con- nected domain Ω ⊂ Rd with Lipschitz boundary ∂Ω = ∂ΩD ∪∂ΩN, where clamping ] C and traction are imposed on ∂ΩD and ∂ΩN respectively. Under the application of O both body forces f: Ω → Rd and boundary tractions g: ∂ΩN → Rd the material is subject to deformation so that a given reference point x of the initial undeformed . h material is translated to the vector x′ = x+u(x) of the deformed material, with u t a denoting the displacement. Through the assumption of linearly elastic material be- m haviour, the governing equations for u correspond to the following mixed boundary [ value problem 2 (1.1a) Lu..=∇·σσσ(u)=f, in Ω, v 1 (1.1b) σσσ(u)=E:ǫǫǫ(u), in Ω, 1 (1.1c) u=0, on ∂Ω , 2 D 6 (1.1d) σσσ(u)·nˆ =g, on ∂ΩN, 0 1. In the above, the strain caused as a result of the displacements u is characterisedby 0 the symmetric linearised strain tensor 5 1 1 ∂u ∂u v: ǫǫǫ(u)={ǫij(u)}di,j=1, ǫij(u)= 2 ∂xi + ∂xj . (cid:18) j i(cid:19) i X Additionally, nˆ corresponds to the unit outward pointing normal vector on ∂Ω and ar E ..= E(x) denotes the fourth order elasticity tensor, describing the elastic stiffness of Ω as a result of the load placed upon it. Wewillconsiderthecasewhereourbodyconsistsofoneormoreisotropicmateri- als(i.e: rotationalanddirectionalindependence). Equation(1.1b)describingHooke’s Law can be written as σσσ(u)=2µǫǫǫ(u)+λˆtr(ǫǫǫ(u))Iˆ, †School of Mathematics, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK, and Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Pod vod´arenskou vˇeˇz´ı 4, 18208 Praha 8, Czech Republic. The work of this author has been partly supported by the EU FP7 project AMAZE and by grant A100750802 of the Czech Academy of Sciences ‡SchoolofMathematics,UniversityofBirmingham,Edgbaston, BirminghamB152TT,UK 1 2 ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation where Iˆ represents the identity matrix of appropriate size, tr(A) denotes the trace of a matrix A and both µ and λˆ correspond to Lam´e constants defined in the usual manner µ..= E¯ , λˆ ..= νE¯ , 2(1+ν) (1+ν)(1−2ν) with E¯ > 0 corresponding to Young’s modulus and −1 < ν < 1/2 the Poisson ratio. We now look to apply domain decomposition to the problem (1.1). To do this, we divide our domain Ω into N nonoverlapping subdomains Ω with local boundaries i ∂Ω and outer unit normals nˆ . We denote by Γ the resulting skeletal interface i i Γ = ∪N Γ where Γ ..= ∂Ω \∂Ω and by I ..= Ω¯\Γ the set of interior nodes, with u =..i=u1. iAssumingithattheirestrictionofu tocomponents ofthe skeletalinterface |Ωi i i u is known, problem (1.1) is equivalent to the following set of subproblems i |Γi Lu =f , in Ω , i i i u =0, on ∂Ω ∩∂Ω , i D i (1.2)  σσσ(ui)·ni =gi, on ∂ΩN ∩∂Ωi, u =λ , on Γ , i i i  (1) (2) where i = 1,...,N. By writing u = u +u , we look to describe an appropriate i i i interface operator that will allow problems to be decoupled and thus solved strictly on subdomains in parallel. Through the definition of matrix extension operators H i that map interface data to relevant subdomains via u(2) =H λ, the system (1.2) can i i be decoupled into the following 2N +1 subproblems Lu(1) =f , in Ω , i i i u(1) =0, on ∂Ω ∩∂Ω , (1.3a)  i D i σσσ(u(i1))·ni =gi, on ∂ΩN ∩∂Ωi, (1) u =0, on Γ . i i (1.3b)  N σσσ(H λ)·n =− N σσσ(u(1))·n , on Γ. i i i i ( i=1 i=1 X X (2) Lu =0, in Ω , i i u(2) =0, on ∂Ω ∩∂Ω , (1.3c)  i D i σσσ(u(i2))·ni =0, on ∂ΩN ∩∂Ωi, u(2) =λ , on Γ . i i i The associated weak form to problem (1.3b) is referred to as the Steklov-Poincar´e equation, where the so-called Steklov-Poincar´e pseudo-differential operator S: Λ → θ Λ′ defined in the following manner [11] θ N N s(λ,η)=hSλ,ηi..= (σσσ(H λ )·n )η ds =.. hS λ ,η i, i i i i i i i i=1(cid:20)ZΓi (cid:21) i=1 X X where λ,η ∈ Λ and λ =.. λ , η =.. η . The space Λ is chosen to be a suitable θ |Γi i |Γi i θ fractionalSobolev spaceofindex θ basedonthe boundaryconditions ofthe problem, dependent on the intersection of Γ with ∂Ω [6, 10]. ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation 3 2. Matrix Formulation. Through a finite element discretisation of the weak formulation,itcanbeshownthatthediscreteformulationtotheoriginalproblem(1.1) requires the solution of a matrix-vector system [11]. By distributing nodes based on their location within the domain, we can view this system as follows K K u f (2.1) Ku= II IΓ I = I =f, K K u f ΓI ΓΓ Γ Γ (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) where N (2.2) K ..= K , u=(u ,u )T ∈Rn=nI+nΓ. II IiIi I Γ i=1 M In comparison, the corresponding matrix formulations for each of the discrete weak formulations to the problems presented in (1.2) can be written down as K u(1) =f , II I I (2.3)  SuΓ =fΓ−KΓIu(I1), KIIu(I2) =−KIΓuΓ,  T T with globalsolution u= u(1),0 + u(2),u . In the above, the matrix S corre- I I Γ sponds to the Schur comp(cid:16)lement(cid:17), and(cid:16)so the di(cid:17)scretisation of the decoupled problem (1.2)canbeviewedasaSchurcomplementapproachtothediscretisationoftheglobal problem (1.1). Using (2.1) and (2.2), we are able to view (2.3) in terms of 2N +1 subproblems in the following way K u(1) =f , i=1,...,N, IiIi Ii Ii N (2.4)  Su =f − K u(1), K u(2Γ) =−ΓK Xi=u1 ,ΓIi Ii i=1,...,N. IiIi Ii IiΓ Γ We therefore look to solve (2.1) by exploiting the potential for parallelisationpresent in (2.4). 3. Preconditioning. The systems we expect to solve will typically be both sparseandlargescale,duetotheexpectedfinenessofthefiniteelementdiscretisation required in modern design processes, allowing for the computation of resolute solu- tions. Thisisofparticularimportancefordomainscontainingsharpjumps,occurring for instance due to predefined fixed or void regions. Therefore, it is appropriate to consideriterativesolutiontechniqueswhensolvingsystemsofthe form(2.1). Forour problem, we will consider GMRES [8] for reasons to be described below. InordertoavoidthedirectconstructionandapplicationoftheSchurcomplement matrix, and to improve the spectral properties of the system matrix, we seek an appropriate preconditioner for the system (2.1). Through the following choice of P, we see that K K Iˆ 0 P = II IΓ , KP−1 = II 0 S K K−1 Iˆ (cid:18) (cid:19) (cid:18) ΓI II ΓΓ(cid:19) 4 ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation The minimum polynomial of KP−1 is (λ − 1)d, suggesting that iterative solution methods such as GMRES will converge in at most d iterations [5]. Based on this, we propose to precondition from the right with an approximation P of P as follows KP−1u˜ =f, u˜ =Pu, e where e e P = KII KIΓ , P−1 = KI−I1 0 IˆII −KIΓ IˆII 0 0 S 0 Iˆ 0 Iˆ 0 S−1 (cid:18) (cid:19) (cid:18) ΓΓ(cid:19)(cid:18) ΓΓ (cid:19)(cid:18) (cid:19) e e with S representineg an approximation to the discrete Steklov-Poincar´eopereator. We therefore seek a representation of S that is not only practical to invert but can also be seeen to provide an appropriate preconditioning strategy for the resulting interface problem. e The form of S chosen is based on work in [1], where discrete norm representa- tions for projections of the interpolation spaces Λ onto suitable finite dimensional θ subspaces are desceribed and analysed. Discrete norms of the form (3.1) H =M +M(M−1L)1−θ, θ ∈[0,1], θ are shown to be equivalent to their continuous counterparts on Λ , where M and L θ denote the mass and Laplacian matrices respectively assembled on the interface Γ. One particular example corresponds to θ = 1/2, which is shown in [1] to adhere to the same coercivity and continuity bounds as the discrete Steklov-Poincar´eoperator, leading to mesh independent performance of GMRES. The norms presented in (3.1) can be shown to be spectrally equivalent to H =M(M−1L)1−θ. θ Both of the above can be applied component-wise to a system, suggesting an appro- e priate form of S as d e (3.2) H = H . θ θ 1 M b e From the above, it is clear that fractional powers of matrices must be determined in ordertoapplythe discretenorms. Forrelativelysmallproblems,thiscanbeachieved throughdirectmethodssuchasageneralisedeigenvaluedecomposition. However,the complexity involved is O n3 suggesting instead the use of iterative approaches for Γ larger problems. In [1], approximations through the use of truncated Lanczos and (cid:0) (cid:1) inverse Lanczos algorithms are described, and will also be employedwithin this work through the use of flexible GMRES [7] to account for the changing nature of the preconditioner. 4. Results. We presentvarious results in this sectionto illustrate our approach in practice. It should be noted that while certain examples involve symmetric sys- tem matrices, our choice of non-symmetric preconditioner suggests GMRES as an appropriate Krylov solver. The test problem consideredinvolves a cantilever beam overthe 2D domain Ω= (0,2) × (0,1), with downward force f = 0.75 and outward traction g = 1. The domainwillbeclampedontherighthandsidethroughtheapplicationofhomogeneous ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation 5 g Fig.1. Originaldecomposed undeformedlayout (left)anddeformedlayout(right). Locationof clamping and traction illustrated by cyan and black nodes, respectively. Dirichletconditionsontherelevantboundary. Anillustrationofthedeflectionaswell as a pictorial example of a division of the domain (into 16 subdomains) is provided in Figure 1. As discussed in the previous section, iterative approaches will be used for the application of H . For this work, it was found that the inverse Lanczos approach θ deliveredthemostpromisingresults,largelyduetotherelativelysmallnumberofbasis vectorsrequiredbtoapplythediscretenorms. Table1illustratesourresultsfordiffering mesh parameters h and subdomains. The column labelled S = Iˆillustrates results for both test problems in the absence of interface preconditioning. By reading this column fromtop to bottom (for eachproblem), we observeaelogarithmic dependence on the number of GMRES iterations for increasing mesh parameters. Reading this column from left to right also suggests a logarithmic dependence on the number of subdomains. In comparison,the column labelledS =H of the table providesresults with the θ interfacepreconditionerasdiscussedin(3.2). Here,itcanbe seenthatthe numberof iterations are independent of the chosenemeshbparameter. Whilst there is a logarith- mic dependence onthe number ofiterationsfor anincreasingnumber ofsubdomains, adirectcomparisonwiththecolumnlabelledS =Iˆsuggeststhatourpreconditioning strategy provides significant savings in the number of iterations required for conver- gence. e ThefinalcolumnlabelledS =H illustratesresultsforselectedvaluesoftheta OPT basedontesting. Itwasfoundthattherecordedvalueswereabletoprovideimproved results over the other two coluemns,bsuggesting that different values of theta are able to provide a closer approximation to the decay of the associated Steklov-Poincar´e operator. S˜=I S˜=Hθ S˜=HOPT #Domains 4 16 64 4 16 64 4 16 64 b b θ - - - 0.5 0.5 0.5 0.5 0.6 0.7 h=1/32 28 47 68 12 18 27 12 17 22 1/64 41 66 96 12 19 27 12 18 23 1/128 59 93 137 12 19 27 12 19 24 Table 1 Results forthe test problem. Tolerance of GMRES set at 10−6. In order to observe the computational benefits of our method, we look to pro- vide rough estimates in order to gauge how our derived approach will perform in a parallel environment. Due to the non-overlapping nature of our approach, all subdo- 6 ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation main solves can be carried out in parallel. As mentioned previously, the main issue surrounds the solution to the resulting interface problem. Within each application of our preconditioner to this problem, we are required to invert the discrete interface Laplacian. This issue is present in the Lanczos process, and also in the subsequent generalisedeigenvaluedecompositionthatfollows. Duetothestructureofthismatrix, these inversions can lead to a computational bottleneck for an increasing number of subdomains, and so we would like to consider an iterative approach to alleviate this issue. The structureofthe involvedmatrixsuggestsconjugategradientas asuitable al- ternative,coupledwithanappropriatepreconditioningstrategy(PCG).Inthis work, we propose to precondition by using the relevant contributions of L restricted to Γ , i with the cross points removed to enable construction in parallel. The parallel CPU time takenfor eachGMRES iterationcan then be realisedby dividing the number of PCGiterationsmultiplied by the CPUtime takento apply the preconditionerby the total number of faces involved in the construction of Γ. By adding this contribution to the CPU time taken for one parallel subdomain solve, we calculate the total CPU time by multiplying the result to the total number of GMRES iterations required to achieve convergence. The results for the investigation are displayed in Table 2 where CPU times (in seconds)are providedfor differing meshand subdomainsizes. A Linux machine with an Intel(cid:13)R CoreTM i7 CPU 870@ 2.93 GHz with 8 cores was used to obtain the data. S˜=H OPT #Domains 4 16 64 256 b θ 0.5 0.6 0.7 0.75 h=1/16 0.0169 0.0168 0.0176 0.0254 1/32 0.0635 0.0273 0.0199 0.0232 1/64 0.4455 0.1238 0.0384 0.0238 1/128 3.8716 1.0804 0.2529 0.0623 1/256 50.2476 13.2858 3.3204 0.7295 Table 2 Total CPU times(seconds) anticipated through the use of parallel computing. From the table, it can be seen that for relatively coarse meshes, we do not see a significant enough decrease in the CPU time to warrant the use of parallelism. This behaviour can be attributed to the computational complexity of sparse matrix inversion (O(k2n), k is the bandwidth) for relatively small values of n, and also the efficiency of the backslash command in MATLAB. However, notable savings in CPU time equating to roughly factor 4 can be seen for finer meshes. These figures are encouraging, as they suggest that our approach is capable of significant speedup throughtheuseofparallelarchitecturewhencompareddirectlytosolvingtheproblem globally on a single processor. After collating the results in Table 2, a generalincreasewas notedin the number ofGMRESiterationswhencompareddirectlytothe figuresobtainedinTable 1. The reason for this can be attributed to the use of inner PCG iterations. In particular, a logarithmic dependence on the mesh parameter was observed for cases involving smaller numbers of subdomains. However, the deterioration can be seen as an ac- ceptable compromise,asthe results forlargermeshessuggestthe use ofanincreasing ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation 7 number ofsubdomainsforimprovedperformance. Itshouldbe notedthatthe results obtained above were done so with a relatively coarse tolerance for PCG of 10−3, as well as a reasonably modest number of PCG iterations (typically between 2 and 12) at each GMRES iteration for each of the test cases considered. It should not be expected that continual speedup can be gained through the use of an increasing number of subdomains, as certain factors such as inter-processor communication between each of the three steps will begin to play an important role. Therefore,intermsofaregularsubdivision,thiswouldsuggestanoptimaldecomposi- tionofthedomainbasedonthemeshparameter,andalsopossiblyothercontributing factors relating to computer hardware. 5. Topology Optimisation. Asanapplicationofourfindings,wewilldescribe how our workcan be incorporatedinto commonly used solversfromproblems arising in topology optimisation. The problem we consider here is the so-called Variable Thickness Sheet problem [2, pp. 54 – 57], which can be described mathematically using finite elements by the following nonlinear optimisation problem min fTu u,ρ m  (5.1) subject to: K(ρρ)ui ≤=Vf, K(ρ)=Xi=1ρiKi!, i∈D In theabove, the n−v0Xec≤toρr≤ofρnio≤daρl displace∀mie∈ntDva..=lue{s1,u2,=...u,(mρ)}.denotes the solution to the elasticity equations, with f ∈ Rn representing the corresponding dis- cretisation of the load linear form. The density ρ ∈ Rm is subject to upper and lower bounds ρ and ρ respectively, with the volume of the body being denoted by V. Additionally, K(ρ) represents the finite element stiffness matrix for the elasticity equations, with each K , i=1,...,m denoting elemental stiffness matrices. i By consideringthe methodofLagrangemultipliers, minima to (5.1)areobtained through a nonlinear system of equations. The nonlinearities can be dealt with using a number of commonly used approaches. For instance, one could consider the use of interior point methods [4]. The fairly standard solution technique used by the communityinvolvestheconsiderationoffixedpointtypeupdateschemesforaninitial guess for the density in the following way 1. Finite Element Analysis (FEA) solve equations of linear elasticity. 2. Density update e.g.: OC (see [2]), MMA (see [9]). 3. Check for convergence. If not satisfied, rerun 1 and 2 using updated density. It can be expected that a reasonably large number of fixed point iterations are re- quired to obtain a suitable final design. The bulk of computational effort will be concentrated on the Finite Element Analysis step, namely the repeated process of obtaining updated displacement variables through the use of the equations of linear elasticity[3]. Therefore,weproposetoapplyourpreconditioningstrategyasdiscussed inSection3tothisproblemcoupledwiththefairlystraightforwardOptimalityCrite- ria (OC) method for the density update. No attempt will be made here to carry out Step 2 above in parallel; however [3] describe an appropriate implementation using the Method of Moving Asymptotes (MMA). 8 ParallelSolutionoftheLinearElasticityproblemwithapplicationsinTopologyOptimisation #Domains 4 16 64 256 θ 0.5 0.6 0.7 0.75 h=1/16 10(10) 10(18) 10(33) 10(56) 1/32 17(11) 17(18) 17(34) 19(54) 1/64 23(11) 23(18) 24(32) 27(54) 1/128 29(12) 30(17) 32(31) 32(52) 1/256 33(13) 36(17) 37(31) 41(49) Table 3 Results for the cantilever beam problem solved using our preconditioning strategy for the FEA coupled withthe OCmethod for the density update. In Table 3, results are provided illustrating the performance of our approach for the cantilever beam problem. The results were obtained using an adaptive tolerance for GMRES based on successive compliance values. The total number of fixed point iterations are given, along with the average number of GMRES iterations per fixed point step (bracketed). Whilst the number of fixed point iterations appears to in- crease for finer meshes, the average number of GMRES iterations remains roughly constant. Whilst we still see a logarithmic dependence on the average number of GMRESiterationsforanincreasingnumberofsubdomains,the fixedpointiterations remain roughly constant. Future workinvolves validation of our approachon a parallelmachine, as well as consideration of further problems (possibly to include 3D domains) and alternative solutionmethodstotrytosolvetopologyoptimisationproblemscompletelyinparallel. We expect our approach to adapt well in parallel, with potential speedup for 3D problems of factor 8 anticipated. REFERENCES [1] M.ArioliandD. Loghin,Discrete Interpolation Norms with Applications, SIAMJ.Numer. Anal.,47(2009), pp.2924–2951. [2] M.P.BendsøeandO.Sigmund,TopologyOptimization: Theory,Methods,andApplications, SpringerVerlag,Providence,RhodeIsland,2003. [3] T. Borrvall and J.Petersson, Large-scale topology optimization in 3D using parallel com- puting, Computer Methods inAppliedMechanics andEngineering, 190(2001), pp. 6201– 6229. [4] R.H.W.HoppeandS.I.Petrova,Primal–DualNewtonInteriorPointMethodsinShapeand Topology Optimization, Numerical Linear Algebrawith Applications, 11(2004), pp. 413– 429. [5] I.C.F. Ipsen,A note on preconditioning nonsymmetric matrices,SIAMJ.Numer.Anal.,23 (2002), pp.1050–1051. [6] A.QuarteroniandA.Valli,DomainDecomposition MethodsforPartialDifferentialEqua- tions, Numerical Mathematics and Scientific Computation, The Clarendon Press Oxford UniversityPress,NewYork,1999. OxfordSciencePublications. [7] Y. Saad, A Flexible Inner-Outer Preconditioned GMRES Algorithm, SIAM J. Sci. Comput., 14(1993), pp.461–469. [8] Y.SaadandM.H.Schultz,GMRES:AGeneralizedMinimalResidualAlgorithmforsolving Nonsymmetric Linear Systems,SIAMJ.Sci.Statist.Comput.,7(1986), pp.856–869. [9] K. Svanberg, A Class of Globally Convergent Optimization Methods based on Conservative Convex Separable Approximations, SIAM J. Optim., 12 (2001/02), pp. 555–573 (elec- tronic). [10] A.ToselliandO.Widlund,DomainDecompositionMethods–AlgorithmsandTheory,vol.34 ofSpringerSeriesinComputational Mathematics, Springer-Verlag,Berlin,2005. [11] J. A. Turner, Application of Domain Decomposition to Problems in Topology Optimisation, PhDthesis,UniversityofBirmingham,2014.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.