On fast Fourier solvers for the tensor product high-order FEM for a generalized Poisson equation Alexander Zlotnik 1 and Ilya Zlotnik 2 Abstract We present direct logarithmically optimal in theory and fast in practice algorithms to implement the tensor product high order finite element method on multi-dimensional rectangular parallelepipeds for solving PDEs of the Poisson kind. They are based on the well-known Fourier approaches. The key new points are the fast direct and inverse FFT-based algorithms for expansion in eigenvectors of the 1D eigenvalue problems for the high order FEM. The algorithms can further be used for numerous applications, in particular, to implement the tensor product high order finite element methods for various 7 time-dependentPDEs. Resultsofnumericalexperimentsin2Dand3Dcasesarepresented. 1 0 2 Keywords. Fast direct algorithm, high order finite element method, FFT, Poisson equation. n MSC subject classifications. 65F05, 65F15, 65M60, 65T99. a J 4 1 Introduction 1 ] We present direct fast algorithms to implement nth order (n (cid:62) 2) finite element method A (FEM) on rectangular parallelepipeds [4] for solving a N-dimensional generalized Poisson equa- N tion (N (cid:62) 2) with the Dirichlet boundary condition. The algorithms are based on the well- . h known Fourier approaches, e.g., see [2,10–12] and references therein. The key new points are t a the fast direct and inverse algorithms for expansion in eigenvectors of the 1D eigenvalue prob- m lems for the high order FEM utilizing several versions of the discrete fast Fourier transform [ (FFT) [3]. This solves the old known problem, see [2, p. 271], and makes the full algorithms 1 v logarithmically optimal with respect to the number of elements as in the case of the bilinear 7 elements (n = 1) or standard finite-difference schemes. The algorithms are fast in practice 6 (faster than the theoretical expectations) and demonstrate only a mild growth in n starting 9 3 from the standard case n = 1. For example, in the 9th order case, the 2D FEM system for 0 220 elements containing almost 85 · 106 unknowns and the 3D FEM system for 218 elements . 1 containing more than 190·106 unknowns are solved respectively in less than 2 and 15 min on 0 7 an ordinary laptop using Matlab R2016a code, see details below. 1 The algorithms can further serve for a variety of applications including general 2nd order : v elliptic equations (as preconditioners), for the N-dimensional heat, wave or time-dependent i X Schro¨dinger PDEs, etc. They can be applied for some non-rectangular domains, in particular, r by involving meshes topologically equivalent to rectangular ones [7]. Other standard boundary a conditions can be covered as well, see a brief description in [14]. Moreover, the Fourier structure of algorithms is especially valuable for solving some wave physics problems, in particular, in- volving non-local boundary conditions, e.g., see [2,6,13], whence our own interest arose. Clearly the algorithms are also highly parallelizable. 1Corresponding author. Department of Mathematics at Faculty of Economic Sciences, National Re- search University Higher School of Economics, Myasnitskaya 20, 101000 Moscow, Russia. E-mail: [email protected] 2Settlement Depository Company, 2-oi Verkhnii Mikhailovskii proezd 9, building 2, 115419 Moscow, Russia. E-mail: [email protected] 1 The paper is organized as follows. In Section 2, the statement of the 1D FEM eigenvalue problem together with auxiliary FEM eigenvalue problems on and inside the reference element are given. The basic Section 3 is devoted to a description of eigenvalues and eigenvectors of the 1D FEM eigenvalue problem and the fast direct and inverse algorithms for expansion in these eigenvectors. Applications to the generalized Poisson equation in a N-dimensional rectangular parallelepiped with the Dirichlet boundary condition are described in Section 4. Results of numerical experiments for N = 2 and 3 are presented in detail in Section 5; all of them include the standard case n = 1 for comparison. 2 The statement of 1D FEM eigenvalue problem We first consider in detail the FEM for the simplest 1D eigenvalue ODE problem −u(cid:48)(cid:48)(x) = λu(x) on [0,X], u(0) = u(X) = 0, u(x) (cid:54)≡ 0. (2.1) We take the uniform mesh ω¯ with the nodes x = jh, j = 0,K (i.e., 0 (cid:54) j (cid:54) K) and h j the step h = X/K. Let H(n)[0,X] be the FEM space of the piecewise-polynomial functions h ϕ ∈ C[0,X] such that ϕ(x) ∈ P for x ∈ [x ,x ], j = 1,K, with ϕ(0) = ϕ(X) = 0; here P n j−1 j n is the space of polynomials having at most nth degree, n (cid:62) 2. Let S(n) be the space of vector functions w such that w ∈ R for j = 0,K with w = w = 0 K j 0 K and w ∈ Rn−1, j = 1,K. Clearly dimS(n) = nK−1. A function ϕ ∈ H(n)[0,X] is uniquely j−1/2 K h defined by its values at the mesh nodes ϕ = ϕ(x ), j = 0,K, with ϕ = ϕ = 0, and inside j j 0 K the elements ϕ = {ϕ(x +(l/n)h)}n−1, j = 1,K, that form the element in S(n). j−1/2 j−1 l=1 K We use the following scaled operator form of the standard FEM discretization for problem (2.1) Av = λCv, v ∈ S(n), v (cid:54)= 0. (2.2) K HereA = AT > 0andC = CT > 0aretheglobal(scaled)stiffnessandmassoperators(matrices) acting in S(n) and together with λ independent on h; the true approximate eigenvalues are K λ = 4h−2λ. h Let A = {A }n and C = {C }n be the local stiffness and mass matrices related to kl k,l=0 kl k,l=0 the reference element σ = [−1,1] with the following entries 0 (cid:90) (cid:90) A = e(cid:48)(x)e(cid:48)(x)dx, C = e (x)e (x)dx, kl k l kl k l σ0 σ0 (cid:0) (cid:1) where {e }n is the Lagrange basis in P such that e −1+(2k)/n = δ , for k,l = 0,n, and l l=0 n l kl δ is the Kronecker delta. The matrices A, C and the related matrix pencil G(λ) := A−λC kl have the following 3×3–block form a aT a c cT c g (λ) gT(λ) g (λ) 0 n 0 n 0 n A = a A(cid:101) aˇ , C = c C(cid:101) cˇ, G(λ) = g(λ) G(cid:101)(λ) gˇ(λ) . (2.3) a aˇT a c cˇT c g (λ) gˇT(λ) g (λ) n 0 n 0 n 0 Here A(cid:101), C(cid:101) and G(cid:101)(λ) = A(cid:101)− λC(cid:101) are square matrices of order n − 1 with the column vectors a,c,g(λ) = a−λc ∈ Rn−1 whereas pˇ ≡ (Pp) = p , l = 1,n−1, for p ∈ Rn−1. The matrices l l n−l A, C and G(λ) are bisymmetric (i.e. symmetric with respect to the main and secondary diagonals). 2 Notice that P = δ and PT = P−1 = P. Let Rn−1 and Rn−1 be the subspaces of even ij i(n−j) e o and odd vectors in Rn−1, i.e. such that respectively Pp = p and Pp = −p. The decomposition Rn−1 = Rn−1 ⊕Rn−1 (for n (cid:62) 3) is implemented by the formulas e o p = p +p , p := 0.5(p+pˇ), p := 0.5(p−pˇ). (2.4) e o e o Notice that dimRn−1 = [n/2] and dimRn−1 = [(n−1)/2], with Rn−1 = {0} for n = 2. Clearly e o o pˇ·q = p·qˇ, pˇ·qˇ= p·q for any p,q ∈ Rn−1. (2.5) Hereafter the symbol · denotes the inner product of vectors in Rn−1. Then problem (2.2) can be written in the following explicit form g (λ)v +gˇ(λ)·v +2g (λ)v +g(λ)·v +g (λ)v = 0, j = 1,K −1, (2.6) n j−1 j−1/2 0 j j+1/2 n j+1 g(λ)v +G(cid:101)(λ)v +gˇ(λ)v = 0, j = 1,K, (2.7) j−1 j−1/2 j with v = v = 0, v (cid:54)≡ 0. 0 K We also consider the auxiliary eigenvalue problems on and inside the reference element σ 0 Ae = λCe, e ∈ Rn+1, e (cid:54)= 0; (2.8) A(cid:101)e = λC(cid:101)e, e ∈ Rn−1, e (cid:54)= 0, (2.9) where clearly A (cid:62) 0, C > 0 and A(cid:101) = A(cid:101)T > 0, C(cid:101) = C(cid:101)T > 0; see some their properties in [13] (wheretheproblemsimilarto(2.6), (2.7)ontheuniformmeshon[0,∞)forλ ∈ Cwasstudied). Denote by S and S˜ their spectra. Let {λ(l),e(l)}n−1 be eigenpairs of problem (2.9). n n 0 l=1 Lemma 2.1. 1. The subspaces Rn−1 and Rn−1 are invariant with respect to A(cid:101) and C(cid:101). Thus e e each eigenvector in {e(l)}n−1 can be chosen either even or odd. Also λ(l) > 0, l = 1,n−1. l=1 0 2. Similar properties are valid for problem (2.8) with the exception of one simple zero eigenvalue. Proof. Any bisymmetric matrix B of the order n−1 commutes with P, i.e. BP = PB, (2.10) that implies the main result of Item 1. The property λ(l) > 0, l = 1,n−1 is well-known. 0 For Item 2, the argument is similar taking into account that A(1...1)T = 0 (concerning simplicity of λ = 0, see Proposition 5 in [13]). ˜ One can check by the direct computation that all the eigenvalues in S and S are simple n n and S ∩S˜ = ∅ at least for 1 (cid:54) n (cid:54) 9, see [13]. n n ˜ For low n, one can find S and S analytically (with the help of Mathematica), in particular, n n √ √ √ ˜ ˜ ˜ ˜ S = {2.5}, S = {2.5,10.5}, S = {14 ± 133,10.5} and S = {14 ± 133,30 ± 9 5} 2 3 4 5 (repeatability of the eigenvalues is not occasional, see [13]). We choose {e(l)}n−1 as in Lemma 2.1 using scaling C(cid:101)e(l) ·e(l) = 1. l=1 ˜ Lemma 2.2. Let G(cid:101)(λ)p = −g(λ), where λ (cid:54)∈ S . Let the vectors a and c be expanded as n n−1 n−1 (cid:88) (cid:88) a = a(l)C(cid:101)e(l), c = c(l)C(cid:101)e(l), with a(l) = a·e(l), c(l) = c·e(l). (2.11) l=1 l=1 See G(cid:101)(λ), g(λ), a and c in (2.3). Then the following formulas hold (cid:88)n−1 a(l) −λc(l) (cid:88)n−1 a(l) −λ(l)c(l) p = e(l) = 0 e(l) −C(cid:101)−1c. λ−λ(l) λ−λ(l) l=1 0 l=1 0 3 Proof. For the expansions p = (cid:80)n−1p e(l) and (2.11), we have l=1 l n−1 n−1 G(cid:101)(λ)p = (cid:88)(cid:0)λ(l) −λ(cid:1)p C(cid:101)e(l), g(λ) = (cid:88)(cid:0)a(l) −λc(l)(cid:1)C(cid:101)e(l), 0 l l=1 l=1 and the result easily follows. 3 Solving of the 1D FEM eigenvalue problem and related FFT-based algorithms Below we impose the following assumption: ˜ ˜ (A) the eigenvalues in S and S are simple and S ∩S = ∅. n n n n Recall that it is valid at least for 2 (cid:54) n (cid:54) 9 (below in Section 5 we verify it up to n = 21). We introduce the auxiliary equation γ(λ) ≡ −(g −g ·G(cid:101)−1g)(λ)/(g −gˇ·G(cid:101)−1g)(λ) = θ, (3.1) (cid:98) 0 n ˜ whereλ (cid:54)∈ S , withtheparameterθ. Itssolvingisequivalenttofindingtherootsofapolynomial n having at most nth degree, see [13]. In particular, owing to Lemma 2.2 this equation can be rewritten as (cid:88)n−1 (a(l) −λc(l))2 (cid:16) (cid:88)n−1 (aˇ(l) −λcˇ(l))(a(l) −λc(l))(cid:17) a −λc + = −θ a −λc + . (3.2) 0 0 λ−λ(l) n n λ−λ(l) l=1 0 l=1 0 Hereaˇ(l) = aˇ·e(l) andcˇ(l) = cˇ·e(l). Moreover,for2 (cid:54) n (cid:54) 9computationshelptoconfirmthatthe vectors e(l) are even and odd respectively for odd and even l provided that λ(1) < ... < λ(n−1); 0 0 therefore aˇ(l) = (−1)la(l) and cˇ(l) = (−1)lc(l), l = 1,n−1. We define the simplest inner product in S(n) and the corresponding squared C-norm K K−1 K (cid:88) (cid:88) (y,v) := y v + y ·v , (cid:107)v(cid:107)2 := (Cv,v) . S(n) j j j−1/2 j−1/2 C S(n) K K j=1 j=1 Next theorem describes eigenvalues and eigenvectors of problem (2.2). Theorem 3.1. 1. The spectrum of problem (2.2) consists in S˜ and the numbers (cid:8)λ(l)(cid:9)n n k l=1 that are all n (and all positive real) solutions to equation (3.2) with θ = θ := cos πk for k K k = 1,K −1. The numbers (cid:8)λ(l)(cid:9)n differ from {λ(l)}n−1 and are different for fixed k . k l=1 0 l=1 2. The following eigenvector corresponds to the eigenvalue λ(l): 0 s(l) = 0, j = 1,K −1, s(l) = (−P)j−1e(l), j = 1,K, (3.3) 0,j 0,j−1/2 for l = 1,n−1. Here (−P)j−1e = (−1)j−1e for even e, (−P)j−1e = e for odd e. 3. The following eigenvector corresponds to the eigenvalue λ(l): k πk(j −1) πkj s(l) = s , j = 1,K −1, s(l) = p(l)sin +pˇ(l)sin , j = 1,K, (3.4) k,j k,j k,j−1/2 k K k K 4 where s := sin πkj and p(l) ∈ Rn−1 solves the non-degenerate algebraic system G(cid:101)(cid:0)λ(l)(cid:1)p(l) = k,j K k k k −g(cid:0)λ(l)(cid:1), for k = 1,K −1, l = 1,n. k 4. The introduced eigenvectors are C-orthogonal, i.e. (cid:0)Cs(l),s(˜l)(cid:1) = 0 (3.5) k k˜ S(n) K ˜ ˜ ˜ ˜ for any k,k ∈ 0,K −1, l ∈ 1,n−δ and l ∈ 1,n−δ such that k (cid:54)= k and/or l (cid:54)= l. k0 k˜0 Consequently they form the basis in S(n), i.e. any w ∈ S(n) can be uniquely expanded as K K n−1 K−1 n (cid:88) (cid:88)(cid:88) w = w s(l) + w s(l). (3.6) 0l 0 kl k l=1 k=1 l=1 ˜ Proof. 1. We distinguish between two cases. Let first λ ∈ S and e satisfy (2.9). Then, for any n j = 1,K, using equation (2.7) we get (cid:2) (cid:3) 0 = v ·G(cid:101)(λ)e = G(cid:101)(λ)v ·e = − (g(λ)·e)v +(gˇ(λ)·e)v . j−1/2 j−1/2 j−1 j Clearly 0 g(λ)·e G(λ) e = 0 . 0 gˇ(λ)·e Since λ (cid:54)∈ S by assumption (A), we have g(λ) · e (cid:54)= 0 or gˇ(λ) · e (cid:54)= 0. Owing to assumption n ˜ (A) and Lemma 2.1, λ is simple in S and e is either even or odd. Correspondingly either n gˇ(λ)·e = g(λ)·e (cid:54)= 0 and v = −v , or gˇ(λ)·e = −g(λ)·e (cid:54)= 0 and v = v . Since v = 0, j j−1 j j−1 0 in both cases we get v = 0, j = 0,K. (3.7) j Thus equation (2.7) is reduced to G(cid:101)v = 0 and implies that v = c e. j−1/2 j−1/2 j−1/2 Now equation (2.6) is reduced to c gˇ(λ)·e+c g(λ)·e = 0, j = 1,K −1. j−1/2 j+1/2 Therefore c = −c for even e or c = c for odd e. Consequently the sought j+1/2 j−1/2 j+1/2 j−1/2 eigenvector v satisfies (3.7) together with v = (−1)j−1e for even e, v = e for odd e, j = 1,K j−1/2 j−1/2 (v is defined up to a non-zero constant multiplier). Thus we come to eigenvectors (3.3). ˜ 2. Next let λ (cid:54)∈ S . Then from equation (2.7) we get n v = −G−1(λ)[v g(λ)+v gˇ(λ)], j = 1,K. (3.8) j−1/2 j−1 j Inserting this into equation (2.6), we find the three-point equation gˆ (λ)v +2gˆ (λ)v +gˆ (λ)v = 0, j = 1,K −1, (3.9) n j−1 0 j n j+1 where gˆ (λ) = (g −g ·G(cid:101)−1g)(λ), gˆ (λ) = (g −gˇ·G(cid:101)−1g)(λ). 0 0 n n 5 Straightforwardly the following equalities hold 1 gˆ (λ) 0 gˆ (λ) 0 n G(λ) −(G(cid:101)−1g)(λ) = 0 , G(λ) −(G(cid:101)−1gˇ)(λ) = 0 . 0 gˆ (λ) 1 gˆ (λ) n 0 If gˆ (λ) = gˆ (λ) = 0, then the equalities mean that λ is at least double eigenvalue for problem n 0 (2.8) that contradicts assumption (A). If gˆ (λ) = 0 and gˆ (λ) (cid:54)= 0, then equation (3.9) together with (3.8) lead to v = 0 thus such n 0 λ does not satisfy (2.2). Therefore gˆ (λ) (cid:54)= 0 and equation (3.9) is simplified to n v −2γˆ(λ)v +v = 0, j = 1,K −1, (3.10) j−1 j j+1 with the function γˆ(λ) = gˆ (λ)/gˆ (λ) (see it also in (3.1)). Since v = v = 0, we can use the 0 n 0 n expansion K−1 (cid:88) v = v˜ s , j = 0,K, j k k,j k=1 and define the vector v˜ := (v˜ ,...,v˜ ) of its coefficients. Using the expansion in (3.10) gives 1 K−1 K−1 (cid:88) (cid:0) (cid:1) 2 v˜ θ −γˆ(λ) s = 0, j = 1,K −1. (3.11) k k k,j k=1 Clearly this equality is valid for some v˜ (cid:54)= 0 if and only if γˆ(λ) = θ for some k = 1,K −1. (3.12) k Notice that v˜ = 0 is equivalent to v = 0 in S(n) (taking into account formula (3.8)). K Therefore λ satisfies (3.12); moreover, v˜ = δ and consequently v = s , j = 0,K, together j kj j k,j with v = −s (G−1g)(λ)−s (G−1gˇ)(λ), j = 1,K, j−1/2 k,j−1 k,j see (3.8) (all last three equalities are valid up to the same non-zero multilplier). Thus we come to eigenvectors (3.4). ˜ The total amount of eigenvalues λ (cid:54)∈ S (taking into account their possible multiplicity) is n dimS(n)−(n−1) = n(K−1). The maximal amount of roots algebraic equations (3.12) for all K k is the same so that each equation (3.12) has to possess exactly n distinct roots (for fixed k, the written eigenvector v is defined by λ uniquely). 3. Property (3.5) is knowingly valid for eigenvectors s(l) and s(˜l) corresponding to different k k˜ ˜ ˜ ˜ eigenvalues of problem (2.2), in particular, for k = 0 and k (cid:54)= 0, or k = k and l (cid:54)= l. The remaining case will be covered below in Corollary 3.1 of the related Lemma 3.1. Notice that: (1) the vectors s(l) are used only to describe the algorithm, and only the 0 vectors e(l) are applied in its implementation; (2) s(l) are independent on l; (3) the vectors p(l) k,j k are independent on j and can also be computed owing to Lemma 2.2. 6 Lemma 3.1. Let w ∈ S(n) and w = qw +qˇw , j = 1,K, for some q ∈ Rn−1. Then K j−1/2 j−1 j (cid:0)Cs(l),w(cid:1) = 0, l = 1,n−1. (3.13) 0 S(n) K (cid:0)Cs(l),w(cid:1) k S(n) K = 2(cid:8)c +c·p(l) +(cid:0)C(cid:101)p(l) +c(cid:1)·q +θ (cid:2)c +c·pˇ(l) +(cid:0)C(cid:101)p(l) +c(cid:1)·qˇ(cid:3)(cid:9)(s ,w) , (3.14) 0 k k k n k k k ωh for k = 1,K −1, l = 1,n, where (s ,w) := (cid:80)K−1s w . k ωh j=1 k,j j Proof. 1. For any v,w ∈ S(n), recalling notation (2.3) we have K K−1 (cid:88)(cid:2) (cid:3) (Cv,w) = 2c v +c (v +v )+cˇ·v +c·v w S(n) 0 j n j−1 j+1 j−1/2 j+1/2 j K j=1 K (cid:88)(cid:0) (cid:1) + cv +C(cid:101)v +cˇv ·w . (3.15) j−1 j−1/2 j+1 j−1/2 j=1 2. According to formulas (3.15) and (3.3), for even e(l), we get K−1 K (cid:0)Cs(l),w(cid:1) = (cid:88)(−1)j(c−cˇ)·e(l)w +(cid:88)(−1)j−1C(cid:101)e(l) ·(qw +qˇw ) 0 S(n) j j−1 j K j=1 j=1 K−1 (cid:88) = C(cid:101)e(l) ·(q −qˇ) (−1)jw = 0 j j=1 since (c−cˇ)·e(l) = 0 and C(cid:101)e(l) ·(q −qˇ) = 0 for any c,q ∈ Rn−1 as well as w = w = 0. 0 K For odd e(l), we similarly get K−1 K K−1 (cid:0)Cs(l),w(cid:1) = (cid:88)(c+cˇ)·e(l)w +(cid:88)C(cid:101)e(l) ·(qw +qˇw ) = C(cid:101)e(l) ·(q +qˇ)(cid:88)w = 0 0 S(n) j j−1 j j K j=1 j=1 j=1 since (c+cˇ)·e(l) = 0 and C(cid:101)e(l)·(q+qˇ) = 0 for any c,q ∈ Rn−1 as well as w = w = 0. Equality 0 K (3.13) is proved. 3. Formula (3.15) together with (3.4) and (2.5) imply that K−1 (cid:0)Cs(l),w(cid:1) = (cid:88)(cid:2)2(cid:0)c +c·p(l)(cid:1)s +(cid:0)c +c·pˇ(l)(cid:1)(s +s )(cid:3)w k S(n) 0 k k,j n k k,j−1 k,j+1 j K j=1 K +(cid:88)(cid:2)(cid:0)C(cid:101)p(l) +c(cid:1)s +(cid:0)C(cid:101)pˇ(l) +cˇ(cid:1)s (cid:3)·w =: (cid:88)(cid:48)+(cid:88)(cid:48)(cid:48). (3.16) k k,j−1 k k,j j−1/2 j=1 Owing to formula s +s = 2θ s (3.17) k,j−1 k,j+1 k k,j we first derive (cid:88)(cid:48) = 2(cid:2)c +c·p(l) +θ (c +c·pˇ(l))(cid:3)(s ,w) . (3.18) 0 k k n k k ωh 7 Second, using formulas (2.5) and (2.10), we get K (cid:88)(cid:48)(cid:48) = (cid:88)(cid:0)C(cid:101)p(l) +c(cid:1)·q(s w +s w )+(cid:0)C(cid:101)p(l) +c(cid:1)·qˇ(s w +s w ). k k,j−1 j−1 k,j j k k,j−1 j k,j j−1 j=1 Owing to s = s = 0, w = w = 0 as well as formulas (3.17), we further derive k,0 k,K 0 K K−1 (cid:88)(cid:48)(cid:48) = 2(cid:0)C(cid:101)p(l) +c(cid:1)·q(s ,w) +(cid:0)C(cid:101)p(l) +c(cid:1)·qˇ(cid:88)(s +s )w k k ωh k k,j−1 k,j+1 j j=1 = 2(cid:2)(cid:0)C(cid:101)p(l) +c(cid:1)·q +θ (cid:0)C(cid:101)p(l) +c(cid:1)·qˇ(cid:3)(s ,w) . (3.19) k k k k ωh Adding (3.18) and (3.19), we prove (3.14). Corollary 3.1. The orthogonality property (3.5) from Theorem 3.1, Item 4 is valid. ˜ ˜ Proof. It remains to consider the case k,k ∈ 1,K −1 and k (cid:54)= k. Since then (s ,s ) = 0, the k k˜ ωh result follows from (3.14). Wecallthecalculationofw ∈ S(n) bythecoefficientsw oftheexpansion(3.6)asthe inverse K kl F -transform and the calculation of the coefficients w by w ∈ S(n) as the direct F -transform. n kl K n We also consider related expansion of y ∈ S(n) K n−1 K−1 n (cid:88) (cid:88)(cid:88) y = y Cs(l) + y Cs(l) (3.20) (cid:101)0l 0 (cid:101)kl k l=1 k=1 l=1 and the calculation of the coefficients y by y ∈ S(n) that we call as the direct FC -transform. (cid:101)kl K n Let us describe their fast FFT-based implementation. Theorem 3.2. 1. The inverse F -transform can be implemented according to the following n formulas K−1 n (cid:88)(cid:16)(cid:88) (cid:17) πkj w = w sin , j = 1,K −1, (3.21) j kl K k=1 l=1 n−1 (cid:88) w = (−P)j−1 w e(l) j−1/2 0l l=1 K−1 K−1 (cid:88) πk πk(j −1/2) (cid:88) πk πk(j −1/2) +2 d cos sin −2 d sin cos , j = 1,K, (3.22) k,e k,o 2K K 2K K k=1 k=1 where d and d are respectively even and odd components of the vector d := (cid:80)n w p(l). k,e k,o k l=1 kl k Note that (−P)j−1e = e for odd j and (−P)j−1e = −eˇ for even j for any e ∈ Rn−1. The collection {w }K−1 can be computed by the standard inverse FFT with respect to sines. j j=1 The collection {w }K can be computed by n−1 modified inverse FFT related to the centers j−1/2 j=1 of elements in the amount of [n/2] with respect to sines and [(n−1)/2] with respect to cosines using extensions d := 0 and d := 0, see algorithms DST-I, DST-III and DCT-III in [3]. K,e 0,o 2. The direct FC -transform can be implemented basing on the standard formula n y = (cid:0)y,s(l)(cid:1) /(cid:107)s(l)(cid:107)2. (3.23) (cid:101)kl k S(n) k C K 8 Here, first, for k = 0, l = 1,n−1, the following formulas hold K (cid:16)(cid:88) (cid:17) (y,s(l)) = (−P)j−1y ·e(l), (cid:107)s(l)(cid:107)2 = K. (3.24) 0 S(n) j−1/2 0 C K j=1 Second, for k = 1,K −1, l = 1,n, the following formulas hold K−1 (cid:88) πkj (y,s(l)) = y sin k S(n) j K K j=1 K−1 K−1 (cid:88) πkj (cid:88) πkj +p(l) · (y +y ) sin +p(l) · (y −y ) sin , (3.25) k,e j−1/2 j+1/2 e K k,o j+1/2 j−1/2 o K j=1 j=1 (cid:110) (cid:111) (cid:107)s(l)(cid:107)2 = K c +(cid:0)C(cid:101)p(l) +2c(cid:1)·p(l) +θ (cid:2)c +(cid:0)C(cid:101)p(l) +2c(cid:1)·pˇ(l)(cid:3) . (3.26) k C 0 k k k n k k Notice that the sums in formula (3.25) are independent on l. The collection of all these coeffi- cients can be computed using n standard direct FFTs with respect to sines. 3. Similarly to Item 2, the direct F -transform can be implemented basing on the standard n formula w = (cid:0)Cw,s(l)(cid:1) /(cid:107)s(l)(cid:107)2. (3.27) kl k S(n) k C K Here, for k = 0, l = 1,n−1, the following formula holds K (cid:16) (cid:88) (cid:17) (Cw,s(l)) = C(cid:101) (−P)j−1w ·e(l). (3.28) 0 S(n) j−1/2 K j=1 For k = 1,K −1, l = 1,n, formula (3.25) with y := Cw is applicable. Alternatively, the following formula holds as well K−1 (Cw,s(l)) = 2(cid:2)c +c·p(l) +θ (cid:0)c +c·pˇ(l)(cid:1)(cid:3)(cid:88)w sin πkj k S(n) 0 k k n k j K K j=1 K−1 K−1 (cid:88) πkj (cid:88) πkj +q(l) · (w +w ) sin +q(l) · (w −w ) sin , (3.29) k,e j−1/2 j+1/2 e K k,o j+1/2 j−1/2 o K j=1 j=1 where q(l) and q(l) are respectively even and odd components of the vector q(l) := C˜p(l)+c. Once k,e k,o k k again all these coefficients can be computed using n standard direct FFTs with respect to sines. Proof. 1. Let the coefficients w of expansion (3.6) be known. According to the first formulas kl (3.3) and (3.4), the values of w for integer indices in (3.6) are reduced to (3.21). To compute w for half-integer indices, we transform the second sum in (3.6). Owing to decomposition (2.4) we rewrite the second formula (3.4) in the form (cid:16) πk(j −1) πkj(cid:17) (cid:16) πkj πk(j −1)(cid:17) s(l) = p(l) sin +sin −p(l) sin −sin k,j−1/2 k,e K K k,o K K πk πk(j −1/2) πk πk(j −1/2) = 2cos p(l) sin −2sin p(l) cos , j = 1,K. 2K k,e K 2K k,o K 9 Then using also the second formula (3.3), we obtain formula (3.22). 2. Now we consider the computation of the coefficients in expansion (3.20) for given y ∈ S . K Owing to the orthogonality property (3.5), they first can be expressed in the form (3.23) for k = 0, l = 1,n−1 and k = 1,K −1, l = 1,n. Formulas (3.15) and (3.3) imply that (cid:107)s(l)(cid:107)2 = KC(cid:101)e(l) ·e(l) = K, l = 1,n−1. 0 C Lemma 3.1 immediately implies formula (3.26) since (s ,s ) = K/2. k k ω h By virtue of formulas (3.3) for the numerator of formula (3.23) for k = 0 we can write K K (cid:88) (cid:16)(cid:88) (cid:17) (y,s(l)) = y ·(−P)j−1e(l) = (−P)j−1y ·e(l). 0 S(n) j−1/2 j−1/2 K j=1 j=1 By virtue of formulas (3.4) for the same numerator for k = 1,K −1 we get K−1 K K (cid:88) (cid:88) (cid:88) (y,s(l)) = y s + y s ·p(l) + y s ·pˇ(l). (3.30) k S(n) j k,j j−1/2 k,j−1 k j−1/2 k,j k K j=1 j=1 j=1 Therefore shifting by 1 the index in the second of these sums, and applying the identity a b + 1 1 a b = 0.5(a +a )(b +b )+0.5(a −a )(b −b ) and recalling decomposition (2.4), we derive 2 2 1 2 1 2 1 2 1 2 (y,s(l)) k S(n) K K−1 K−1 K−1 (cid:88) (cid:88) (cid:88) = y s +p(l) · (y +y )s +p(l) · (y −y )s . j k,j k,e j−1/2 j+1/2 k,j k,o j+1/2 j−1/2 k,j j=1 j=1 j=1 Since also p ·q = p ·q , p ·q = p ·q for any q,p ∈ Rn−1, e e e o o o we obtain formula (3.25). 3. Owing to the orthogonality property (3.5), formula (3.27) is valid. By virtue of formulas (3.3), (3.15) as well as C(cid:101)T = C(cid:101) and P = PT for its numerator for k = 0 we can write K K (cid:0)Cw,s(l)(cid:1) = (cid:0)Cs(l),w(cid:1) = (cid:88)C(cid:101)(−P)j−1e(l) ·w = (cid:16)C(cid:101)(cid:88)(−P)j−1w (cid:17)·e(l). 0 S(n) 0 S(n) j−1/2 j−1/2 K K j=1 j=1 By virtue of formulas (3.16), (3.18) for the same numerator for k = 1,K −1 we get (cid:0)Cw,s(l)(cid:1) = (cid:0)Cs(l),w(cid:1) k S(n) k S(n) K K K = 2(cid:2)c +c·p(l) +θ (cid:0)c +c·pˇ(l)(cid:1)(cid:3)(s ,w) +(cid:88)(cid:0)q(l)s +qˇ(l)s (cid:1)·w , 0 k k n k k ωh k k,j−1 k k,j j−1/2 j=1 where q(l) = C˜p(l)+c. Transforming the last sum in the same manner as above the second and k k third terms in (3.30), we obtain (3.29). 10