Efficient cyclic reduction for QBDs with rank structured blocks∗ Dario A. Bini ([email protected]), Dipartimento di Matematica, University of Pisa, Italy Stefano Massei ([email protected]), Scuola Normale Superiore, Pisa, Italy Leonardo Robol ([email protected]), 6 1 Scuola Normale Superiore, Pisa, Italy 0 2 January 6, 2016 n a J 5 Abstract ] WeprovideeffectivealgorithmsforsolvingblocktridiagonalblockToeplitzsystemswith A m×mquasiseparableblocks,aswellasquadraticmatrixequationswithm×mquasiseparable N coefficients, based on cyclic reduction and on the technology of rank-structured matrices. The algorithms rely on the exponential decay of the singular values of the off-diagonal . h submatrices generated by cyclic reduction. We provide a formal proof of this decay in the t a Markovian framework. The results of the numerical experiments that we report confirm a m significantspeedupoverthegeneralalgorithms,alreadystartingwiththemoderatelysmall size m≈102. [ 1 v 1 Introduction 1 6 Cyclic reduction (CR) is an effective tool that can be used for solving several problems in linear 8 algebra and in polynomial computations [4]. It has been originally introduced by G.H. Golub 0 andR.W.Hockneyinthemid1960s[13,8],forthenumericalsolutionofblocktridiagonallinear 0 . systems stemming from the finite differences solution of elliptic problems, and has been gener- 1 alized to solve nonlinear matrix equations associated with matrix power series with applications 0 6 to queuing problems, Markov chains and spectral decomposition of polynomials. In fact, an 1 important application of CR concerns the computation of the minimal nonnegative solution of : thematrixequationX =A +A X+A X2,encounteredinQuasiBirth-Death(QBD)Markov v −1 0 1 i chains, where A−1, A0, and A1 are given m×m nonnegative matrices such that A−1+A0+A1 X is irreducible and stochastic and where X is the matrix unknown [2, 4]. The computation of the r solution X allows to recover the steady state vector π of the Markov chain. a RewritingthematrixequationasA +(A −I)X+A X2 =0,CRcomputesfoursequences −1 0 1 ∗Work supported by Gruppo Nazionale di Calcolo Scientifico (GNCS) of INdAM, and by the PRA project “Mathematicalmodelsandcomputationalmethodsforcomplexnetworks”fundedbytheUniversityofPisa. 1 of matrices, A(h), i=−1,0,1 and A(cid:98)(h), according to the following equations i 0 A(h+1) =−A(h) S(h) A(h), S(h) =(A(h)−I)−1 1 1 1 0 A(h+1) =A(h)−A(h) S(h) A(h) − A(h) S(h) A(h), (1) 0 0 1 −1 −1 1 A(h+1) =−A(h) S(h) A(h), A(cid:98)(h+1) =A(cid:98)(h)−A(h) S(h) A(h), −1 −1 −1 0 0 1 −1 for h=0,1,..., with A(i0) =Ai, i=−1,0,1 and A(cid:98)(00) =A0−I. It can be proved, in the context of Markov chains, [2] that the sequence −(A(cid:98)0(h))−1A−1 converges to the minimal nonnegative solution G of the matrix equation. If the QBD process is not null recurrent, applicability and quadratic convergence of the algorithm are guaranteed [2][Theorems 7.5, 7.6]. In the null recurrent case, it has been proved in [11] that convergence is linear with factor 1. 2 Without any further assumption on the structure of the blocks, each step of CR requires a smallnumberofmatrixmultiplicationsandonematrixinversionfortheresultingcomputational costof O(m3)arithmeticoperations(ops)perstep. Ontheotherhand, thereareseveralmodels from the applications in which the blocks A exhibit special structures. In order to decrease the i computational complexity of the iterations, variations of CR which exploit these structures have been proposed, see for instance [3], [17], [1]. Here,weareinterestedinanalysingthecasewheretheblocksA arequasiseparablematrices. i That is, the case where the off-diagonal submatrices of A , A and A , strictly contained in −1 0 1 the upper or in the lower triangular part, have low rank with respect to m. The maximum of the ranks of the off-diagonal submatrices is called quasiseparable rank and a matrix with quasiseparablerankk iscalledk-quasiseparable. Observethatk-quasiseparablematricesinclude bandedmatrices. Thesestructuresareencounteredinwideandimportantclassesofapplications like, bidimensional random walks [16], the Jackson tandem queue model [14] and other QBD processes, or, for instance, in the finite differences discretization of elliptic PDEs. Our goal is to design a version of CR which exploits the rank structures of the blocks A and i which can be implemented at a substantially lower cost. This way, we may arrive at designing effective solvers both for block tridiagonal block Toeplitz systems and for the quadratic matrix equations encountered in QBD Markov chains. Indeed, a way to reach this goal is to find out if some structure of the blocks A(h) is maintained during the CR steps (1), and then to try to i exploit this structure in order to design an effective implementation of CR. Looking at the iterative scheme (1), one can find out that the quasiseparable rank can grow exponentially at each step. Despite that, plotting the singular values of the off-diagonal blocks of the matrices A(h) shows an interesting behaviour as reported in Figure 1. For a randomly i generatedQBDprocesswithtridiagonalblocksofsize1600,thesingularvaluesofanoff-diagonal submatrix of size 799×800 in A(h) have an exponential decay in the first 20 steps of CR. 0 It is evident that, even though the number of nonzero singular values grows at each step of CR, the number of singular values above the machine precision – denoted by a horizontal line in Figure 1– is bounded by a moderate constant. Moreover, the singular values seem to stay below a straight-line which constitutes an asymptotic bound. That is, they get closer to this line as h→∞. Thelogarithmscalesuggestthatthecomputedsingularvaluesσ(h) decayexponentially i with i and the basis of the exponential grows with h but has a limit less than 1. Wewillprovethisasymptoticpropertyrelyingonthetechnologyofrank-structuredmatrices and relate the basis of the exponential decay to the width of the domain of analiticity of the matrix function ϕ(z)−1 for ϕ(z)=−z−1A +I−A −zA . −1 0 1 The paper is organized as follows. In Section 2 we recall some useful properties of CR. In Section 3 we prove some preliminary lemmas for bounding the singular values of sums, products 2 100 Exponential decay of the Singular values 10−5 10−10 10−15 10−20 10−25 10−30 10−35 10−400 5 10 15 20 25 30 35 40 n−th Singular value Figure 1: Log-scale plot of the largest singular values of the largest south-western submatrix of A containedinthelowertriangularpart,form=1600. Thehorizontallinedenotesthemachine 0 precision threshold. Matrices are randomly generated so that A (cid:62) 0 are tridiagonal matrices i and A +A +I+A is stochastic. −1 0 1 andinversesofmatrices. Section4containstheproofofthemainresultconcerningtheexponen- tial decay of the singular values. In Section 5 we describe our algorithm which implements CR relying on the package H2Lib [5], concerning H-matrices, and the related code together with the results of the numerical experiments. Finally, in Section 6, we give some concluding comments and possible future developments. 2 Main properties of cyclic reduction We recall a functional interpretation of cyclic reduction introduced in the Markov chains frame- work [2], [4] to prove applicability and convergence properties. Associate the matrices A(h), i = −1,0,1 defined in (1) with the matrix Laurent polynomial i ϕ(h)(z) := −z−1A(h) +(I −A(h))−zA(h), where ϕ(0)(z) = ϕ(z) = z−1A +(I −A )−zA , −1 0 1 −1 0 1 and define the matrix rational function ψ(h)(z)=ϕ(h)(z)−1. The following property holds (cid:40) ψ(0)(z):=ψ(z), ψ(h+1)(z2):= 1(ψ(h)(z)+ψ(h)(−z)). 2 In particular, making explicit the recurrence relation in the sequence {ψ(h)}, we find that ψ(h)(z2h) = 21h (cid:80)2j=h−01ψ(0)(ωNj z) where ωN = e2Nπi is a principal N-th root of the unity for N =2h, and i denotes the imaginary unit, so that −1 2h−1 ϕ(h)(z2h)= 1 (cid:88) ψ(0)(ωj z) . (2) 2h N j=0 Observe that in the case where A , A and A are tridiagonal, then ϕ(z) is tridiagonal as −1 0 1 well, so that for any value of z such that detϕ(z)(cid:54)=0, the matrix ψ(z) is semiseparable, that is, tril(ψ(z))=tril(L), triu(ψ(z))=triu(U), where L and U are matrices of rank 1 [18]. 3 Relying on this property, in the next section we will show that ψ (z), as well as the blocks h A(h),A(h) andA(h),hasoff-diagonalsubmatriceswithsingularvalueswhichdecayexponentially. −1 0 1 3 Singular values of sums, products and inverses Inthissection,werecallsomebasicfactsonthesingularvaluesdecomposition(SVD)andprovide some technical lemmas. Let us denote by σ (A) the j-th singular value of the m×n matrix A j sorted in non-ascending order and by A = UΣV∗, Σ = diag(σ ,...,σ ), p = min{m,n}, the 1 p SVD of A where U and V are unitary and V∗ denotes the transpose conjugate of V. Moreover, denotebyu andv theithcolumnofU andV, respectively. Werecallthefollowingwell-known i i property concerning the SVD of a matrix A. Property 3.1. For any matrix A the function f(X) = (cid:107)A−X(cid:107) takes its minimum over the 2 class of matrices of rank (cid:96)−1 for X =(cid:80)(cid:96)−1σ u v∗ and the value of the minimum is exactly σ . i=1 i i i (cid:96) 3.1 Some technical lemmas Let (cid:107)A(cid:107) be the Euclidean norm of A and µ (A) = (cid:107)A(cid:107) ·(cid:107)A−1(cid:107) be the condition number of 2 2 2 2 A. The following result relates the singular values of the matrices A, B and C =AB. Lemma 3.2. Consider two matrices A∈Cm×n, B ∈Cn×n, such that B is invertible. Then σ (A) σ (A) j (cid:54)σ (AB)(cid:54)(cid:107)B(cid:107) ·σ (A), j (cid:54)σ (BA∗)(cid:54)(cid:107)B(cid:107) ·σ (A), (cid:107)B−1(cid:107) j 2 j (cid:107)B−1(cid:107) j 2 j 2 2 Proof. We prove only the first statement since the other follows directly from it. Consider the SVD decompositions A = U Σ V∗, B = U Σ V∗. Recall that the singular values of a generic A A A B B B matrix M ∈ Cm×n are the square roots of the eigenvalues of M∗M, so that by the minimax theorem, we can write x∗M∗Mx σ (M)2 = max min . j dim(V)=j x∈V x∗x V⊆Cn x(cid:54)=0 Now note that AB = U Σ V∗U Σ V∗ and since unitary matrices do not change the singular A A A B B values we have σ (AB)=σ (Σ QΣ ) where Q=V∗U . Thus, we can express σ (AB)2 as j j A B A B j x∗Σ∗Q∗Σ∗Σ QΣ x (Σ x)∗Q∗Σ2Q(Σ x) σ (AB)2 = max min B A A B = max min B A B . j dim(V)=jx∈V x∗x dim(V)=jx∈V x∗x V⊆Cn V⊆Cn By setting y =Σ x and recalling that Σ is invertible by hypothesis we have B B y∗Q∗Σ2Qy y∗y σ (AB)2 = max min A · j dim(V)=jy∈V y∗y x∗x V⊆Cn so that, by using the fact that Q is unitary and that x∗x (cid:54) y∗y (cid:54) (cid:107)B(cid:107)2x∗x, we obtain (cid:107)B−1(cid:107)2 2 2 σj(A)2 = σj(ΣA)2 (cid:54)σ (AB)2 (cid:54)(cid:107)B(cid:107)2·σ (Σ )2 =(cid:107)B(cid:107)2·σ (A)2. (cid:107)B−1(cid:107)2 (cid:107)B−1(cid:107)2 j 2 j A 2 j 2 2 The following lemma relates the singular values of an off-diagonal submatrix of the inverse of a given matrix A with those of the corresponding submatrix of A. 4 Lemma 3.3. Let A∈Cn×n be an invertible matrix and consider the block partitioning (cid:18) A B (cid:19) (cid:18) A˜ B˜ (cid:19) A= , A−1 = , C D C˜ D˜ where A and A˜ are i×i matrices for 2(cid:54)i(cid:54)n−1. We have the following properties 1. If D is invertible then 1 σ (C) (cid:54) σ (C˜) (cid:54) (cid:107)D−1(cid:107) ·(cid:107)S−1(cid:107) ·σ (C), where S = (cid:107)D(cid:107)2(cid:107)SD(cid:107)2 j j 2 D 2 j D A−BD−1C is the Schur complement of D. 2. If A is invertible then 1 σ (C) (cid:54) σ (C˜) (cid:54) (cid:107)A−1(cid:107) ·(cid:107)S−1(cid:107) ·σ (C), where S = (cid:107)A(cid:107)2(cid:107)SA(cid:107)2 j j 2 A 2 j A D−CA−1B is the Schur complement of A. Proof. Let us consider part 1. If D is invertible we can write (cid:18) A B (cid:19)−1 (cid:18) S−1 S−1BD−1 (cid:19) = D D C D −D−1CS−1 D−1+D−1CS−1BD−1 D D and in particular we have C˜ = −D−1CS−1. Repeatedly applying Lemma 3.2 to C˜ yields D 1 σ (CS−1) (cid:54) σ (C˜) (cid:54) (cid:107)D−1(cid:107) · σ (CS−1) and eventually 1 σ (C) (cid:54) σ (CS−1) (cid:54) (cid:107)D(cid:107)2 j D j 2 j D (cid:107)SD(cid:107)2 j j D (cid:107)S−1(cid:107) ·σ (C). Combining these inequalities gives us the thesis. For proving part 2, we can D 2 j proceed in the same manner relying on the inversion formula (cid:18) A B (cid:19)−1 (cid:18) A−1+A−1BS−1CA−1 −A−1BS−1 (cid:19) = A A . C D −S−1CA−1 S−1 A A Lemma 3.4. Let A = (cid:80)+∞ A and A+ = (cid:80)+∞A where A ∈ Cm×n have rank at most k i=−∞ i i=0 i i and suppose that (cid:107)Ai(cid:107)2 (cid:54)Me−α|i|. Then σj(A)(cid:54) 1−2Me−α ·e−αj2−kk, σj(A+)(cid:54) 1−Me−α ·e−αj−kk. Proof. Note that B = (cid:80)s−1 A has rank at most k(2s − 1). For j positive integer, set s i=−s+1 i s=(cid:100)j−k(cid:101) and observe that, since 2k(cid:100)j−k(cid:101)<j+k, we have k(2s−1)(cid:54)j−1 so that B is an 2k 2k s approximation to A of rank at most j−1. By Property 3.1 it follows that σ (A)(cid:54)(cid:107)A−B (cid:107) . j s 2 Moreover, since A−B =(cid:80) A , we have σ (A)(cid:54)(cid:80) (cid:107)A (cid:107) (cid:54)2Me−αs/(1−e−α) which s |i|(cid:62)s i j |i|(cid:62)s i 2 completes the proof of the first bound. The second bound is proved similarly. Remark 3.5. In the particular case where k = 1, Lemma 3.4 yields σs(A) (cid:54) 1−2Me−α ·e−α2s, σ (A+)(cid:54) M ·e−α(s−1). s 1−e−α 4 Exponential decay of the singular values We can now state the main result about the decay of the singular values. It is clear that, if the blocks A i = −1,0,1 have an off-diagonal rank structure, then the matrix ϕ(0)(z) also enjoys i this property. We will show that this fact implies the exponential decay of the singular values of the off-diagonal blocks of ϕ(h)(z2h) for every h and for any z ∈T where T={z ∈C: |z|=1} denotes the unit circle in the complex plane. Given an integer N > 0, let ω = ei2π/N and observe that the quantity 1 (cid:80)N−1(zωj )k N N j=0 N coincides with zk if k ≡ 0 mod N and with 0 otherwise. This way, if A(z) = (cid:80) ziA i∈Z i is a matrix Laurent series analytic on the annulus A(r ,r ) = {z ∈ C : r < |z| < r } for 1 2 1 2 5 0 < r < 1 < r , then 1 (cid:80)N−1A(ωjz) = B(zN) where B(z) = (cid:80) ziA is analytic on 1 2 N j=0 n i∈Z Ni A(rN,rN). WedenotebyI theoperatorwhichmapsA(z)intoB(z)andwriteB(z)=I (A(z)). 1 2 N N Observe that I is linear and continuous on the space of analytic functions on A(r ,r ). N 1 2 Moreover,inviewof (2),wehaveψ(h) =I (ψ(0))forN =2h. Thisway,ifweprovethatany N off-diagonalsubmatrixB(z)ofψ(0)(z)issuchthatI (B(z))hastheexponentialdecayproperty N for its singular values, then we have shown this property also for ψ(h)(z). Partition ϕ(z) and ϕ(z)−1 as follows (cid:20)I−E(z) −B(z) (cid:21) (cid:20)E˜(z) B˜(z)(cid:21) ϕ(z)= , ψ(z):=ϕ(z)−1 = , (3) −C(z) I−D(z) C˜(z) D˜(z) where E(z) and D(z) are square matrices of any compatible size. Theorem 4.1. Let ϕ(z)=−z−1A +I−A −zA be an m×m matrix function such that −1 0 1 (i) The A are non-negative and I−ϕ(z) has spectral radius smaller than 1 for any z ∈T. i (ii) The blocks A are k-quasiseparable and (cid:107)A (cid:107) (cid:54)L, i=−1,0,1. i i 2 (iii) There exist t>1 and δ >0 such that detϕ(z)(cid:54)=0 and (cid:107)ϕ(z)−1(cid:107) (cid:54)δ for z ∈A(t−1,t). 2 Then ρ(I − ϕ(z)) < 1 for any z ∈ A and in the partitioning (3), both blocks I − E(z) and I −D(z) are invertible for any z ∈ A. Moreover, for any z ∈ T and for any h, the singular values of C˜(h) :=I (C˜(z)), with N =2h, are such that N 4Lδ2 σs(C˜(h)(z))(cid:54)3Me−s−6k3klogt, M = (1−e−Nlogt)(1−t−1). (4) Moreover, if A , A , A are tridiagonal then the above bound turns into −1 0 1 σs(C˜(h)(z))(cid:54)Me−2slogt. (5) Proof. Let us prove that ρ(I −ϕ(z)) < 1 for any z ∈ A. By contradiction, assume that there exists ξ ∈A such that ρ(I−ϕ(ξ))(cid:62)1. Since A (cid:62)0 for i=−1,0,1 then |I−ϕ(ξ)|(cid:54)I−ϕ(|ξ|), i and by the monotonicity of the spectral radius we get 1 (cid:54) ρ(I −ϕ(ξ)) (cid:54) ρ(I −ϕ(|ξ|)). Thus, sinceρ(I−ϕ(1))<1(cid:54)ρ(I−ϕ(|ξ|))andρisacontinuousfunction,thenthereexists1/t<ξˆ<t such that ρ(I−ϕ(ξˆ))=1. Since I−ϕ(ξˆ) is nonnegative, then by the Perron-Frobenius theorem thereexistsaneigenvalueofI−ϕ(ξˆ)equalto1,thatisϕ(ξˆ)wouldbesingular,whichcontradicts the assumptions. NowweprovethatI−D(z)andI−E(z)areinvertibleforanyz ∈A. Since|D(z)|(cid:54)D(|z|), for the monotonicity of the spectral radius, we have ρ(D(z)) (cid:54) ρ(|D(z)|) (cid:54) ρ(D(|z|)). On the other hand, D(|z|) is a principal submatrix of the nonnegative matrix I −ϕ(|z|) so that ρ(D(|z|)) (cid:54) ρ(I −ϕ(|z|)) which is less than 1 since |z| ∈ A. We conclude that ρ(D(z)) < 1 for any z ∈ A so that I −D(z) is nonsingular. The same argument can be used to deduce that I−E(z) is nonsingular. Now we prove the bound (4) on the singular values. For simplicity we assume that k = 1, the general case can be treated similarly. Since the off-diagonal blocks of A have rank at i most 1 then C = u vT, i = −1,0,1, for suitable vectors u ,v where we assume that (cid:107)u (cid:107) = i i i i i i 2 (cid:107)C (cid:107) , (cid:107)v (cid:107) = 1. Thus, we have C(z) = (cid:80)1 ziu vT. Since I −D(z) is invertible on A, we i 2 i 2 i=−1 i i have C˜(z) = H(z)(cid:80)1 ziu vTK(z) where H(z) = (I −D(z))−1, K(z) = S (z)−1 = E˜(z), i=−1 i i D and H(z), K(z) are analytic for z ∈ A. Consider the Fourier series of H(z) and K(z), that 6 is, H(z) = (cid:80) zsH , K(z) = (cid:80) zsK , and recall that the coefficients H , K have an s∈Z s s∈Z s s s exponential decay [12][Theorem 4.4c], that is, |(Hs)i,j|(cid:54)maxz∈A|(H(z))i,j|e−|s|logt, |(Ks)i,j|(cid:54) maxz∈A|(K(z))i,j|e−|s|logt. Since for any matrix norm induced by an absolute norm (cid:107)·(cid:107) and for any matrix A it holds that |a |(cid:54)(cid:107)A(cid:107) so that we may write i,j (cid:107)H (cid:107)(cid:54)max(cid:107)H(z)(cid:107)e−|s|logt, (cid:107)K (cid:107)(cid:54)max(cid:107)K(z)(cid:107)e−|s|logt, (6) s s z∈A z∈A Now recall that C˜ = H(z)(cid:80) ziu vTK(z), set z ∈ T and consider the generic ith term i=−1,0,1 i i ziH(z)u vTK(z) in the above summation. We have i i (cid:88) (cid:88) (cid:88) ziH(z)u vTK(z)= zs+h+iH u v K = H u zp+ivTK , i i s i i h s i i p−s s,h∈Z s∈Z p∈Z where we have set p = s+h. Now, applying the operator I to the above matrix cancels out N the terms in zp+i such that p+i is not multiple of N, so that we are left with the terms where p+i=Nq and we get (cid:88) (cid:88) (cid:88) I (ziH(z)u vTK(z))= H u zqvTK =: uˆ(i)vˆ(i)(z), N i i s i i Nq−i−s s s s∈Z q∈Z s∈Z for uˆ(i) =H u , vˆ(i)(z)=(cid:80) zqvTK . Thus we may write s s i s q∈Z i Nq−i−s I (C˜)=(cid:88)Uˆ Vˆ (z)T, Uˆ =(cid:104)uˆ(−1),uˆ(0),uˆ(1)(cid:105), Vˆ (z)=(cid:104)vˆ(−1)(z),vˆ(0)(z),vˆ(1)(z)(cid:105). N s s s s s s s s s s s∈Z To complete the proof, recall that z ∈ T and apply Lemma 3.4 with k = 3 to the series (cid:80) Uˆ VT. In order to do this, we have to provide upper bounds to (cid:107)Uˆ Vˆ (z)T(cid:107) for z ∈ T. s∈Z s s s s 2 We have (cid:107)Uˆ Vˆ (z)T(cid:107) (cid:54)(cid:107)Uˆ (cid:107) (cid:107)Vˆ (z)(cid:107) . Concerning (cid:107)Uˆ (cid:107) , since Uˆ =H [u ,u ,u ], we have s s 2 s 2 s√ 2 s 2 s s −1 0 1 (cid:107)Uˆ (cid:107) (cid:54)(cid:107)H (cid:107) (cid:107)[u ,u ,u ](cid:107) (cid:54) 3(cid:107)H (cid:107) max (cid:107)C (cid:107) , where the latter inequality follows from s 2 s 2 −1 0 1 2 s 2 i i 2 √ the fact that (cid:107)u (cid:107) = (cid:107)C (cid:107) and that consequently, (cid:107)[u ,u ,u ](cid:107) (cid:54) 3max (cid:107)C (cid:107) . Thus i 2 i 2 −1 0 1 2 i i 2 from (6) we get √ (cid:107)Uˆ (cid:107) (cid:54) 3Lmax(cid:107)H(z)−1(cid:107) e−|s|logt. s 2 2 z∈A Similarly, since (cid:107)v (cid:107) =1 and |z|=1, we have i 2 (cid:88) (cid:88) (cid:107)vˆ(i)(cid:107) (cid:54) (cid:107)K (cid:107) (cid:54)max(cid:107)K(z)(cid:107) e−|Nq−i−s|logt, s 2 Nq−i−s 2 z∈A 2 q∈Z q∈Z where the last inequality follows from (6). Define r the remainder of the division of i + s by N, so that i + s = Nqˆ+ r, and get (cid:80) e−|Nq−i−s|logt = (cid:80) e−|N(q−qˆ)+r|logt = q∈Z q∈Z (cid:80) e−|Nq+r|logt = e−rlogt +(cid:80) e−(Nq−r)logt +(cid:80) e−(Nq+r)logt = e−rlogt +(erlogt + q∈Z q(cid:62)1 q(cid:62)1 e−rlogt)( 1 −1)(cid:54) 2 . Whence we deduce that 1−e−Nlogt 1−e−Nlogt √ 2 3 (cid:107)Vˆ (cid:107) (cid:54) max(cid:107)K(z)(cid:107) . s 2 1−e−Nlogt z∈A 2 Combining the two bounds yields 6L (cid:107)Uˆ Vˆ (z)(cid:107) (cid:54) max(cid:107)K(z)(cid:107) max(cid:107)H(z)(cid:107) e−|s|logt. (7) s s 2 1−e−Nlogt z∈A 2 z∈A 2 7 It remains to estimate (cid:107)K(z)(cid:107) and (cid:107)H(z)(cid:107) . Concerning K(z) = E˜(z), observe that this is 2 2 a principal submatrix of ψ(z) so that (cid:107)K(z)(cid:107) (cid:54) (cid:107)ψ(z)(cid:107) . Concerning H(z) = (I −D(z))−1, 2 2 observe that from the condition A (cid:62) 0 it follows that |D(z)| (cid:54) D(|z|) and that ρ(D(z)) (cid:54) i ρ(|D(z)|)(cid:54)ρ(D(|z|))(cid:54)ρ(ϕ(|z|))<1 since I −D(z) is a principal submatrix of ϕ(z). Thus we may write (I−D(z))−1 =(cid:80)∞ D(z)j and |(I−D(z))−1|(cid:54)(I−D(|z|))−1. Now, since A (cid:62)0 j=0 i for i=−1,0,1, then D˜(|z|)=(I−D(|z|))−1+(I−D(|z|))−1C(|z|)S−1 B(|z|)(I−D(|z|))−1 (cid:62)(I−D(|z|))−1 I−D(|z|) (cid:124) (cid:123)(cid:122) (cid:125)(cid:124) (cid:123)(cid:122) (cid:125)(cid:124) (cid:123)(cid:122) (cid:125)(cid:124) (cid:123)(cid:122) (cid:125)(cid:124) (cid:123)(cid:122) (cid:125) (cid:62)0 (cid:54)0 (cid:62)0 (cid:54)0 (cid:62)0 so that (cid:107)(I −D(z))−1(cid:107)2 (cid:54) (cid:107)(I −D(|z|))−1(cid:107)2 (cid:54) (cid:107)D˜(|z|)(cid:107)2 (cid:54) maxz∈A(cid:107)ψ(z)(cid:107)2. Thus, applying Lemma 3.4 together with the bound (7) and rank of the blocks 3 yields 12Lδ2 σs(C˜(h)(z))(cid:54) (1−e−Nlogt)(1−t−1)e−s−63logt. If the blocks A are k-quasiseparable, then Lemma 3.4 is applied with rank of the blocks 3k so i thattheexponent(s−3)/6isreplacedby(s−3k)/(6k),Ifϕ(z)istridiagonal,thenu =u =u −1 0 1 and v =v =v , so that Uˆ and Vˆ are formed by a single column, i.e., Lemma 3.4 is applied −1 0 1 j j with rank of the blocks 1. This provides (5). 4.1 Exponential decay of the singular values in ϕ(z) Inthissection,weprovethedecaypropertyofthesingularvaluesintheoff-diagonalsubmatrices of ϕ(h)(z) when |z| = 1. The proof is obtained by combining the decay property for the matrix functionψ(h),statedinTheorem4.1,withasuitablelemmawhichallowstoextendthisproperty to the matrix inverse. Lemma 4.2. Letϕ(h)(z)=−z−1A(h)+I−A(h)−zA(h) bethem×m-matrixLaurentpolynomial −1 0 1 obtained at the hth step of CR. Under the hypotheses of Theorem 4.1, for every z ∈ T we have the following bound: σ (C(h))(cid:54)K(L ,ϕ)·σ (C˜(h)), K(L ,ϕ)=(1+3L )(1+L +L2(cid:107)ϕ(1)−1(cid:107) ) j h j h h h h 2 where ϕ(h)(z) and ϕ(h)(z)−1 are partitioned as in (3) and L is such that (cid:107)A(h)(cid:107) (cid:54)L . h i 2 h Proof. With the notation of the partitioning (3) applied to ϕ(h)(z), from Lemma 3.3 applied to ϕ(h)(z) we have σ (C(h)) (cid:54) (cid:107)I −E(h)(z)(cid:107) (cid:107)S (z)(cid:107) σ (C˜(h)). Thus, since z ∈ T and I − j 2 I−E(h) 2 j E(h)(z)isasubmatrixofϕ(h)(z),wehave(cid:107)I−E(h)(z)(cid:107) (cid:54)(cid:107)ϕ(h)(z)(cid:107) (cid:54)1+3L . Moreover,tak- 2 2 h ingthenormsinS (z)=I−D(h)(z)−C(h)(z)(I−E(h)(z))−1B(h)(z)weget(cid:107)S (z)(cid:107) (cid:54) I−E(h) I−E(h) 2 1+L +L (cid:107)(I−E(h)(z))−1(cid:107) L . Moreover, forz ∈Twehave|I−E(h)(z))−1|(cid:54)(cid:80)∞ E(h)(1)i h h 2 h i=0 so that (cid:107)I −E(h)(z))−1(cid:107) (cid:54) (cid:107)(I −E(h)(1))−1(cid:107) = (cid:107)(cid:80)∞ E(h)(1)i(cid:107) (cid:54) (cid:107)(cid:80)∞ A(h)(1)i(cid:107) = 2 2 i=0 2 i=0 2 (cid:107)ϕ(h)(1)−1(cid:107) , where we have set A(h)(z) = z−1A(h) +A(h) +zA(h). Here, we have used the 2 −1 0 1 property that the conditions A(h),A(h),A(h) (cid:62)0 and ρ(A(h)+A(h)+A(h))<1 are preserved at −1 0 1 −1 0 1 each step of CR (see [2]). Finally, since ϕ(h)(1)−1 =ψ(h)(1)= 1 (cid:80)N−1ψ(ωi ), for N =2h (see N i=0 N Section 2), we have (cid:107)ϕ(h)(1)−1(cid:107) (cid:54)(cid:107)ψ(1)(cid:107) . 2 2 Remark 4.3. Note that the previous bound still holds with (cid:107)ϕ(h)(1)−1(cid:107) , in place of (cid:107)ϕ(1)−1(cid:107) . 2 2 Experimentally, (cid:107)ϕ(h)(1)−1(cid:107) is much smaller than (cid:107)ϕ(1)−1(cid:107) just after few steps h. 2 2 8 Observe that L depends on the step h of CR. However, since under the assumptions of h Theorem 4.1, the sequences generated by CR are such that lim A(h) = 0, for i = 1,−1 while k i lim A(h) is finite (see [2]), then there exists L such that L(cid:62)L . Thus, Combining Lemma 4.2 h 0 h and Theorem 4.1 we obtain the following result. Corollary 4.4. Let ϕ(h)(z) = −z−1A(h) + I − A(h) − zA(h) be the m × m-matrix Laurent −1 0 1 polynomial obtained at the hth step of CR and assume the hypothesis of Theorem 4.1. Then for any off-diagonal submatrix C(h)(z) of ϕ(h)(z) we have σs(C(h)) (cid:54) 3MK ·es−6k3klogt, where K =(1+3L)(1+L+L2(cid:107)ϕ(1)−1(cid:107) ), M is the constant defined in Theorem 4.1 and L(cid:62)(cid:107)A(h)(cid:107) , 2 i 2 fori=−1,0,1. Inparticular, ifAi istridiagonalfori=−1,0,1thenσs(C(h))(cid:54)MK·e−(2s)logt 4.2 Exponential decay of the singular values in A(h) i To prove the decay of the singular values in the off-diagonal submatrices of A(h) for i=−1,0,1 i we rely on the following result of which we omit the elementary proof. Lemma 4.5. LetA(z)=z−1A +A +zA andletξ beaprimitive6-throotoftheunity. Then −1 0 1 A = 1(cid:0)ξA(ξ)+ξ5A(ξ5)−A(−1)(cid:1),A = 1(A(z)+A(−z)),A = 1(cid:0)ξ5A(ξ)+ξA(ξ5)−A(−1)(cid:1). −1 3 0 2 1 3 Lemma 4.6. Let A = 1 (cid:80)k A ∈ Cn×n where σ (A ) (cid:54) γe−αj, for j = 1,...,n. Then k i=1 i j i σj(A)(cid:54)γ˜e−αj−kk, γ˜ = 1−γe−α. Proof. RelyingontheSVD,wewriteA =(cid:80)∞ σ (A )u v∗ whereu andv arethesingular i j=1 j i i,j i,j i,j i,j vectors of A and where, for convenience, we have expanded the sum to an infinite number of i terms by setting σ (A )=0 for j >n. This allows us to write j i k ∞ (cid:32) k (cid:33) ∞ A= 1 (cid:88)A =(cid:88) 1 (cid:88)σ (A )u v∗ =(cid:88)A˜ . k i k j i i,j i,j j i=1 j=1 i=1 j=1 Observe that A˜ have rank k and (cid:107)A (cid:107)(cid:54)γe−αj. Applying Lemma 3.4 completes the proof. j j We may conclude with the decay property for the singular values of the off-diagonal subma- trices of A(h), for i=−1,0,1. i Lemma 4.7. Let ϕ(h)(z) be the matrix function genertaed at the hth step of CR with the prop- erty that every offdiagonal submatrix B(z) of ϕ(h)(z) has decaying singular values such that σ (B(z)) (cid:54) γe−αs. Then every coefficient B of B(z) = z−1B + B + zB is such that s i −1 0 1 σs(B0)(cid:54)γe−αj−22, σs(Bi)(cid:54)γe−αj−33, for i=1,−1. Proof. By Lemma 4.5, we have an expression for B based on evaluations of B(z). In particular, i wehaveA = 1(B(i)+B(−i)), A = 1(ξ∓1B(ξ)+ξ∓5B(ξ5)−B(−1)),whereξ isaprimitive 0 2 ±1 3 6-th root of the unity. Applying Lemma 4.6 completes the proof. 4.3 The Markovian case One of the most interesting application of CR algorithm is in the Markovian framework, in whichapplicabilityandconvergencepropertiesareguaranteed. Inthatcase,thematrixfunction ϕ satisfies almost all the hypotheses made in the previous subsections but it is singular at z =1 since 1 is always an eigenvalue of ϕ(z). Nevertheless we will show that Corollary 4.4 can still be applied considering a rescaled version of ϕ(z). 9 When the coefficients A for i = −1,0,1 represent the blocks of the transition matrix of an i irreduciblenotnullrecurrentQBDprocess,theeigenvaluesofϕ(z)enjoythefollowingproperties [9, 4]: (i) |λ | (cid:54) |λ | (cid:54) ... (cid:54) |λ | (cid:54) λ < λ (cid:54) |λ | (cid:54) ... (cid:54) |λ |, with λ ,λ ∈ R and 1 2 m−1 m m+1 m+2 2m m m+1 one of the two equal to 1. (ii) In the annulus {λ < |z| < λ } ϕ is invertible and the spectral radius of I −ϕ(z) is m m+1 strictly less than 1. Hence we consider the rescaled version of ϕ, that is, ϕ (z) := ϕ(θz), and we choose θ = θ (cid:113) (cid:112)λ λ . We obtain a matrix function invertible on A={1 <|z|<t} where t= λm+1. m m+1 t λm Observe that ϕ(h)(z) := ϕ(h)(α2hz) so applying CR to ϕ one obtains the same matrix α α sequences up to a rescaling factor. In particular the exponential decay of the singular values is left unchanged as shown in the following. Theorem 4.8. For given t > 1 and δ (cid:62) 0, consider the following class of matrix functions associated with QBD stochastic processes with k-quasiseparable blocks: χ :=(cid:8)ϕ(z): (cid:107)ϕ−1(z)(cid:107) (cid:54)δ t−1 (cid:54)|z|(cid:54)t, t<λ /λ (cid:9). δ,t 2 m+1 m Thenthereexistsauniformconstantγ(δ,t)suchthatforanyoff-diagonalblockC(h)(z)ofϕ(h)(z), with ϕ∈χδ,t, its s-th singular value is bounded by σs(C(h)(z))(cid:54)γ(δ,t)·e−s−6k3klogt. Remark4.9. Observethatinthecaseofnull-recurrentQBDprocessesonehasλ =λ =1, m m+1 so that there is no open annulus including T where ϕ(z) is nonsingular and we cannot apply Theorem 4.1. This drawback can be partially overcome by applying the shift technique of [2, 4]. This technique allows to construct a new matrix function ϕ˜(z) which has the same eigenvalues of ϕ(z) except for the eigenvalue 1 which is shifted to 0. So that ϕ˜(z) has an open annulus containingTwhereitisnonsingular. Moreover,applyingCRtoϕ˜(z)generatesmatrixsequences which easily allow to recover the corresponding matrix sequences obtained by applying CR to ϕ(z). The sequences associated with ϕ(z) differ from the sequences associated with ϕ˜(z) by a rank-1 correction. This way, if the exponential decay of the singular values holds for the latter sequences, it holds also for the former ones. The difficulty that still remains is that the nonnegativityoftheblocksA ,A andA isnotgenerallysatisfiedbythefunctionϕ˜(z)sothat −1 0 1 in principle Theorem 4.1 cannot be applied and a different version specific for this case should be formulated. Remark 4.10. The bounds that we have given to the decay of the singular values of the off- diagonalsubmatricesarenotstrict. Experimentally,singularvaluesseemtodecayslightlyfaster. Even in the null-recurrent case where λ = λ = 1, the decay still occurs even though in a m m+1 deterioratedform. Thedecaypropertiesclearlydependonthedomainofanalyticityofψ(z)but this is not the only reason of the decay. More investigation is needed in this direction. 5 An algorithm using H-matrices We have provided an implementation of CR, which applies to matrix functions ϕ(z) having quasiseparable blocks, and relies on the approximate quasiseparable structure induced by the decay of the singular values. We relied on the H-matrix representation of [6, 7, 10]. 10