A projector-splitting integrator for dynamical low-rank approximation ∗ Ch. Lubich* and I.V. Oseledets** *Mathematisches Institut, Universität Tübingen, Auf der Morgenstelle 10, D-72076 Tübingen, Germany, 3 [email protected] 1 0 **Institute of Numerical Mathematics, Russian Academy of 2 Sciences, Gubkina Street 8, Moscow,Russia, n a J [email protected] 8 ] A December 11, 2013 N . h t a m [ The dynamical low-rank approximation of time-dependent matrices is a low-rank 2 factorization updating technique. It leads to differential equations for factors of the v 8 matrices, which need to be solved numerically. We propose and analyze a fully ex- 5 plicit, computationally inexpensive integrator that is based on splitting the orthogonal 0 1 projector onto the tangent space of the low-rank manifold. As is shown by theory and . 1 illustrated by numerical experiments, the integrator enjoys robustness properties that 0 are not shared by any standard numerical integrator. This robustness can be exploited 3 1 to change the rank adaptively. Another application is in optimization algorithms for : v low-rankmatriceswheretruncationbacktothegivenlowrankcanbedoneefficiently i X byapplyingastepoftheintegratorproposedhere. r a 1 Introduction Low-rank approximation of large matrices and tensors is a basic model reduction technique in a widevarietyofapplicationsrangingfromquantumphysicstoinformationretrieval. Inthepresent paper,weconsiderthelow-rankapproximationoftime-dependentmatricesA(t),t ≤t ≤t¯,which 0. are either given explicitly or are the unknown solution of a differential equation A(t)=F(A(t)). In the first case, instead of performing an (expensive) low-rank approximation via singular value ∗This work was partially supported by DFG, SPP 1324, by RFBR grants 12-01-00546-a, 11-01-00549-a, 12-01- 33013mol-ved-a,RFBR-DFGgrant12-01-91333,byFederalprogram“Scientificandscientific-pedagogicalper- sonnelofinnovativeRussia”(contracts14.740.11.1067,16.740.12.0727,grants8500and8235) 1 decompositionsindependentlyforeveryt,onewouldpreferacomputationallyinexpensiveupdat- ing procedure that works only with the increments of A and is free from decompositions of large matrices. In the second case, one would like to find an approximate solution to the differential equationworkingonlywithitslow-rankapproximation. Bothcasescanbedealtwithbydynamicallow-rankapproximationwhereA(t)isapproximated byY(t)offixedrankr byimposingthatitstimederivativeshouldsatisfy . . (cid:107)Y(t)−A(t)(cid:107)=min, (1) wheretheminimizationisoverallmatricesthataretangenttoY(t)onthemanifoldM ofmatrices .r ofrankr,andthenormistheFrobeniusnorm. InthecaseofadifferentialequationA=F(A),the aboveminimizationisreplacedwith . (cid:107)Y(t)−F(Y(t))(cid:107)=min. (2) In both cases, this leads to a differential equation on the manifold of matrices of a given rank r, whichisthensolvednumerically. For large time-dependent matrices, this approach was proposed and analyzed in [5], and first numerical applications were given in [12]. The approach and its error analysis were extended to tensors in the Tucker format in [6]. Very recently, in [1, 8, 14] the dynamical low-rank approxi- mation approach was extended to tensors in the tensor train (TT) format studied in [13] and the hierarchical Tucker (HT) format studied in [3]. In quantum dynamics, such an approach was used previously in the multiconfiguration time-dependent Hartree (MCTDH) method [10, 11] to de- termine an approximate solution to the time-dependent multi-particle Schrödinger equation, and the above minimization process is known there as the Dirac–Frenkel time-dependent variational principle(see,e.g.,[7]). The dynamical low-rank approximation leads to differential equations that need to be solved numerically. Inthispaperwepresentanumericalintegrationtechniquethatisfullyexplicitand,in contrasttostandardintegratorssuchas(explicitorimplicit)Runge–Kuttamethods,doesnotsuffer from a possible ill-conditioning of certain matrices arising in the differential equations. This new method is based on a splitting of the projector onto the tangent space of the low-rank manifold at the current position. A different splitting algorithm was recently proposed in [14], where the differential equations are split along different components. The projector splitting discussed here offers, however, several advantages: it leads to a much simpler and less expensive time-stepping algorithmanditcanbeshowntoenjoyremarkablerobustnesspropertiesundertheill-conditioning mentionedbefore. InSection2werecapitulatethedynamicallow-rankapproximationofmatrices,andinSection3 we describe the novel splitting integrator. In Section 4 we analyze the robustness under over- approximation, that is, under the dreaded ill-conditioning mentioned above. Section 5 presents numerical experiments. The final section addresses some perspectives for the use of the proposed integratorandextensions. 2 Dynamical low-rank approximation of matrices Letrbeagivenrank,andletM denotethemanifoldofallrealm×nmatricesofrankr(typically, r r (cid:28)m,n). In the dynamical low-rank approximation to matrices A(t)∈Rm×n, the approximation 2 Y(t)∈M isrepresentedinanon-uniquewayas r Y(t)=U(t)S(t)V(t)(cid:62), (3) where U(t) ∈ Rm×r and V(t) ∈ Rn×r each have r orthonormal columns, and S(t) ∈ Rr×r is an invertible matrix. This looks similar to the singular value decomposition, but S(t) is not assumed diagonal. It is shown in [5, Prop.2.1] that, given such a decompositionY =U S V(cid:62) of the starting value 0 0 0 0 andimposingthegaugeconditions . . U(t)(cid:62)U(t)=0, V(t)(cid:62)V(t)=0, (4) the solutionY(t)∈M to the time-dependent variational principle (1) admits a unique decompo- r sition(3),andthefactorsU(t),S(t),V(t)satisfythefollowingsystemofdifferentialequations: . . U(t)=(I−U(t)U(t)(cid:62))A(t)V(t)S(t)−1 . . V(t)=(I−V(t)V(t)(cid:62))A(t)(cid:62)U(t)S(t)−(cid:62) (5) . . S(t)=U(t)(cid:62)A(t)V(t). Thissystemofdifferentialequationsistobesolvednumerically. Thenumericalsolutionof(5)by standard integrators (e.g., of Runge–Kutta type) becomes cumbersome if S(t) is nearly singular. This situation occurs in the case of over-approximation, when the true rank of the solution (or approximate rank) is smaller than the chosen rank r. This is a realistic case: the effective rank may not be known in advance, and it is often reasonable to overestimate the rank for accurate approximation. To avoid the possible singularity of S(t) usually some sort of regularization is used. However,sucharegularizationintroduceserrorsthatarepoorlyunderstood. . Theminimizationcondition(1)statesthatatanapproximationY(t)∈M ,thederivativeY(t)is . r obtainedbyorthogonallyprojectingA(t)ontothetangentspaceT M oftherank-r manifoldat Y(t) r Y(t): . . Y(t)=P(Y(t))A(t), (6) whereP(Y)istheorthogonalprojectorontothetangentspaceT M ofthemanifoldM atY ∈M . Y r r r Theprojectorhasasimplerepresentation[5,Lemma4.1]: forY =USV(cid:62) asin(3), P(Y)Z =ZVV(cid:62)−UU(cid:62)ZVV(cid:62)+UU(cid:62)Z. (7) Note that UU(cid:62) is the orthogonal projector onto the range R(Y) of Y =USV(cid:62), and VV(cid:62) is the orthogonalprojectorontotherangeR(Y(cid:62)),sothatwecanalsowrite P(Y)Z =ZP −P ZP +P Z. (8) R(Y(cid:62)) R(Y) R(Y(cid:62)) R(Y) 3 The integrator 3.1 First-order splitting method, abstract formulation Let a rank-r approximation Y to A(t ) be given and consider a step of the standard Lie–Trotter 0 0 splittingof(6)with(8)fromt tot =t +h: 0 1 0 3 . . 1. Solve the differential equation Y = AP with initial value Y (t ) =Y on the interval I R(Y(cid:62)) I 0 0 I t ≤t ≤t . 0 1 . . 2. Solve the differential equation Y = −P AP with initial value Y (t ) =Y (t ) on II R(YII) R(YI(cid:62)I) II 0 I 1 theintervalt ≤t ≤t . 0 1 . . 3. Solve the differential equation Y = P A with initial value Y (t ) =Y (t ) on the III R(Y ) III 0 II 1 III intervalt ≤t ≤t . 0 1 Finally,takeY =Y (t )asanapproximationtoY(t ),thesolutionof(6)att . Bystandardtheory, 1 III 1 1 1 thisisamethodoffirst-orderaccuracy. Remarkably,eachofthesplitdifferentialequationscanbe solvedexactlyinatrivialway. Lemma3.1. Thesolutionof1. isgivenby . . . Y (t)=U (t)S (t)V (t) with (U S ) =AV , V =0. (9) I I I I I I I I Thesolutionof2. isgivenby . . . . Y (t)=U (t)S (t)V (t) with S =−U(cid:62)AV , U =0, V =0. (10) II II II II II II II II II Thesolutionof3. isgivenby . .(cid:62) . Y (t)=U (t)S (t)V (t) with (V S(cid:62) ) =A U , U =0. (11) III III III III III III III III Proof. We first notice that each of the terms on the right-hand side of (8) is in the tangent space T M ,becauseforthefirsttermtheorthonormalityV(cid:62)V =I yields Y r P(Y)(ZVV(cid:62))=ZVV(cid:62)+UU(cid:62)ZVV(cid:62)−UU(cid:62)ZVV(cid:62) =ZVV(cid:62), so that ZVV(cid:62) ∈ T M . Similarly we also have UU(cid:62)Z ∈ T M and UU(cid:62)ZVV(cid:62) ∈ T M . It Y r Y r Y r therefore follows that the solutions of 1.-3. all stay of rank r. Hence, Y (t) can be factorized as I Y (t)=U (t)S (t)V (t) with an invertible s×s matrix S andU ,V having orthonormal columns. I I I I I I I Allthematricescanbechosentobedifferentiable,andwethushave . . .(cid:62) Y =(U S ) V(cid:62)+(U S )V I I I I I I I . . . . . whichby1. mustequalY =AVV(cid:62). Weobservethatthisissatisfiedif(U S ) =AV andV =0. I I I I I I I TheproofsforSteps2.and3.aresimilar. 3.2 First-order splitting method, practical algorithm Lemma 3.1 leads us to the following algorithm. Given a factorization (3) of the rank-r matrix Y =U S V(cid:62) and denoting the increment ∆A=A(t )−A(t ), one step of the integrator reads as 0 0 0 0 1 0 follows: 1. Set K =U S +∆AV 1 0 0 0 andcomputethefactorization U1S(cid:98)1 =K1 withU1 havingorthonormalcolumnsandanr×r matrixS(cid:98)1 (usingQRorSVD). 4 2. Set S(cid:101)0 =S(cid:98)1−U1(cid:62)∆AV0. 3. Set L1 =V0S(cid:101)0(cid:62)+∆A(cid:62)U1 andcomputethefactorization V S(cid:62) =L , 1 1 1 withV havingorthonormalcolumnsandanr×r matrixS (usingQRorSVD). 1 1 Thealgorithmcomputesafactorizationoftherank-r matrix Y =U S V(cid:62), 1 1 1 1 which is taken as an approximation toY(t ). Note thatY is identical to the result of the abstract 1 1 splittingalgorithmofSection3.1,withoutanyfurtherapproximation. 3.3 Higher-order schemes Higher-orderextensionscanbeobtainedfromtheabovefirst-orderalgorithmbythestandardtech- nique of composition rules. The usual symmetric composition is obtained by first taking a step of the above integrator with step size h/2 followed by taking its steps in reverse order. The resulting schemelooksasfollows(hereA =A(t ),A =A(t +h),A =A(t +h)): 0 0 1/2 0 2 1 0 K =U S +(A −A )V , 1/2 0 0 1/2 0 0 (U ,S(cid:98) )=QR(K ), 1/2 1/2 1/2 S(cid:101)0 =S(cid:98)1/2−U1(cid:62)/2(A1/2−A0)V0, L1 =V0S(cid:101)0(cid:62)+(A1−A0)(cid:62)U1/2, (V1,S(cid:98)1(cid:62))=QR(L1), (12) S(cid:101)1/2 =S(cid:98)1−U1T/2(A1−A1/2)V1, K1 =U1/2S(cid:101)1/2+(A1−A1/2)V1, (U ,S )=QR(K ), 1 1 1 Y =U S V(cid:62). 1 1 1 1 This symmetrized splitting is a second-order scheme for (6). Higher-order schemes are obtained bysuitablefurthercompositions;see,e.g.,[4,9]. 3.4 The integrator for matrix differential equations The basic first-order scheme extends straightforwardly to an explicit method for the low-rank ap- . proximation of solutions A(t) of matrix differential equations A = F(A), where now A(t) is not known beforehand. The only change is that ∆A=A(t )−A(t ) is replaced, in a way resembling 1 0 theexplicitEulermethod,with ∆A=hF(Y ). 0 The symmetric composition with the adjoint method now yields an implicit method. An explicit method of order 2 is obtained by first taking a step with the basic first-order integrator, which 5 y.ields an approximationY(cid:101)1, and then to take a step with the above second-order method in which A(t) is replaced, att =t0+θh, with the linear function B(t0+θh)=(1−θ)F(Y0)+θF(Y(cid:101)1), and (cid:82)t correspondinglyA(t)withthequadraticfunctionY + B(s)ds,thatis, 0 t 0 (cid:90) θ h h A(t0+θh)≈Y0+h B(t0+ϑh)dϑ =Y0+ θ(2−θ)F(Y0)+ θ2F(Y(cid:101)1). 2 2 0 Higher-ordermethodscanagainbeobtainedbycompositionofstepsofthesecond-ordermethod. 4 Robustness under over-approximation The equations of motion (5) break down when S becomes singular, and standard numerical in- tegrators applied to (5) run into problems with stability and accuracy when S is ill-conditioned. Such a situation arises when the matrix A(t) to be approximated by a rank-r matrix has rank less than r, or is close to a rank-deficient matrix. It is a remarkable property of the integrator proposed here that it behaves much better than a standard integrator applied to (5) in such a situation of over-approximation. On the one hand, this is already apparent from the observation that there is nomatrixinversioninthealgorithm. Thereisinfactmoretoit. Thefollowingresultdependsontheorderingofthesplittingoftheprojector(8),sothatwefirst computeK =US,thenS,thenL=VS(cid:62). Foradifferentordering,suchascomputingsubsequently K,L,S,thefollowingsurprisingexactnessresultisnotvalid. Theorem 4.1. Suppose that A(t) has rank at most r for allt. With the initial valueY =A(t ), the 0 0 splittingalgorithmofSection3.2isthenexact: Y =A(t ). 1 1 Proof. WedecomposeA(t)=U(t)S(t)V(t)(cid:62),wherebothU(t)andV(t)haverorthonormalcolumns, and S(t) is an r×r matrix. We assume thatV(t )(cid:62)V(t ) is invertible. If this is not satisfied, then 1 0 wemakethefollowingargumentwithasmallperturbationofA(t )suchthatV(t )(cid:62)V(t )becomes 1 1 0 invertible,andlettheperturbationtendtozerointheend. Thefirstsubstepofthealgorithm,startingfromY =U S V(cid:62)=U(t )S(t )V(t )(cid:62)=A(t ),yields 0 0 0 0 0 0 0 0 U1S(cid:98)1 =A(t1)V0 =U(t1)S(t1)(cid:0)V(t1)(cid:62)V(t0)(cid:1), sothattherangeof A1 =A(t1)=(cid:0)U(t1)S(t1)(cid:1)V(t1)(cid:62) =U1S(cid:98)1(V(t1)(cid:62)V(t0)(cid:1)−1V(t1)(cid:62) iscontainedintherangeofU ,andhencewehave 1 U U(cid:62)A =A , aswellas A V V(cid:62) =A . 1 1 1 1 0 0 0 0 Usingtheformulasofthesplittingschemewethencalculate Y = U S V(cid:62) 1 1 1 1 = U1S(cid:101)0V0(cid:62)+U1U1(cid:62)∆A = U1S(cid:98)1V0(cid:62)−U1U1(cid:62)∆AV0V0(cid:62)+U1U1(cid:62)∆A = U S V(cid:62)+∆AV V(cid:62)−U U(cid:62)∆AV V(cid:62)+U U(cid:62)∆A 0 0 0 0 0 1 1 0 0 1 1 = A +A V V(cid:62)−A −A V V(cid:62)+U U(cid:62)A +A −U U(cid:62)A =A , 0 1 0 0 0 1 0 0 1 1 0 1 1 1 0 1 whichisthestatedresult. 6 Consider now a small perturbation to a matrix of rank less than r: with a small parameter ε, assumethat(withprimesasnotationalsymbols,notderivatives), A(t)=A(cid:48)(t)+εA(cid:48)(cid:48)(t) withrank(A(cid:48)(t))=q<r, whereA(cid:48) andA(cid:48)(cid:48) andtheirderivativesareboundedindependentlyofε. Wefactorize A(cid:48)(t)=U(cid:48)(t)S(cid:48)(t)V(cid:48)(t)(cid:62) withU(cid:48)(t)andV(cid:48)(t)havingqorthonormalcolumnsandwithaninvertibleq×qmatrixS(cid:48)(t). We apply the splitting integrator for the dynamical rank-r approximation of A(t) with starting value Y =A(cid:48)(t )+εA(cid:48)(cid:48), rank(Y )=r, 0 0 0 0 where A(cid:48)(cid:48) is bounded independently of ε (but may differ from A(cid:48)(cid:48)(t )). We compare the result of 0 0 therank-r algorithmwiththatoftherank-qalgorithmstartingfrom Y¯ =A(cid:48)(t )+εA¯(cid:48)(cid:48), rank(Y¯ )=q<r. 0 0 0 0 Theorem 4.2. In the above situation, let Y and Y¯ denote the results of n steps of the splitting n n integrator for the rank-r approximation and rank-q approximation, respectively, applied with step sizeh. Then,aslongast +nh≤T, 0 (cid:107)Y −Y¯ (cid:107)≤C(ε+h), n n whereC isindependentofn,handε (butdependsonT −t ). 0 We note that by the standard error estimates of splitting methods, the integration error of the rank-qapproximationisY¯ −Y¯(t )=O(hp),uniformlyinε,fortheintegratoroforder p. Further- n n more, it follows from the over-approximation lemma in [5], Section 5.3, that the difference of the rank-r andrank-qapproximationsisboundedbyY(t)−Y¯(t)=O(ε). Proof. (a)Wefactorize Y =U S V(cid:62), 0 0 0 0 where U = (U(cid:48),U(cid:48)(cid:48)) ∈ Rm×r = Rm×q×Rm×(r−q) and V = (V(cid:48),V(cid:48)(cid:48)) ∈ Rn×r = Rn×q×Rn×(r−q) 0 0 0 0 0 0 haveorthonormalcolumns. S ischosenasanr×r matrixinblock-diagonalform 0 (cid:32) (cid:33) S(cid:48) 0 S = 0 with S(cid:48) =S(cid:48)(t )andS(cid:48)(cid:48) =O(ε). 0 0 S(cid:48)(cid:48) 0 0 0 0 . . Weconsiderthedifferentialequationinthefirstsplittingstep,Y (t)=A(t)P . Wefactorize I R(Y(cid:62)(t)) I (omittingthesubscriptI andtheargumentt) Y =USV(cid:62), whereU andV haver orthonormalcolumns,sothatwehavetheequation . . . . (cid:62) USV(cid:62)+USV(cid:62)+USV =AVV(cid:62). (13) 7 As is shown in the proof of Lemma 5.4 of [5] (see also [2]), the decomposition becomes unique if weimposethatS staysblock-diagonal, (cid:32) (cid:33) S(cid:48) 0 S= , 0 S(cid:48)(cid:48) and . . U(cid:62)U =H, V(cid:62)V =K withr×r matricesoftheblockform (cid:32) (cid:33) (cid:32) (cid:33) 0 H 0 K 12 12 H = , K = , H 0 K 0 21 21 whichareskew-symmetric,H =−HT andK =−KT . Withthecorrespondingdecompositions 12 21 12 21 U =(U(cid:48),U(cid:48)(cid:48))andV =(V(cid:48),V(cid:48)(cid:48))theproofofLemma5.4of[5]yieldsthat . . (cid:62) H =−(S(cid:48))−(cid:62)V(cid:48)(cid:62)A U(cid:48)(cid:48)+O(ε), K =−(S(cid:48))−1U(cid:48)(cid:62)AV(cid:48)(cid:48)+O(ε), 12 12 andinparticularH =O(1),K =O(1). Multiplying(13)withU(cid:62) fromtheleftandwithV from 12 12 theright,weobtain . . HS+S+SK(cid:62) =U(cid:62)AV, whichyieldsonthediagonalblocks . . . . (cid:48) (cid:48)(cid:48) S =U(cid:48)(cid:62)AV(cid:48), S =U(cid:48)(cid:48)(cid:62)AV(cid:48)(cid:48). From(13)wefurtherobtain . . (U(cid:48)S(cid:48)) +U(cid:48)(cid:48)S(cid:48)(cid:48)K(cid:62) =AV(cid:48) 12 . (cid:48) and,usingalsotheaboveequationforS , . . (cid:48)(cid:62) S(cid:48)V +H S(cid:48)(cid:48)V(cid:48)(cid:48)(cid:62) =U(cid:48)(cid:62)AV(cid:48)(cid:48)V(cid:48)(cid:48)(cid:62). 12 . Weshowinpart(b)oftheproofbelowthatU(cid:48)(cid:48)(cid:62)AV(cid:48)(cid:48)=O(ε+h)(butwecannotconcludethesame . forU(cid:48)(cid:62)AV(cid:48)(cid:48)). Insummary,weget(indicatingnowthesubscriptI) . . . (cid:48) (cid:48)(cid:48) S =U(cid:48)(cid:62)AV(cid:48), S =O(ε+h) I I I I . . (U(cid:48)S(cid:48)) =AV(cid:48)+O(ε+h) I I I . . (cid:48)(cid:62) S(cid:48)V =U(cid:48)(cid:62)AV(cid:48)(cid:48)V(cid:48)(cid:48)(cid:62)+O(ε+h). I I I I I Forthesecondstepinthesplittingweobtainsimilarly . . . (cid:48) (cid:48)(cid:48) S =−U(cid:48)(cid:62)AV(cid:48), S =O(ε+h) II II II II . . (cid:48) U S(cid:48) =−U(cid:48)(cid:48)U(cid:48)(cid:48)(cid:62)AV(cid:48) +O(ε+h) II II II II II . . (cid:48)(cid:62) S(cid:48) V =−U(cid:48)(cid:62)AV(cid:48)(cid:48)V(cid:48)(cid:48)(cid:62)+O(ε+h), II II II II II 8 andforthethirdstepweobtain . . . (cid:48) (cid:48)(cid:48) S =U(cid:48)(cid:62)AV(cid:48) , S =O(ε+h) III III III III . . (cid:48) U S(cid:48) =U(cid:48)(cid:48) U(cid:48)(cid:48)(cid:62)AV(cid:48) +O(ε+h) III III III III III . . (S(cid:48) V(cid:62)) =U(cid:48)(cid:62)A+O(ε+h). III III III ComparingtheseequationswiththoseforS¯,U¯,V¯ intherank-qsplittingmethod,itfollowsthat S(cid:48) =S(cid:48) (t )=S¯ +O(hε+h2) 1 III 1 1 U(cid:48) =U(cid:48) (t )=U¯ +O(hε+h2) 1 III 1 1 V(cid:48) =V(cid:48) (t )=V¯ +O(hε+h2). 1 III 1 1 Sincestableerrorpropagationintherank-qsplittingmethodisobtainedbythestandardargument using the Lipschitz continuity of the right-hand side of the differential equation (uniformly in ε), weconcludetotheassertionofthetheorem. (b)Wedecomposetherank-qmatrix . . A(cid:48)(t)=Uˆ(cid:48)(t)Sˆ(cid:48)(t)Vˆ(cid:48)(t)(cid:62) with Uˆ(cid:48)(t)Uˆ(cid:48)(t)=0, Vˆ(cid:48)(t)Vˆ(cid:48)(t)=0 and Uˆ(cid:48)(t ) =U(cid:48) +O(ε +h), Vˆ(cid:48)(t ) =V(cid:48)+O(ε +h). This choice of initial factors is possible 0 0 0 0 because of the condition A(cid:48)(t ) =Y +O(ε+h), and the factorization at later t exists by solving 0 0 therank-qdifferentialequations(5). Fort ≤t ≤t +hwehaveU(cid:48)(t)=U(cid:48)(t )+O(h)=Uˆ(cid:48)(t )+ 0 0 0 0 O(ε+h)=Uˆ(cid:48)(t)+O(ε+h)andV(cid:48)(t)=Vˆ(cid:48)(t)+O(ε+h),andhence . . . . A(cid:48)(t)=Uˆ(cid:48)(t)Sˆ(cid:48)(t)V(cid:48)(t)(cid:62)+U(cid:48)(t)Sˆ(cid:48)(t)V(cid:48)(t)(cid:62)+U(cid:48)(t)Sˆ(cid:48)(t)Vˆ(cid:48)(t)(cid:62)+O(ε+h). SinceU(cid:48)(cid:48)(t)(cid:62)U(cid:48)(t)=0andV(cid:48)(cid:48)(t)(cid:62)V(cid:48)(t)=0,thisyieldsthedesiredbound . U(cid:48)(cid:48)(t)(cid:62)A(t)V(cid:48)(cid:48)(t)=O(ε+h), andtheproofiscomplete. 5 Numerical experiments 5.1 Problem setting Theexampleistakenfrom[5]. Wegeneratetime-dependentmatricesA(t)as A(t)=Q (t)(A +exp(t)A )Q (t), 1 1 2 2 where the orthogonal matrices Q (t) and Q (t) are generated as the solutions of differential equa- 1 2 tions . Q =TQ, i=1,2 i i i with random skew-symmetric matrices T. The matrices A ,A are generated as follows. First, i 1 2 we generate a 10×10 matrix as an identity matrix plus a matrix with random entries distributed uniformly over [0,0.5]. This matrix is then set as a leading block of a 100×100 matrix. After that, we add a perturbation to this enlarged matrix. The perturbation is generated as a matrix with uniformly distributed entries on [0,ε]. For small ε this matrix is close to a rank-10 matrix. If the rank of the approximation is chosen larger than 10, the norm of S−1 in (5) will be large and this leadstoinstabilitywithstandardtimediscretizations. 9 5.2 Numerical comparisons We will compare the following schemes. The first scheme is the standard implicit midpoint rule combinedwithfixedpointiterationapplieddirectlytothesystem(5). Wetesttheproposedsplitting schemeswithdifferentordersofsplittingandwith/withoutsymmetrization. WedenotebyKSLthe scheme of Section 3.2, where first K =US is updated, then S, then L=VS(cid:62). By KLS we denote theschemewherefirstK,thenL,thenS areupdated. Wethusconsiderthefollowingmethods: 1. KLSscheme(firstorder) 2. KLSschemewithsymmetrization(secondorder) 3. KSLscheme(firstorder) 4. KSLschemewithsymmetrization(secondorder) We are interested in the approximation errors (cid:107)Y(t)−A(t)(cid:107) for each particular scheme and differ- entvaluesofr andε. TheresultsarepresentedinFigure1,wherewealsoplottheerrorofthebest rank-rapproximationtoA(t)computedbySVD.NotethatbothKSLschemesperformremarkably betterintheoverapproximationcase(subplotd). Themidpointruleisunstableincased),whereas both KLS schemes have significantly higher error than the KSL schemes. All computations are donewithconstantstepsizeh=10−3. To test the convergence properties of the schemes with respect to the timestep h, we have com- putedthenumericalorderofthedifferentschemesusingtheRungerule: (cid:107)y(h)−y(h/2)(cid:107)/(cid:107)y(h/2)−y(h/4)(cid:107)≈2p incaseoforder p. p Appr. err. p Appr. err. Midpoint 2.0023 0.2200 Midpoint 2.0024 0.0188 KLS 1.0307 1.8133 KLS 1.0309 1.8030 KLS(symm) 1.8226 0.2215 KLS(symm) 1.8231 0.0324 KSL 1.0089 0.2188 KSL 1.0082 0.0002 KSL(symm) 2.005 0.2195 KSL(symm) 2.0049 0.0002 Table1:ε =10−3,r=10 Table2:ε =10−6,r=10 p Appr. err. p Appr. err. Midpoint 0.0001 0.1006 Midpoint - failed KLS 0.8154 1.4224 KLS 0.9633 1.3435 KLS(symm) 1.4911 0.3142 KLS(symm) 0.3127 1.5479 KSL 1.0354 0.0913 KSL 1.0362 9.1316e-05 KSL(symm) 1.9929 0.0913 KSL(symm) 1.993 9.1283e-05 Table3:ε =10−3,r=20 Table4:ε =10−6,r=20 Theapproximationerrorlistedistheerroratt =1withrespecttothegivenmatrixA(t),notwith respect to the solution Y(t) of rank r of the differential equations (5). In the overapproximation 10