ebook img

A Strictly Single-Site DMRG Algorithm with Subspace Expansion PDF

0.64 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Strictly Single-Site DMRG Algorithm with Subspace Expansion

A Strictly Single-Site DMRG Algorithm with Subspace Expansion C. Hubig,1,∗ I. P. McCulloch,2 U. Schollwöck,1 and F. A. Wolf1 1Department of Physics and Arnold Sommerfeld Center for Theoretical Physics, Ludwig-Maximilians-Universität München, Theresienstrasse 37, 80333 München, Germany 2Centre for Engineered Quantum Systems, School of Physical Sciences, The University of Queensland, Brisbane, Queensland 4072, Australia (Dated: 14th April 2015) We introduce a strictly single-site DMRG algorithm based on the subspace expansion of the AlternatingMinimalEnergy(AMEn)method. TheproposednewMPSbasisenrichmentmethodis sufficienttoavoidlocalminimaduringtheoptimisation,similarlytothedensitymatrixperturbation method, but computationally cheaper. Each application of Hˆ to |Ψ(cid:105) in the central eigensolver is 5 reducedincostforaspeed-upof≈(d+1)/2,withdthephysicalsitedimension. Furtherspeed-ups 1 result from cheaper auxiliary calculations and an often greatly improved convergence behaviour. 0 Runtime to convergence improves by up to a factor of 2.5 on the Fermi-Hubbard model compared 2 to the previous single-site method and by up to a factor of 3.9 compared to two-site DMRG. The r method is compatible with real-space parallelisation and non-abelian symmetries. p A 2 I. INTRODUCTION matrix wavefunction formalism (CWF),14 we achieve a 1 speed-up of ≈ (d + 1)/2 during each application of Hˆ Since its introduction in 1993,1,2 the Density Matrix to |Ψ(cid:105) in the eigensolver during the central optimisation ] Renormalisation Group method (DMRG) has seen tre- routine, where d is the dimension of the physical state l e mendous use in the study of one-dimensional systems.3,4 space on each site. - r Variousimprovementssuchasreal-spaceparallelisation,5 The layout of this paper is as follows: Section II will st the use of abelian and non-abelian symmetries6 and establish the notation. Section III will recapitulate the . multi-grid methods7 have been proposed. Most density matrix perturbation method and the CWF. Sec- t a markedly, the introduction8 of density matrix perturb- tion IV will introduce the subspace expansion method m ation steps allowed the switch from two-site DMRG to and the heuristic expansion term with a simple two- - single-siteDMRGin2005,whichprovidedamajorspeed- spin example. The strictly single-site DMRG algorithm d up and improved convergence in particular for systems (DMRG3S) will be presented in Section V alongside a n o with long-range interactions. comparison with the existing CWF. As both the original c Nevertheless, despite some progress,9–11 (nearly) two- perturbation method and the heuristic subspace expan- [ dimensional systems, such as long cylinders, are still a sionrequireamixingfactor α,8 SectionVIdescribeshow hardproblemforDMRG.Themainreasonforthisisthe to adaptively choose α for fastest convergence. Numer- 2 v differentscalingofentanglementduetothearealaw:12,13 ical comparisons and examples will be given in Section 4 In one dimension, entanglement and hence matrix di- VII. 0 mensions in DMRG are essentially size-independent for 5 ground states of gapped systems, whereas in two dimen- 5 sions, entanglement grows linearly and matrix dimen- II. DMRG BASICS 0 sions roughly exponentially with system width. . 1 As a result, the part of the Hilbert space considered Thenotationestablishedherecloselyfollowsthereview 0 by DMRG during its ground state search increases dra- article Ref. 4. Consider a state |Ψ(cid:105) of a system of l sites. 5 matically, resulting mainly in three problems: firstly, Eachsitehasaphysicalstatedimensiond , e.g.∀i:d = 1 i i the DMRG algorithm becomes numerically more chal- 3, l=50 for a system of 50 S =1 spins: : v lenging as the sizes of matrices involved grow (we will (cid:88) Xi assume matrix-matrix multiplications to scale as O(m3) |Ψ(cid:105)= cσ1...σl|σ1...σl(cid:105) . (1) throughout the paper). Secondly, the increased search σ1...σl r a space size makes it more likely to get stuck in local min- Inpractice,thedimensionofthephysicalbasisisusually ima. Thirdly, while sequential updates work well in 1-D constant, ∀i : d = d, but we will keep the subscript to i chains with short-range interactions, nearest-neighbour refer to one specific basis on site i where necessary. sites in the 2-D lattice can be separated much farther in It is then possible to decompose the coefficients the DMRG chain. Therefore, improvements to the core c as a series of rank-3 tensors M ,...,M of size σ1,...,σl 1 l DMRG algorithm are still highly worthwhile. (d ,m ,m ) respectively, withm =m =1. Thecoef- i i−1 i 0 l In this paper, we will adopt parts of the AMEn ficientc canthenbewrittenasthematrixproduct σ1,...,σl method15 developed in the tensor train/numerical lin- of the corresponding matrices in M ,...,M : 1 l ear algebra community to construct a strictly single-site (cid:88) |Ψ(cid:105)= Mσ1···Mσl|σ ...σ (cid:105) . (2) DMRGalgorithmthatworkswithoutaccessingthe(full) 1 l 1 l reduceddensitymatrix. Comparedtotheexistingcenter- σ1...σl(cid:124) cσ1(cid:123).(cid:122)..σl (cid:125) 2 The maximal dimension m = max {m } is called the right MPS basis of M is changed in some way. Depend- i i i MPS bond dimension. In typical one-dimensional calcu- ing on the exact implementation, updates may work on lations, m=200, but for e.g. 32×5 cylinders, m>5000 one (single-site DMRG) or two sites (two-site DMRG) is often necessary. It is in these numerically demanding at a time. The enrichment step may be missing or im- cases that our improvements are of particular relevance. plemented via Density Matrix Perturbation or Subspace Similarly, a Hamiltonian operator can be written as a Expansion. matrix product operator (MPO),whereeachtensorW is i now of rank 4, namely (d ,d ,w ,w ): i i i−1 i III. PERTURBATION STEP AND Hˆ = (cid:88) Wσ1τ1···Wσlτl|σ ...σ (cid:105)(cid:104)τ ...τ |. (3) CENTERMATRIX WAVEFUNCTION 1 l 1 l 1 l FORMALISM (CWF) σ1...σl τ1...τl A. Convergence Problems of Single-Site DMRG w = max {w } is called the MPO bond dimension. We i i willusuallyassumethatformosti,m =mandw =w. i i In practice, this holds nearly everywhere except at the During single-site DMRG, only a single MPS tensor ends of the chain, where the mi grow exponentially from Mi on site i is optimised at once. Compared to two- 1 to m. The basis of M (W ) of dimension m (w ) site DMRG, the search space is reduced by a factor of i i i−1 i−1 iscalledtheleft-handside(LHS)basis,whereasthebasis d ≈ 2...5, leading to a speed-up of at least O(d) per of dimension m (w ) is the right-hand side (RHS) basis iteration.8 However, since the left and right bases of the i i ofthistensor. Forsimplicity,mi,di andwi canalsorefer tensorsMiarefixedanddefinedbytheenvironment(li−1 to the specific basis (and not only its dimension) when andri+1),thisapproachislikelytogetstuck. Whilealso unambiguous. occurringiftherearenosymmetriesimplementedonthe Instead of M , we will also write A (B ) for a left level of the MPS, this issue is most easily visible if one i i i (right) normalised MPS tensor: considers U(1) symmetries:4 assume that all basis states to the right of the RHS bond of M transform as some i (cid:88)Aσi†Aσi =I (4) quantum number sz. If we now target a specific sector, i i e.g. S = 0 overall, then on the LHS of this bond (i.e. σi from tzhe left edge up to and including M ), all states (cid:88)BiσiBiσi† =I . (5) must transform as −sz. In this configuratiion, it is im- σi possible for a local change of Mi to add a new state that transforms as, say, s(cid:48), to its right basis states, as there If we then define the contractions z would be nocorresponding state −s(cid:48) to therightof that z l =(cid:0)Aσ1···Aσi−1Mσi(cid:1)∈(d ,...,d ,m ) (6) bond, rendering the addition of the state moot from the i 1 i−1 i 1 i i perspectiveofthelocaloptimiser,asitsnormwillbezero ri =(cid:0)MiσiBiσ+i+11···Blσl(cid:1)∈(mi−1,di,...,dl) , (7) identically. A concrete example of this issue is given in Section VIIA. we can rewrite |Ψ(cid:105) from (2) as DMRG is a variational approach on the state space available to MPS of a given bond dimension. As such, (cid:88) |Ψ(cid:105)= liri+1|σ1...σi(cid:105)⊗|σi+1...σl(cid:105) . (8) the algorithm must converge into either the global or a σ1...σl local minimum of the energy in this state space. Hence, we will call all cases where DMRG converges on an en- Thatis,whenonlyconsideringonespecificbond(i,i+1), ergysubstantiallyhigherthantheminimalenergyachiev- the left and right MPS bases at this bond are built up able with the allowed MPS bond dimension cases where from the states generated by the MPS tensor chains to DMRG is stuck in local minima. the left and right of the bond. Individual elements of an MPS basis are therefore called “state”. Furthermore, define L = 1 and L = L A†W A 0 i i−1 i i i B. Density Matrix Perturbation with summation over all possible indices. Similarly, R = 1 and R = R B†W B . With these contrac- l+1 i i+1 i i i This convergence problem has been solved by White tions, it is possible to write (2005).8 In the following, we will assume a left-to-right sweep, sweeping in the other direction works similarly, (cid:104)Ψ|Hˆ|Ψ(cid:105)=L M†W M R (9) i−1 i i i i+1 but on the left rather than right bonds. After the local optimisationofthetensorM ,thereduceddensitymatrix for any i∈[0,l]. i DMRG then works by sweeping over the system mul- ρ =l M M†l† (10) tiple times. During each sweep, each site tensor M is i,R i−1 i i i−1 i sequentially updated once with each update consisting of is built on the next bond. This is the reduced density one optimisation step via e.g. a sparse eigensolver and matrix resulting from tracing out the part of the system possibly one enrichment step during which the left or to the left of bond (i,i+1). 3 ρ is then perturbed as small cylinders. However, both the cost of the applica- i,R tions of Hˆ to |Ψ(cid:105) as O(w(d2+d)m3) as well as the large (cid:16) (cid:17) ρ →ρ(cid:48) =ρ +αTr L ρ L† . (11) density matrix ρ∈(dm,dm) cause problems if m and w i,R i,R i,R i i,R i become large. The new ρ(cid:48) is then used to decide on a new set of i,R basisstatesontheRHSofM , withtheinversemapping i IV. SUBSPACE EXPANSION from the new to the old basis being multiplied into each componentofB . Themixingfactorαisasmallscalar i+1 The idea of using subspace expansion instead of dens- used to control the perturbation. A new scheme to find ity matrix perturbation originates15,16 in the tensor the optimal choice of α is discussed in Section VI. train/numerical linear algebra community. There, a stringent proof was given regarding the convergence properties of this method when the local tensor Z of C. Centermatrix Wavefunction Formalism (CWF) i the residual In a standard single-site DMRG calculation, the re- |Z(cid:105)≡Hˆ|Ψ(cid:105)−E|Ψ(cid:105)= (cid:88) Zσ1···Zσl|σ ...σ (cid:105) (13) duced density matrix ρ is never used. More import- 1 l 1 l i,R σ1...σl antly, even building ρ on a given bond (i,i+1) will i,R not yield a density matrix that can be used in (11), as it isusedastheexpansionterm. Here, wewillonlyusethe onlycontainsthem statesexistingonthatbondalready methodofsubspaceexpansionandsubstituteanumeric- i without knowledge of the m states on the bond one ally much more cheaply available expansion term. i−1 step to the left. In other words, it is not possible to Thefollowingsectionisdividedintothreeparts: firstly, choose the optimal set m based only on m , rather, one we will explain the concept of subspace expansion acting (cid:102)i i requires also di and mi−1. on two neighbouring MPS tensors Mi, Mi+1. Secondly, The centermatrix wavefunction formalism14 was de- the expansion term employed in DMRG3S is introduced veloped to cope with this problem. Given a site tensor and motivated. Thirdly, a simple example is described. M ∈(d ,m ,m )onaleft-to-rightsweep,itintroduces i i i−1 i a “centermatrix” C ∈ (d m ,m ) and replaces the i,R i i−1 i original site tensor as A. Subspace Expansion with an Arbitrary Expansion Term M →A ∈(d ,m ,d m ) s.t. M =A C . (12) i i i i−1 i i−1 i i i,R Inthefollowing,wewilldescribesubspaceexpansionof Ai is constructed to be left-orthogonal and is essentially the RHS basis of the current working tensor, as it would an identity matrix mapping the left basis mi−1 and the occur during a left-to-right sweep. physical basis di onto a complete basis containing all Assume a state |Ψ(cid:105) described by a set of tensors dimi−1 states on its right. The new basis is “complete” {A1,...,Ai−1,Mi,Bi+1,...,Bl}. At the bond (i,i+1), in the sense that all states reachable from the left bond we can then decompose the state as a sum over left and basis mi−1 and the local physical basis di are contained right basis states as in Eq. (8). within it. Nowweexpand thetensorM ∈(d,m ,m )bysome i i−1 i The contents of Mi are placed in Ci,R accordingly and expansion term Pi ∈ (d,mi−1,mPi) for each individual theoriginalstateremainsunchanged. Thereduceddens- physical index component: ity matrix is then ρ = C C† and has access to all dimi−1 states, as reqi,uRired aib,Rovei.,RA perturbation of ρi,R Miσi →M˜iσi =(cid:2)Miσi Piσi(cid:3) . (14) according to (11) hence allows the introduction of new This effectively expands the RHS MPS basis of M from states. i m to m +m . Similarly, expand the components of TheDMRGoptimisationstepcanworkonCi,R alone, Bi ∈(di,m ,mPi ) with zeros: with L built prior to optimisation of C from the i+1 i i+1 i i,R expanded Ai. During each eigensolver step, the ef- (cid:20)Bσi+1(cid:21) fective Hamiltonian on site i has to be applied onto Bσi+1 →B˜σi+1 = i+1 . (15) i+1 i+1 0 C . The application is done by contraction of L ∈ i,R i (w,dimi−1,dimi−1), Ri+1 ∈ (w,mi,mi) and Ci,R ∈ The appropriately-sized block of zeros only multiplies (dimi−1,mi) at cost O(w(d2+d)m3) per step. After op- with the expansion term Piσi. In terms of a decomposi- timisation, the perturbation is added. Its computational tion as in (8), this is equivalent to costisdominatedbythecalculationofαTr{L ρ L†}at i i,R i (cid:20) (cid:21) O(wd3m3). The bond between Ai and Ci,R can then be |Ψ(cid:105)= (cid:88) [l p] ri+1 |σ ...σ (cid:105)⊗|σ ,...,σ (cid:105) (16) truncated down to m using ρ(cid:48) and the remaining parts i 0 1 i i+1 l of C are multiplied into Bi,R to the right. σ1,...,σl i,R i+1 The resulting algorithm converges quickly for one- where p is the result of multiplying l and P , with i−1 i dimensional problems and performs reasonably well for the 0 in the second expression similarly resulting from 4 the 0 in B . While the state |Ψ(cid:105) remains unchanged, newstatesanddisturbtheresultoftheeigensolver,which i+1 the local optimiser on the new site B can now choose isoptimalatthisspecificvalueofmbutnotaneigenstate i+1 the initially-zero components differently if so required: of Hˆ yet. The necessary flexibility in the left-/right basis states to The cost of a single subspace expansion is O(wdm3+ escape local minima has been achieved without referring w2d2m2)forthecalculationofP ,potentiallyO(2dwm2) i to the density matrix. for the addition to M and B respectively and i i+1 Note that while orthonormality of Bi+1 is lost, we do O(dw2m3 +d2m2) for the SVD of an (dm,wm) matrix not need it between the enrichment step on site i and formed from M˜ . If we restrict the SVD to m singular the optimisation step on site i+1. The orthonormality i values, then the resulting matrices will be of dimension of M can be restored via singular value decomposition i (dm,m), (m,m) and (m,wm) respectively. Thefirst can asusual. Furthermore,itisusuallynecessarytotruncate the RHS basis of M˜ down from m +m to m imme- be reformed into A˜i at cost O(dm2) and the second and i i Pi third multiplied into B at cost O(m3dw+m3d). The diately following the expansion: this preserves the most i+1 total cost of this step is dominated by the cost of the relevant states of the expansion term while avoiding an SVDatO(dw2m3), whichisstillcheaperthanthecalcu- exponential explosion of bond dimensions. lation of the perturbation term in (11), not considering When sweeping from right to left, the left rather than theothercostsassociatedtousingthedensitymatrixfor right MPS basis of the current working tensor is expan- truncation. ded, with the left tensor A being zero-padded as op- i−1 posed to the right tensor B : i+1 Miσi →M˜iσi =(cid:20)MPσiσii(cid:21) (17) i C. Subspace Expansion at the Example of a Aσi−1 →A˜σi−1 =(cid:2)Aσi−1 0(cid:3) . (18) d=l=2 Spin System i−1 i−1 i−1 Inthefollowing,wewilldemonstrateandillustratethe B. Expansion Term method of subspace expansion at the simple example of a system of two spins with S = 1 from m = 1 to m = 2 Usingtheexactresidualastheexpansiontermiscom- as it would occur during a left-to2-right sweep. putationally expensive: The term Hˆ|Ψ(cid:105) can be updated Assume the Hamiltonian locally and is mostly unproblematic, but the subtraction of E|Ψ(cid:105) and subsequent re-orthonormalisation is costly and has to be done after each local optimisation, as the H =S1S2+S1S2+S1S2 (20) x x y y z z current value of E changes. This exact calculation is hence only possible for m ≈ 100, which is far too small = 1(cid:8)S1S2 +S1S2(cid:9)+S1S2 (21) 2 + − − + z z to tackle difficult two-dimensional problems. Instead, we propose the very cheaply available terms with MPO-components P =αL M W ∈(d ,m ,w m ) (19) i i−1 i i i i−1 i i (cid:20) (cid:21) 1 1 to be used during left-to-right sweeps and Pi = W = √ S √ S S (22) 1 + − z αR M W foruseduringright-to-leftsweepswithsome 2 2 i+1 i i scalar mixing factor α. In the regime where the exact (cid:20) 1 1 (cid:21)T residual can be computed, these terms work essentially W2 = √ S− √ S+ Sz . (23) 2 2 equally well. This expression for P can be heuristically motivated i as follows: (19) is equivalent to the partial projection of Let the initial state be an m = 1 MPS, described by H|Ψ(cid:105) onto |Ψ(cid:105) to the left of the current bond. Hence, in components thegroundstateandignoringnumericalerrors, theRHS basis of this Pi is identical to that of Mi. Truncation (cid:104)(cid:112) (cid:105) from m +m to m is then possible without inducing A↑ =[a] A↓ = 1−a2 (24) i Pi i 1 1 errors. (cid:104)(cid:112) (cid:105) Numerically, it seems possible to choose α arbitrarily B2↑ =[b] B2↓ = 1−b2 (25) large without hindering convergence or perturbing the state too much in simple (one-dimensional) problems. where square brackets denote matrices in the MPS bond However, if the chosen maximal bond dimension m is indices. Due to the standard normalisation constraints, insufficient to faithfully capture the ground state of the there are only two free scalar variables here, a and b. given system, α has to be taken to zero eventually to allow convergence. Otherwise, P will continuously add Subspace expansion of A is straightforward (keep in i 1 5 mind that L ≡1 for convenience): The inner loop sweeps over the system, iterating over 0 and updating the tensors on each site sequentially. Each (cid:88) Pτ1 = Wτ1σ1Aσ1 (26) local update during a left-to-right sweep (right-to-left 1 1 1 σ1 sweeps work analogously) consists of the following steps: P↑ =W↑↑A↑+W↑↓A↓ (27) 1 1 1 1 1 1. Optimise the tensor M : Use an eigensolver tar- (cid:104)√ (cid:105) i = 1√−a2 0 a (28) geting the smallest eigenvalue to find a solution 2 (M(cid:63),λ(cid:63)) to the eigenvalue problem i P↓ =W↓↑A↑+W↓↓A↓ (29) 1 1 1 1 1 =(cid:104)0 √a −√1−a2(cid:105) (30) Li−1Ri+1WiMi =λMi . (39) 2 λ(cid:63) is the new current energy estimate. This first resulting in A(cid:48)1 and B2(cid:48) directly after the expansion: step dominates the computational cost. (cid:104) √ (cid:105) A(cid:48)1↑ = a 1√−2a2 0 a (31) 2. Build αPi according to (19) using Mi(cid:63). Build an (cid:104)√ √ (cid:105) appropriately-sizedzeroblock0i+1afterthedimen- A(cid:48)↓ = 1−a2 0 √a − 1−a2 (32) sions of P are known. 1 2 i √ b  1−b2 3. Subspace-expand M(cid:63) → M˜(cid:63) with αP and B i i i i+1 B(cid:48)↑ =0 B(cid:48)↓ = 0  . (33) with 0i+1. 2 0 2  0  0 0 4. ApplyaSVDtoM˜(cid:63) andtruncateitsrightbasisto i m again, resulting in A˜(cid:63). i i Normalising A(cid:48) via a singular value decomposition as 1 A(cid:48) →A(cid:48)(cid:48)SV† and multiplying SV†B(cid:48) →B(cid:48)(cid:48) gives: 5. Multiply the remainder of the SVD (SV†) into 1 1 2 2 B →B˜ . A(cid:48)(cid:48)↑ =(cid:2)1 0(cid:3) (34) i+1 i+1 1 A(cid:48)(cid:48)↓ =(cid:2)0 1(cid:3) (35) 6. Build Li from A˜(cid:63)i, Li−1 and Wi. 1 (cid:34) √ (cid:35) a 1√−a2 0 a 7. Calculateanewenergyvalueaftertruncationbased SV† = √ 2 √ (36) onL ,B˜ ,W andR . Usethisenergyvalue 1−a2 0 √a − 1−a2 i i+1 i+1 i+1 2 and λ(cid:63) to adapt the current value of α (cf. Sec- (cid:20) (cid:21) ab tion VI). B(cid:48)(cid:48)↑ = √ (37) 2 1−a2b √ 8. Continue on site i+1. (cid:20) (cid:21) a 1−b2 B(cid:48)(cid:48)↓ = √ √ . (38) 2 1−a2 1−b2 Of these, step 2 and 3 implement the actual subspace expansion, whereas all others are identical to standard As expected, the final state |Ψ(cid:105) = (cid:80) A(cid:48)(cid:48)σ1B(cid:48)(cid:48)σ2 is single-site DMRG. σ1σ2 1 2 still entirely unchanged, but there is now a one-to-one It is important to note that the only change from correspondence between the four entries of B(cid:48)(cid:48) and the standard single-site DMRG is the addition of an en- 2 coefficients c in the computational basis, mak- richment step via subspace expansion. Therefore, this {↑,↓},{↑,↓} ing the optimisation towards cii =0,ci(cid:54)=j = √1 trivial. method does not interfere with e.g. real-space parallel- 2 ised DMRG,5,17 the use of nonabelian symmetries6,14 or multi-grid methods.7 V. STRICTLY SINGLE-SITE DMRG To analyse the computational cost, we have to take special care to ensure optimal ordering of the multiplic- ations during each eigensolver iteration in (39). The We can now combine standard single-site DMRG (e.g. problem is to contract L R W M , with L and Ref. 4, p. 67) with the subspace expansion method i−1 i+1 i i i−1 R ∈ (w,m,m), W ∈ (d,d,w,w) and M ∈ (d,m,m). as a way to enrich the local state space, leading to a i+1 i i The optimal ordering is then (((L M )W )R ): strictly single-site DMRG implementation (DMRG3S) i−1 i i i+1 that works without referring to the density matrix at 1. Contract L and M over the left MPS bond at any point. i−1 i cost O(mw·m·dm=m3wd). With the notation from Section II, the steps follow mostlystandardsingle-siteDMRG.Inanoutermostloop, 2. Multiply in W over the physical bond of M i i the algorithm sweeps over the system from left-to-right and the left MPO bond at cost O(m2·wd·dw = and right-to-left until convergence is reached. Criteria m2d2w2). forconvergencearee.g.diminishingchangesinenergyor an overlap close to 1 between the states at the ends of 3. FinallycontractwithR overtherightMPOand i+1 subsequent sweeps. MPS bonds at cost O(md·wm·m=m3dw). 6 Fig. 1 displays the individual steps within a single up- date from the energy perspective: Let ∆E denote the O gaininenergyduringtheoptimisationstepandlet∆E T denote the subsequent rise in energy during the trun- E i cation following the enrichment step. ∆E (cid:54)= 0 only T occursifsomeenrichment(eitherviadensitymatrixper- turbationorsubspaceexpansion)hasoccurred,otherwise there would be no need for any sort of truncation. We Optimisation can hence control the approximate value of ∆E via α, T y g which leads to a simple adaptive and computationally r ΔE ne O cheap algorithm: E Ef If ∆ET was very small or even negative (after chan- ging the optimised state by expansion of its right basis) during the current update, we can increase α during the Truncation ΔE T next update step on the next site. If, on the other hand, |∆E | ≈ |∆E |, that is, if the error incurred during E T O min truncation nullified the gain in energy during the optim- isation step, we should reduce the value of α at the next iteration to avoid making this mistake again. Time In practice, it seems that keeping ∆E ≈ −0.3∆E T O gives the fastest convergence. Given the order-of- Figure 1. (Colour online) Energies of the state at different magnitude nature of α, it is furthermore best to in- points during a single update: Before optimisation, the state crease/decrease it via multiplication with some factor has some initial energy E . Local optimisation via the eigen- i greater/smallerthan1asopposedtoaddingorsubtract- solver takes this energy down by ∆E to E . Subsequent O min ing fixed values. truncationcausesariseinenergyby∆E withthefinalvalue T Some special cases for very small ∆E (stuck in a at the end of this update being E . O f local minimum or converged to the ground state?) and ∆E >0 or ∆E <∆E have to be considered, mostly T T O The total cost of this procedure to apply Hˆ to |Ψ(cid:105) depending on the exact implementation. is O(2m3wd + d2m2w2). Assuming large d2w/m is Itisunclearwhetherthereisacausalrelationbetween small, this gives a speed-up in the eigensolver multiplic- the optimal choice of α and the ratio of ∆ET/∆EO or ations of (d+1)/2 over the CWF approach, which takes whetherboth simplycorrelate withaproceeding DMRG O(m3wd(d+1)). calculation: at the beginning, gains in energy are large andαisoptimallychosenlarge,whereaslateron,energy In addition to this speed-up, the subspace expansion decreases more slowly and smaller values of α are more isconsiderablycheaperthanthedensitymatrixperturb- appropriate. ation. Since the perturbation/truncation step can often take up to 30% of total computational time, improve- It is important to note that this is a tool to reach con- ments there also have a high impact. At the same time, vergence more quickly. If one is primarily interested in a thenumberofsweepsatlargemneededtoconvergedoes wavefunction representing the ground state, the calcula- notseemtoincreasecomparedtotheCWFapproach(cf. tionofanewαateachiterationcomesatessentiallyzero Section VII) and sometimes even decreases. cost. If, however, the aim is to extrapolate in the trun- cationerrorduringthecalculation,thenafixedvaluefor α is of course absolutely necessary. VI. ADAPTIVE CHOICE OF MIXING FACTOR VII. NUMERICAL EXAMPLES Both density matrix perturbation and subspace ex- pansion generally require some small mixing factor α to moderate the contributions of the perturbation terms. A. DMRG Stuck in a Local Minimum The optimal choice of this α depends on the number of states available and those required to represent the Inthissub-section,wewillgiveashortexampleofhow groundstate,aswellasthecurrentspeedofconvergence. DMRG can get stuck in a local minimum even on a very Too large values for α hinder convergence by destroying small system. Consider 20 S = 1 spins with isotropic 2 the improvements made by the local optimiser, whereas antiferromagnetic interactions and open boundary con- too small values lead to the calculation being stuck in ditions. The U(1) symmetry of the system is exploited local minima with vital states not added for the reasons on the MPS basis, with the overall S forced to be zero. z given in Section IIIB. The correct choice of α hence af- Theinitialstateisconstructedfrom20linearlyindepend- fectscalculationstoalargedegree,butisalsodifficultto ent states, all with 3 sites on the very right at S = 0.5 z estimate before the start of the calculation. and m=20 in total. The quantum number distribution 7 bosonicsystemwithanopticallatticepotential,aFermi- 5 HubbardmodelatU =1andquarter-fillingandasystem of free fermions at half-filling. 4 Each algorithm is run at three different values of 3 m=mmax,mmax/2,mmax/4 from the same initial state and run to convergence. This way, it is possible to both Sz 2 observe the behaviour of the methods at low and high ber accuracies. m 1 u The usual setup in DMRG calculations of starting at N m small m and increasing m slowly while the calculation u 0 nt progresses makes it unfortunately very difficult to com- a u Q -1 pare between the three methods. This is because differ- ent methods require different configurations to converge -2 optimally. We therefore restrict ourselves to fixed m throughout an entire calculation, even though all meth- -3 αIn =pu 0t odscouldbespedupfurtherbyincreasingmslowlydur- -4 α > 0 ing the calculation. 0 2 4 6 8 10 12 14 16 18 20 Errors in energy compared to a numerically exact ref- Bond Index i erence value E are plotted as a function of sweeps and 0 CPU time. It should be stressed that this error in en- Figure 2. (Colour online) The quantum number distribution ergy is not directly comparable to the truncation error ascountedfromtherightateachbondofal=20systemwith traditionally used in two-Site DMRG or the variance S = 12 andSztotal =0. Theartificialinputstateisshownwith (cid:104)Hˆ2(cid:105)−(cid:104)Hˆ(cid:105)2 sometimes considered in single-site DMRG. black circles. Two DMRG calculations have then been done Even small differences in energy can lead to vastly dif- onthisinputstate,oncewithnoenrichmentterm(α=0,red ferent physical states and reaching maximal accuracy in squares) and once with subspace expansion enabled (α (cid:54)= 0, energyiscrucialtoensurethatthetruegroundstatehas bluediamonds). Itisclearlyvisiblethatwithoutenrichment, DMRG3S can reduce some weights to zero, but cannot add been reached. new states – red only occurs together with black. As soon Furthermore, a traditional two-site DMRG (2DMRG) as enrichment is enabled, DMRG3S restores ±S symmetry calculation without perturbations is done and its error z andreflectivesymmetryoverthe10th bondandfindsamuch inenergyandruntimetoconvergenceiscomparedtothe better ground state. two single-site algorithms. Here, convergence is defined asanormalisedchangeinenergylessthan10−9 (form= m ) resp. 10−8 (for m < m ). The runtime to max max at each bond is plotted in Fig. 2 as black circles. convergence is the CPU time used until that energy was DMRG3Sisrunwithsubspaceexpansiondisabled,i.e. output by the eigensolver for the first time. α = 0 throughout the calculation. The algorithm “con- All calculations were performed on a single core of a verges” to some high-energy state at Eα=0 = −6.35479. Xeon E5-2650. Theresultingquantumnumberdistribution(redsquares in Fig. 2) shows clear asymmetry both between the left and right parts of the system and the +Sz and −Sz sec- 1. S =1 Heisenberg Chain tors at any given bond. It is also visible that while some states are removed by DMRG3S without enrichment, it Firstly, we consider a S = 1 Heisenberg spin chain cannot add new states: the red squares only occur to- with l =100 sites and periodic boundary conditions im- gether with the black filled circles from the input state. plemented on the level of the Hamiltonian as a simple If we enable enrichment via subspace expansion, i.e. link between the first and last site: take α (cid:54)= 0, DMRG3S quickly converges to a much bet- ter ground state at Eα(cid:54)=0 = −8.6824724. The quantum numbersarenowevenlydistributedbetweentheleft-and 100 right parts of the system and ±Sz symmetry is also re- Hˆ =(cid:88)Sˆi·Sˆ(i+1)%100 . (40) stored. i=1 U(1)symmetriesareexploitedandthecalculationsare forced in the S =0 sector. z B. Application to Physical Systems Thissystemisofparticularinterestas,firstly,itisone of the standard benchmarking systems with well-known In the following subsections, we will compare the two analytic values for the ground-state energy. Secondly, single-site DMRG algorithms CWF and DMRG3S when it is a one-dimensional system where the case of peri- applied to four different physical systems: a S =1 Heis- odic boundary conditions can still be tackled by DMRG. enberg spin chain with periodic boundary conditions, a The larger MPO bond dimension resulting from these 8 100 of 2.6,1.3 and 2.7 for m = 200,400 and 800 respect- DMRG3S; 800 ively between CWF and DMRG3S when considering the DMRG3S; 400 10-1 DMRG3S; 200 runtime to convergence. CWF; 800 Incomparison,the2DMRGalgorithmdoesnothandle CWF; 400 10-2 CWF; 200 theperiodicboundaryconditionswellandyieldsenergies higherthanthesingle-sitealgorithmswithperturbations 10-3 (cf. Tab. I). Runtime to convergence is hence not com- E|0 10-4 parable. /| E|0 E- 10-5 | 2. Dilute Bosons on an Optical Lattice 10-6 We carry on to study bosons in a modulated potential 10-7 of 10 unit cells, each with 16 sites. The cutoff for local 10-8 occupation numbers is nmax =5, resulting in a local site dimension of d=6. The Hamiltonian is given as 10-9 0 10 20 30 0 3000 6000 9000 12000 160 (cid:26) (cid:18) (cid:19) (cid:27) Half-Sweeps CPU Time [s] Hˆ =+(cid:88)nˆ cos2 2πi−0.5 +(nˆ −1) i 16 i i=1 Figure 3. (Colour online) Spin Chain Eq. (40): Normal- 159 ised error in energy as a function of sweeps (left) and CPU (cid:88)(cid:110) (cid:111) − cˆ†cˆ +h.c. . (41) timeused(right)ofthetwosingle-sitealgorithmsatdifferent i i+1 m = 200,400,800. DMRG3S shows both a speed-up and an i=1 improved convergence per sweep compared to CWF, with a This system should be fairly easy for DMRG to handle, long tail of slow convergence very visible for CWF at high as there are only nearest-neighbour interactions. How- accuracies. ever, the large-scale order due to the modulated poten- tial and a very small energy penalty paid for an uneven distribution of bosons was observed to cause badly con- Table I. Spin Chain Eq. (40): Normalised error in energy at convergence and runtime to convergence of all three meth- verged results.7 Manual checks of the states returned by ods. DMRG3S is consistently faster than CWF, whereas the each method were hence done to ensure a proper, equal energiesprovidedby2DMRGarenotcomparableinaccuracy. distribution of bosons throughout the whole system. m=200 m=400 m=800 Thestateisinitialisedwithn=80bosonsintotal. We DMRG3S Energy Error 2.1×10−6 1.0×10−7 7.1×10−9 allowm=50,100,200statesandusetheenergyreference valueE =−103.646757. Allalgorithmsconvergetothis CWF Energy Error 2.8×10−6 1.7×10−7 7.1×10−9 0 value at m=200. 2DMRG Energy Error 1.1×10−5 8.6×10−7 1.0×10−7 Fig. 4 compares CWF and DMRG3S whereas Tab. II DMRG3S Runtime 583s 1935s 3990s additionallylists2DMRG.Sincethebonddimensionsare CWF Runtime 1519s 2695s 11133s relatively small, wedo notexpecta speed-upfrom faster 2DMRG Runtime 762s 3181s 21963s numericaloperations. Instead,theimprovedconvergence behaviour per sweep is responsible for the speed-up of 2 of DMRG3S over CWF at small m. At larger m, CWF PBC similarly arises during the simulation of quasi two- converges better, but numerical operations also become dimensional systems as cylinders. The same applies cheaper for DMRG3S for a speed-up of 2 again. to the non-nearest-neighbour interactions in this system As there are no long-range interactions, 2DMRG also (between the first and last site) and cylindrical systems. fares well with regard to energy accuracy. However, it takes longer to converge than the single-site methods es- Fig. 3 compares the error in energy with respect to pecially at large m, mainly because the eigenvalue prob- thereferencevalueE =−140.148404forDMRG3Sand 0 lem in two-site DMRG is of dimension d larger than in CWF for m = 200,400,800 as a function of sweeps and single-siteDMRG.AcomparisonbetweenDMRG3Sand computation time. 2DMRG leads to a speed-up of up to 3.3 for the case of During the first three to four sweeps, DMRG3S ex- m=200. hibits a smaller convergence rate per sweep, however, compared to the first sweeps of CWF, they also cost negligible CPU time. Afterwards, DMRG3S offers com- parable (at medium accuracies) or much improved (at 3. Fermi-Hubbard Model highaccuracies)convergenceratepersweepascompared to CWF together with a still reduced average runtime As a third example, substantially more expensive cal- per sweep. Combined, these effects lead to a speed-up culations are carried out for a substantially stronger en- 9 100 100 DMRG3S; 200 DMRG3S; 1200 DMRG3S; 100 DMRG3S; 600 10-1 DMRG3S; 50 10-1 DMRG3S; 300 CWF; 200 CWF; 1200 CWF; 100 CWF; 600 10-2 CWF; 50 10-2 CWF; 300 10-3 10-3 E|0 10-4 E|0 10-4 /| /| E|0 E|0 E- 10-5 E- 10-5 | | 10-6 10-6 10-7 10-7 10-8 10-8 10-9 10-9 0 8 16 24 0 150 300 450 600 0 10 20 30 0 1500 3000 4500 6000 7500 9000 Half-Sweeps CPU Time [s] Half-Sweeps CPU Time [s] Figure 4. (Colour online) Bosonic System Eq. (41): Normal- Figure 5. (Colour online) Fermi-Hubbard Eq. (42): Normal- isederrorinenergyfromCWFandDMRG3Sasafunctionof ised error in energy from DMRG3S and CWF as a function sweeps(left)andCPUtimeused(right)form=50,100,200. ofsweeps(left)andCPUtimeused(right)fordifferentbond Again,animprovedconvergencebehaviourathighaccuracies dimensions m = 300,600,1200. The same basic behaviour can be observed, in particular at smaller values of m. The as for the previous systems is repeated, with both improved smallbonddimensionsleadtoasmallerspeed-upduetofaster convergencebehaviourathighaccuraciesandfasternumerical numericaloperations,whichonlybecomesvisibleatm=200. operations. Table II. Bosonic System Eq. (41): Normalised error in en- ergy at convergence and runtime to convergence of all three methods. DMRG3S is again the fastest method with a very Table III. Fermi-Hubbard Eq. (42): Normalised error in en- constantspeed-upof2overCWFandupto3.3over2DMRG. ergy at convergence and runtime to convergence of all three methods. Accuracies are comparable between the different m=50 m=100 m=200 methods, but runtimes vary greatly. DMRG3S Energy Error 2.9×10−6 4.8×10−8 <10−9 m=300 m=600 m=1200 CWF Energy Error 2.3×10−6 3.9×10−8 <10−9 DMRG3S Energy Error 1.5×10−6 7.5×10−8 <10−9 2DMRG Energy Error 1.9×10−6 2.8×10−8 <10−9 CWF Energy Error 1.5×10−6 7.6×10−8 <10−9 DMRG3S Runtime 124s 171s 469s 2DMRG Energy Error 1.3×10−6 6.4×10−8 <10−9 CWF Runtime 260s 397s 951s DMRG3S Runtime 474s 1367s 3955s 2DMRG Runtime 210s 462s 1550s CWF Runtime 1215s 3917s 10122s 2DMRG Runtime 727s 2950s 15596s tangledFermi-Hubbardmodelof100siteswithHamilto- nian   Hˆ =(cid:88)100− (cid:88) (cid:104)cˆ† cˆ +h.c.(cid:105)+nˆ nˆ  . (42) Tab. III summarises all three DMRG implementations. i,σ i+1,σ i,↑ i,↓   Sincethesystemonlyexhibitslocalinteractions,2DMRG i=1 σ=↑,↓ fares well and all methods generally provide compar- Both U(1) and U(1) symmetries are employed, able energies. The difference is therefore in the runtime charge Sz with 50 fermions and Stotal = 0 enforced through the needed to achieve these energies. Compared to CWF, z choice of initial state. Together with the free fermi- DMRG3S achieves a speed-up of ≈ 2.6 consistently at ons from the next section, we can use this system to all m, as the smallest m = 300 is already large enough study how criticality and increased entanglement affect to justify the assumption d2w (cid:28) m in the speed-up of the three methods. numerical operations. In particular, it continues to con- Calculations are done for m = 300,600,1200. All verge quickly at high accuracies whereas CWF develops methods converge to the same value E =−84.2555254 a long tail of slow convergence. The speed-up compared 0 at large m. to 2DMRG is smaller at lower values of m, but increases Fig. 5 compares the two single-site methods while to 3.9 at m=1200. 10 100 is the same as for the Fermi-Hubbard system, namely DMRG3S; 1200 m = 300,600,1200. We used E = −126.602 376 as DMRG3S; 600 0 10-1 DMRG3S; 300 the reference value, since all methods converged to this CWF; 1200 ground-state energy at m=1200. CWF; 600 10-2 CWF; 300 TheresultsinTab.IVandFig.6mostlyfollowthepre- vious results for locally interacting systems: Accuracies 10-3 of all methods are essentially identical, whereas time to E|0 convergence varies between the methods. At small m, E|/|0 10-4 tdhueeretoabreetstoemrceosnpveeerdg-eunpcseobfeDhaMviRoGur3pSeorvsewreCepW,wF,hlearregaeslya E- | significantadvantageofDMRG3Sbecomesvisibleatlar- 10-5 ger m, when numerical operations become cheaper com- pared to the CWF method. Correspondingly, the speed- 10-6 upfromCWFtoDMRG3Sincreasesfrom1.6atm=300 to 2 at m=1200. 10-7 Similarly, the larger numerical cost of two-site DMRG becomes more noticeable at larger m, with the speed-up 10-8 0 10 20 30 0 2000 4000 6000 8000 10000 between 2DMRG and DMRG3S increasing from 1.5 at Half-Sweeps CPU Time [s] m=300 to more than 6 at m=1200. Compared to the non-critical Fermi-Hubbard system Figure 6. (Colour online) Free Fermions Eq. (43): Norm- from Section VIIB3, we observe larger errors in energy alised error in energy from CWF and DMRG3S as a func- atfixedm,asexpected. Correspondingly,asmoreeigen- tion of sweeps (left) and CPU time used (right) at m = values contribute significantly, convergence of both the 300,600,1200. CWF again exhibits a long tail of slow con- eigenvalue solver and the singular value decompositions vergence while DMRG3S converges quickly at all m and all becomesslower,leadingtoaslow-downofallthreemeth- accuracies. ods. Table IV. Free Fermions Eq. 43): Normalised error in energy atconvergenceandruntimetoconvergenceofallthreemeth- VIII. CONCLUSIONS ods. m=300 m=600 m=1200 The new strictly single-site DMRG (DMRG3S) al- DMRG3S Energy Error 5.0×10−6 2.8×10−7 <10−9 gorithm results in a theoretical speed-up of ∼ (d+1)/2 CWF Energy Error 3.8×10−6 2.8×10−7 <10−9 during the optimisation steps compared to the center- 2DMRG Energy Error 3.7×10−6 2.6×10−7 <10−9 matrix wavefunction formalism (CWF), provided that d2w/missmall. Further,convergenceratespersweepare DMRG3S Runtime 533s 1452s 4643s improvedintheimportantandcomputationallymostex- CWF Runtime 863s 2590s 9586s pensive high-accuracy/large-m phase of the calculation. 2DMRG Runtime 794s 4584s 29698s In addition, auxiliary calculations (enrichment, normal- isation, etc.) are sped up and memory requirements are relaxed. 4. Free Fermions Numerical experiments confirm a speed-up within the theoretical expectations compared to the CWF method. Finally,weconsideramodeloffreefermionsonachain The efficiency of single-site DMRG in general compared of 100 sites with Hamiltonian tothetraditionaltwo-siteDMRGwassubstantiatedfur- ther by a large speed-up at comparable accuracies in en- 100 Hˆ =−(cid:88) (cid:88) (cid:104)cˆ† cˆ +h.c.(cid:105) . (43) ergy. i,σ i+1,σ i=1σ=↑,↓ The maximally delocalised wavefunction found in the ACKNOWLEDGMENTS ground-state of this system is notoriously difficult for MPS formats in general to reproduce faithfully. At the WewouldliketothankS.Dolgov,D.Savostyanovand same time, most other parameters are identical (d, l, I.Kuprovforveryhelpfuldiscussions. C.Hubigacknow- m) or very close (w) to those in the Fermi-Hubbard ledges funding through the ExQM graduate school and model from Section VIIB3. The calculation is done us- the Nanosystems Initiate Munich. F. A. Wolf acknow- ing U(1) and U(1) symmetries at half-filling with ledges support by the research unit FOR 1807 of the charge Sz N = 100 fermions and Stotal = 0. The choice of m DFG. z

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.