Table Of Content

Selfconsistent order-N density-functional calculations for very large systems Pablo Ordejoń,1,2 Emilio Artacho,3 and José M. Soler3 1Department of Physics and Materials Research Laboratory, University of Illinois, Urbana, Illinois 61801, USA 2Departamento de F´ısica, Universidad de Oviedo, C/ Calvo Sotelo s/n, 33007 Oviedo, Spain 3Instituto de Ciencia de Materiales Nicolás Cabrera and Departamento de F´ısica de Materia Condensada, C-III Universidad Autónoma de Madrid, 28049 Madrid, Spain. (February 6, 2008) We present a method to perform fully selfconsistent density-functional calculations, which scales linearly with the system size and which is well suited for very large systems. It uses strictly localized pseudoatomic orbitals as basis functions. The sparse Hamiltonian and overlap matrices are calculated with anO(N) effort. Thelongrangeselfconsistent potentialanditsmatrixelementsare computed in a real-space grid. The other matrix elements are directly calculated and tabulated 6 as a function of the interatomic distances. The computation of the total energy and atomic forces 9 is also done in O(N) operations using truncated, Wannier-like localized functions to describe the 9 occupied states, anda band-energyfunctional which is iteratively minimized with noorthogonality 1 constraints. Weillustratethemethodwithseveralexamples,includingcarbonandsiliconsupercells n with up to 1000 Si atoms and supercells of β-C3N4. We apply the method to solve the existing a controversy about the faceting of large icosahedral fullerenes by performing dynamical simulations J on C60, C240, and C540. 2 1 v A large effort has been devoted in the last few years ing always in the range of a few valence orbitals per 2 to develop approximate methods to solve the electronic atom.6 As a first step, in this work we use minimal basis 0 structureoflargesystemswithacomputationalcostpro- sets of one s and three p orbitals per atom, the exten- 0 portional to its size.1 Several approaches are now suf- sion to larger bases being perfectly possible within the 1 ficiently accurate and robust to obtain reliable results present formulation. The choice of a basis obviously im- 0 6 for systems with thousands of atoms. So far, how- pliesa possibleerrorassociatedto itsincompleteness. In 9 ever, most of these schemes have been useful only with the same way as for the error due to the linear scaling / simple Hamiltonians, like empirical tight-binding mod- algorithm, the error in the basis can be reduced at the t a els, which provide an ideal setting for order-N calcu- expense of an increase in computational effort. Its mag- m lations. First-principles order-N calculations have been nitude should be carefully checked, but also compared - performed mainly in the non-selfconsistent Harris func- with other sources of error to ensure that an increase of d 2 tionalversion ofthelocaldensityapproximation(LDA) the basis is really worthwhile. n o for electronic exchange and correlation (XC) using min- Our method uses standard LDA techniques7 for the c imal bases.1,3 Linear scaling algorithms in fully selfcon- valence electrons, the core electrons being replaced by v: sistentLDAhavealsobeentried,4 butthe resultsarefar pseudopotentials.8 The basis orbitals used in this work fromthelinearscalingregime,duetotherelativelysmall i are the s andp pseudoatomic orbitalsdefined by Sankey X number of manageable atoms in those simulations. Her- andNiklewski,9 inthe contextof non-selfconsistentHar- r nandez et al.5 havesuccessfullyproducedLDAresultsin a ris functional methods. These are slightly excited pseu- largesiliconsystemsusingareal-spacegridmethod. The doatomic orbitals φ (r), obtained by solving the valence µ computational requirements that this kind of approach electron problem in the isolated atom, with the same demands are, however,extremely large, and calculations pseudopotential and LDA approximations, and with the must be performed in massive computational platforms. boundary condition that the atomic orbitals are strictly We have developed a selfconsistent density-functional localized, vanishing outside a given radius r . This ra- c formulation with linear scaling, capable of producing re- diuscutoffisinprincipleorbitaldependent,butwedonot sults for very large systems, whose computational de- make explicit this dependence in the equations only for mands are not overwhelmingly large, so that systems simplicity in the notation. The great advantage of these with many hundreds of atoms can be treated in mod- orbitalsisthattheygiverisetosparseoverlapandHamil- estcomputationalplatformslikeworkstations,andmuch tonian matrices (since matrix elements between distant larger systems can be treated in massive platforms. The orbitalsexactlyvanishexactly)andtheydisplaythesame method is based on the linear combination of atomic or- structureas inconventionaltight-binding. The extentof bitals (LCAO) approximation as basis of expansion of the interactions and the sparseness of the matrices de- the electronic states. Non-orthogonal LCAO bases are pendonthe cutoff radiusr ofeachatom. Thesearenot c very efficient, reducing the number of variables dramat- critical as long as the maxima of the atomic wave func- ically, compared to plane-wave (PW) or real-space-grid tions are well within r . For ananalysis of the quality of c approaches,so that larger systems can be studied. Also, pseudoatomic orbitals as a basis for solid state calcula- LCAOcanprovideupto extremelyaccuratebases,stay- tions we refer the reader to Ref. 10. 1 dependentofthechargedensityn(r),andtheirmatrixel- e 0.3 ementsbetweenatomicorbitalscanbeexpressedassums 0.2 3 eeeee coefnttweorc(ehnφtµe|rV(NSLµ(νr=−hφRµi|)φ|φνiνianadndhφµhφ|pµ2|/V2NmA|(φrν−i)Rori)t|φhνreie) ∆E (eV) u e integrals, which only depend on the relative positions of pairs or triplets of atoms. We follow the method pro- 0.1 e posedbySankeyandNiklewski9 tocompute allthesein- e e tegrals: they arecalculatedbeforehandandtabulated as 0.0 3u3u3u 3u 3u 3u e e3ue a function of the relative position of the centers. These tables are used during the simulation, to calculate all 0 25 50 75 100 the non-zero integrals by interpolation. The details of E cut (Ry) the procedure can be found in Ref. 9. Since all these integrals are zero for distant enough atoms, their number scales linearly with the size of the system, as well as FIG.1. Convergence of the total energy per carbon atom the computation time. The contributions of these terms vs grid fineness (given by the cutoff Ecut of the plane waves to the Hamiltonian are computed only once for a given that it can represent). Theresults of thepresent method are atomic configuration, since they do not depend on the shown for a diamond supercell with 64 atoms (full circles) selfconsistent charge. andforaC3 cluster(diamonds). Opencirclesshowresultsof conventional plane-wavecalculations for diamond.12 The matrix elements of the Hartree potential VHδ(r) createdbythechargeδn(r)andtheexchange-correlation TheLDAHamiltonianmatrixelementsforagivenpar- potential VXC[n(r)] both depend on the selfconsistent ticle density are obtained using a combination of tech- charge. To calculate these integrals we compute n0(r), niques, adopting the most convenient one for each term n(r) and δn(r) for a given LCAO density matrix at of the Hamiltonian. In a prior step, to avoid dealing the points of a regular grid in real space. This is with the long range of the pseudopotentials, we rewrite straightforwardsincethebasisorbitalsaredefinedinreal the Kohn-Sham Hamiltonian by adding and subtracting space. Poisson’s equation for the Hartree potential can theHartreepotentialcreatedbytheneutral-atomcharge be then solved by the standard fast Fourier transform n0(r), defined as (FFT)method,assumingasupercellgeometry,orbythe multigrid method.11 In spite of its NlogN scaling, we n0(r)= nNi A(r−Ri) (1) presently use FFT’s for simplicity, since this part rep- Xi resents a minor contribution to the total computational load. Note that only two FFT’s are necessary per cy- NA whereirunsovertheatomsinthesystem,andn isthe i cle of selfconsistency (SCF cycle), in contrast with PW spherical atomic charge density of the atom i in its neu- based calculations, where an FFT is required for each 0 tral, isolated state with ρ electrons on each orbital φ . µ µ electronic state. The LDA XC potential is trivially com- Ifwedefineδn(r)=n(r)−n0(r)wheren(r)istheactual puted on each point of the grid. Once the value of the chargedensity,the Hartreepotentialcanbe decomposed Hartreeandthe XCpotentialsareknownateverypoint, into two contributions VHδ and VH0, created by δn(r) and the integralshφ |Vδ|φ i and hφ |V |φ i are computed n0(r) respectively. Using Eq. (1), VH0 can be expressed by direct summµatiHon oνn the griµd. XTCheseν sums are care- as a sum of atomic contributions. Also, the pseudopo- fullydonetominimizetheamountofnumericalworkload tential is decomposed into a short-range non-local term involved. Only the non-zero integrals (between orbitals VNL and a long-range local term VL. Following Sankey on atoms closer than 2r ) are computed, and only the and Niklewski,9 we define the neutral atom potential of c points of the grid for which both orbitals are non-zero a given atom at R as i contribute to each integral. We use sparse-matrix mul- nNA(r−R ) tiplication techniques optimized for this class of opera- VNA(r−Ri)=VL(r−Ri)+e2Z i |r−r′| i dr′. (2) tions. Asaresult,thecomputationoftheintegralsscales linearly with the size of the system. VNA is short ranged, since the core attraction and the Itisimportanttostressthattheconvergencewithgrid electron Coulomb repulsion of the neutral atom charge spacing of our method is different from that in standard cancel each other beyond rc. The Kohn-Sham Hamilto- PW calculations, which are known to require large PW nian is finally obtained as cutoffsforsystemscontainingatomswithhardpseudopo- p2 tentials. In Figure 1 we show the convergence of the to- HKS = + [VNL(r−Ri)+VNA(r−Ri)] talenergyper atom(referredto the convergedvalue) for 2m Xi carbon,asafunctionofEcut,the kineticenergycutoffof +VHδ(r)+VXC(r) (3) the plane waves that the grid can represent. Full circles are for a diamond supercell of 64 atoms, whereas dia- The overlap,kinetic energy term, neutral atompoten- monds are for a cluster of 3 carbon atoms in a supercell tial andnon-localpartof the pseudopotential, are allin- of15×15×15˚A3. Inbothcases,theresultsareconverged 2 to below 2 meV/atom for a cutoff of 30 Ry. This is in sharp contrast with results of PW calculations12 (open 200 c 250 (cid:27) circles) in which the cutoff necessary to achieve conver- 150 200 gence (with the same pseudopotential) is much higher. SIZE s CPU Note, moreover,that the energy cutoff in our case refers 100 c 150 time (Mb) to the grid representationof the charge density, whereas 50 c s 100 (min) inthePWcaseitreferstothewavefunctions,whichim- c - pliesanevenhigher(fourtimes)cutoffinthechargeden- 0 s 50 sity. Thereasonforthefastconvergenceofourapproach s 0 isthatmostoftheHamiltonianterms(mostimportantly 0 250 500 750 1000 thekineticenergyandtheneutralatompotential)arenot Number of atoms computed in the grid. FIG.2. CPUtimeperSCFcycleandjobmemoryforasim- Once the Kohn-ShamHamiltonian has been obtained, we use a recently proposedorder-N method13,14 to com- ulation of Si supercells with different sizes. Times measured in an IBM PowerPC with 17 Mflops (Linpack 100x100). pute the band structure energy EBS (sum of occupied eigenvalues). In this approach, a modified band energy functional13,14 is minimized15 with respect to the elec- In molecular dynamics (MD) simulations and geomet- rical optimizations the atomic forces are needed. We tronic orbitals by means of a conjugate gradients (CG) computethemusingavariationoftheHellman-Feynman algorithm,toyieldEBS. Theorthonormalityoftheoccu- theorem,whichincludesPulay-likecorrectionstoaccount piedstatesdoesnotneedtobeimposed,butitisobtained for the fact that the basis set is not complete and moves as a result of the minimization of the energy functional. with the atoms. The force on atom i is The elimination of the orthogonalizationis the first step to achieve an order-N scaling. The second is the use of F =− ρ ∂Hµ0ν + E ∂Sµν − ∂Uii−ee localized,Wannier-likewavefunctions(LWF)todescribe i µν ∂R µν ∂R ∂R X i X i i µν µν the electronicstatesenteringthe minimizationofthe en- ∂φ ∂φ ebregyyonfudnactigoinvaenl. cTurtuonffcaRticonfroomf ththeseecleonctaelrizeodf tfuhnecLtiWonFs +2Xµ ρ0µh∂Rµi|VHδ|φµi−2Xµνρµνh∂Rµi|VHδ+VXC|φνi (6) provides a linear scaling algorithm. The errors involved in this truncation, which can be reduced arbitrarily by where H0 = p2/2m+VNL +VNA, and ρµν and Eµν are 9 the density and energy-density matrices, respectively. increasingthevalueofR ,areanalyzedindetailinRef1. c The first three terms are calculated interpolating the ta- After the band energy has been minimized and the 9 ble data, whereas the last two terms are computed by LWF’s obtained, the new charge density is computed, numerical integration in the grid, as was done for the completing a so-called SCF cycle. From the density, a matrix elements of the Hartree and XC potentials in the new Hamiltonian is produced, the procedure being re- Kohn-Sham Hamiltonian. peated until selfconsistency in the charge density or the In order to show the linear scaling of the method, we Hamiltonian is achieved. At this point, the total energy have performed calculations on supercells of silicon in can be computed as the diamond structure, with different numbers of atoms e2 e2 from 64 to 1000. Only the Γ point was used to sam- Etot = EBS− VH(r)n(r)dr+ VH0(r)n0(r)dr ple the Brillouin zone, the cutoff for the charge density 2 Z 2 Z grid was 12 Ry, and the LWF’s were truncated at 4.5 ˚A. + [ǫXC(n)−VXC(n)]n(r)dr+Uii−ee (4) Figure 2 shows the linear behavior of the CPU time and Z memory requirements with the number of atoms. The CPU time shown represents the average cost to perform whereVH(r)is theHartreepotentialofthe selfconsistent a SCF step in a MD simulation, including the calcula- charge n(r), and, following Sankey and Niklewski,9 we tion of the charge density and Hamiltonian matrix el- have defined ements, the minimization of the band structure energy, andthecalculationoftheatomicforces. Thebandstruc- Uii−ee = e22Xll′ ′|RZl−lZRl′ l′| − e22 Z VH0(r)n0(r) (5) taunraeveenrearggeyomf2in0imCiGzaittieornatwioitnhsi,nwehaiclehtShCeFnucymcbleerreoqfuSirCeFd cycles depends largely on the simulation temperature, As in the case of the Hamiltonian, we have added and lengthofthe time stepandmixing algorithmforselfcon- subtracted the electrostatic energy of the neutral atom sistency. So far,in comparablesimulationconditions,no charge n0(r) to obtain Eq. (4). The advantage,again, is significant dependence of the number of CG iterations thatUii−ee canbeexpressedasasumofshortrangecon- and SCF cycles with the size of the system has been ob- tributions,whichiseasytoevaluateinO(N)operations,9 served. As we can see, in the present method both the avoidingtheproblemsrelatedwiththelongrangecharac- CPUtimeandmemoryrequirementsaresmallenoughto teroftheioniccoreinteractions. Theintegralsappearing permit the calculation of a system of 1000 silicon atoms in Eq. (4) are calculated in the real space grid. in a very modest workstation. 3 TABLEI. Averageradius(r¯),standard(σs)andmaximum In conclusion, we have presented an efficient method deviation [σm =(rmax−rmin)/2] of radii, and non-planarity for selfconsistent LDA calculations with linear scaling. angle3φaroundpentagons(goingfrom0◦foraplanarpentag- Wehaveanalyzedtheperformanceversussystemsizeand onalsiteto12◦ for atruncatedicosahedron) forthefullerene grid cutoff, and shown that simulations of systems with clusters. We compare the results of the present work with hundreds of atoms are possible with small workstations. those of Itoh et al.,16 obtained with theHarris functional This should open the possibility of very large scale ab This work Itoh et al.16 initio simulations in the near future. r¯(˚A) σs/r¯ σm/r¯ φ r¯(˚A) σs/r¯ σm/r¯ φ We acknowledge R. M. Martin and Paul von Allmen ◦ ◦ C60 3.59 0.000 0.000 12.0 3.55 0.000 0.000 12.0 for many useful discussions, and D. A. Drabold and O. C240 7.18 0.023 0.027 8.5◦ 7.06 0.021 0.028 7.9◦ F.Sankey for allowingus the use ofmany oftheir codes. ◦ ◦ C540 10.69 0.038 0.054 9.6 10.53 0.033 0.053 9.2 P. O. is indebted to R. M. Martin and J. B. Adams for continuous support and encouragement. This work was partially supported by DOE Grant No. DEFG 02- As an example of a system with partially ionic char- 91ER45439andDGICYT(Spain)GrantNo. PB92-0169. acter, and with atoms with compact orbitals, we have performed calculations on the β phase of C3N4, which was proposed as a potentially very hard material by Liu and Cohen.17 The calculations were done in supercells of 42 and 224 atoms, with nearly identical results. A cutoff of 200 Ry for the charge density grid was used. We obtain an accuracy better than 1% in both the lat- 1P.Ordejón,D.A.Drabold,R.M.MartinandM.P.Grum- tice constantsand the severalinequivalent bond lengths, bach,Phys.Rev.B51,1456(1995),andreferencestherein. acanldcu1la0t%ionins.1t7heTbhuelske mreosudlutlsusc,oncotrmasptarwedithtothoothseerofLDthAe 23JD..HYaorrrkis,,JP.hPy.s.LRu,eva.nBd1W3., 1Y7a7n0g,(1P9h8y5)s.. Rev. B 49, 8526 non-selfconsistentHarrisfunctional,whichyielderrorsof (1994); J. P. Lu and W. Yang, Phys. Rev. B 49, 11421 5%and16%for the distances andbulk modulus, respec- (1994). tively, showing that selfconsistency is essential to obtain 4W. Hierse and E. B. Stechel, Phys. Rev. B. 50, 17811 reliable results in this partially polar system. (1994). We have applied our method to study the struc- 5E. Hernandez and M. Gillan, Phys. Rev. B 51, 10157 ture of large, single shell icosahedral fullerene clusters. (1995), and unpublished. Theseareimportanttounderstandtheobservedspheric- 6R. Poirier, R. Kari, and I. G. Csizmadia, Handbook for ity of multishell fullerenes (buckyonions). For the sin- gaussian basis sets, Elsevier Science (1985). gle shell clusters, elasticity theory, as well as empiri- 7W.KohnandL.J.Sham,Phys.Rev.140,1133(1965).The cal potential calculations, predict markedly polyhedral exchange-correlation potential is taken from D. M. Ceper- shapes. CalculationsperformedbyItohetal.,16usingthe ley and G. J. Alder, Phys. Rev. Lett. 45, 566 (1980), as Harris-functional order-N method,1 agree qualitatively parametrized by J. Perdew and A. Zunger, Phys. Rev. B with these results. However, similar non-selfconsistent 23, 5048 (1981). calculations3 predict that even the large clusters are 8G. B. Bachelet, D. R. Hamman, and M. Schlu¨ter, Phys. spherical. Here we have repeated the calculations with Rev.B 26, 4199 (1982). selfconsistent LDA using the present method, thus im- 9O. F. Sankey and D. J. Niklewski, Phys. Rev. B 40, 3979 (1989). proving over the non-selfconsistent nature of the former 10D. Sanchez-Portal, E. Artacho, and J. M. Soler, unpub- calculations. Usingadynamicalquenchingalgorithm,we lished; Solid State Commun. 95, 685 (1995). have computed the equilibrium structure of three icosa- 11W.H.Press et al.,Numerical Recipes (CambridgeUniver- hedralfullereneclusters: C60,C240 andC540. Asupercell sity,New York,1989). geometrywasassumed,with acubic cellwithsides of12 12N. Troullier and J. L. Martins, Phys. Rev. B 43, 1993 ˚A for C60, 22 ˚A for C240 and 34 ˚A for C540. The calcu- (1991). lations were done using a cutoff of 100 Ry for the repre- 13P. Ordejón, D. A. Drabold, M. P. Grumbach, and R. M. sentation of the charge density in reciprocal space, and Martin,Phys.Rev.B48,14646(1993);F.Mauri,G.Galli, a different localization radius for σ and π type Wannier and R. Car, Phys. Rev.B 47, 9973 (1993). functions (2.5 and 4.0 ˚A, respectively).16 Increasing the 14J. Kim, F. Mauri, and G. Galli, Phys. Rev. B. 52, 1640 localization radius to 4.1 ˚A (both for σ and π), and/or (1995). increasing the grid cutoff to 150 Ry in the simulations 15The variables of the minimization are the coefficients cµi changesthefinalrelativedistancesinlessthan0.4%. The oftheexpansionofthelocalized wavefunctionsintermsof rsuesltusltasraerevesruymsmimairliazredtointhToasbeleobI.taWineedsebeythItaothoeutrarle.-, 16tSh.eItaothom, Pic.oOrbrditeajlósn:,ψDi=. AP. µDcrµaibφoµld., and R. M. Martin, andconfirmthat,exceptforC60,the singleshellclusters unpublished. tend to be polyhedral,insteadofspherical,andthatthis 17A.Y.LiuandM.L.Cohen,Phys.Rev.B41,10727(1989); polyhedral character is more pronounced as the cluster A. Reyes-Serrato, D. H. Galván, and I. L. Garzón, Phys. size increases. Rev.B 52, 6293 (1995), and references therein. 4

Selfconsistent order-N density-functional calculations for very large systems PDF

4 Pages·0.13 MB·

by Pablo Ordejon

#additional_collections #journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Selfconsistent order-N density-functional calculations for very large systems

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.