ebook img

The regularization theory of the Krylov iterative solvers LSQR and CGLS for linear discrete ill-posed problems, part I: the simple singular value case PDF

0.63 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The regularization theory of the Krylov iterative solvers LSQR and CGLS for linear discrete ill-posed problems, part I: the simple singular value case

myjournalmanuscriptNo. (willbeinsertedbytheeditor) The regularization theory of the Krylov iterative solvers LSQR and CGLS for linear discrete ill-posed problems, part I: the simple singular value case ZhongxiaoJia 7 1 0 Received:date/Accepted:date 2 n a Abstract Forthelarge-scalelineardiscreteill-posedproblemmin Ax b orAx=b J k − k with b contaminated by a white noise, the Lanczos bidiagonalization based LSQR 0 2 methodanditsmathematicallyequivalentConjugateGradient(CG)methodforATAx= ATb are most commonly used. They have intrinsic regularizing effects, where the ] numberkofiterationsplaystheroleofregularizationparameter.However,therehas A been no answer to the long-standing fundamental concern by Bjo¨rck and Elde´n in N 1979:forwhich kindsofproblemsLSQR andCGLS canfindbest possibleregular- . ized solutions? Here a best possible regularizedsolution means that it is at least as h t accurateasthebestregularizedsolutionobtainedbythetruncatedsingularvaluede- a composition(TSVD)methodorstandard-formTikhonovregularization.Inthispaper, m assuming that the singular values of A are simple, we analyze the regularizationof [ LSQRforseverely,moderatelyandmildlyill-posedproblems.Weestablishaccurate 1 estimatesforthe2-normdistancebetweentheunderlyingk-dimensionalKrylovsub- v spaceandthek-dimensionaldominantrightsingularsubspaceofA.Forthefirsttwo 8 kindsofproblems,we thenprovethatLSQRfindsa bestpossibleregularizedsolu- 0 tionatsemi-convergenceoccurringatiterationk andthat,fork=1,2,...,k ,(i)the 7 0 0 k-stepLanczosbidiagonalizationalwaysgeneratesanearbestrankkapproximation 5 0 toA;(ii)thekRitzvaluesalwaysapproximatethefirstklargesingularvaluesinnat- . uralorder;(iii) the k-step LSQR always capturesthe k dominantSVD components 1 0 ofA.Forthethirdkindofproblem,weprovethatLSQRgenerallycannotfindabest 7 possible regularized solution. We derive estimates for the entries of the bidiagonal 1 matricesgeneratedbyLanczosbidiagonalization,whichcanbepracticallyexploited : v to identify if LSQR finds a best possible regularizedsolution at semi-convergence. i Numericalexperimentsconfirmourtheory. X r a ThisworkwassupportedinpartbytheNationalScienceFoundationofChina(No.11371219) ZhongxiaoJia DepartmentofMathematicalSciences,TsinghuaUniversity,100084Beijing,China. E-mail:[email protected] 2 ZhongxiaoJia Keywords Discreteill-posed fullorpartialregularization bestornearbestrank · · kapproximation TSVDsolution semi-convergence Lanczosbidiagonalization · · · · LSQR CGLS · MathematicsSubjectClassification(2000) MSC65F22 65R32 15A18 65J20 · · · · 65R30 65F10 65F20 · · 1 IntroductionandPreliminaries Considerthelineardiscreteill-posedproblem min Ax b or Ax=b, A Rm n, b Rm, (1) × x Rnk − k ∈ ∈ ∈ wherethenorm isthe2-normofavectorormatrix,andAisextremelyillcondi- k·k tionedwithitssingularvaluesdecayingtozerowithoutanoticeablegap.(1)mainly arisesfromthediscretizationofthefirstkindFredholmintegralequation Kx=(Kx)(t)= k(s,t)x(t)dt=g(s)=g, s W Rq, (2) W ∈ ⊂ Z wherethekernelk(s,t) L2(W W )andg(s)areknownfunctions,whilex(t)isthe ∈ × unknownfunctiontobesought.Ifk(s,t)isnon-degenerateandg(s)satisfiesthePi- cardcondition,thereexiststheuniquesquaresintegrablesolutionx(t);see[23,44,47, 69,76].Hereforbrevityweassumethatsandt belongtothesamesetW Rq with ⊂ q 1. Applications include image deblurring,signal processing, geophysics, com- ≥ puterizedtomography,heatpropagation,biomedicalandopticalimaging,groundwa- termodeling,andmanyothers;see,e.g.,[1,22,23,47,57,63,64,69,76,77,104].The theoryand numericaltreatmentsof integralequationscan be foundin [69,70]. The right-handsideb=bˆ+eisnoisyandassumedtobecontaminatedbyawhitenoisee, causedbymeasurement,modelingordiscretizationerrors,wherebˆ isnoise-freeand e < bˆ .Becauseofthepresenceofnoiseeandtheextremeill-conditioningofA, tkheknaivkeksolutionx =A†bof(1)bearsnorelationtothetruesolutionx =A†bˆ, naive true where † denotes the Moore-Penrose inverse of a matrix. Therefore, one has to use regularizationtoextractabestpossibleapproximationtox . true Inprinciple,regularizinganill-posedproblemistoreplaceitbyawell-posedone, suchthattheerroriscompensatedbythegaininstability.Inotherwords,regulariza- tionistocompromisetheerrorandstabilityasbestaspossible.Forawhitenoisee, throughoutthepaper,wealwaysassumethatbˆ satisfiesthediscretePicardcondition A†bˆ C withsomeconstantCfornarbitrarilylarge[1,27,41,42,44,47,64].Itis k k≤ ananalogofthePicardconditioninthefinite dimensionalcase;see,e.g.,[41],[44, p.9],[47,p.12]and[64,p.63].Thetwodominatingregularizationapproachesareto solvethefollowingtwoessentiallyequivalentproblems min Lx subjectto Ax b =min (3) x Rnk k k − k ∈ andgeneral-formTikhonovregularization(cf.[88,97,98]) min Ax b 2+l 2 Lx 2 (4) x Rn{k − k k k } ∈ RegularizationTheoryofLSQRandCGLS 3 withl >0theregularizationparameterforregularizedsolutions[44,47].Asuitable choiceofthematrixLisbasedona-priorinformationonx ,andtypicallyLiseither true theidentitymatrix,adiagonalweightingmatrix,ora p ndiscreteapproximationof × a firstorsecondorderderivativeoperator.Particularly,if L=I, theidentitymatrix, (4)isstandard-formTikhonovregularization. ThecaseL=I isofmostcommoninterestsandourconcerninthispaper.From nowon,we alwaysassume L=I,forwhichthe solutionsto (1), (3) and(4) canbe fullyanalyzedbythesingularvaluedecomposition(SVD)ofA.Let S A=U VT (5) 0 (cid:18) (cid:19) betheSVDofA,whereU =(u ,u ,...,u ) Rm mandV =(v ,v ,...,v ) Rn n 1 2 m × 1 2 n × areorthogonal,S =diag(s ,s ,...,s ) R∈n n withthesingularvaluess >∈s > 1 2 n × 1 2 >s >0assumedtobesimplethrough∈outthepaper,andthesuperscriptT denotes n ··· thetransposeofamatrixorvector.Then x =(cid:229) n uTi bv =(cid:229) n uTi bˆv +(cid:229) n uTi ev =x +(cid:229) n uTi ev (6) naive s i s i s i true s i i=1 i i=1 i i=1 i i=1 i withkxtruek=kA†bˆk= (cid:229) ni=1|usTiib2ˆ|2 1/2. The discrete Picard(cid:16)condition m(cid:17)eans that, on average, the Fourier coefficients uTbˆ decayfasterthans andenablesregularizationto computeusefulapproxima- | i | i tions to x , which results in the followingpopular model that is used throughout true Hansen’sbooks[44,47]andthecurrentpaper: uTbˆ =s 1+b , b >0, i=1,2,...,n, (7) | i | i where b is a model parameter that controls the decay rates of uTbˆ . Hansen [47, | i | p.68] points out, “while this is a crude model, it reflects the overallbehavior often found in real problems.” One precise definition of the discrete Picard condition is uTbˆ =t s 1+zi withcertainconstantst 0, z >0, i=1,2,...,n.Weremarkthat o|nice|theti i>0andz donotdiffergreait≥ly,sucihdiscretePicardconditiondoesnot i i affectourclaims,ratheritcomplicatesderivationsandformsoftheresults. The white noise e has a number of attractive properties which play a critical role in the regularization analysis: Its covariance matrix is h 2I, the expected val- uesE( e 2)=mh 2 andE(uTe)=h ,i=1,2,...,n,and e √mh and uTe h , i=k1,k2,...,n; see, e.g.,|[4i4,|p.70-1] and [47, p.41-2].kThke≈noise e thus|aiffe|c≈ts uTb, i=1,2,...,n, moreorlessequally.With(7), relation(6) showsthatforlarge i singular values uTbˆ /s is dominantrelative to uTe/s . Once uTbˆ uTe from some i onwards|,tihe|smiall singularvaluesmagni|fyi u|Tei/s , and| tihe|n≤oi|seie|domi- nates uTb/s andmustbesuppressed.Thetransitio|nipo|intki issuchthat | i | i 0 uT b uT bˆ > uT e h , uT b uT e h ; (8) | k0 |≈| k0 | | k0 |≈ | k0+1 |≈| k0+1 |≈ see[47,p.42,98]andasimilardescription[44,p.70-1].Thes arethendividedinto k thek largeonesandthen k smallones. 0 0 − 4 ZhongxiaoJia ThetruncatedSVD(TSVD)method[41,44,47]dealswith(3)bysolving min x subjectto A x b =min, k=1,2,...,n, (9) k k k k − k where A =U S VT is the best rank approximation k to A with respect to the 2- k k k k normwithU =(u ,...,u ),V =(v ,...,v )andS =diag(s ,...,s );itholdsthat k 1 k k 1 k k 1 k A A =s (cf.[10,p.12]).andxtsvd =A†b,calledtheTSVDsolution,solves k − kk k+1 k k (9).An crucialobservationisthatxtsvd is theminimum-normleastsquaressolution k to min A x b that perturbs A to A in (1), and we will frequently exploit this k k x Rnk − k inte∈rpretationlater. Basedontheabovepropertiesofthewhitenoisee,itisknownfrom[44,p.70-1] and[47,p.71,86-8,95]thattheTSVDsolutions (cid:229)k uTibv (cid:229)k uTibˆv, k k ; s i s i 0 xtsvd =A†b=i=1 i ≈i=1 i ≤ (10) k k  (cid:229)k usTibvi (cid:229)k0 usTibˆvi+ (cid:229)k usTievi, k>k0, i=1 i ≈i=1 i i=k0+1 i andxtsvd isthebestTSVDregularizedsolutionto(1),whichbalancestheregulariza- k0 tionandperturbationerrorsoptimallyandstabilizestheresidualnorms Axtsvd b k k − k fork notclosetonafterk>k .Theindexk playstheroleoftheregularizationpa- 0 rameterthatdetermineshowmanylargeSVDcomponentsofAareusedtocompute aregularizedsolutionxtsvd to(1). k Thesolutionxl oftheTikhonovregularizationhasafilteredSVDexpansion xl =(cid:229) n fiusTi bvi, (11) i=1 i s 2 wherethe f = i arecalledfilters.TheTSVDmethodisaspecialparameterfil- i s 2+l 2 i teredmethod,where,inxtsvd,wetake f =1, i=1,2,...,kand f =0, i=k+1,...,n. k i i Theerrorxl xtrue canbewrittenasthesumoftheregularizationandperturbation errors, and an−optimal l aims to balance these two errors and make the sum of opt their normsminimized[44,47,69,104]. The best possible regularizedsolution xl opt retainsthek dominantSVD componentsanddampenstheothern k smallSVD 0 0 − componentsasmuchaspossible[44,47].Apparently,theabilitytoacquireonlythe largestSVDcomponentsofAisfundamentalinsolving(1). Anumberofparameter-choicemethodshavebeendevelopedforfindingl or opt k ,suchasthediscrepancyprinciple[75],theL-curvecriterion,whoseusegoesback 0 toMiller[74]andLawsonandHanson[72]andistermedmuchlaterandstudiedin detailin[43,49],andthegeneralizedcrossvalidation(GCV)[33,105];see,e.g.,[5, 44,47,64,66,68,79,89,104]fornumerouscomparisons.Allparameter-choicemeth- odsaimtomake f/s notsmallfori=1,2,...,k and f/s 0fori=k +1,...,n. i i 0 i i 0 ≈ Eachofthesemethodshasitsownmeritsanddisadvantages,andnooneisabsolutely reliable for all ill-posed problems.For example,some of the mentionedparameter- choicemethodsmayfailtofindaccurateapproximationstol ;see[37,103]foran opt RegularizationTheoryofLSQRandCGLS 5 analysisontheL-curvemethodand[44]forsomeotherparameter-choicemethods. Afurtherinvestigationonparamater-choicemethodsisnotourconcerninthispaper. TheTSVDmethodandthestandard-formTikhonovregularizationproducevery similarsolutionswithessentiallytheminimum2-normerror,i.e.,theworst-caseerror [69, p.13]; see [102], [42], [44, p.109-11] and [47, Sections 4.2 and 4.4]. Indeed, for an underlying linear compact equation Kx =g, e.g.,(2), with the noisy g and truesolutionx (t),underthesourceconditionthatitssolutionx (t) R(K )or true true ∗ x (t) R(K K),therangeoftheadjointK ofK orthatofK K,whi∈chamounts true ∗ ∗ ∗ ∈ toassumingthatx (t)oritsderivativeissquaresintegrable,theerrorsofthebest true regularizedsolutionsbytheTSVDmethodandtheTikhonovregularizationareorder optimal,i.e.,thesameorderastheworst-caseerror[69,p.13,18,20,32-40],[77,p.90] and [104, p.7-12].These conclusionscarriesoverto (1) [104, p.8]. Therefore,both xlopt and xtks0vd are best possible solutions to (1) under the above assumptions, and anyofthem canbe takenas thereferencestandardwhenassessing the regularizing effectsofaniterativesolver.Forthesakeofclarityandanalysis,wewilltakextsvd as k0 thestandardreference. For(1)large,theTSVDmethodandtheTikhonovregularizationmethodaregen- erallytoodemanding,andonlyiterativeregularizationmethodsarecomputationally viable. A major class of methods has been Krylov iterative solvers that project (1) ontoa sequenceof low dimensionalKrylovsubspacesandcomputesiteratesto ap- proximate x [1,23,32,36,44,47,69]. Of Krylov iterative solvers, the CGLS (or true CGNR)method,whichimplicitlyappliestheCGmethod[34,51]toATAx=ATb,and itsmathematicallyequivalentLSQRalgorithm[85]havebeenmostcommonlyused. TheKrylovsolversCGME(orCGNE)[10,11,19,36,38]andLSMR[11,25]arealso choices, which amount to the CG method applied to min AATy b or AATy=b k − k withx=ATyandMINRES[84]appliedtoATAx=ATb,respectively.TheseKrylov solvershavebeenintensivelystudiedandknowntohavegeneralregularizingeffects [1,21,32,36,38,44,47,52,53]andexhibitsemi-convergence[77,p.89];seealso[10, p.314], [11, p.733], [44, p.135] and [47, p.110]: The iterates converge to x and true theirnormsincreasesteadily,andtheresidualnormsdecreaseinaninitialstage;af- terwardsthenoiseestartstodeterioratetheiteratessothattheystarttodivergefrom x andinsteadconvergetox ,whiletheirnormsincreaseconsiderablyandthe true naive residual norms stabilize. If we stop at the right time, then, in principle, we have a regularizationmethod,wheretheiterationnumberplaystheroleoftheregularization parameter. Semi-convergenceis due to the fact that the projected problem starts to inherittheill-conditioningof(1)fromsomeiterationonwards,andasmallsingular valueoftheprojectedproblemamplifiesthenoiseconsiderably. TheregularizingeffectsofCG typemethodswere noticedbyLanczos[71]and were rediscoveredin [62,92,96]. Based on these works and motivated by a heuris- tic explanation on good numerical results with very few iterations using CGLS in [62],and realizingthatsuchan excellentperformancecanonlybe expectedif con- vergenceto the regularpartofthe solution,i.e.,xtsvd, takesplace beforethe effects k0 ofill-posednessshowup,onpage13of[12],Bjo¨rckandElde´nin1979foresightedly expressed a fundamentalconcern on CGLS (and LSQR): More research is needed to tell for which problems this approach will work, and what stopping criterion to choose.Seealso[44,p.145].AsremarkedbyHankeandHansen[39],thepaper[12] 6 ZhongxiaoJia wastheonlyextensivesurveyonalgorithmicdetailsuntilthattime,andastrictproof oftheregularizingpropertiesofconjugategradientsisextremelydifficult.Anenor- mous effort has long been made to the study of regularizing effects of LSQR and CGLS (cf. [24,30,31,36,38,44,47,52,53,56,78,81,86,91,100]), but hitherto there hasbeennodefinitiveanswertotheabovelong-standingfundamentalquestion,and thesameisforCGMEandLSMR. ForAsymmetric,MINRESandMR-IIappliedtoAx=bdirectlyarealternatives and have been shown to have regularizing effects [14,36,40,47,58,67], but MR-II seems preferablesince the noisy b is excludedin the underlyingsubspace [55,58]. For A nonsymmetric or multiplication with AT difficult to compute, GMRES and RRGMRES are candidate methods[3,15,16,80], and the latter may be better [58]. The hybrid approaches based on the Arnoldi process have been first proposed in [17] and studied in [14,18,73,82]. Gazzola and her coauthors [26,27,28,29] have describedageneralframeworkofthehybridmethodsandpresentedvariousKrylov- Tikhonov methods with different parameter-choice strategies. The regularizing ef- fectsofthesemethodsarehighlyproblemdependent,anditappearsthattheyrequire that the mixing of the left and right singular vectors of A be weak, that is,VTU is closetoadiagonalmatrix;formoredetails,see,e.g.,[58]and[47,p.126]. The behavior of ill-posed problems critically depends on the decay rate of s . j Thefollowingcharacterizationofthedegreeofill-posednessof(1)wasintroducedin [54]andhasbeenwidelyused[1,23,44,47,76]:Ifs =O(j a ),then(1) ismildly j − or moderatelyill-posed for 1 <a 1 or a >1. If s =O(r j) with r >1, j= 2 ≤ j − 1,2,...,n,then(1) isseverelyill-posed.Hereformildlyill-posedproblemsweadd therequirementa > 1,whichdoesnotappearin[54]butmustbe metfork(s,t) 2 ∈ L2(W W ) in (1) [39,44]. In the one-dimensionalcase, i.e., q=1, (1) is severely ill-pos×ed with k(s,t) sufficiently smooth, and it is moderately ill-posed with s = j O(j p 1/2), where p is the highest order of continuous derivatives of k(s,t); see, − − e.g.,[44,p.8]and[47,p.10-11].Clearly,thesingularvaluess foraseverelyill-posed j problemdecayatthesamerater 1,whilethoseofamoderatelyormildlyill-posed − a problemdecayatthedecreasingrate j thatapproachesonemorequicklywith j+1 jforthemildlyill-posedproblemtha(cid:16)nfor(cid:17)themoderatelyill-posedproblem. Ifaregularizedsolutionto(1)isatleastasaccurateasxtsvd,thenitiscalledabest k0 possibleregularizedsolution.Given(1),iftheregularizedsolutionofaniterativereg- ularizationsolveratsemi-convergenceissuchabestpossibleone,then,bythewords of Bjo¨rck and Elde´n, the solver works for the problem and is said to have the full regularization.Otherwise,thesolverissaidtohaveonlythepartialregularization. Because it has long been unknown whether or not LSQR, CGLS, LSMR and CGME have the full regularization for a given (1), one commonly combines them withsomeexplicitregularization,hopingthattheresultinghybridvariantsfindbest possibleregularizedsolutions[1,44,47].AhybridCGLSistorunCGLSforseveral trialregularizationparametersl andpicksupthebestoneamongthecandidates[1]. Itsdisadvantagesarethatregularizedsolutionscannotbeupdatedwithdifferentl and thereisnoguaranteethattheselectedregularizedsolutionisabestpossibleone.The hybridLSQR variantshavebeen advocatedby Bjo¨rck and Elde´n[12] and O’Leary andSimmons[83],andimprovedanddevelopedbyBjo¨rck[9]andBjo¨rck,Grimme RegularizationTheoryofLSQRandCGLS 7 and van Dooren [13]. A hybridLSQR first projects(1) onto Krylovsubspaces and then regularizes the projected problems explicitly. It aims to remove the effects of smallRitzvaluesandexpandsKrylovsubspacesuntiltheycapturesthek dominant 0 SVDcomponentsofA[9,13,39,83].Theexplicitregularizationforprojectedprob- lems should be introducedand play into effectsonly after semi-convergencerather thanfromtheveryfirstiteration.Ifitworks,theerrornormsofregularizedsolutions and the residual norms further decrease until they ultimately stabilize. The hybrid LSQRandCGMEhavebeenintensivelystudiedin,e.g.,[6,7,8,20,38,39,73,80,90] and [1,47,50]. Within the frameworkof such hybridsolvers, however,it is hard to findanear-optimalregularizationparameter[13,90]. Incontrast,ifaniterativesolveristheoreticallyprovedandpracticallyidentified to have the full regularization, one simply stops it after semi-convergence,and no complicatedhybridvariantand furtheriterations are needed. Obviously,we cannot emphasize too much the importance of proving the full or partial regularization of LSQR,CGLS,LSMRandCGME.Bythedefinitionofthefullorpartialregulariza- tion, we now modifythe concernof Bjo¨rckand Elde´n as: Do LSQR, CGLS, LSMR andCGMEhavethefullorpartialregularizationforseverely,moderatelyandmildly ill-posedproblems?Howtoidentifytheirfullorpartialregularizationinpractice? Inthispaper,wefocusonLSQRandanalyzeitsregularizationforseverely,mod- eratelyandmildlyill-posedproblems.DuetothemathematicalequivalenceofCGLS andLSQR,theassertionsonthefullorpartialregularizationofLSQRapplytoCGLS aswell.WeprovethatLSQRhasthefullregularizationforseverelyandmoderately ill-posed problems once r > 1 and a > 1 suitably, and it generally has only the partial regularization for mildly ill-posed problems. In Section 2, we describe the LanczosbidiagonalizationprocessandLSQR,andmakeanintroductoryanalysis.In Section 3, we establish sinQ theoremsfor the 2-normdistance between the under- lyingk-dimensionalKrylovsubspaceandthek-dimensionaldominantrightsingular subspaceofA.We thenderivesomefollow-upresultsthatplayacentralroleinan- alyzingtheregularizationofLSQR.InSection4,forthefirsttwokindsofproblems weprovethatak-stepLanczosbidiagonalizationalwaysgeneratesanearbestrankk approximationtoA,andthekRitzvaluesalwaysapproximatethefirstklargesingular valuesinnaturalorder,andnosmallRitzvalueappearsfork=1,2,...,k .Thiswill 0 showthatLSQRhasthefullregularization.Formildlyill-posedproblems,weprove that,forsomek k ,thekRitzvaluesgenerallydonotapproximatethefirstklarge 0 ≤ singularvaluesinnaturalorderandLSQRgenerallyhasonlythepartialregulariza- tion.InSection5,wederiveboundsfortheentriesofbidiagonalmatricesgenerated byLanczosbidiagonalization,showinghowfasttheydecayandhowtousethemto identifyifLSQRhasthefullregularizationwhenthedegreeofill-posednessof(1)is unknowninadvance.InSection6, wereportnumericalexperimentstoconfirmour theoryonLSQR.Finally,wesummarizethepaperwithfurtherremarksinSection7. Throughoutthepaper,wedenotebyK (C,w)=span w,Cw,...,Ck 1w thek- k − { } dimensionalKrylovsubspacegeneratedbythematrixC andthevectorw, andbyI and the bold letter 0 the identity matrix and the zero matrix with ordersclear from the context, respectively. For the matrix B=(b ), we define B =(b ), and for ij ij | | | | C =(c ), B C means b c componentwise. ij ij ij | | | | | |≤| | | |≤| | 8 ZhongxiaoJia 2 TheLSQRalgorithm TheLSQRalgorithmisbasedontheLanczosbidiagonalizationprocess,Algorithm1, thatcomputestwoorthonormalbases q ,q ,...,q and p ,p ,...,p ofK (ATA,ATb) 1 2 k 1 2 k+1 k andK (AAT,b)fork=1,2,...,n,{respectively. } { } k+1 Algorithm1 k-stepLanczosbidiagonalizationprocess 1. Takep1=b/ b Rm,anddefineb1q0=0. k k∈ 2. For j=1,2,...,k (i) r=ATpj b jqj 1 (ii) a j= r −;qj=r−/a j (iii) z=Akqjk a jpj (iv) b j+1=k−zk;pj+1=z/b j+1. Algorithm1canbewritteninthematrixform AQ =P B , (12) k k+1 k ATP =Q BT+a q eT . (13) k+1 k k k+1 k+1 k+1 wheree isthe(k+1)-thcanonicalbasisvectorofRk+1,P =(p ,p ,...,p ), k+1 k+1 1 2 k+1 Q =(q ,q ,...,q )and k 1 2 k a 1 b a 2 2   B = b ... R(k+1) k. (14) k 3 ×  ∈  ... a   k   b   k+1   Itisknownfrom(12)that B =PT AQ . (15) k k+1 k WeremindthatthesingularvaluesofB ,calledtheRitzvaluesofAwithrespectto k theleftandrightsubspacesspan P andspan Q ,areallsimple. k+1 k { } { } andAcotmitepruatteiosnthke,LitSerQatRessox(lvk)e=sthQepy(rko)bwleimthkAx(k)−bk=minx∈Kk(ATA,ATb)kAx−bk k y(k)=argmin B y b e(k+1) = b B†e(k+1), (16) y Rkk k −k k 1 k k k k 1 ∈ wheree(k+1)isthefirstcanonicalbasisvectorofRk+1,andtheresidualnorm Ax(k) 1 k − b decreases monotonically with respect to k. We have Ax(k) b = B y(k) k k k − k k − b e(k+1) and x(k) = y(k) ,bothofwhichcanbecheaplycomputed. k k 1 k k k k k Notethat b e(k+1)=PT b.Wehave k k 1 k+1 x(k)=Q B†PT b, (17) k k k+1 RegularizationTheoryofLSQRandCGLS 9 that is, the iterate x(k) by LSQR is the minimum-normleast squaressolution to the perturbedproblemthatreplacesAin(1)byitsrankkapproximationP B QT.Re- k+1 k k callthatthebestrankkapproximationA toAsatisfies A A =s .Furthermore, k k k+1 k − k analogousto(9),LSQRnowsolves min x subjectto P B QTx b =min, k=1,2,...,n (18) k k k k+1 k k − k for the regularized solutions x(k) to (1). If P B QT is a near best rank k approx- k+1 k k imation to A with an approximate accuracy s and the k singular values of B k+1 k approximatethefirstklargeonesofAinnaturalorderfork=1,2,...,k ,thesetwo 0 factsrelateLSQRandtheTSVDmethodnaturallyandcloselyintwoways:(i)xtsvd k andx(k)aretheregularizedsolutionstothetwoperturbedproblemsof(1)thatreplace Abyitstworankkapproximationswiththesamequality,respectively;(ii)xtsvd and k x(k) solvealmostthesametworegularizationproblems(9)and(18),respectively.As aconsequence,theLSQRiteratex(k0) isasaccurateasxtsvd,andLSQRhasthefull k0 regularization.Otherwise, as will be clear later, underthe discrete Picard condition (7), x(k0) cannotbe as accurateas xtks0vd if either Pk+1BkQTk is nota near bestrank k approximationtoA,k=1,2,...,k ,orB hasatleastonesingularvaluesmallerthan 0 k s for some k k . Precisely, if either of them is violated for some k k and k0+1 ≤ 0 ≤ 0 q (k)<s ,x(k)hasbeendeterioratedbythenoisee,andLSQRhasonlythepartial k k0+1 regularization.Wewillgiveaprecisedefinitionofanearbestrankk approximation toAsoon. 3 sinQ theoremsforthedistancesbetweenK (ATA,ATb)andspan V aswell k k { } astheothersrelated vanderSluisandvanderVorst[99]provethefollowingresult,whichhasbeenusedin Hansen[44]andthereferencesthereintoillustratetheregularizingeffectsofLSQR andCGLS.Wewillalsoinvestigateitfurtherinourpaper. Proposition1 LSQRwiththestartingvectorp =b/ b andCGLSappliedtoATAx= 1 k k ATbwiththestartingvectorx(0)=0generatethesameiterates x(k)=(cid:229) n f(k)uTi bv, k=1,2,...,n, (19) i s i i=1 i where k (q (k))2 s 2 f(k)=1 (cid:213) j − i , i=1,2,...,n, (20) i −j=1 (q j(k))2 andtheq (k) arethesingularvaluesofB labeledasq (k)>q (k)> >q (k). j k 1 2 ··· k (19) shows that x(k) has a filtered SVD expansion of form (11). If all the Ritz valuesq (k) approximatethefirstksingularvaluess ofAinnaturalorder,thefilters j j (k) (k) f 1,i=1,2,...,k and the other f monotonicallyapproach zero for i=k+ i ≈ i 10 ZhongxiaoJia 1,...,n.Thisindicatesthatiftheq (k) approximatethefirstksingularvaluess ofA j j innaturalorderfork=1,2,...,k thenthek -stepLSQRhasthefullregularization. 0 0 However, if a small Ritz value appears before some k k , i.e., q (k) >s and ≤ 0 k 1 k0+1 s <q (k) s withthesmallestinteger j >k +1,then f(k) −(0,1)tendsto j∗ k ≤ k0+1 ∗ 0 i ∈ zeromonotonicallyfori= j ,j +1,...,n;ontheotherhand,wehave ∗ ∗ (cid:213) k (q j(k))2−s i2 = (q k(k))2−s i2k(cid:213)−1(q j(k))2−s i2 0, i=k +1,...,j 1 j=1 (q j(k))2 (q k(k))2 j=1 (q j(k))2 ≤ 0 ∗− since the first factor is non-positive and the second factor is positive. Then we get f(k) 1, i=k +1,...,j 1,indicatingthatx(k) isalreadydeterioratedandLSQR i ≥ 0 ∗− hasonlythepartialregularization. Thestandardk-stepLanczosbidiagonalizationmethodcomputesthekRitzvalues q (k),whichareusedtoapproximatesomesingularvaluesofA.Itismathematically j equivalenttothesymmetricLanczosmethodfortheeigenvalueproblemofATAstart- ingwithq =ATb/ ATb ;see[4,10,11,87,101]or[2,60,61]forseveralvariations 1 k k thatarebasedonstandard,harmonic,andrefinedprojection[4,94,101]oracombi- nation of them [59]. It is known that, for generalsingular value distribution and b, someRitzvaluesbecomegoodapproximationstotheextremesingularvaluesofAas k increases.Iflargesingularvaluesare wellseparatedbutsmallsingularvaluesare clustered,largeRitzvaluesconvergefastbutsmallRitzvaluesconvergeslowly. For(1),ATbcontainsmoreinformationondominantrightsingularvectorsthan ontheonescorrespondingtosmallsingularvalues.Therefore,K (ATA,ATb)hope- k fully contains richer information on the first k right singular vectors v than on the i other n k ones, at least for k small. Furthermore,note that A hasmany small sin- − gular values clustered at zero. Due to these two basic facts, all the Ritz values are expectedtoapproximatethelargesingularvaluesofAinnaturalorderuntilsomeit- erationk,atwhichasmallRitzvalueshowsup.Inthiscase,theiteratesx(k)byLSQR captureonlythelargestkdominantSVDcomponentsofA,andtheyaredeteriorated by the noise e dramatically after that iteration. This is why LSQR and CGLS have general regularizing effects; see, e.g., [1,44,46,47,50] and the references therein. Unfortunately,theseargumentscannothelpusdrawanydefinitiveconclusiononthe full or partial regularizationof LSQR because there has been no quantitativeresult onthesizeofsuchkforanykindofill-posedproblemandthenoisee.Foraseverely ill-posed example from seismic tomography,it is reportedin [100] that the desired convergenceoftheRitzvaluesactuallyholdsaslongasthediscretePicardcondition issatisfiedandthereisagoodseparationamongthelargesingularvaluesofA.Yet, therehasbeennomathematicaljustificationontheseobservations. A complete understandingof the regularization of LSQR includes accurate so- lutionsofthefollowingproblems:HowaccuratelydoesK (ATA,ATb)approximate k thek-dimensionaldominantrightsingularsubspaceofA?Howaccurateistherankk approximationP B QT toA?CanitbeanearbestrankkapproximationtoA?How k+1 k k does the noise level e affects the approximation accuracy of K (ATA,ATb) and k P B QT fork k kankdk>k ,respectively?Whatsufficientconditionsonr and a k+a1rekneekdedto≤gua0ranteethatP0 B QT isa nearbestrankk approximationtoA? k+1 k k

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.