Nearest matrix with prescribed eigenvalues and its 5 1 applications 0 2 b E. Kokabifar∗, G.B. Loghmani∗ and S.M. Karbassi† e F 8 1 Abstract Consider n × n matrix A and a set Λ consisting of k ≤ n prescribed complex ] A numbers. Lippert (2010) in a challenging article, studied geometrically the spectral N norm distance from A to the set Λ and constructed a perturbation matrix ∆ with minimum spectral norm such that A+∆ had Λ in its spectrum. This paper presents . h an easy practical computational method for constructing the optimal perturbation t a ∆ by extending necessary definitions and lemmas of previous works. Also, some m conceivable applications of this issue are provided. [ Keywords: Matrix, Eigenvalue, Perturbation, Singular value. 3 v AMS Classification: 15A18, 65F35, 65F15. 2 8 4 1 Introduction 0 . 1 0 LetAbeann×ncomplex matrix andlet L bethesetof complex n×nmatrices thathave 4 λ ∈ C as a prescribed multiple eigenvalue. In 1999, Malyshev [16] obtained the following 1 : formula for the spectral norm distance from A to L : v i X A−λI γI minkA−Bk = maxs n , r B∈L 2 γ≥0 2n−1(cid:18)(cid:20) 0 A−λI (cid:21)(cid:19) a where k · k denotes the spectral matrix norm and s (·) ≥ s (·) ≥ s (·) ≥ ··· are the 2 1 2 3 singular values of the corresponding matrix in nonincreasing order. Also he constructed a perturbation, ∆, to matrix A such that A+∆ belonged to the L and ∆ was the optimal perturbation of the matrix A. Malyshev’s work can be considered as a solution to the Wilkinson’s problem, that is, the computation of the distance from a matrix A ∈ Cn×n ∗Department of Mathematics, Faculty of Science, Yazd University, Yazd, Iran ([email protected], [email protected]). †Department of Mathematics, Yazd Branch, Islamic Azad University, Yazd, Iran, (mehdikar- [email protected]). 1 which has only simple eigenvalues, to the set of n×n matrices with multiple eigenvalues. Wilkinson introduced this distance in [25] and some bounds were computed for it by Ruhe [23], Wilkinson [26–29] and Demmel [6]. However, in a non-generic case, if A is a normal matrix then the Malyshev’s formula is not directly applicable. Ikramov and Nazari [11] showed this point and they obtained an extension of Malyshev’s formula for normal matrices. Furthermore, the Malyshev’s formula was extended by them [13] for the case of a spectral norm distance from A to matrices with a prescribedtriple eigenvalue. In 2011, undersome conditions, a perturbation∆ to matrix Awas constructed by Mengi [20] suchthat∆hadminimumspectralnormandA+∆wasbelongedtothesetofmatricesthat had a prescribed eigenvalue of prespecified algebraic multiplicity. Moreover, Malyshev’s work also was extended by Lippert [15] and Gracia [8]. They computed a spectral norm distance from A to the matrices with two prescribed eigenvalues. Recently, Lippert [14] introduced a geometric motivation for the results obtained in [8,15] and he computed the smallest perturbation in the spectral norm such that the perturbed matrix had some given eigenvalues. On the other hand, in [14] and Section 6 of [15] it was shown that the optimal perturbations are not always computable for the case of fixing three or more distinct eigenvalues. Denote by M the set of n×n matrices that have k ≤ n prescribed k eigenvalues. This article concerns the spectral norm distance from A to M and describes k a clear computational technique for construction of ∆ having minimum spectral norm and satisfying A+∆ ∈ M . Also, some possible applications of this topic is considered. First, k some lower bounds for this distance are obtained. Then, two assumptions are provided such that the optimal perturbation is always computable when these assumptions hold. It is noticeable that If one or both conditions are not satisfied then still A+∆ ∈ M , k but ∆ has not necessary minimum spectral norm. In this case we can have lower and upper bounds for the spectral norm distance form A to A+∆ ∈ M . Note that if, in k a special case, A is a normal matrix, i.e., A∗A = AA∗, then we can not use the method describedin thispaperfor thecomputation of theperturbation,immediately. Inthis case, by following theanalysis performedin [11,12,21]one can derive a refinementof our results for the case of normal matrices. Therefore, throughout of this paper, it is assumed that A is not a normal matrix. Suppose now that an n× n matrix A and a set of complex numbers Λ= {λ ,λ ,...,λ } in which k ≤ n, are given. Now for 1 2 k γ = [γ1,1,γ2,1, ...,γk−1,1,γ1,2,γ2,2, ...,γk−2,2, ...,γ1,k−1] ∈ Ck(k2−1), 2 define the nk×nk upper triangular matrix Q (γ) as A B γ I γ I ... γ I 1 1,1 n 1,2 n 1,k−1 n 0 B2 γ2,1In ... γ2,k−2In QA(γ) = ... ... B3 ... ... , (1) .. . γk−1,1In 0 ... 0 B k nk×nk where γ ,(i = 1,...,k−1) are pure real variables and B = A−λ I , (j = 1,...,k). i,1 j j n Clearly, Q (γ) can be assumed as a matrix function of variables γ such that i = A i,j 1...k−1, j = 1...k−i.Hereafter,forthesakeofsimplicity,thepositiveintegernk−(k−1) is denoted by κ. Assume that the spectral norm distance from A to M is denoted by k ρ (A,M ), i.e, 2 k ρ (A,M )= k∆k = min kA−Mk , 2 k 2 2 M∈Mk in whatfollows, some lower boundsfor the optimal perturbation∆ such that A+∆ ∈ M k are obtained. 2 Lower bounds for the optimal perturbation Let us begin by considering s (Q (γ)) which is the κth singular value of Q (γ). First, κ A A note that s (Q (γ)) is a continuous function of variable γ. Also, if we define the unitary κ A matrix U of the form I 0 n −In I U = n , ... 0 (−1)k−1I n k×k where I is n×n identity matrix, then it is straightforward to see that n B −γ I γ I ... γ I 1 1,1 n 1,2 n 1,k−1 n 0 B2 −γ2,1In ... γ2,k−2In UQA(γ)U∗ = ... ... B3 ... ... . ... −γk−1,1In 0 ... 0 B k nk×nk 3 Since under a unitary transformation the singular values of a square matrix are in- variant, it follows that s (Q (γ)) is an even function with respect to all its real variables κ A γ ,(i = 1,...,k−1). Therefore, without loss of generality, in remainder of the paper, we i,1 assume that γ ≥ 0, for every i= 1,...,k−1. i,1 Lemma 2.1. If A ∈ Cn×n has λ ,λ ,...,λ as some of its eigenvalues, then for all 1 2 k γ ∈ Ck(k2−1), it holds that sκ(QA(γ))= 0. Proof. Suppose that λ ,λ ,...,λ are some of the eigenvalues of A corresponding to 1 2 k the associated eigenvectors e ,e ,...,e , respectively. Evidently, 1 2 k Ae = λ e , and B e = (λ −λ )e , i,j = 1,··· ,k. (2) i i i j i i j i It is shown that Q (γ) has k linearly independent eigenvectors corresponding to zero A as one of its eigenvalues. This implies that s (Q (γ)) = 0. Two cases are now considered. κ A Case1. Letλ ,...,λ bek distincteigenvalues of Q (γ). Consequently, e ,...,e are 1 k A 1 k k linearly independent eigenvectors. Consider now the nk×1 vectors v1,v2,...,vk such that v1 = [e ,0,...,0]T and introduce the remaining vectors by the following formula 1 0 j > m vm = em j = m , (3) j m−j 1 γ vm e j = m−1,...,1 λj−λm pP=1 (cid:16) j,p p+j(cid:17) m where vm denotes the jth component of vm. The elements of each vector vm should j be computed recursively, starting from the last component. Clearly, v1,v2,...,vk are k linearly independent vectors. Using (2) and (3), straightforward calculation yields m−i (Q (γ)vm) = B vm+ γ vm A i i i i,p p+i Xp=1(cid:0) (cid:1) m−i m−i 1 = γ vm B e + γ vm λ −λ i,p p+i i m i,p p+i i m Xp=1(cid:0) (cid:1) Xp=1(cid:0) (cid:1) m−i m−i = − γ vm + γ vm i,p p+i i,p p+i Xp=1(cid:0) (cid:1) Xp=1(cid:0) (cid:1) = 0. Thus, the vectors v1,v2,...,vk satisfy Q (γ)vm = 0,(m = 1,...,k). Consequently, the A dimension of the null space of Q (γ) is at least κ which means that s (Q (γ))= 0. A κ A 4 Case 2. Assume that some of the complex numbers λ ,λ ,...,λ are equal. Without 1 2 k loss of generality, we can assume that λ is an eigenvalue of algebraic multiplicity l, i.e., 1 λ = λ = ... = λ . In this case, a new method is provided for constructing the vectors 1 2 l associated with λ , i.e., v1,v2,...,vl. Obviously, it is enough to construct a set of linearly 1 independent vectors {v1,v2,...,vl,vl+1,...,vk}, for which Q (γ)vm = 0,(m = 1,...,l). A To do this, let e be the right eigenvector for λ and let e ,e¯ ,e¯ ,...,e¯ form a chain of 1 1 1 1 2 l−1 generalized eigenvectors of length l associated with λ . So, we have 1 (A−λ I)e = 0, (A−λ I)e¯ = e , and (A−λ I)e¯ = e¯ , (j = 2,...,l). 1 1 1 1 1 1 j j−1 Define the operator V as follows: V[e ]= e¯ and V[e¯]= e¯ , i = 1,...,l−1. (4) 1 1 i i+1 Now, let v1 = [e ,0,...,0]T and define the vectors ,v2,...,vl by 1 0 j > m vm = e1 j = m , j m−j − γ V vm j = m−1,...,1 j,p p+j pP=1 (cid:16) h i(cid:17) where vm denotes the jth component of vm and analogous to Case 1 elements of each j vector vm, can now be computed by the above recursive relation, beginning from the last element. The remaining vectors associated with simple eigenvalues are constructed by the method presented in Case 1. Linearity independence of vectors vi,(i = 1,...,l) is concluded by their definition. Thus, v1,...,vk are linearly independent vectors. Also, calculations similar to what was performed in Case 1, conclude that v1,...,vl are the l eigenvectors associated to zero as an eigenvalue of Q (γ), i.e., Q (γ)vi = 0, for every A A i = 1,...,l. (cid:3) Following corollary can be concluded by similar calculations of the above lemma. Corollary 2.2. Let λ ,λ ,...,λ be some eigenvalues of A ∈ Cn×n. Then for all γ ∈ 1 2 k Ck(k2−1) we have s (A−λ I) = 0, i = 1,...,k, n i A−λ I γ I s i 1,1 = 0, i,j = 1,...,k, 2n−1(cid:18)(cid:20) 0 A−λjI (cid:21)(cid:19) A−λiI γ1,1I γ1,2I (5) s3n−2 0 A−λjI γ2,1I = 0, i,j,l = 1,...,k, 0 0 A−λ I l . . . s (Q (γ)) = 0. κ A 5 Lemma 2.3. Suppose that A ∈ Cn×n has λ ,λ ,...,λ as some of its eigenvalues. If ∆ 1 2 k is the minimum norm perturbation such that A+∆ ∈ M , then for every γ, k k∆k ≥ s (Q (γ)). 2 κ A Proof. By Lemma 2.1, we know that s (Q (γ)) = 0. Applying the Weyl κ A+∆ inequalities for singular values (for example, see Corollary 5.1 of [5]) to the relation Q (γ) = Q (γ)+I ⊗∆, yields A+∆ A k s (Q (γ)) = |s (Q (γ))−s (Q (γ))|≤ kI ⊗∆k = k∆k . (cid:3) κ A κ A κ A+∆ k 2 2 Similar calculations as performed above for the singular value s (Q (γ)), can be consid- κ A ered for remainder of the singular values appearing in the left hand side of the equations (5). Thus, the following relations are deduced: k∆k ≥ s (A−λ I), i = 1,...,k, 2 n i A−λ I γ I k∆k ≥ s i 1,1 , i,j = 1,...,k, 2 2n−1(cid:18)(cid:20) 0 A−λjI (cid:21)(cid:19) A−λ I γ I γ I i 1,1 1,2 k∆k2 ≥ s3n−2 0 A−λjI γ2,1I , i,j,l = 1,...,k, 0 0 A−λ I l . . . k∆k ≥ s (Q (γ)). 2 κ A Next corollary gives the main result of this section. Corollary 2.4. Assume that A∈ Cn×n and a set Λ consisting of k ≤ n complex numbers are given. Then for all γ ∈ Ck(k2−1), the optimal perturbation ∆, satisfies k∆k ≥ max{α ,α ,...,α }, (6) 2 1 2 k where α = max{s (A−λ I), i = 1,...,k}, 1 n i A−λ I γ I α = max s i 1,1 , i,j = 1,...,k , 2 (cid:26) 2n−1(cid:18)(cid:20) 0 A−λjI (cid:21)(cid:19) (cid:27) A−λ I γ I γ I i 1,1 1,2 α3 = maxs3n−2 0 A−λjI γ2,1I = 0, i,j,l = 1,...,k, 0 0 A−λ I l . .. α = s (Q (γ)). k κ A 6 From this point, construction of an optimal perturbation is considered. This is com- pletely dependent on dominance of α ,(i = 1,...,k). First, let maximum of right hand i side of (6) occur in α = s (Q (γ)). k κ A 3 Properties of s (Q (γ)) and its corresponding singular κ A vectors In this section, we obtain further properties of s (Q (γ)) and its associated singular vec- κ A tors. In the next section, this properties are applied to construct the optimal perturbation ∆. Forourdiscussion,itisnecessarytoreformsomedefinitionsandlemmasof[8,15,16,20]. Definition 3.1. Suppose that vectors u (γ) v (γ) 1 1 u(γ) = ... ,v(γ) = ... ∈ Cnk (uj(γ),vj(γ) ∈ Cn,j = 1,...,k), u (γ) v (γ) k k is a pair of left and right singular vectors of s (Q (γ)), respectively. Define κ A U(γ) = [u (γ),...,u (γ)] , and V(γ) = [v (γ),...,v (γ)] . 1 k n×k 1 k n×k Considering definition of the vectors u(γ) and v(γ), we have the following relations Q (γ)v(γ) = s (Q (γ))u(γ), (7) A κ A Q (γ)∗u(γ) = s (Q (γ))v(γ), (8) A κ A also without loss of generality, assume that u(γ) and v(γ) are unit vectors. Thefollowing lemma, which can beverified byconsideringLemma3.5 of [15](Seealso Lemma 3 of [8]), concludes that there exists a finite point γ ∈ Ck(k2−1) where the function s (Q (γ)) attains its maximum value. κ A Lemma 3.2. s (Q (γ))→ 0 as |γ|→ ∞. κ A Definition 3.3. Let γ be a point where the singular value s (Q (γ)) attains its maxi- ∗ κ A k−1 mum value such that Π γ∗i,1 > 0 at this point. We set α∗k = sκ(QA(γ∗)). i=1 It is easy to verify that if α∗ = 0, then λ ,λ ,...,λ are some eigenvalues of Q (γ). k 1 2 k A Therefore, in what follows we assume that α∗ > 0. Moreover, suppose that α∗ is a simple k k (not repeated) singular value of Q (γ ). A ∗ Investigatingotherpropertiesofs (Q (γ))anditsassociatedsingularvectorsrequires κ A results deduced in Theorem 2.10 and Theorem 2.11 of [20]. 7 Theorem 3.4. Let A(t): R → Cn×n be an analytic matrix-valued function. There exists a decomposition A(t) = U(t)S(t)V(t)∗, where U(t): R → Cn×n,V(t): R → Cn×n are unitary and analytic, S(t) is diagonal and analytic for all t. Assume that u (t),v (t) are l l the lth columns of U(t) and V(t), respectively, and s (t) is the signed singular value at the l lth diagonal entry of S(t). Using the product rule and the fact that A(t)v (t) = s (t)u (t), l l l it is straightforward to deduce ds (t) dA(t) l = Real u (t)∗ v (t) . (9) l l dt (cid:18) dt (cid:19) In particular, if s (t ) > 0 at a local extremum t , and s (t ) is a simple singular value, l ∗ ∗ l ∗ then ds (t ) dA(t ) l ∗ = Real u (t )∗ ∗ v (t ) = 0. (10) l ∗ l ∗ dt (cid:18) dt (cid:19) Sun [24] has shown that a simple singular value has an analytic expansion for the derivative formula similar to what is mentioned in (9). In this case, we can differentiate a singular value along the tangent dA(t) from A(t) and find a singular value decomposition dt that expresses the derivative as right hand side of (9). The first part of the above theorem (Theorem2.10of[20])impliesthatforoneparameterA(t)’s, Sun’sformulaswillstillwork, as long as U’s and V’s are chosen correctly. Moreover, the second part of Theorem 3.4 (Theorem 2.11 of [20]) assures that singular vectors u (t ) and v (t ) for which equation l ∗ l ∗ (10) is satisfied when s (t ) > 0, can be found. On the other hand, suppose we are given l ∗ an n parameter family A(t ,...,t ), and it is desired to find singular vectors u’s and v’s 1 n such that ∂s = Real(u∗∂Av),(i = 1,...,n). In this case, Theorem 3.4 cannot be always ∂ti ∂ti applied. The k eigenvalue case needs to deal with this failure analyticity. Fortunately, we can cope with this essential flaw by assuming that α∗ is an isolated singular value (as this k qualification is considered). Now, it should be noted that γ ,(i = 1,...,k−1) are pure i,1 real variables, while other components of the vector γ are complex variables. If we set γ = γ +iγ , for j >1, then it can beassumed that γ has (k−1)2 real components. i,j i,j,R i,j,I Following lemma is deduced from applying Lemma 3.4 to Q (γ). A Lemma 3.5. Let γ and α∗ be as defined in Definition 3.3, and let α∗ > 0 be a simple ∗ k k singular value of Q (γ ). Then there exists a pair A ∗ u (γ ) v (γ ) 1 ∗ 1 ∗ ... , ... ∈Cnk(uj(γ∗),vj(γ∗) ∈ Cn,j = 1,...,k), uk(γ∗) vk(γ∗) of left and right singular vectors of α∗, respectively, such that k 1. u (γ )∗v (γ )= 0, i= 1,...k−1, j = 1,...,k−i, and i ∗ i+j ∗ 2. u (γ )∗u (γ )= v (γ )∗v (γ ), i,j = 1,...,k. i ∗ j ∗ i ∗ j ∗ 8 Proof. To prove first part of the lemma, two cases are considered. Case 1. At first assume that j > 1. By applying the Lemma 2.10 of [20], for both of the real and imaginary parts of γ = γ + iγ we have Re(u (γ )∗v (γ )) = 0, and Re(iu (γ )∗v (γ )) = 0, ∗i,j ∗i,j,R ∗i,j,I i ∗ i+j ∗ i ∗ i+j ∗ respectively. This means that u (γ )∗v (γ )= 0. i ∗ i+j ∗ Case 2. Consider the case for which j = 1. Applying Lemma 2.10 of [20] for real numbers γ ,(i = 1,...,k−1), concludes that ∗i,1 Re(u (γ )∗v (γ )) = 0. So, first part of lemma is proved if we show that u (γ )∗v (γ ) i ∗ i+1 ∗ i ∗ i+1 ∗ is a real number. Let us introduce the 1×nk vectors u (γ )= [0,...,0,u (γ )∗,0,...,0] i,i ∗ i ∗ and v (γ ) = [0,...,0,v (γ )∗,0,...,0] whereu (γ )∗ andv (γ )∗ are theith components. i,i ∗ i ∗ i ∗ i ∗ By multiplying (7) by u (γ ) and (8) by v (γ ) both from the left, following results are i,i ∗ i,i ∗ obtained, respectively, k−i u (γ )∗B v (γ )+ γ u (γ )∗v (γ ) =α∗u (γ )∗u (γ ), (11) i ∗ i i ∗ ∗i,p i ∗ i+p ∗ k i ∗ i ∗ Xp=1(cid:0) (cid:1) i−1 v (γ )∗B∗u (γ )+ γ¯ v (γ )∗u (γ ) = α∗v (γ )∗v (γ ), (12) i ∗ i i ∗ ∗p,i−p i ∗ p ∗ k i ∗ i ∗ Xp=1(cid:0) (cid:1) Conjugating(11),subtractingitfrom(12)andconsideringresultsobtained inCase1leads to the following relation γ u (γ )∗v (γ ) = α∗(u (γ )∗u (γ )−v (γ )∗v (γ )), (13) ∗i,1 i ∗ i+1 ∗ k i ∗ i ∗ i ∗ i ∗ clearly, the right hand side of (13) is a real number which implies that u (γ )∗v (γ ) is i ∗ i+1 ∗ also real. Thus firs part of the lemma is proved completely. Now attempt to prove second part of the lemma. Multiplying (7) by u (γ ) from i,j ∗ the left where u (γ )∗ is in the jth place, and multiplying (8) by v (γ ) from the left i ∗ j,i ∗ where v (γ )∗ is in the ith place, leads to similar results as in (11) and (12), respectively. j ∗ Performing similar calculations as first part of the proof, leads to u (γ )∗B v (γ )−u (γ )∗B v (γ )= α∗(u (γ )∗u (γ )−v (γ )∗v (γ )), i ∗ j i ∗ i ∗ i j ∗ k i ∗ i ∗ i ∗ i ∗ which can be written as (λ −λ )u (γ )∗v (γ ) = α∗(u (γ )∗u (γ )−v (γ )∗v (γ )). i j i ∗ j ∗ k i ∗ i ∗ i ∗ i ∗ The left hand side of this equation is equal to zero. In fact, the assertion is obvious when i = j. For i 6= j, assuming i < j then considering first part of the proof. Thus proof is completed by keeping in mind that α∗ > 0. (cid:3) k 9 Corollary 3.6. Let suppositions of Lemma 3.5 hold. Then the matrices U(γ ) and V(γ ) ∗ ∗ (recall Definition 3.1 but for γ = γ ), satisfy U(γ )∗U(γ ) = V(γ )∗V(γ ). ∗ ∗ ∗ ∗ ∗ Lemma 3.7. Suppose that assumptions of Lemma 3.5 are satisfied. Then the two matrices U(γ ) and V(γ ) have full rank. ∗ ∗ Proof. This result can be verified by considering Lemma 4.4 of [15]. (cid:3) 4 Construction of the optimal perturbation Let γ and α∗ be as defined in Definition 3.3, and let α∗ > 0 be a simple singular value ∗ k k of Q (γ ). In this section, a perturbation ∆ ∈ Cn×n is constructed such that satisfies A ∗ k∆k = α∗, and perturbed matrix A+∆ has k prescribed eigenvalues. Suppose that the 2 k matrices U(γ ) and V(γ ) are as in Corollary 3.6 and define ∗ ∗ ∆ = −α∗U(γ )V(γ )†, (14) k ∗ ∗ where V(γ )† denotes the Moore-Penrose pseudoinverse of V(γ ). From Corollary 3.6 it ∗ ∗ can be deduced that the two matrices U(γ ) and V(γ ) have the same nonzero singular ∗ ∗ values. So, there exists a unitary matrix W ∈ Cn×n such that U(γ ) = WV(γ ). Notice ∗ ∗ that Lemma 3.7 implies V(γ)†V(γ) =I . Therefore, k k∆k = −α∗U(γ )V(γ )† = α∗ WV(γ )V(γ )† = α∗, 2 k ∗ ∗ k ∗ ∗ k (cid:13) (cid:13)2 (cid:13) (cid:13)2 (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) and ∆V(γ ) = −α∗U(γ ) ⇔ ∆v (γ ) = −α∗u (γ ), i = 1,...,k. (15) ∗ k ∗ i ∗ k i ∗ Using (7) and (15) for every i = 1,...,k, yields k−i (B +∆)v (γ ) = − γ v (γ ). (16) i i ∗ ∗i,p i+p ∗ Xp=1 By Lemma 3.7, we know that v (γ ),(i = 1,...,k) are linearly independent vectors. i ∗ Consequently, (B +∆)v (γ ) is a nonzero vector for every i,j ∈ {1,...,k}, except for i j ∗ the case i = j =k. Now, define the k vectors ψ ,ψ ,...,ψ by k k−1 1 k ψk =vk(γ∗), and ψi = (Bp+∆)vi(γ∗), i = k−1,...,1, (17) p=Yi+1 10