ebook img

Analysis of equivalence relation in joint sparse recovery PDF

1.3 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Analysis of equivalence relation in joint sparse recovery

Analysis of the equivalence relationship in joint sparse recovery Wang Changlong, Jigen Peng Abstract The joint sparse recovery problem is a generalization of the single measurement vector problem 7 whichiswidelystudiedinCompressed Sensinganditaimstorecoveryasetofjointlysparsevectors. 1 i.e. have nonzero entries concentrated at common location. Meanwhile l -minimization subject to 0 p 2 matricesiswidelyusedinalargenumberofalgorithmsdesignedforthisproblem. Thereforethemain n contribution in this paper is two theoretical results about this technique. The first one is to prove a J that in every multiple systems of linear equation, there exists a constant p∗ such that the original 7 uniquesparsesolutionalsocanberecoveredfromaminimizationinl quasi-normsubjecttomatrices p ] whenever0<p<p∗. Theotheroneistoshowananalysisexpression ofsuchp∗. Finally,wedisplay T I theresults of one exampleto confirm thevalidity of ourconclusions. . s c keywords: sparse recovery, multiple measurement vectors, joint sparse recovery, null [ 1 space property, lp-minimization v 0 5 8 1 INTRODUCTION 1 0 . 1 In sparse information processing, one of the central problems is to recovery a sparse so- 0 7 lution of an underdetermined linear system, such as visual coding [18], matrix completion 1 : v [1], source localization [15], and face recognition [23]. That is, letting A be an underdeter- i X mined matrix of size m n and b Rm is a vector representing some signal, so the single r × ∈ a measurement vector (SMV) is popularly modeled into the following l -minimization. 0 min x s.t. Ax = b, (1) 0 x∈Rmk k where x indicates the number of nonzero elements of x. However, l -minimization has 0 0 k k been proved to be NP-hard [17] because of the discrete and discontinuous nature of x 0 k k . In order to overcome this difficulty, many researchers have suggested to replace x 0 k k with x p. Instead of l -minimization, they consider the l -minimization with 0 p 1, k kp 0 p ≤ ≤ min x p s.t. Ax = b (2) x∈Rmk kp 1 where x p = m x p ([8] [2]). Due to the fact that x = lim x p, it seems to be k kp i=1| i| k k0 p→0k kp more natural toPconsider lp-minimization. Furthermore, a natural extension of single measurement vector is the joint sparse re- covery problem, also known as the multiple measurement vector (MMV) problem which arises naturally in source localization [14], neuromagnetic imaging [3], and equalization of sparse-communication channels [6] [4]. Instead of a single measurement b, we are given a set of r measurements, Ax(k) = b(k) k = 1...r, (3) in which the vectors x(k) (k = 1...r) are joint sparse, i.e. the solution vectors share a common support and have nonzero entries concentrated at common locations. Let A Rm×n and B = [b(1)...b(r)] Rm×r , the MMV problem is to look for the row- ∈ ∈ sparse solution matrix and it can be modeled as the following l -minimization problem. 2,0 min X s.t. AX = B, (4) 2,0 X∈Rm×rk k n where X = X and X is a row vector and defined as the i-th row k k2,0 i=1k row ik2,0 row i of X, and XrowPi 2,0 = 1 if Xrow i 2 = 0 and Xrow i 2,0 = 0 if Xrow i 2 = 0. k k k k 6 k k k k We can define the support of X, support(X) = S = i : X = 0 and call the row i 2 { k k 6 } solution X is k-sparse, when S k, where S is the cardinality of set S and we also say | | ≤ | | that X can be recovered by model (4), if X is the unique solution to model (4). It needs to be emphasized that we can not regard the solution of multiple measurement vector (MMV) as a combination of several solutions of single measurement vectors. i.e., the solution matrix X to l -minimization is not always composed by the solution vectors 2,0 to l -minimization. For example. 0 Example 1. We consider an underdetermined system AX = B, where 2 0 0 1 0 1 1  0 0.5 0 1 0   1 1  A = and B = .  0 0 1 2 0.5   0 1   −         0 0 0 1 0.5   0 0   −        If we treat the AX = B = [b b ] as a combination of two single measurements vector, 1 2 Ax = b and Ax = b , it is easy to verify that each sparse solution to these two problems 1 2 is x = [0.5 2 0 0]T,and x = [0 0 1 2]T . So let X∗ = [x x ], it is easy to check that 1 2 1 2 X∗ = 4. In fact, it is easy to verify that 2,0 k k 2 0.5 0.5  2 2  X =  0 1       0 0      X is the solution to l -minimization since X = 3 < X∗ = 4. 2,0 2,0 2,0 k k k k With this simple Example 1, we should be aware that MMV problem wants a jointly sparse solution, not a solution which is just composed by sparse vectors. Therefore, MMV problem is more complex than SMV, so MMV needs its own theoretical work. Be inspired by l -minimization, a popular approach to find the sparest solution to MMV problem is p to solve the following l -minimization optimization problem. 2,p min X s.t. AX = B, (5) 2,p X∈R(m×r)k k p n p where the mixed norm X = X and p (0,1]. k k2,p i=1k row ik2 ∈ P 1.1 Related Work Many researchers have made a lot of contribution related to the existence, uniqueness and other properties of l -minimization [13][7][12][22]. Eldar [5] gives a sufficient con- 2,p dition for MMV when p = 1, and Unser [21] analyses some properties of the solution to l -minimization when p = 1. Fourcart and Gribonval [7] studied the MMV setting 2,p when r = 2 and p = 1, they gave a sufficient and necessary condition to judge whether a k-sparse matrix X can be recovered by l -minimization. Furthermore, Lai and Liu [12] 2,p consider the MMV setting when r 2 and p [0,1], they improved the condition in [7] ≥ ∈ and give a sufficient and necessary condition when r 2 . ≥ On the other hand, numerous algorithms have been proposed and studied for l - 2,0 minimization (e.g. [11] [10]). Orthogonal Matching Pursuit (OMP) algorithms are ex- tended to the MMV problem [20], and convex optimization formulations with mixed norm extend to the corresponding the SMV solution [16]. Hyder [10] provides us a robust algo- rithm for l -minimization which shows a clear improvement in both noiseless and noisy 2,p environment. p Due to the fact that X = lim X , it seems to be more natural to consider l - k k2,0 p→0k k2,p 2,p minimization instead of a NP-hard optimization l -minimization than others. However, 2,0 it is an important theoretical problem that whether there exists a general equivalence relationship between l -minimization and l -minimization. 2,p 2,0 3 In the case r = 1, Peng [19] have given a definite answer to this theoretical problem. There exists a constant p(A,b) > 0, such that every a solution to l -minimization is also p the solution to l -minimization whenever 0 < p < p(A,b), 0 ln min x +1 ln min x 0 0 p(A,b) = Ax=bk k − Ax=bk k , (6) (cid:16) lnr (cid:17)lnr (cid:16) (cid:17) m − However, this range can not be calculated. Peng [19] onlyproves theconclusion when r = 1, so it isurgent toextend thisconclusion to MMV problem. Furthermore, Peng just proves the existence of such p, he does not give us a computable expression of such p. Therefore, the main purpose of this paper is not onlytoprovetheequivalence relationshipbetweenl -minimizationandl -minimization, 2,p 2,0 but also present an analysis expression of such p in Section 2 and Section 3. 1.2 Main Contribution In this paper, we focus on the equivalence relationship between l -minimization and 2,p l -minimization. Furthermore, it is an application problem that an analysis expression 2,0 of such p∗ is needed, especially in designing some algorithms for l -minimization. 2,p In brief, this paper gives answers to two problems which are urgently needed to be solved: (I). There exists a constant p∗ such that every k-sparse solution matrix X to l - 2,0 minimization is also the solution to l -minimization whenever 0 < p < p∗. 2,p (II). We give an analysis expression of such p∗ which is formulated by the dimension of the matrix A Rm×n, the eigenvalue of the matrix ATA and B Rm×r. ∈ ∈ Our paper is organizedas follows. InSection 2, we will present some preliminaries of the null space condition, which plays a core role in the proof of our main theorem, and prove the equivalence relationship between l -minimization and l -minimization. In Section 2,p 2,0 3 we focus on proving the another main results of this paper. There we will present an analysis expression of such p∗ . Finally, we summarize our finding in last section. 1.3 Notation For convenience, for x Rn, we define its support by support (x) = i : x = 0 and i ∈ { 6 } the cardinality of set S by S . Let Ker(A) = x Rn : Ax = 0 be the null space of | | { ∈ } matrix A, denote by λ+ (A) the minimum nonzero absolute-value eigenvalue of ATA and min 4 by λ (A) the maximum one. We also use the subscript notation x to denote such a max S vector thatisequal toxontheindex set S andzero everywhere else. andusethesubscript notation X to denote a matrix whose rows are those of the rows of X that are in the S set index S and zero everywhere else. Let X be the i-th column in X, and let X col i row i be the i-th row in X. i.e X = [X ,X ...X ] = [X ,X ...X ]T, for col 1 col 2 col r row 1 row 2 row m X Rn×r. We use A,B = tr(ATB) and A = a 2. ∈ h i k kF i,j| ij| P 2 EQUIVALENCE RELATIONSHIP BETWEEN l -MINIMIZATION AND l -MINIMIZATION 2,p 2,0 In the single measurement vector (SVM) problem, there exists a sufficient andnecessary condition to judge a k-sparse vector whether can be recovered by l -minimization and l - 0 p minimization, namely, the null space condition. Theorem 1. [9] Given a matrix A Rm×n with m n, every x∗ with x∗ = k can be 0 ∈ ≤ k k recovered by l -minimization (0 p 1) if and only if: p ≤ ≤ x < x , (7) S p SC p k k k k for any x Ker(A), and set S 1,2,3...n with S T∗ , where T∗ = support(x∗). ∈ ⊂ { } | | ≤ | | Null space condition is widely used in sparse theory, however, this condition only con- siders a single measurement which we can treat it as the situation that r = 1 in MMV problem. Furthermore, in [12], the well-known Null space condition has been extended to the situation when r > 1. Theorem 2. (Theorem1.3of[12]) Let A be a real matrix of size m n and S 1,2...n × ⊆ { } be a fixed index set. Fixed p [0,1] and r 1. Then the following condition are equivalent ∈ ≥ (a) Allx(k) with supportin S fork = 1...r can beuniquelyrecoveredbyl -minimization. 2,p (b) For all vectors Z = [z(1),z(2)...z(r)] (N(A))r (0.0...0) ∈ \{ } Z < Z . (8) S 2,p SC 2,p k k k k (c) For all vectors z N(A), we have z p < z p. ∈ j∈S| j| j∈SC | j| P P It is worth pointing out that Theorem 2 not only provides us a sufficient and necessary condition of MMV’s version, but also proves the equivalence relationship between the situations when r = 1 and r > 1. 5 According to Theorem 2, we can get the following corollary which is very easy to be proved. Corollary 1. Given a matrix A Rm×n, if every X∗ Rm×r with X∗ = k can be 2,0 ∈ ∈ k k recovered by l -minimization, then we have the following conclusion. 2,0 (a) For any X (N(A))r (0,0...0) , we have that X 2k +1. 2,0 ∈ \{ } k k ≥ n 2.5 (b) we have that k − +1, where a represents the integer part of a. ≤ ⌈ 2 ⌉ ⌈ ⌉ (c) The number of measurements needed to recovery every k-sparse matrices always m satisfies m 2k, furthermore, k . ≥ ≤ ⌈ 2 ⌉ Proof. (a) According to Theorem 2, for any X (N(A))r (0,0...0) and S k, we ∈ \{ } | | ≤ have that X < X , (9) S 2,0 SC 2,0 k k k k and it is easy to get that X 2k +1. (10) 2,0 k k ≥ (b) According to the proof of (a), we have that n X 2k+1. Due to the integer- 2,0 ≥ k k ≥ value of k, we have that k (n 1)/2 when n is an odd number, similarly, we get that ≤ − k (n 2)/2 when n is an even number. ≤ − n 2.5 In brief, we get that k − +1, where a represents the integer part of a. ≤ ⌈ 2 ⌉ ⌈ ⌉ (c) For any x˜ N(A) 0 , we consider X˜ = [x˜,x˜...x˜] (N(A))r. ∈ \{ } ∈ According to the proof of (a), it is obvious that x˜ = X˜ 2k + 1, such that 0 2,0 k k k k ≥ the sub-matrix A is an invertible matrix, where S = support(x˜). Therefore, we can get S that 2k rank(A) m < n. Due to the integer-value virtue of k, we also can say that ≤ ≤ k m . ≤ ⌈2⌉ In order to clear further the meaning of new version null space condition and use it more conveniently, it is necessary to introduce a new concept named M-null space constant (M- NSC). Definition 1. Given an underdetermined matrix A Rm×n, for every p [0,1] and a ∈ ∈ positive integer k, the M-null space constant h(p,A,r,k) is the smallest number such that, p p X h(p,A,r,k) X , when 0 < p 1, k Sk2,p ≤ k SCk2,p ≤ and X h(0,A,r,k) X , when p = 0, S 2,0 SC 2,0 k k ≤ k k 6 for every index set S 1,2,...,n with S k and every X (Ker(A))r (0,0...0) . ⊂ { } | | ≤ ∈ \{ } According to the definition of M-NSC, it is easy to get the following corollary which is also very easy to be proved and we leave the proof to readers. Corollary 2. Every k-sparse matrix X Rn×r can be recovered by l -minimization if 2,p ∈ and only if h(p,A,r,k) < 1. As shown in Corollary 2, M-NSC provides us a sufficient and necessary condition of the solution to l -minimization and l -minimization, and it is important for proofing the 2,0 2,p equivalence relationship between l -minimization and l -minimization. Furthermore, 2,0 2,p we emphasize a few important properties of h(p,A,r,k). Proposition 1. The M-NSC as defined in Definition 1, is nondecreasing in p [0,1]. ∈ Proof. The proof is divided into two steps. Step 1: To prove h(p,A,r,k) h(1,A,r,k), for any p [0,1]. ≤ ∈ For any X (N(A))r (0,0...0) , without of generality, we assume that X row 1 2 ∈ \{ } k k ≥ X ... X . row 2 2 row n 2 k k ≥ k k We define a function θ(p,X,k) as k p X θ(p,X,k) = i=1k row ik2 , (11) n p X Pi=k+1k row ik2 then it is easy to get that the definitionPof h(p,A,r,k) is equivalent to h(p,A,r,k) = max sup θ(p,X,k) (12) |S|≤kX∈(N(A))r\{(0,0...0)} For any p [0,1], the function f(t) = tp (t > 0) is a non-increasing function. For any ∈ t j k +1,...n and i 1,2...k , we have that, ∈ { } ∈ { } p p X X k row jk2 k row ik2. (13) X ≥ X row j 2 row i 2 k k k k We can rewrite inequalities (13) into p X X k row ik2 k row ik2. (14) p X ≤ X k row jk2 k row jk2 Therefore, we can get that k p k X X i=1k row ik2 i=1k row ik2. (15) p X ≤ X Pk row jk2 Pk row jk2 We can conclude that n p n X X j=k+1k row jk2 j=k+1k row jk2. (16) k X p ≥ k X P i=1k row ik2 P i=1k row ik2 P P 7 1.6 1.4 k=1 k=2 1.2 C 1 S N − M 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p Figure 1: M-NSC in Example 1 such that 1 1 . i.e., θ(p,X,k) θ(1,X,k). θ(p,X,k) ≥ θ(1,X,k) ≤ Because h(p,A,r,k) = max sup θ(p,X,k), we can get that h(p,A,r,k) |S|≤kX∈(N(A))r\{(0,0...)} ≤ h(1,A,r,k). Step 2: To prove h(pq,A,r,k) h(p,A,r,k) for any p [0,1] and q (0,1). ≤ ∈ ∈ According to the definition of θ(p,X,k) in Step 1, we have that k X pq k ( X p)q n X p θ(pq,X,k) = i=1k row ik2 = i=1 k row ik2 i=1k row ik2 . (17) n X pq n ( X p)q ≤ n X p Pj=k+1k row jk2 Pj=k+1 k row jk2 Pj=k+1k row jk2 It needs to be pPointed out that we haPve prove the fact in StePp 1, that k u p k u i=1| i| i=1| i| , (18) n u p ≤ n u Pj=k+1| j| Pj=k+1| j| for any u u u .P P 1 2 n | | ≥ | |··· ≥ | | Therefore, we can get that θ(pq,X,k) θ(p,X,k), in other words, θ(p ,X,k) 1 ≤ ≤ θ(p ,X,k) as long as p p . 2 1 2 ≤ Because h(p,A,r,k) = max sup θ(p,X,k), so we can get that h(p,A,r,k) |S|≤kX∈(N(A))r\{0,0...0} is nondecreasing in p [0,1]. ∈ The proof is completed. Proposition 2. The M-NSC as defined in Definition 1, is a continuous function in p ∈ [0,1]. Proof. As been proved in Proposition 1, h(p,A,r,k) is nondecreasing in p [0,1], such ∈ 8 that there is jump discontinuous if h(p,A,r,k) is discontinuous at a point. Therefore, it is enough to prove that it is impossible to have jump discontinuous points of h(p,A,r,k). For convenience, we still use θ(p,X,S) which is defined in proof of Proposition 1, and the following proof is divided into three steps. Step 1. To prove that there exist X (N(A))r and a set S 1,2...n such that ∈ ⊂ { } θ(p,X,S) = h(p,A,r,k). Let V = X ((N(A))r) : X = 1, i = 1,2...n , and it is easy to get that row i 2 { ∈ k k } h(p,A,r,k) = max sup θ(p,X,S) |S|≤kX∈V It needs to be pointing out that, the choice of the set S 1,2...n with S k is ⊂ { } | | ≤ ′ ′ ′ limited, so there exists a set S with S k such that h(p,A,r,k) = sup θ(p,X,S ). | | ≤ X∈V ′ On other hand, θ(p,X,S ) is obviously continuous in X on V. Because of the compact- ′ ′ ′ ness of V, there exists X V such that h(p,A,r,k) = θ(p,X ,S ). ∈ Step 2. To prove that lim h(p,A,r,k) = h(p ,A,r,k). 0 − p→p 0 Weassumethat lim h(p,A,r,k) = h(p ,A,r,k). AccordingtoProposition1,h(p,A,r,k) 0 − 6 p→p 0 is nondecreasing in p [0,1], therefore, we can get a sequence of p with p p− such ∈ { n} n → 0 that lim h(p ,A,r,k) = M < h(p ,A,r,k). (19) n 0 − pn→p0 According to the proof in Step 1, there exists X′ (N(A))r and S 1,2...n such ∈ ⊂ { } ′ ′ that h(p ,A,r,k) = θ(p ,X ,S ). It is easy to get that 0 0 ′ ′ ′ lim θ(p ,X,S ) = θ(p,X ,S ) = h(p ,A,r,k). (20) n 0 − p→p 0 According to the definition of θ(p,X,S), it is obvious that ′ ′ h(p ,A,r,k) θ(p ,X ,S ), (21) n n ≥ however, (19) and (21) contradict each other. Therefore, we have that lim h(p,A,r,k) = h(p ,A,r,k). 0 − p→p 0 Step 3. To prove that lim h(p,A,r,k) = h(p ,A,r,k), for any p [0,1). 0 0 p→p+ ∈ 0 We consider a sequence of p with p p < 1 and p p+. { n} 0 ≤ n → 0 According to Step 1, there exist X V and S k such that n n ∈ | | ≤ h(p ,A,r,k) = θ(p ,X ,S ), (22) n n n n 9 since the choice of S 1,2...n with S k is limited, there exists two subsequence ⊂ { } | | ≤ ′ p of p , X of X and a set S such that { ni} { n} { ni} { n} ′ θ(p ,X ,S ) = h(p ,A,r,k). (23) ni ni ni Furthermore, since X V, it is easy to get a subsequence of X which is convergent. n ∈ ni ′ Without of generality, we assume that X X . ni → ′ ′ ′ Therefore, we can get that h(p ,A,r,k) = θ(p ,X ,S ) θ(p ,X ,S ). ni ni ni → 0 ′ ′ According to the definition of h(p ,A,r,k), we can get that θ(p ,X ,S ) h(p ,A,r,k), 0 0 0 ≤ such that lim h(p,A,r,k) = h(p ,A,r,k) 0 p→p+ 0 Combining Step 2 and Step 3, we show that it is impossible for h(p,A,r,k) to have jump discontinuous. The proof is completed. The concept M-NSC is very important in this paper and it will offer tremendous help in illustrating the performance of l -minimization and l -minimization, however, M- 2,0 2,p NSC is difficult to be calculated for large scale matrix. We show the figure of M-NSC in Example 1 in Figure 1. Combining Proposition 1 and 2, then we can get the first main theorem which shows us the equivalence relationship between l -minimization and 2,0 l -minimization. 2,p Theorem 3. If every k-sparse matrix X can be recovered by l -minimization, then there 2,0 exists a constant p(A,B,r) such that X also can be recovered by l -minimization when- 2,p ever 0 < p < p(A,B,r). Proof. According to Proposition 1 and Proposition 2, we can get that h(0,A,r,k) < 1 if l -minimization can recovery every k-sparse matrix X, 2,0 Since h(p,A,r,k) is continuous and nondecreasing at the point p = 0, there exists a constant p(A,B,r) and a small enough number δ that h(0,A,r,k) < h(p,A,r,k) ≤ h(0,A,r,k)+δ < 1 for any p (0,p(A,B,r)). ∈ The proof is completed. 3 AN ANALYSIS EXPRESSION OF SUCH P In Section 2, we have proved the fact there exists a constant p(A,B,r) such that both l -minimization and l -minimization have the same solution, however, it is also impor- 2,p 2,0 tant to give such an analysis expression of p(A,B,r). In Section 3, we focus on giving an 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.