ebook img

Statistical Mechanics of Dictionary Learning PDF

0.46 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Statistical Mechanics of Dictionary Learning

epl draft Statistical Mechanics of Dictionary Learning Ayaka Sakata1 and Yoshiyuki Kabashima1 3 1 DepartmentofComputational IntelligenceandSystemsScience,Tokyo InstituteofTechnology, Yokohama226-8502, 1 Japan. 0 2 n a PACS 89.20.Ff–Computer science and technology J PACS 75.10.Nr–Spin-glass and otherrandom models 5 Abstract–Findingabasismatrix(dictionary)bywhichobjectivesignalsarerepresentedsparsely 2 isofmajor relevanceinvariousscientificandtechnological fields. Weconsideraproblemtolearn ] a dictionary from a set of training signals. We employ techniques of statistical mechanics of n disordered systems to evaluate the size of the training set necessary to typically succeed in the n dictionary learning. The results indicate that the necessary size is much smaller than previously - estimated, whichtheoretically supportsand/orencourages theuseof dictionary learning inprac- s i tical situations. d . t a m - d Introduction. – Invariousfieldsofscienceandtech- [11–13]. n o nology, such as earth observation, astronomy, medicine, Letus denote the training setof M-dimensionalsignals c civil engineering, materials science, and in compiling im- as an M P matrix Y = Y , where each column vec- [ agedatabases[1],ithasamajorrelevancetorecoverorigi- tor Y rep×resents a sample{sigµnl}al and P is the number of 3 nalsignalsfromdeficientsignalsobtainedbylimitednum- the salmples. In a simple scenario, DL is formulated as a v berofmeasurements. TheNyquist-Shannonsamplingthe- problem to find a pair of an M N matrix (dictionary) 8 orem [2] provides the necessary and sufficient number of D= D and an N P sparse×matrix X= X such 17 measurements for recovering arbitrary band-limited sig- that{Y =µi}DX holds.×By DL, the characterist{icsi/l}trends 6 nals. However, techniques based on this theorem some- underlying Y areextractedintoD,andY canbecom- 3. timesdonotmatchrestrictionsand/ordemandsoftoday’s pactly repre{seln}ted as a superposition of a felw dictionary front-line applications [3,4], and much effort is still being 0 columns,whosecombinationandstrengtharespecifiedby 2 made to find more efficient methodologies. the sparse matrix X. DL suits not only efficient signal 1 The concept of sparse representations has recently processing such as CS, but also extraction of non-trivial : v drawn great attention in such research. Many real world regularitiesfromhigh-dimensionaldata. Forinstance,DL i signals such as natural images are represented sparsely in hasbeensuccessfullyappliedtothefacialimageprocessing X Fourier/waveletdomains; namely, many components van- fortheefficientstorageoflargedatabases,wherestandard r a ish or are negligibly small in amplitude when the signals algorithmsfail [12,14]. In this case, Yl and D correspond are represented by Fourier/wavelet bases. This empirical to a facial image and a collection of patches of facial pat- property is exploited in the signal recovery paradigm of terns learned by the P samples, respectively. A variant compressed sensing (CS) enabling the recovery of sparse of DL has also been employed in gene expression analy- signals from much fewer measurements than those the sis to estimate transcription factor activity D from gene sampling theorem estimates [5–10]. expression data of a small size Y [15]. l { } However, the effectiveness of CS relies considerably on AnimportantquestionofDL is howlargeasamplesize theassumptionthatabasisbywhichtheobjectivesignals P is necessary to uniquely identify an appropriate dictio- look sparse is known in advance. Therefore, in applying nary D, because the ambiguity of the dictionary is fatal CS to general signals of interest, whose bases for sparse in use for signal/data analysis after learning. As the first representation are unknown, the primary task to accom- answer to this question, an earlier study based on linear plish is to identify an appropriate basis (dictionary) for algebrashowedthat when the training set Y is generated the sparse representation from an available set of train- by a pair of matrices D0 and X0 (planted solution) as ing signals. This is oftentermeddictionary learning (DL) Y = D0X0, one can perfectly learn these as a unique p-1 Ayaka Sakata and Yoshiyuki Kabashima solution except for the ambiguities of signs and permu- (a) (b) tations of matrix elements if P > P = (k+1) C and c N k k is sufficiently small, where k is the number of non-zero elements in each column of X0 [16]. This result is signifi- P(h|X0„ 0) cˆX+mˆX2 P(h|X0=0) cˆX cantasitisthefirstproofthatguaranteesthelearnability h h withafinitesizesamplesetforDL.However,theestimate of P is supposed to enable a considerable improvement; - h h - h h c th th th th the authors of [16] speculated that P could be reduced c substantially to O(N2) or even smaller, although provid- (c) X* (d) ¶ X* ¶ h ing a mathematical proof was technically difficult. The improvementoftheestimationPc ispracticallysignificant - h h because it will leadto considerablereduction ofnecessary th cost for DL in terms of both sample and computational h Qˆ-1 th X complexities. h In this Letter, we take an alternative approach to esti- - h h th th matingP . Specifically,weexaminethetypicalbehaviorof c DLusingthereplicamethodinthelimitofN,M,P . →∞ Themainresultofouranalysisisthattheplantedsolution Fig. 1: (color online) (a) and (b) show distributions of local is typically learnable by O(N) training samples if negligi- field h (a) P(h|X0 6= 0) for X0 6= 0 and (b) P(h|X0 = 0) for ble mean squareerrorsper element are allowedand M/N X0 = 0. (c) and (d) show X∗ and ∂X∗/∂h as functions of h, issufficientlylarge. Thistheoreticallysupportsand/oren- respectively. courages the employment of DL in practical applications. Problem setting. – We focus on the learning strat- convergestothetypicalvaluef N−2[F]0 withprobabil- ≡ egy ity unity, where [ ]0 stands for the averagewith respect to D0 and X0. ·C··onsequently, this property is also ex- min Y(=N−1/2D0X0) N−1/2DX 2 pected to hold for other relevant macroscopic variables of D,X|| − || the solution of eq. (1), D∗ and X∗. Therefore, assessing subj. to D 2 =MN, X =NPθ (1) || || || ||0 f is the central issue in our analysis. [11,12,16–19], where A 2 = A2 for a matrix Thisassessmentcanbecarriedoutsystematicallyusing A= A , and A rep||res||ents theµnlumµlber of non-zero thereplicamethod[20]inthelimitofN whilekeep- µl 0 →∞ eleme{nts}of A. T||he||parameter θ P[0,1] denotes the rate ing α = M/N O(1) and γ = P/N O(1). Under the ∼ ∼ of non-zero elements assumed by∈the learner, and N−1/2 replica symmetric (RS) ansatz, where the solution space ofeq.(1)isassumedtobecomposedofatmostafewpure is introduced for convenience in taking the large system states [21], the free energy density is given as limit. For simplicity, we assume that D0 and X0 of the Qˆ χˆ χ χˆ +mˆ2 pstlraanitnetds osofluDtio0n2ar=e MunNifo,rmXly0ge0n=eraNtePdρuanndder tXhe0 c2on=- f =eΩx,tΩˆrn−α(cid:16) D−2D D−mˆDmD+ D2QˆDD(cid:17) NPρ. Wec|o|nsid||erthatthe|c|orre||ctnon-zeroden|s|ityρ||can γ QˆXQX−χˆXχX mˆ m +λθ φ(h;Qˆ ,λ) differ from θ for generality, but we assume ρ θ; other- − 2 − X X −hh X iih wise, the correct identification of D0 and X0≤is trivially α(cid:16)γ(Q 2m m +ρ) (cid:17) X D X + − , (3) impossible. The main goal of our study is to evaluate the 2(1+Q χ +χ ) X D X critical sample ratio γ = P /N above which the planted o c c solution can be learned typically. where extrΩ,Ωˆ{G(Ω,Ωˆ)} stands for the extremization of a function (Ω,Ωˆ) with respect to a set of macroscopic Statistical mechanics approach. – Partitionfunc- G variables Ω χ ,m ,Q ,χ ,m and that of their tion conjugates Ωˆ≡ {QˆD ,χˆD,mˆX,QˆX ,χˆX,}mˆ ,λ , and D D D X X X ≡{ } β Zβ(D0,X0)= dDdXexp −2N||DX−D0X0||2 φ(h;Qˆ ,λ)=min lim QˆXX2 hX+λX ǫ . (4) δZ( D 2 NM(cid:18))δ( X 0 NPθ) ((cid:19)2) X X ǫ→+0n 2 − | | o × || || − || || − Notation representsthe averagewithrespecttoh h hh···ii constitutes the basis of our approachsince the minimized accordingtothedistributionP(h)=ρP(hX0 =0)+(1 | 6 − costofeq.(1) canbe identified withthe zerotemperature ρ)P(hX0 =0), where P(hX0 =0) and P(hX0 =0) are free energy F= lim β−1lnZ(D0,X0;β). This sta- givenb|y zero-meanGaussia|n di6stributions wi|th variances β→∞ tistically fluctua−tes depending on D0 and X0. However, χˆ +mˆ2 andχˆ ,respectively(Fig.1(a),(b)). Thedetails X X X as N,M,P , one can expect that the self-averaging of the derivation of the free energy density are shown in → ∞ property is realized; i.e., the free energy density N−2F Appendix. p-2 Statistical Mechanics of Dictionary Learning Physical implications. – At the extremum of 0.2 eq. (3), the relationships N=30, M=15 0.1 N=40, M=20 1 N=50, M=25 m = [Tr(D0)TD∗] , (5) N=60, M=30 D 0 0.15 MN 1 m = [Tr(X0)TX∗] , (6) X 0 NP QX = N1P[Tr(X∗)TX∗]0 = N1P[||X∗||2]0 (7) 0.1 0.0130 40 N 50 60 hold, where T denotes the matrix transpose. These pro- 0.05 videthe meansquareerrors(perelement),whichmeasure the performance of DL, as 1.25 ǫD ≡ M1N[||D∗−D0||2]0 =2(1−mD) (8) 0 1 1.2 1.4 g 1.6 1.8 2 ǫ (NP)−1[ X∗ X0 2] =ρ 2m +Q . (9) X 0 X X ≡ || − || − The variables χ and χ physically mean the sensitivity D X Fig. 2: (color online) γ-dependence of the ratio of marginal of the estimates D∗ and X∗ when the cost of eq. (1) is modes relative to N2 at α = 0.5 and ρ = θ = 0.1. The be- linearly perturbed. havior at N → ∞ extrapolating from the results of finite N Eq.(4)representstheeffectivesingle-bodyminimization is denoted by thedashed line. Inset: N-dependence of thera- problem concerning an element of X that is statistically tio of marginal modes for γ = 2. The dashed line stands for equivalent to eq. (1) [22]. Here, the randomness of D0 N−1 as a guide. Each marker represents the average of 100 and X0 is effectively replaced by the random local field experiments. h. The first and second terms of P(h) correspond to the cases where an element of X0 is given as X0 = 0 and X0 = 0, respectively. Under a given h, the solu6 tion X∗ c0o.rTrehcet ifdoernmteifirccaatsioenporofvDid0esanǫDd X=0ǫ,Xan=d0heinndceicwateincgaltlhiet thatminimizesthecostofeq.(4)isofferedasX∗ =h/Qˆ X thesuccesssolution. Thelatterisreferredtoasthefailure for h > h (2Qˆ λ)1/2 and 0 otherwise (Fig. 1 (c)). | | th ≡ X solution since mD =0 and mX =0 indicate the complete Werefertothecasesof|h|>hth and|h|<hth as“active” failure of information extraction of D0 and X0. and“inactive,”respectively. WhenX0 =0,hisgenerated 6 Success solution (S) exists when γ >1 and froma Gaussiandistribution(P(hX0 =0)) ofzero-mean | 6 tahnadnvwarhieanncXe0χˆ=X+0,mˆfo2Xr,wahnicdhXh∗isischmaorraectleikreizlyedtobybaenaoctthiveer α>θeSff(θ,ρ)=θ+(1−ρ) π2uexp −u22 (10) zero-mean Gaussian (P(hX0 = 0)) of a smaller variance r (cid:18) (cid:19) | χˆ (Fig. 1 (a),(b)). Therefore, one can expect that the hold, where u = H−1((θ ρ)/(2(1 ρ))) and H−1(x) is X hard-thresholding scheme based on hth represents proper the inverse function of H−(x) = (2π−)−1/2 ∞dte−t2/2. S assignment of zero/non-zero elements in X∗ so as to ac- is further classified into two cases dependxing on γ. For curately estimate X0 and D0 if mˆX is sufficiently large. γ >γS, where R A distinctive feature of X∗ is the divergence of the α local susceptibility ∂X∗/∂h at “border” cases of h = γ (α,θ,ρ)= , (11) S α θS hth (Fig. 1 (d)). This affects the increase in the ef- − eff ± fective degree of freedom (ratio) as follows: θeff = θ + χD andχX are finite. Onthe other hand, for 1<γ <γS, hhhthδ(|h|−hth)iih, whereas hth is determined so as to χD and χX tend to infinity, keeping R=χD/χX finite. satisfy θ = dhP(h) indicating the sparsity con- To physically interpret this classification, let us take a |h|>hth dition ||X||0R= NPθ. The excess hhhthδ(|h|−hth)iih is variation around Y =N−1/2D0X0, which yields supposedtorepresentacombinatorialcomplexityforclas- sifyingeachelementofX∗ thatcorrespondstotheborder 0=δ(DX)D0,X0 =D0δX+δDX0. (12) | case h =h into the active case, h >h and X∗ =0, orth|e i|nactitvhecase, h <h andX| |=0. tThhediverge6 nce IfδD =0andδX =0arethe unique solutionofeq.(12), | | th theplantedsolutionislocallystable. Otherwise,thereare of ∂X∗/∂h is also accompanied by the instability |h=±hth “marginal”modesalongwhichthecostofeq.(1)doesnot of the RS solution against perturbations that break the increaselocally,andthesolutionsetformsamanifold. The replica symmetry [23]. The influence of this instability is numberofconstraintsofeq.(12),MP,coincideswiththat discussed later. of the degree of freedom of δD and δX, MN +NPθs , eff Actualsolutions.– Wefoundtwotypesofsolutions; at P =γ N. Thus, the classificationbelow/aboveγ cor- S S the firstoneischaracterizedbym =1andQ =m = responds to the change in the number of marginal modes D X X ρ,whilethesecondischaracterizedbym =0andm = around the planted solution. X D p-3 Ayaka Sakata and Yoshiyuki Kabashima Success Planted solution solution e c 1 a Learnable by p s ate 0.8 O(N) samples St Failure 0.6 solution y a g Failure er en 0.4 e e Fr Success g 0.2 Impossible to learn g g 1 S F 0 0.2 0.4 q 0.6 0.8 1 Fig. 3: (color online) Schematic pictures of γ-dependence of thephase space and free energy underRS assumption. Fig. 4: (color online) Phase diagram on α−θ plane. Toconfirmthevalidityofthisinterpretation,wenumer- icallyevaluatedthe numberofmarginalmodes ofeq.(12) Fisauniquesolution. Asγ increases,Sappearsatγ =1, in the case of α=1/2 and θ =ρ=0.1, which is shown in and the number of marginal modes changes from O(N2) Fig. 2. The assessmentof γ when θ =ρ is conjectured to s to O(N) at γ = γ . This implies that when negligibly beexactsincetheeffectoftheborderelementsisnegligible S smalllinearperturbationsareaddedtothecostofeq.(1), under this condition. Fig. 2 indicates that the number of marginal modes scales as O(N2) for γ <γ =1.25, while the limits limN→∞ǫD 0 and limN→∞ǫX 0 still hold S ∼ ∼ for S of γ > γ while they can be boosted to O(1) for S it scales as O(N), and the contribution of the marginal S of γ < γ . For γ < (γ )γ , S and F are degenerated modes approacheszero,for γ >γ (inset). This resultco- S S F S ≤ providing f =0. However,at γ =γ , S becomes thermo- incideswithourtheoreticalassessment. Atthesametime, F dynamically dominant by keeping f = 0, while F begins this implies that identifying the planted solution without to have positive f. This means that the planted solution any errors by eq. (1) is difficult as long as γ O(1), ∼ is typically learnable by P > P = Nγ O(N) training but the discrepancies per element caused by the marginal c F ∼ samples if negligible mean square errors per element are modes are negligibly small and could be allowed in many allowed. practical situations. Inthecaseofγ <1,foranyN P matrixZ of Z = Fig. 4 plots the phase diagram on an α θ plane. The NPθ, X∗ = aZ and D∗ = a−1Y×(ZZT)−1ZT m||ini|m|0ize region above α = θeFff(θ) (curve) represent−s the condition the cost of (1) to zero, where a is determined such that under which the planted solution is typically learnable D∗ 2 = MN. This implies that the set of solutions by O(N) training samples. DL is impossible in the re- o||f eq|.| (1) spreads widely, and the weight of the planted gion below α = θ (straight line) because X0 cannot be solution is negligibly small in the state space. This may correctly recovered even if D0 is known [7]. How the be why S disappears for γ <1. sample complexity scales with respect to N in the region Failure solution (F) exists for γ 0. If θ <α<θeFff(θ) is beyond the scope of this Letter, but an ∀ ≥ interesting question nonetheless. 2 v2 α<θF (θ)=θ+ vexp (13) eff π − 2 Summary and discussion. – In summary, we have r (cid:18) (cid:19) assessed the size of training samples required for cor- where v = H−1(θ/2) holds, F always offers χ ,χ rectly learning a planted solution in DL using the replica D X → ∞ makingthefreeenergyf vanish. Forα>θF ,ontheother method. OuranalysisindicatedthatO(N)samples,which eff hand, χ and χ become finite implying that a single aremuchfewerthanestimatedinanearlierstudy[16],are D X solutionofeq. (1) is locally stable for most directions and sufficient for learning a planted dictionary with allowance offers f >0, if γ is greater than for negligible square discrepancies per element when the α number of non-zero signals is sufficiently small compared γF(α,θ)= (α1/2 (θF )1/2)2. (14) to that of measurements. − eff It was shown that the identification of dictionary can The inequality θF θS always holds because the influ- be characterizedas a phase transition with respect to the eff ≥ eff enceoftheborderelementsforFisstrongerthanthatfor number of training samples. Our RS analysis probably S, which leads to γ γ . does not describe the exact behavior of DL since the RS S F ≤ Fig. 3 illustrates changes in state space that occur for solutionsareunstableagainstthereplicasymmetrybreak- sufficiently large α under the RS assumption. For γ < 1, ing (RSB) disturbances. However, we still speculate that p-4 Statistical Mechanics of Dictionary Learning theRSestimateofγ servesasanupperboundofthecor- and F rectcriticalratioγ . Thisisbecausethe freeenergyvalue c 1=NP dqabδ(Tr(Xa)TXb NPqab) (20) ofFassessedundertheRSBansatzshouldbegreaterthan X − X or equal to that of the RS solution due to the positivity Z to the integrand. Let us denote (qab) and constraintof the entropy of pure states (complexity) [24], QD ≡ D QX ≡ (qab), and introduce two joint distributions whereas that of S is kept to vanish, which always yields a X smaller estimate of γF. P ( Da ; )= PD0(D0) Promising future research includes an extension of the D D { } Q V ( ) D D currentframeworktomoregeneralsituationssuchasnoisy Q n cases as well as refinement of the estimates of the critical δ( Da 2 NM) δ(Tr(Da)TDb NMqab), ratios γ and γ taking RSB into account [25]. × || || − − D S F a=1 a<b Y Y (21) ∗∗∗ P ( Xa ; )= PX0(X0)δ(||X0||0−NPρ) X X This work was partially supported by a Grant-in-Aid { } Q VX( X) Q for JSPS Fellow (No. 23–4665) (AS) and KAKENHI No. n δ( Xa NPθ) δ(Tr(Xa)TXb NPqab), 22300003(YK). × || ||0− − X a=1 a≤b Y Y Appendix: Derivationofeq.(3).– Ingeneral,the (22) configurationalaverageofthefreeenergydensitycouldbe where V ( ) and V ( ) are the normalization con- evaluated on the basis of the following formula: D D X X Q Q stants. The above-mentioned computation provides the 1 1 ∂ f = lim lim ln[Zn(D0,X0)] . (15) following expression: β −β N→∞N2 n→0∂n β 0 [Zn(D0,X0)] Unfortunately, assessing [Zn(D0,X0)] for n R in the β 0 β 0 ∈ mathematically rigorous manner is technically difficult, = d(NM )d(NP )V ( )V ( ) D X D D X X Q Q Q Q and this fact prohibits us from utilizing eq. (15) in prac- Z tice. In the replica method, this difficulty is resolved by n β evaluating N−2ln[Zβn(D0,X0)]0 for n∈N as an analytic × exp−2 taµl2 , function of n first in the limit of N , and taking the a=1 µ,l →∞ Y X QX,QD n 0 limit afterward with use of the obtained analytic    fun→Mctoiroenpfroercinse∈ly,Rwaesewvaellul.ate[Zβn(D0,X0)]0 byaveraging wa[·n·hd·e]rQeDXt,QaµalX=wreiptNhrei−ns1e/nt2htPsetNish=tea1t(aeDveµasrpiXaagciaele−wspitDehcµ0iirfiXeesdi0pl)eb.cyt Ntoot{aDtainoand} the right hand side of an identity { } QD , whose distributions are given by eqs. (21) and (22). X Zn(D0,X0)= QDistributions(21)and(22)areindependentofeachother, β n andprovideeachentryof Da and Xa withzeromean dDaXaδ( Da 2 NM)δ( Xa 0 NPθ) and a finite variance. Thi{s allo}ws us{to u}tilize the central || || − || || − Z a=1 limit theorem indicating that we can handle ta as multi- Y(cid:8) (cid:9) µl β n variate Gaussian random variables that follow exp DaXa D0X0 2 , (16) × −2N Xa=1|| − || ! P ( ta )= 1 exp 1 ta ( −1)abtb , whichis validforonly n N, overthe distributions ofthe t { µl} (2π)ndet −2 µl T µl planted solutions D0 and∈X0 that are given by Yµl T Xa,b p  (23) 1 PD0(D0)= NDδ(||D0||2−NM) (17) washTeraebT=sqtDaabnqdXasbf−or(qaDan0qnXa×0n+mqDba0tqrXbi0x)w+hρo.seUetniltirziiensgatrheisgiavnedn and evaluatingintegralsof and bymeansofthesaddle X D Q Q ρ X2 point method lead to an expression PX0(X0)=Yi,l (cid:26)(1−ρ)δ(Xi,l)+ √2π exp(cid:18)− 2il(cid:19)(cid:27),(18)Nl→im∞N12[Zβn(D0,X0)]0 =extr − α2γ lndet(In+βT) parenesrdpfoe(rcXmtiv0in,e{glyX,thwae}h)ienrttehegaNrtaDclsoimsofeth2o(eunnt+oirn1m)tahvliaiszraieatvibaollneusac(toDinosn0t,,a{wnDte.aiIn}n)- +γnTrQˆ2XQX +ln Z {haY=n0dXα}PX0(X0)e−Ξ!o sert trivial identities with respect to all combinations of Trˆ 1 replicas (a,b=0,1,2,...,n), +α QDQD lndet ˆD 2 − 2 Q n nα o 1=NM dqabδ(Tr(Da)TDb NMqab), (19) +nλθ+ ln(2π) . (24) D − D 2 Z i p-5 Ayaka Sakata and Yoshiyuki Kabashima Here, represents the n n identity matrix, auxil- eq.(30), dXe−ξ,isreplacedtoe−βφ(h;QˆX,λ) byapplying n iary vaIriables ˆ (qˆab) ×and ˆ (qˆab) are intro- thesaddlepointmethod. Insertingeqs. (29)–(31)andthe QD ≡ D QX ≡ X R duced in evaluating V ( ) and V ( ) with use of rescaledvariablesinto eq.(24) offersthe expressionofthe D D X X the saddle point method,Qand Ξ 1 Qn qˆabXaXb + zero temperature free energy density (3). ≡ 2 a,b=0 X λ n lim Xa ǫ. Extremization should be taken a=1 ǫ→+0| | P with respect to λ and four kinds of macroscopic variables P, , ˆ , and ˆ . REFERENCES D X D X Q Q Q Q Exactly evaluating eq. (24) should provide the correct [1] Starck J. -L., Murtagh F. and Fadili J. M., Sparse leading order estimate of N−2ln[Zβn(D0,X0)]0 for each Image and Signal Processing: Wavelets, Curvelets, Mor- of n N. However, we here restrict the candidate of the phological Diversity (Cambridge Univ.Press), 2010. ∈ dominant saddle point to that of the replica symmetric [2] Nyquist H.,Trans. AIEE,47 (1928) 617. form as [3] Malviya S., Voepel-Lewis T., Eldevik O. P., Rock- wellD.T.,WongJ.H.,andTaitA.R.,BritishJournal 1, a=b of Anaesthesia, 84 (2000) 743. qDab = qD, a6=b, (a,b6=0) (25) [4] Lian L. Y., and Robert G. (Eds.), Protein NMR Spec- m , a=0,b=0 troscopy: PrincipalTechniquesandApplications(JohnWi-  D 6 ley & SonsLtd.) 2011. Q , a=b  X [5] DonohoD.,IEEETrans.Inform.Theory,52(2006)1289. qab = q , a=b, (a,b=0) (26) X  X 6 6 [6] CandesE. J., andTao T.,IEEE Trans. Inform. Theory,  mX, a=0,b6=0 51 (2005) 4203. Qˆ , a=b [7] Kabashima Y,, Wadayama T., andTanaka T.,J. Stat.  D qˆab = qˆ , a=b, (a,b=0) (27) Mech., (2009) L09003. D  − D 6 6 [8] Donoho D. L., Maleki A., and Monatanri A., PNAS, mˆ , a=0,b=0  − D 6 106 (2009) 18914.  QˆX, a=b [9] Ganguli S. and Sompolinsky H., Phys. Rev. Lett., 104 qˆab = qˆ , a=b, (a,b=0) (28) (2010) 188701. X  − X 6 6 mˆ , a=0,b=0 [10] KrzakalaF.,M´ezardM.,SaussetF.,SunY.F.,and  X − 6 Zdeborova, L. , Phys. Rev. X, 2(2012) 021005. so as to obtain an analytic expression with respect to n. [11] RubinsteinR.,BrucksteinA.M.,andEladM.,Proc. This yields of IEEE,98 (2010) 1045. [12] Elad M., Sparse and Redundant Representations: From lndet( +β )=nln(1+β(Q q q )) Theory to Applications in Signal and Image Processing, n X D X I T − (Springer-Verlag),2010. β(q q 2m m +ρ) +ln 1+n D X − D X , (29) [13] GleichmanS.,andEldar Y.C.,IEEEInfo.Theor.,57 1+β(Q q q ) X D X (2011) 6958. (cid:16) − (cid:17) [14] Bryt O. and Elad M., J. Vis. Commu. Image Rep., 19 (2008) 270. Trˆ n Q2XQX +ln dXaPX0(X0)e−Ξ! [15]HGoofnfmgaTn.E, .XPu.a,nClJa.,rlCkheeRn.,La.n,dRWigagnignsYR.,.BBM.C, LBiioHin.-, Z aY=0 formatics 2011, 12 (2011) 82. n n(n 1) = Qˆ Q nmˆ m − qˆ q [16] Aharon M., Elad M., and Bruckstein A. M., Linear X X X X X X 2 − − 2 Algebra and its Applications, 416 (2006) 48. +ln dXe−ξ n , (30) [17] Olshausen B. A. and Field D. J., Vision Res., 37 h hh ii (1997) 3311. (cid:16)Z (cid:17) [18] Engan K., Aase S. O., and Hakon Husoy J., IEEE and Acoustic, Speech and Signal Processing, (1999) 2443. [19] Aharon M., Elad M., and Bruckstein A. M., IEEE TrQˆDQD 1lndet ˆ Trans. Signal Processing, 2006 (54) 11. D 2 − 2 Q [20] Dotzenko V.,Introduction to the ReplicaTheory of Sta- = nQˆ nmˆ m n(n−1)qˆ q tistical Systems (Cambridge Univ.Press), 2001. 2 D− D D− 2 D D [21] Me´zardM.,ParisiG.,andVirasoroM.A.,SpinGlass n 1 qˆ +mˆ2 Theory and Beyond, (World Sci. Pub.) 1987. − 2 ln(QˆD+qˆD)− 2 1−nQˆD +qˆD , (31) [22] Guo D.andVerdu´ S.,IEEE Trans. Inform. Theory, 51 (cid:18) D D(cid:19) (2005) 1983. [23] de Almeida J. R. L. and Thouless D. J., J. Phys. A: where ξ ≡ (QˆX +qˆX)X2/2−hX +λ ǫ→+0|X|ǫ. Fur- Math. Gen., 11 (1978) 983. ther, the following replacement of variables is convenient [24] M´ezard M. and Montanari A., Information, Physics, P in handling our computation in the limit of β : and Computation (Oxford Univ.Press), 2009. QˆD + qˆD βQˆD, qˆD β2χˆD, 1 qD →χD/∞β, [25] Sakata A., and Kabashima Y.,unpublished. Qˆ + qˆ → βQˆ , qˆ →β2χˆ , Q − q → χ /β, X X X X X X X X → → − → and λ βλ. In β , integral with respect to X in → → ∞ p-6

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.