MULTI-PROCESSOR APPROXIMATEMESSAGEPASSING USINGLOSSY COMPRESSION PuxiaoHan,v JunanZhu,n RuixinNiu,v andDrorBaronn v:VirginiaCommonwealthUniversity,Dept.ElectricalandComputerEngineering Richmond,VA23284,U.S.A.Email:{hanp,rniu}@vcu.edu n:NorthCarolinaStateUniversity,Dept.ElectricalandComputerEngineering 6 Raleigh,NC27695,U.S.A.Email:{jzhu9,barondror}@ncsu.edu 1 0 2 ABSTRACT Comparedwithmanyresultsondistributedcomputation, n a Inthispaper,acommunication-efficientmulti-processorcom- optimization,andnetworktopology[3,13]inMP-CS,amod- J pressed sensing framework based on the approximate mes- estsubsetoftheliteratureconsidersthecommunicationcosts 8 sagepassingalgorithmis proposed. We performlossycom- of the GC step [4,5,14,15]. In this paper, we still consider 1 pressiononthedatabeingcommunicatedbetweenprocessors, multi-processor AMP (MP-AMP), and focus more on the ] resultinginareductionincommunicationcostswithaminor communication costs. In contrast to our prior work [6], we C degradationinrecoveryquality. Intheproposedframework, are willing to accept a minor decrease in recovery quality D a new state evolution formulation takes the quantization er- while providing significant and often dramatic communi- . rorintoaccount,andanalyticallydeterminesthecodingrate cations savings. Such results are especially well-suited to cs requiredineachiteration. Twoapproachesforallocatingthe clusterswherecommunicationbetween serversis costly. To [ codingrate,anonlineback-trackingheuristicandanoptimal achieve the reduction in communicationcosts, we use lossy allocation scheme based on dynamic programming, provide compression to reduce the inter-processor communication 1 v significantreductionsincommunicationcosts. costs, and provide a modified SE formulation that accounts 5 for quantization error. Two approaches for allocating the Index Terms— lossy compression, multi-processor ap- 9 codingrate,anonlineback-trackingheuristicandanoptimal 5 proximatemessagepassing,ratedistortionfunction. allocation scheme based on dynamic programming, provide 4 significantreductionsincommunicationcosts. Furthermore, 0 1. INTRODUCTION we consider Bayesian AMP, which achieves better recovery . 1 accuracy than non-Bayesian AMP [7] by assuming that the 0 Compressedsensing(CS)[1,2]hasnumerousapplicationsin unknownsignalfollowsaknownpriordistribution. 6 variousareasofsignalprocessing.Duetothecurseofdimen- Inthe following,boldcapitalandboldlower-caseletters 1 : sionality,itcanbedemandingtoperformCSonasinglepro- are used to denote matrices and vectorsrespectively, capital v cessor. Furthermore,clusterscomprisedof manyprocessors letters without bold typically refer to dimensionality or ran- i X have the potential to accelerate computation. Hence, multi- dom variables, and []T denotes vector or matrix transposi- · r processorCS(MP-CS)hasbecomeofrecentinterest[3–5]. tion. The ℓ2 norm of a vectoris denotedby , (µ,σ2) a We considerMP-CSsystemscomprisedoftwoparts: (i) isa Gaussiandistributionwithmeanµ andvkar·iakncNeσ2, and local computation (LC) is performed at each processor, and [a,b]isacontinuousuniformdistributionwithin[a,b]. U (ii) global computation (GC) obtains an estimate of the un- known signal after processors exchange the results of LC. 2. THECENTRALIZEDAMPALGORITHM In our previous work [6], we developed an MP-CS frame- work based on the approximatemessage passing (AMP) al- Approximatemessagepassing(AMP)[7]isastatisticalalgo- gorithm [7], in which a GC approach performs AMP in an rithmderivedfromthetheoryofprobabilisticgraphicalmod- MPsystem,providingthesamerecoveryresultascentralized els [16]. Given noisy measurements y = As + e of the 0 AMP.WechoseAMP,becauseitisanalyticallytractabledue unknownsignals RN,whereelementsins areindepen- 0 0 to the state evolution (SE) [8,9] formalism, and can be ex- ∈ dentandidenticallydistributed(i.i.d.) realizationsofascalar tended to Bayesian CS [9,10], matrix completion [11], and randomvariable S p , A RM×N is the sensing ma- non-negativeprincipalcomponentanalysis(PCA)[12]. trixwithentries 0i.i∼.d. S0(0,1/∈M),ande RM isadditive ∼ N ∈ measurementnoise,whichisi.i.d. (0,σ2),AMPiteratively TheworkofP.HanandR.NiuwassupportedinpartbytheVCUPres- N e identialResearchQuestFund. TheworkofJ.ZhuandD.Baronwassup- recoverss0,startingfromaninitialestimatex0 =0andresid- portedinpartbytheNationalScienceFoundationunderGrantCCF-1217749 ualz =y: andtheU.S.ArmyResearchOfficeunderContractW911NF-14-1-0314. 0 ft =xt+ATzt, (1) xt+1 =ηt(ft), (2) ItcanbeseenthatintheGCstepofMP-AMP,eachpro- zt+1 =y−Axt+1+(N/M)ηt′(ft)zt, (3) csuesmsosrthpemsentodsobfttpaitnofthaenfduxsion,ceanntders,enanddsxthe futosioeancchepnrtoer- t t+1 t+1 wheretistheiterationnumber,thebarabovethevectorin(3) cessor.1 Ourgoalinthispaperistoreducethesecommunica- denotes its empirical average, η is known as the denoising t tioncostswhilebarelyimpactingrecoveryperformance. functionordenoiser,andηt′ denotesitsderivative. Supposethatalltheelementsinfparecomputedas32-bit t Accordingto BayatiandMontanari[9], asN and single-precisionfloating-pointnumbers.BecauseSEisrobust → ∞ M/N = κ > 0, the elementsof ft in (1) followi.i.d. Ft = to small perturbations[7,8], we can compress fp lossily up S +σ Z,whereZ (0,1)andthesequence σ2 satisfies t 0 t ∼N { t} tosomereasonabledistortionlevel,andsendthecompressed σt2+1 =σe2+(1/κ)E[ηt(S0+σtZ)−S0]2 (4) outputtothefusioncenter.Toensurethatthiserrorisindeeda =σ2+(1/κ)E x s 2/N. “smallperturbation,”werequiretheerrortobeadditiveand, e k t+1− 0k if possible, white and Gaussian, so that we can analyze the Notethatσ02 = σe2+(1/κ)E[S0]2;equation(4)isknownas relationshipbetweentheerrorandcodingrate. stateevolution(SE),andtheoptimaldenoiserformeansquare error(MSE)istheconditionalmean[9,17]: 3.2. LossyCompressionoffp t ηt(Ft)=E[S0 S0+σtZ =Ft]. (5) Before we propose specific lossy compression approaches, | we describe an important property of MP-AMP. In addition In this paper, we assume that S follows the Bernoulli 0 to the well-knownGaussianity of the vectorf s [9], nu- Gaussiandistribution: mericalresultsshowthatelementsoffp (1/tP−)s0are also pS0(s)=ǫN(s;µs,σs2)+(1−ǫ)δ(s), (6) fi.pi.d.G(1a/uPss)isanwanidthfmqean(10/aPnd)svarairaencinetdσ−et2p/ePnd.eFnutrf0tohrerdmifoferer-, whereδ(s)denotestheDiracdeltafunction,andS0typically etnt−processor0s p andt q−. In light0of this property, fp can be t hasmeanµs =0. Thedenoiseriseasilyderivedusing(5). describedasascalarchannel: Asameasureofthemeasurementnoiselevelandrecovery Fp =S /P +(σ /√P)Z , whereZ (0,1). accuracy,wedefinethesignal-to-noise-ratio(SNR)as t 0 t p p ∼N FortheBernoulliGaussiandistribution(6), SNR=10log10 E kAs0k2 /E kek2 Ftp ∼ǫN µs/P,(σs2+Pσt2)/P2 +(1−ǫ)N 0,σt2/P . 10log E (cid:0)s (cid:2)2 /E e(cid:3) 2 (cid:2) =10(cid:3)(cid:1)log ρ/σ2 , Scalar Qu(cid:0)antization: Next, we (cid:1)propose a uni(cid:0)form qua(cid:1)n- ≈ 10 k 0k k k 10 e (cid:0) (cid:2) (cid:3) (cid:2) (cid:3)(cid:1) (cid:0) (cid:1) tizerwithentropycoding,alsoknownasentropycodedscalar where ρ = ǫ/κ, and the signal-to-distortion-ratio (SDR) at quantization(ECSQ)[18]. iterationtas LetΨ(u)denotethecharacteristicfunctionofFp, itcan t UsingStDheRS(tE)e=qu1a0tiloong1in0(cid:0)(4E),(cid:2)kwse0kh2a(cid:3)v/eE(cid:2)kxt−s0k2(cid:3)(cid:1). bes|hΨo(wun)|th≤atǫexp −0.5 σs2+Pσt2 u2/P2 SDR(t)=10log10 ρ/ σt2−σe2 . +(1−ǫ)exp(cid:0)(cid:2)−0.5σ(cid:0)t2u2/P(cid:1)≤(cid:1)exp(cid:0)−0(cid:3).5σt2u2/P(cid:1) NotethattheBernoulliGaussian(cid:2) as(cid:0)sumption(cid:1)i(cid:3)n thispaperis isnearlyband-limited. Dueto thisproperty,itispossible to onlyforillustration,andourworkiseasilyextendedtoother develop a uniform quantizer of fp i.i.d. Fp, where the priordistributionsp . quantizationerrorvpisapproximattel∼ystatisticaltlyequivalent S0 t to a uniformly distributed noise Vp [ 0.5∆ ,0.5∆ ] uncorrelatedto Fp. Actually, a qutan∼tizaUtio−nbin sQize ∆ Q 3. MULTI-PROCESSORAMPFRAMEWORK t Q ≤ 2σ /√P willsufficeforvalidationofvp i.i.d. Vp [19]. t t ∼ t 3.1. CommunicationinMulti-ProcessorAMP The fusion center will receive the quantized data fp t ∼ i.i.d.Fp,andcalculatef = P fp i.i.d. F ,where Consider a system with P processors and one fusion cen- t t p=1 t ∼ t e P P P tnearm. EelaycAhppr,oacnedssoobrtapin∈sy{p1,=··A· ,pPs}+taekpes. TMhe/PprorocwedsuorefsAin, e Ft = Ftp =eFt+Vt, aendVt = eVtp. (7) 0 Xp=1 Xp=1 (1)—(3)canthenberewritteninadistributedmanner: e e Applyingthecentrallimittheorem,V approximatelyfollows LocalComputation(LC)performedbyeachprocessorp: t (0,Pσ2)forlargeP,whereσ2 =∆2/12. zpt =yp−Apxt+(1/κ)ηt′(ft−1)zpt−1, ENntropyQCoding and OptimumQ Bit RQate: Let pi be the probabilitythatFpfallsintothei-thquantizationbin.Theen- fp =x /P +(Ap)Tzp. t t t t tropyofquantizedFp, Fp, isH = p log (p )[20], t t Q − i i 2 i GlobalComputation(GC)performedbythefusioncenter: P 1In order to calculate eachezp , the fusion center also needs to send P t+1 ft = ftp, ηt′(ft), andxt+1 =ηt(ft). ηnit′c(aftti)ontocaolslttihsenpegrolicgeisbsloercs.omThpiasreisdawsicthaltahra,taonfdttrhaenscmoritrteisnpgoandviencgtocro.mmu- Xp=1 thatis, the sensors need H bits on averageto presenteach toalltheprocessors. Thecorrespondingcommunicationcost Q elementinftptothefusioncenter,whichisachievablethrough isalsonegligiblecomparedwiththatofcommunicatingftp. entropycoding[20]. e Inratedistortion(RD)theory[20],wearegivenalength- 3.4. DynamicProgramming(DP-MP-AMP) nrandomsequenceY = Y n i.i.d. Y,andourgoal n { n,i}i=1 ∼ While back-trackingis a usefulheuristic, it is possible for a istoidentifyareconstructionsequenceY = Y n that n { n,i}i=1 givencodingbudgetRperelement,totalnumberofAMPit- canbeencodedatlowrate whilethedisbtortionbd(Yn,Yn) = erationsT,andinitialnoiselevelσ2 inthescalarchannelto n1 id(Yn,i,Yn,i) (e.g., squared error distortion) bebtween compute the coding rate allocation0s among the AMP itera- thePinput and reconstruction sequences is small. RD theory tionsthatminimizethefinalMSE,σ2 . b T,D hascharacterizedthefundamentalbest-possibletrade-offbe- Todoso,notethatwecanevaluateσ2 offlineandhence tween the distortion D = d(Y ,Y ) and coding rate R(D), t,C n n obtain the number of iterations required to reach the steady which is called the rate distortion function. The RD func- b state, which would be a reasonable choice for T. Second, tionR(D)canbecomputednumerically(cf.Blahut[21]and recallingthenewSEequationin(8),σ2 dependsonσ2 Arimoto[22]). Fortheuniformquantizerthatyieldsaquanti- t,D t−1,D andσ2,whichisafunctionofR ,thecodingrateallocatedin zationMSEofσ2 withacodingrateH bitsperelement,the Q t Q Q thet-thiteration.Therefore,wecanrewriteσ2 asfollows: RDfunctionwillgiveabitrateR(D =σ2)<H ,whichis t,D Q Q achievablethroughvectorquantization[18]. σ2 =f (σ2 ,R )=f (σ2 ,R ,R ) t,D 1 t−1,D t 2 t−2,D t−1 t (9) New SE Equation: For both ECSQ and RD-based vector = =f (σ2,R , ,R ,R ), quantization that lead to a quantization MSE of σ2, the fu- ··· t 0 1 ··· t−1 t Q sNio(n0,c1e)n.teTrhweinllehwavdeenFeotis=erSan0d+SqEeσqt2u+atiPonσbQ2eZeco,mwehereZe ∼ {th1a,Tt2i,s·R,··gi,vT=en}.Rσ02D,,emσnoT2itn,iDinmgiiszFinoTgn(lRyσ)2a=fufno{crRtio1an,·g·oiv·fe,nRRtRTfoc≥rant0b:∈e ηtQ(Ft)=E S0 S0+ σt2+PσQ2Z =Ft and Pfortm=1ulattedasthe}followingoptimTiz,Dationproblem: h (cid:12) q i σt2+C1u=rreσnee2t+ly(,1w/κe)oEnhlyη(cid:12)(cid:12)tQc(cid:16)onSs0id+erqcoσmt2p+rePsseσioQ2nZe(cid:17)oef−fStp0.i2W.h(e8n) Since σ2FmTi(siRni)nσcTr2e,Dasi=ngFmwTi(itRnh)σfT2(σ02,,Rit1i,s··e·as,yRtTo)v.erify(1th0e) broadcast from the fusion center to the P processors is al- t,D t−1,D followingrecursiverelationship: lowed in the network topology, the communication cost of sending x – even uncompressed – is smaller than that of min σ2 = min f min σ2 ,R = , communictatingtheP vectorsfp.Weareconsideringthecase FT(R) T,D 0≤RT≤R 1(cid:18)FT−1(R−RT) T−1,D T(cid:19) ··· t wherebroadcastisnotallowedinourongoingwork. whichmakestheproblemsolvablethroughdynamicprogram- ming(DP). To implement DP, we need to discretize (R) into 3.3. OnlineBack-tracking(BT-MP-AMP) T F R , ,R Ω: T R =R ,whereΩ= R(1), , Letσt2,C andσt2,Ddenotetheσt2obtainedbycentralizedAMP R{(S1)··w·ithRT(∈s) = RP(st=11)t/(S }1), s 1,{ ,S·.··In (4) and MP-AMP (8), respectively. In order to reducecom- thisp}aper,wesetthebit−rateresol−ution∆∀R∈={R/·(·S· }1)= municationwhilemaintaininghighfidelity,wefirstconstrain 0.1bitsperelement.Then,wecreateanS T arrayΣ−,with σt2,D sothatitwillnotdeviatemuchfromσt2,C,andthende- theelementinthes-throw(s 1, ,S×)andt-thcolumn termine the minimumcoding rate requiredin each iteration. (t 1, ,T ) denoted as∈σ{2(s·,·t·), s}toring the optimal Thiscanbedonethroughanonlineback-trackingalgorithm, valu∈eo{f σ·2·· wh}en a totalof R(Ds) bits perelementare used t,D whichwenameBT-MP-AMPandpresentbelow. inthefirsttiterations.Bydefinitionofσ2(s,t),wehave Ineachiterationt,beforequantizingfp,wefirstcompute D σt2+1,C for the next iteration. Then wetfind the maximum σD2(s,t)=r∈{1m,2i,n···,s}f1(cid:16)σD2(r,t−1),R(s−r+1)(cid:17), (11) quantizationMSEσ2 allowedsothattheratioσ2 /σ2 Q t+1,D t+1,C andthefirstcolumnofelementsinΣisobtainedby: doesnotexceedsomeconstant,providedthattherequiredbit rσa2tewdeoecsonnsottruecxtcteheedcsoormreesptohnredsihnogldq.uaBnatiszeedr.onthe obtained σD2(s,1)=f1(cid:16)σ02,R(s)(cid:17), ∀s∈{1,2,··· ,S}. (12) Q After obtainingΣ, the optimalvalue of σ2 , by definition, NotethattheSEin(8)isonlyanapproximation,andwe T,D isσ2(S,T). Meanwhile,toobtaintheoptimalbitallocation donotknowthetruevalueofσ2 inthecurrentiteration.To D t,D strategy,weneedanotherS T arrayRtostoretheoptimal better predict σ2 , we use σ2 = zp 2/M, which is a × t+1,D t,D k tk bitrateRDP(s,t)thatisallocatedatiterationtwhenatotalof σgot2,oDd,eesatcimhaptroorcfeosrsoσrt2,pDs[e8n,d9s],tbthoecsocmalpaurtekzσptt2k+21,tDo.tTheoofubstiaoinn Rto(sB)Tb-iMtsPp-eArMelPe,mwenetnaarmeuestehdeipnrtohpeofisersdtMtiPte-rAatMioPnsa.pSpirmoailcahr center,which thensendsthe scalar σˆ2 = P zp 2/M combinedwithDPasDP-MP-AMP. b t,D p=1k tk P (cid:161)3!4!$5.63) (cid:161)3!4!&5.63"! (cid:161)3!4"5.63#! $! . $! $! #& #& #& #! #! #! 12 -./0"& "& "& , + "! "! "! 789*:;<=>80.?@A B7+C.=DE<8D89*;*=F9.FG.16(cid:239)@A(cid:239)?@A & FE*=D;<.:8HI<*H.FG.,A(cid:239)@A(cid:239)[email protected]#6 & & B7+C.=DE<8D89*;*=F9.FG.,A(cid:239)@A(cid:239)?@A !!. " # $ % & ' ( ) !! " # $ % & ' ( ) L "! !! # % ' ) "! "# "% "' ") #! * * * ' . ' ' & & & D89*2% % % J=*.:;*8./J=*H.E8:.8<8#$ -BFBE77,*++=.DECC:;8..<==0.DD:=8KEEH*=<<IF88<9DD*H.88F.F99G.G**1.;;,6**=A=(cid:239)FF(cid:239)@[email protected](cid:239)..1,(cid:239)??6A@(cid:239)@(cid:239)A@@AA5A.-(cid:239)(cid:239)?3?@#@6AA #$ #$ " " " !". # $ % & ' ( ) !" # $ % & ' ( ) L "! ! # % ' ) "! "# "% "' ") #! * * * Fig.1. SDRandbitratesasfunctionsofiterationnumbert. (N=10,000,M=3,000,κ=0.3,µ =0,σ =1,SNR=20dB.) s s AsshowninFig. 1, BT-MP-AMPusesfewerthan6bits Table1. TotalbitsperelementofMP-AMP perelementineachiteration,morethan80%communication ǫ 0.03 0.05 0.10 savingscomparedwith 32-bitsingle-precisionfloating-point T 8 10 20 transmission, while achieving almost the same SDR’s as in BT-MP-AMP(RDprediction) 33.82 46.43 96.16 centralizedAMP.Ontheotherhand,therearecleargapsbe- BT-MP-AMP(ECSQsimulation) 36.09 49.19 101.50 DP-MP-AMP(RDprediction) 16 20 40 tweentheSDR’sofDP-MP-AMPandcentralizedAMPdur- DP-MP-AMP(ECSQsimulation) 18.04 22.55 45.10 ing the first few iterations, but they vanish quickly as t ap- proachesT,inreturnforover50%communicationreduction beyondthatprovidedbyBT-MP-AMP,asshowninTable1. 4. NUMERICALRESULTS NotealsothattheECSQimplementationofDP-MP-AMP has lower SDR’s than that predictedby DP results based on We evaluate BT-MP-AMP and DP-MP-AMP in an MP sys- theRDfunctionatthebeginning. Thisisbecausethe0.255- tem with P = 30 processors at SNR= 20 dB, where we bitsgaponlyholdsinthehighratelimit. However,duetothe set N = 10,000, M = 3,000, i.e., κ = 0.3, and generate robustness of SE to disturbances, and the increasingly high Bernoulli-Gaussiansequencess withǫ 0.03,0.05,0.1 , rates as t approachesT, the ECSQ implementationmatches 0 ∈ { } µ =0,andσ =1. thepredictedDPresultsatthelastiteration. s s WefirstevaluatetheSEequation(4)ofcentralizedAMP forthe threesparsity levels. As shownin Fig. 1, theyreach 5. CONCLUSION thesteadystateafterT =8,10,and20iterationsrespectively. Then,werunBT-MP-AMPandDP-MP-AMP,whereforthe In this paper, we proposed a multi-processor approximate latterthetotalratesareR=2T bitsperelementandtheRD- message passing framework with lossy compression. We functionmodelstherelationbetweenRtandσQ2. used a uniform quantizer with entropy coding to reduce AccordingtoRDtheory,inthehighratelimit,weshould communication costs, and reformulated the state evolution expect a gap of roughly 0.255 bits per element between the formalism while accounting for quantization noise. Com- entropy and RD function for a given distortion level [18]. bining the quantizers and modified state evolution equation, Therefore, in an implementationof DP-MP-AMP where we an onlineback-trackingapproachandanothermethodbased applyECSQ,we add0.255bitsperelementto theresultsin on dynamicprogrammingdetermine the coding rate in each eachiterationobtainedbyDP.Notethatthetwosolidcurves iterationbycontrollingtheinducederror. Thenumericalre- inthetopthreepanelsareobtainedthroughofflinecalculation sultssuggestthatourapproachescanmaintainahighsignal- andoptimization,andthetwodash-dottedcurvesareobtained to-distortion-ratio despite a significant and often dramatic throughAMPsimulations. reductionininter-processorcommunicationcosts. 6. REFERENCES [13] J.F.Mota,J.M.Xavier,P.M.Aguiar,andM.Puschel, “D-ADMM: A communication-efficient distributed al- [1] D. L. Donoho, “Compressed sensing,” IEEE Trans. gorithm for separable optimization,” IEEE Trans. Sig. Info.Theory,vol.52,no.4,pp.1289–1306,Apr.2006. Proc.,vol.61,no.10,pp.2718–2723,May2013. [2] E. J. Candes, J. K. Romberg, and T. Tao, “Stable sig- [14] P. Han, R. Niu, and Y. C. Eldar, “Modifieddistributed nal recoveryfrom incomplete and inaccurate measure- iterative hard thresholding,” in IEEE Int. Conf. on ments,” Comm. Pure Appl. Math., vol. 59, no. 8, pp. Acoust., Speech, and Sig. Proc. (ICASSP), 2015, pp. 1207–1223,Mar.2006. 3766–3770. [3] J. Mota, J. Xavier, P. Aguiar, and M. Puschel, “Dis- [15] P. Han, R. Niu, and Y. C. Eldar, “Communication- tributedbasispursuit,” IEEETrans.Sig.Proc.,vol.60, efficientdistributedIHT,” inProc.SPARS,2015. no.4,pp.1942–1956,Apr.2012. [16] D. Koller and N. Friedman, Probabilistic graphical models:principlesandtechniques, MITPress,2009. [4] S. Patterson, Y. C. Eldar, and I. Keidar, “Distributed sparse signal recovery for sensor networks,” in IEEE [17] J.Tan,Y.Ma,andD.Baron, “Compressiveimagingvia Int.Conf.onAcoust.,Speech,andSig.Proc.(ICASSP), approximatemessagepassingwithwavelet-basedimage 2013,pp.4494–4498. denoising,”inIEEEGlobalConf.Sig.Info.Proc.(Glob- alSIP),2014,pp.424–428. [5] S. Patterson, Y. C. Eldar, and I. Keidar, “Distributed compressed sensing for static and time-varying net- [18] A.GershoandR.M.Gray, Vectorquantizationandsig- works,” IEEE Trans. Sig. Proc., vol. 62, no. 19, pp. nalcompression,vol.159,SpringerScience&Business 4931–4946,Oct.2014. Media,2012. [6] P. Han, R. Niu, M. Ren, and Y. C. Eldar, “Distributed [19] B.WidrowandI.Kolla´r, QuantizationNoise: Roundoff approximate message passing for sparse signal recov- Errorin DigitalComputation,SignalProcessing,Con- ery,” inIEEEGlobalConf.Sig.Info.Proc.(GlobalSIP), trol,andCommunications,CambridgeUniversityPress, 2014,pp.497–501. 2008. [7] D.L.Donoho,A.Maleki,andA.Montanari, “Message [20] T.M.CoverandJ.A.Thomas, Elementsofinformation passing algorithms for compressed sensing,” in Proc. theory, JohnWiley&Sons,2012. Natl.Acad.Sci.,Madrid,Spain,Sep.2009,vol.106,pp. 18914–18919. [21] R. E. Blahut, “Computation of channel capacity and rate-distortion functions,” IEEE Trans. Info. Theory, [8] D. L. Donoho, A. Maleki, and A. Montanari, “The vol.18,no.4,pp.460–473,Jul.1972. Noise-SensitivityPhaseTransitioninCompressedSens- [22] S. Arimoto, “An algorithmforcomputingthe capacity ing,” IEEETrans.Info.Theory,vol.57,pp.6920–6941, ofarbitrarydiscretememorylesschannels,”IEEETrans. Oct.2011. Info.Theory,vol.18,no.1,pp.14–20,Jan.1972. [9] M. Bayati and A. Montanari, “The Dynamicsof Mes- sage Passing on Dense Graphs, with Applications to Compressed Sensing,” IEEE Trans. Info. Theory, vol. 57,pp.764–785,Feb.2011. [10] J. Tan, Y. Ma, and D. Baron, “Compressive imaging via approximate message passing with image denois- ing,” IEEETrans. Sig.Proc., vol.63, no.8, pp.2085– 2092,Apr.2015. [11] J. T. Parker,P. Schniter,and V. Cevher, “Bilinear gen- eralizedapproximatemessagepassing,” arXivpreprint arXiv:1310.2632,2013. [12] A. MontanariandE. Richard, “Non-negativeprincipal component analysis: Message passing algorithms and sharp asymptotics,” arXiv preprint arXiv:1406.4775, 2014.