Table Of Content

COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTURED MEASUREMENTS FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD ABSTRACT. Consider theproblemofrecoveringan unknownsignalfromundersampled measurements, giventheknowledgethatthesignalhasasparserepresentationinaspecifieddictionaryD. Thisproblem isnowunderstoodtobewell-posedandefficientlysolvableundersuitableassumptionsonthemeasure- 5 mentsanddictionary,ifthenumberofmeasurementsscalesroughlywiththesparsitylevel.Onesufficient 1 conditionforsuchistheD-restrictedisometryproperty(D-RIP),whichasksthatthesamplingmatrixap- 0 proximatelypreservethenormofallsignalswhicharesufficientlysparseinD. Whilemanyclassesofran- 2 dommatricesareknowntosatisfysuchconditions,suchmatricesarenotrepresentativeofthestructural n constraintsimposedbypracticalsensingsystems. Weclosethisgapinthetheorybydemonstratingthat u onecansubsampleafixedorthogonalmatrixinsuchawaythattheD-RIPwillhold,providedthisbasisis J sufficientlyincoherentwiththesparsifyingdictionaryD.Wealsoextendthisanalysistoallowforweighted 8 sparseexpansions.Consequently,wearriveatcompressivesensingrecoveryguaranteesforstructuredmea- surementsandredundantdictionaries,openingthedoortoawidearrayofpracticalapplications. ] T I . s c [ 1. INTRODUCTION 2 1.1. CompressiveSensing. Thecompressive sensingparadigm, asfirstintroducedbyCandèsandTao v [CT05b]andDonoho[Don06],isbasedonusingavailabledegreesoffreedominasensingmechanismto 8 tunethemeasurementsystemsoastoallowforefficientrecoveryofaparticulartypeofsignalorimageof 0 2 interestfromasfewmeasurementsaspossible. Amodelassumptionthatallowsforsuchsignalrecovery 3 issparsity:thesignalcanbewell-approximatedbyjustafewelementsofagivenrepresentationsystem. 0 Often, a near-optimal strategy is to choose the measurements completely at random, for example . 1 following a Gaussian distribution [RV08]. Typically, however, some additional structure is imposed by 0 theapplicationathand,andrandomnesscanonlybeinfusedintheremainingdegreesoffreedom. For 5 1 example,magneticresonanceimaging(MRI)isknowntobewellmodeledbyinnerproductswithFourier : basisvectors. Thisstructurecannotbechanged,andtheonlyaspectthatcanbedecidedatrandomis v i whichFourierbasisvectorstoselect. X Animportantdifferencebetweencompletelyrandommeasurementsystemsandstructuredrandom r a measurement systems is in the aspect of universality. Gaussian measurement systems, among many othersystemswithminimalimposedstructure,areoblivioustothebasisinwhichtheunderlyingsignal issparse, andachieve equalreconstructionquality foralldifferentorthonormalbasis representations. Forstructuredmeasurementsystems,thisis,ingeneral,nolongerthecase.Whenthemeasurementsare uniformlysubsampledfromanorthonormalbasissuchastheFourierbasis,onerequires,forexample, thatthemeasurement basis andthesparsity basis areincoherent; theinnerproducts between vectors fromthetwobasesaresmall. Ifthetwobasesarenotincoherent,morerefinedconceptsofincoherence are required [KW14, AHPR13]. An important example is that of the Fourier measurement basis and a waveletsparsitybasis.Sincebothcontaintheconstantvector,theyaremaximallycoherent. 1.2. Motivation. All of the above mentioned works, however, exclusively cover the case of sparsity in orthonormalbasisrepresentations.Ontheotherhand,thereareavastnumberofapplicationsinwhich sparsity is expressed not in terms of a basis but in terms of a redundant, often highly overcomplete, dictionary. Specifically, if f Cn is the signal of interest to be recovered, then one expresses f Dx, ∈ = Date:June9,2015. 1 2 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD whereD Cn N isanovercompletedictionaryandx CN isasparse(ornearlysparse)coefficientvector. × ∈ ∈ Redundancy is widespread in practice, either because no sparsifying orthonormal basis exists for the signal class of interest, or because the redundancy itself is useful and allows for a significantly larger, richerclassofsignalswhichcanbesparselyrepresentedintheresultingdictionary. Forexample,ithas beenwelldocumentedthatovercompletenessisthekeytoadrasticreductioninartifactsandrecovery errorinthedenoisingframework[SED04,SFM07]. Inthecompressivesensingframework,resultsusingovercompletedictionariesaremotivatedbythe broadarrayoftight(andoftenParseval)framesappearinginpracticalapplications. Forexample,ifone assumessparsitywithrespecttotheDiscrete FourierTransform(DFT),thisisimplicitly assumingthat thesignaliswell-representedbyfrequencieslyingalongthelatticeoftheDFT.Toallowformoreflexibility inthisrigidassumption,oneinsteadmayemploytheoversampledDFTframe,containingfrequencies onamuchfinergrid,orevenoverintervalsofvaryingwidths.Gaborframesareusedinimagingaswellas radarandsonarapplications,whichareoftenhighlyredundant[Mal99]. Manyofthemostwidely-used framesinimagingapplicationssuchasundecimatedwaveletframes[Dut89,SED04],curvelets[CD04], shearlets[LLKW05,ELL08],framelets[CCS08],andmanyothers,areovercompletewithhighlycorrelated columns. 1.3. Related Work. Due to the abundance of relevant applications, a number of works have studied compressive sensing for overcomplete frames. The first work on this topic aimed to recover the coef- ficientvectorxdirectly,andthusrequiredstrongincoherenceassumptionsonthedictionaryD[RSV08]. Morerecently,itwasnotedthatifoneinsteadaimstorecover f ratherthanx,recoveryguaranteescanbe obtainedunderweakerassumptions. Namely,oneonlyneedsthatthemeasurementmatrix A respects thenormsofsignalswhicharesparseinthedictionaryD.Toquantifythis,Candèsetal.[CENR10]define theD-restrictedisometryproperty(D-RIPinshort,seeDefinition2.1below).Formeasurementmatrices thathavethisproperty,anumberofalgorithmshavebeenshowntoguaranteerecoveryundercertainas- sumptions. Optimizationapproachessuchasℓ -analysis[EMR07,CENR10,GNE 14,NDEG13,LML12, 1 + RK13,Gir14]andgreedyapproaches[DNW12,GNE 14,GN13,PE13,GN14]havebeenstudied. + Thispaper establishes the D-RIPfor structuredrandommeasurementsformedbysubsampling or- thonormalbases,allowingforthesetypesofrecoveryresultstobeutilizedinmorerealisticsettings. To date,most randommatrixconstructionsknowntoyieldD-RIPmatriceswithhighprobabilityareran- dommatriceswithacertainconcentrationproperty.Asshownin[CENR10],suchapropertyimpliesthat forarbitrarydictionaryD,oneobtainstheD-RIPwithhighprobability.Thiscanbeinterpretedasadic- tionaryversionoftheuniversalitypropertydiscussedabove. Ithasnowbeenshownthatseveralclasses ofrandommatricessatisfythispropertyaswellassubsampledstructuredmatrices,afterapplyingran- domcolumnsigns[DG03,KW11].Thematricesinbothofthesecasesaremotivatedbyapplicationsce- narios,buttypicallyinapplicationstheyappearwithouttherandomizedcolumnsigns. Inmanycases oneisnotabletoapplycolumnsignsinpractice;forexampleincaseswherethemeasurementsarefixed suchasinMRI,onehasnochoicebuttouseunsignedFouriersamplesandcannotpre-processthedata toincorporatecolumnsigns. Withoutthesesignshowever,suchmeasurementensembleswillnotwork forarbitrarydictionariesingeneral.ThisiscloselyrelatedtotheunderlyingRIPmatrixconstructionsnot beinguniversal.Forexample,itisclearthatrandomlysubsampledFouriermeasurementswillfailforthe oversampledFourierdictionary(forreasonssimilartothebasiscase).Inthiswork,weaddressthisissue, derivingrecoveryguaranteesthattakeintoaccountthedictionary.Similartothebasiscase,ouranalysis willbecoherencebased. AsimilarapproachhasalsobeentakenbyPoonin[Poo15b](completedafter thefirstversionofourpaper),foraninfinitedimensionalversionoftheproblem. 1.4. Contribution. Our main result shows that a wide class of orthogonal matrices having uniformly bounded entries can be subsampled to obtain a matrix that has the D-RIP and hence yields recovery guarantees for sparse recovery in the setting of redundant dictionaries. As indicated above, our technical estimates below will imply such guarantees for various algorithms. As an example we focus on the method of ℓ -analysis, for which the first D-RIP based guarantees were available [CENR10]. Our 1 COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 3 technicalestimateswillalsoprovidemoregeneralguaranteesforweightedℓ -analysisminimization(a 1 weightedversionof(P ),see(P )inSection3fordetails)incaseonehaspriorknowledge oftheun- 1 1,w derlyingsparsitysupport. Recallthatthemethodofℓ -analysisconsistsofestimatingasignal f fromnoisymeasurements y 1 = Af e bysolvingtheconvexminimizationproblem + f♯ argmin D∗f˜ 1 suchthat Af˜ y 2 ε, (P1) = k k k − k ≤ f˜ Cn ∈ whereεisthenoiselevel,thatis, e ε.Theℓ -analysismethod(likealternativeapproaches)assumes 2 1 k k ≤ thatforthesignal f Dx,notonlyistheunderlying(synthesis)coefficientsequencex sparse(typically = unknownandhardtoobtain),butalsotheanalysiscoefficientsD f arenearlysparse, i.e., dominated ∗ byafewlargeentries.WereferthereadertoTheorem2.2belowforthepreciseformulationoftheresult- ing recovery guarantees(as derivedin [CENR10]). The assumption has been observed empirically for manydictionariesusedinpracticesuchastheGaborframe,undecimatedwavelets,curvelets,etc. (see, e.g., [CENR10] for a detailed description of such frames) and is also key for a number of thresholding approachestosignaldenoising[CD04,KL07,ELL08]. A related and slightly stronger signal model, in which the analysis vector D f is sparse or nearly ∗ sparse,hasbeenconsideredindependentlyfromcoefficientsparsity(e.g.,[NDEG13]),andiscommonly calledtheco-sparsitymodel. Theresultsinthispaperneedasimilar,butsomewhatweakerassumptiontoholdforallsignalscor- respondingtosparsesynthesiscoefficientsx.Namely,oneneedstocontrolthelocalizationfactoraswe nowintroduce. Definition1.1. ForadictionaryD Cn N andasparsitylevels,wedefinethelocalizationfactoras × ∈ D Dz η ηdef sup k ∗ k1. (1.1) s,D = = ps kDzk2=1,kzk0≤s In the following we will mainly consider dictionaries, which formParseval frames, that is, D is an ∗ isometry.Thenthetermlocalizationfactorisappropriatebecausethisquantitycanbeviewedasthefac- torbywhichsparsityispreservedunderthegrammatrixmapD D,comparedtothecasewhereD isan ∗ orthonormalbasisandD D I andη 1byCauchy-Schwarz. Forageneralfamilyofsuchframespa- ∗ n = ≡ rameterizedbytheredundancyN/n,ηwillincreasewiththeredundancy;familiesofdictionarieswhere suchgrowth isrelativelyslowwill beofinterest. Forexample,newresultsontheconstructionofunit- normtightframesgiveaconstructivemethodtogeneratespectraltetrisframes[CFM 11],whosegram + matriceshaveatmost2 N/n 6non-zerosperrow,guaranteeingthatηisproportionalonlytothere- ⌈ ⌉+ dundancyfactorN/n.Infact,onecanshowthatthesearesparsestframespossible[CHKK11],suggesting thatfamiliesoftightframeswithlocalizationfactordependinglinearlyontheredundancyfactorshould beessentiallyoptimalforℓ -analysisreconstruction. Weposeasaproblemforsubsequentworktoob- 1 tainsuch estimatesfor dictionariesof practicalinterest, discussing a fewexamplesin Section 3. First, to illustrate that our results go strictly beyond existing theory, we show that harmonicframes (frames constructedbyremovingthehighfrequencyrowsoftheDFT–see(3.3))withsmallredundancyindeed haveaboundedlocalizationfactor.Toourknowledge,thisimportantcaseisnotcoveredbyexistingthe- ories. Inaddition,wealsoboundthelocalizationfactorforredundantHaarwaveletframes. Weexpect, however,thatitwillbedifficulttoefficientlycomputethelocalization factorofanarbitrarydictionary. Forboundedlocalizationfactors,weprovethefollowingtheorem. Theorem1.2. Fixasparsitylevel s N. LetD Cn N bea Parsevalframe –i.e., DD istheidentity– × ∗ < ∈ withcolumns{d ,...,d },andletB {b ,...,b }beanorthonormalbasisofCn whichisincoherenttoD 1 N 1 n = inthesensethat sup sup bi,dj Kn−1/2 (1.2) |〈 〉|≤ i [n]j [N] ∈ ∈ forsomeconstantK 1. ≥ 4 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD Consider the (unweighted) localization factor η η as in Definition 1.1. Construct B Cm n by s,D × = ∈ samplingrowvectorsfromB i.i.d.uniformlyatrandom.Provided e m CsK2η2log3(sη2)log(N), (1.3) ≥ thenwithprobability1 N log3(2s), nB exhibitsuniformrecoveryguaranteesforℓ -analysis. Thatis, − − m 1 foreverysignal f,thesolution f♯ oftqheminimizationproblem(P )withy nBf e fornoisee with e 1 = m + e 2 εsatisfies q k k ≤ D f (D f) e f♯ f C ε C k ∗ − ∗ sk1. (1.4) 2 1 2 k − k ≤ + ps Here,C,C ,andC areabsoluteconstantsindependentofthedimensionsandthesignal f. 1 2 Above and throughout, (u) denotes the best s-sparse approximation to a signal u, that is, (u) def s s = argmin u z . kzk0≤sk − k2 Remarks. 1. Ourstabilityresult(1.4)forℓ analysisguaranteesrecoveryfortheparticular f Dz onlyuptothe 1 = scaledℓ1normofthetailofD∗Dz. Infact,thequantityEf∗= kD∗f−p(Ds∗f)sk1,referredtoastheunrecover- ableenergyofthesignal f inD [CENR10],iscloselyrelatedtothelocalizationfactorη : s D f ηs= sup k p∗sk1 ≤ sup Ef∗+1 f=Dz:kfk2=1,kzk0≤s f=Dz:kfk2=1,kzk0≤s 2. As mentioned, the proof of this theorem will proceed via the D-RIP, so our analysis yields similar guaranteesforotherrecoverymethodsaswell. 3. Itisinterestingtoremarkontheroleofincoherenceinthissetting. Whilepriorworkincompressed sensinghasrequiredthemeasurementmatrixBitselfbeincoherent,wenowaskinsteadforincoherence betweenthebasisfromwhichmeasurementsareselectedandthedictionaryD,ratherthanincoherence withinD.Ofcourse,notethatthecoherenceofethedictionaryDitselfimpactsthelocalizationfactorη . s,D WewillseelaterinTheorem3.1thattheincoherencerequirementbetweenthebasisanddictionarycan beweakenedevenfurtherbyusingaweightedapproach. 4.TheassumptionthatD isaParsevalframeisnotnecessarybutmadeforconvenienceofpresentation. Severalresultshavebeenshownforframeswhicharenottight,e.g.,[LML12,RK13,Gir14]. Indeed,any frameD Cn N withlinearlyindependentrowsissuchthat × ∈ D˜ : (DD∗)−1/2D = is a tight frame. As observed in the recent paper [Fou15], in this case one can use for sampling the adaptedmatrix A˜ (B˜(DD ) 1/2),asmeasuringthesignal f Dz through y A˜f e isequivalentto ∗ − = = = + measuringthesignal f˜ D˜z through y B˜f˜ e. Workingthroughthisextensionposesaninteresting = = + directionforfuturework. 1.5. Organization. Therest of thepaperis organized asfollows. Weintroduce some notation andre- view some technical backgroundin Section 2 before presenting a technical version of our main result inSection3,whichdemonstratesthatbaseswhichareincoherentwithadictionaryD canberandomly subsampledtoobtainD-RIPmatrices. Inthatsection, wealsodiscuss somerelevantapplicationsand theimplicationsofourresults,includingtheregimewherethesamplingbasisiscoherenttothedictio- nary.Theproofofourmaintheoremispresentedinthefinalsection. 2. NOTATION ANDTECHNICAL BACKGROUND Throughoutthepaper,wewriteC,C ,C ,C ,...todenoteabsoluteconstants;theirvaluescanchange ′ 1 2 between different occurrences. We write I to denote the n n identity matrix and we denote by [n] n × theset {1,2,...,n}. Fora vectoru, we denote the jthindex byu[j]oru(j)oru ,dependingonwhich j notationisthemostclearinanygivencontext;itsrestrictiontoonlythoseentriesindexedbyasubset COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 5 Λ is donated by u . These notations should not be confused with the notation (u) , which we use to Λ s denotethebests-sparseapproximationtou,thatis,(u) defarginf u z .SimilarlyforamatrixA, s = kzk0≤sk − k2 A isthe jthcolumnandA isthesubmatricesconsistingofthecolumnswithindicesinΛ. j Λ 2.1. The unweighted case with incoherence. Recall that a dictionary D Cn N is a Parseval frame if × ∈ DD I andthatavector x iss-sparseif x : supp(x) s. Thentherestrictedisometryproperty ∗ n 0 = k k =| |≤ withrespecttothedictionaryD (D-RIP)isdefinedasfollows. Definition2.1([CENR10]). FixadictionaryD Cn N andmatrix A Cm n. Thematrix A satisfiesthe × × ∈ ∈ D-RIPwithparametersδandsif (1 δ) Dx 2 ADx 2 (1 δ) Dx 2 (2.1) − k k2≤k k2≤ + k k2 foralls-sparsevectorsx CN. ∈ Note that when D is the identity, this definition reduces to the standard definition of the restricted isometryproperty[CT05a].Undersuchanassumptiononthemeasurementmatrix,thefollowingresults boundthereconstructionerrorfortheℓ -analysismethod. 1 Theorem 2.2 ([CENR10]). Let D be a tight frame, ε 0, and consider a matrix A that has the D-RIP > withparameters2s andδ 0.08. Thenforeverysignal f Cn,thereconstruction f♯ obtainedfromnoisy < ∈ measurementsy Af e, e ε,viatheℓ -analysisproblem(P )abovesatisfies 2 1 1 = + k k ≤ D f (D f) f f♯ ε k ∗ − ∗ sk1. (2.2) 2 k − k ≤ + ps Thusasindicatedintheintroduction,oneobtainsgoodrecoveryforsignals f whoseanalysiscoeffi- cientsD f haveasuitabledecay. ∗ Both the RIP and the D-RIP are closely related to the Johnson-Lindenstrauss lemma [HV11, AC09, BDDW08]. Recalltheclassicalvariantofthelemmastatesthatforanyε (0,0.5)andpointsx ,...,x 1 d ∈ ∈ Rn,thatthereexistsaLipschitzfunction f :Rn Rm forsomem O(ε 2logd)suchthat − → = (1 ε) x x 2 f(x ) f(x ) 2 (1 ε) x x 2 (2.3) − k i− jk2≤k i − j k2≤ + k i− jk2 foralli,j {1,...,d}. Recentimprovementshavebeenmadetothisstatementwhichinparticularshow ∈ thatthemap f canbetakenasarandomlinearmappingwhichsatisfies(2.3)withhighprobability(see, e.g., [Ach03] for an account of such improvements). Indeed, any matrix A Cm n which for a fixed1 × ∈ vectorz Cn satisfies ∈ P (1−δ)kzk22≤kAzk22≤(1+δ)kzk22 ≤Ce−cm willsatisfytheD-RIPwithhigh¡probabilityaslongasm isatlea¢stontheorderof slog(n/s)[CENR10]. Fromthis,anymatrixsatisfyingtheJohnson-LindenstrausslemmawillalsosatisfytheD-RIP(see[BDDW08] forthe proofof theRIPfor such matrices, which directly carriesover totheD-RIP).Randommatrices knowntohavethispropertyincludematriceswithindependentsubgaussianentries(suchasGaussian or Bernoulli matrices), see for example [DG03]. Moreover, it is shown in [KW11] that any matrix that satisfies the classical RIP will satisfy the Johnson-Lindenstrauss lemma and thus the D-RIP with high probability after randomizing the signs of the columns. The latter construction allows for structured randommatriceswithfastmultiplicationpropertiessuchasrandomlysubsampledFouriermatrices(in combinationwiththeresultsfrom[RV08])andmatricesrepresentingsubsampledrandomconvolutions (incombinationwiththeresultsfrom[RRT12,KMR14]);inbothcases,however,againwithrandomized column signs. While this gives an abundance of such matrices, as mentioned above, it is not always practicalorpossibletoapplyrandomcolumnsignsinthesamplingprocedure. 1 By“fixed”wemeantoemphasizethatthisprobabilityboundmustoccurforasinglevectorzratherthanforallvectorsz likeonetypicallyseesintherestrictedisometryproperty. 6 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD Animportantgeneralset-upforstructuredrandomsensingmatricesknowntosatisfytheregularRIP istheframeworkofboundedorthonormalsystems,whichincludesasaspecialcasethesubsampleddis- creteFouriertransformmeasurements(withoutcolumnsignsrandomized).Suchmeasurementsarethe onlynaturalmeasurementspossibleinmanyphysicalsystemswherecompressivesensingisofinterest, suchasinMRI,radar,andastronomy[LDP07,BS07,HS09,BSO08],aswellasinapplicationstopolyno- mialinterpolation[RW12,BDWZ12,RW11]anduncertaintyquantification[HD15]. Inthefollowing,we recallthisset-up(inthediscretesetting),see[Rau10,Sec.4]foradetailedaccountofboundedorthonor- malsystemsandexamples. Definition2.3(Boundedorthonormalsystem). Consideraprobabilitymeasureνonthediscreteset[n] andasystem{r Cn, j [n]}thatisorthonormalwithrespecttoνinthesensethat j ∈ ∈ r (i)r (i)ν δ k j i j,k = i X 1 if j k whereδ = istheKroneckerdeltafunction. Supposefurtherthatthesystemisuniformly j,k =(0 else. bounded:thereexistsaconstantK 1suchthat ≥ supsup r (i) K. (2.4) j | |≤ i [n]j [n] ∈ ∈ ThenthematrixA Cn n whoserowsarer iscalledaboundedorthonormalsystemmatrix. × j ∈ Drawing m indices i ,i ,...,i independently from the orthogonalization measure ν, the sampling 1 2 m matrix A Cm n whose rows are indexed by the (re-normalized) sampled vectors 1r (i ) Cn will ∈ × m j k ∈ have the restricted isometry property with high probability (precisely, with probabqility exceeding 1 − n Clog3(es))providedthenumberofmeasurementssatisfies − m CK2slog2(s)log(n). (2.5) ≥ ThisresultwasfirstshowninthecasewhereνistheuniformmeasurebyRudelsonandVershynin[RV08], foraslightly worsedependenceofm ontheorderof slog2slog(slogn)log(n). Theseresultswere sub- sequentlyextendedtothegeneralboundedorthonormalsystemset-upbyRauhut[Rau10],andthede- pendenceofmwasslightlyimprovedtoslog3slognin[RW13]. Animportantspecialcasewheretheseresultscanbeappliedisthatofincoherencebetweenthemea- surementandsamplingbases. HerethecoherencebetweentwosetsofvectorsΦ {φ }andΨ {ψ }is i j = = givenbyµ sup φ ,ψ . TwoorthonormalbasesΦandΨofCn arecalledincoherentifµ Kn 1/2. = i,j|〈 i j〉| ≤ − Inthiscase,therenormalizedsystemΦ˜ {pnφ Ψ }isanorthonormalsystemwithrespecttotheuni- i ∗ = formmeasure, which is boundedby K. Then the above results imply thatsignals which are sparse in basisΨcanbereconstructedfrominnerproductswithauniformlysubsampledsubsetofbasisΦ.These incoherence-basedguaranteesareastandardcriteriontoensuresignal recovery, such resultshadfirst beenobservedin[CR07]. 2.2. Generalizationtoaweightedsetupandtolocalcoherence. Recently,thecriterionofincoherence hasbeengeneralizedtothecasewhereonlymost oftheinnerproductsbetweensparsitybasisvectors and measurement basis vectors are bounded. If some vectors in the measurement basis yield larger inner products, one can adjust the sampling measure and work with a preconditioned measurement matrix[KW14],andifsomevectorsinthesparsitybasisyieldlargerinnerproducts,onecanincorporate weights into the recovery schemes and work with a weighted sparsity model [RW13]. Our results can accommodateboththesemodifications.Intheremainderofthissection,wewillrecallsomebackground ontheseframeworksfrom[KW14,RW13]andformulateourdefinitionsforthemoregeneralsetup. COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 7 Considerasetofpositiveweightsω (ω ) .Associatetotheseweightsthenorms j j [N] = ∈ 1/p kxkω,p:=Ã j |xj|pω2j−p! , 0<p≤2, (2.6) X along with the weighted “ℓ -norm", or weighted sparsity: x : ω2, which equivalently is 0 p k kω,0 = j:|xj|>0 j defined as the limit x lim x . We say that a vector x is weighted s-sparse if x : ω,0 p 0 ω,p P ω,0 ω2 s. In linkekwith=this d→efinkitikon, the weighted size of a finite set Λ N, is given bykωk(Λ):= j:|xj|>0 j ≤ ⊂ = ω2; thusavectorisweighted s-sparseifitssupporthasweightedsize atmost s. Whenω 1for Pj∈Λ j j ≥ all j, the weighted sparsity of a vector is at least as large as its unweighted sparsity, so that the class P ofweighted s-sparsevectorsisa strictsubset ofthe s-sparsevectors. Wemakethisassumption in the remainder. Note in particular the special cases x x ω and x x x 2; by ω,1 j j j ω,2 2 j j k k = | | k k =k k = | | Cauchy-Schwarz, it follows that x ps x if x is weighted s-sparse. Indeed, we caqn extend the ω,1 2 P P k k ≤ k k notions of localization factor (1.1) and D-RIP (2.1) to the weighted sparsity setting. It should be clear fromcontextwhichdefinitionwerefertointheremainder. Definition2.4. ForadictionaryD Cn N,weightsω ,ω ,...,ω 1,andasparsitylevels,wedefinethe × 1 2 N ∈ ≥ (weighted)localizationfactoras D Dz η ηdef sup k ∗ kω,1, (2.7) ω,s,D = = ps kDzk2=1,kzkω,0≤s Definition2.5. FixadictionaryD Cn N,weightsω ,ω ,...,ω 1,andmatrix A Cm n. Thematrix × 1 2 N × ∈ ≥ ∈ AsatisfiestheD-ωRIPwithparametersδandsif (1 δ) Dx 2 ADx 2 (1 δ) Dx 2 (2.8) − k k2≤k k2≤ + k k2 forallweighteds-sparsevectorsx CN. ∈ WhenD isanorthonormalmatrix,thisdefinitionreducestothedefinitionωRIPfrom[RW13]. More generally,weightsallowtheflexibilitytoincorporatepriorinformationaboutthesupportsetandallow forweakerassumptions onthedictionaryD. In particular,alargerweight assigned toadictionaryel- ementwillallowforthiselementtohavelargerinnerproductswiththemeasurementbasisvectors. In thisregard,abasisversionofourresultisthefollowingvariantofatheoremfrom[RW13]: Proposition2.6(from[RW13]). Fixaprobabilitymeasureνon[n],sparsitylevels n,andconstant0 < < δ 1.LetD Cn nbeanorthonormalmatrix.LetAbeanorthonormalsystemwithrespecttoνasinDef- × < ∈ inition2.3withrowsdenotedbyr ,andconsiderweightsω ,ω ,...,ω 1suchthatmax r ,d ω . i 1 2 n i i j j ≥ |〈 〉|≤ Constructanm nsubmatrixAofAbysamplingrowsofAaccordingtothemeasureνandrenormalizing × eachrowby 1.Provided m e q m Cδ−2slog3(s)log(n), (2.9) ≥ thenwithprobability1 nclog3s,thesubmatrixAsatisfiestheωRIPwithparameterssandδ. − Anadjustedsamplingdensityforthecasewhencertainvectorsinthemeasurementbasisyieldlarger e innerproductsisobtainedviathelocalcoherenceasdefinedinthefollowing. Definition2.7(Localcoherence,[KW14,RW12]). ThelocalcoherenceofasetΦ {ϕ }k Cn withre- = i i 1⊂ specttoanothersetΨ {ψ }ℓ Cn isthefunctionµloc(Φ,Ψ) Rk definedcoordinate-=wiseby = j j 1⊆ ∈ = µloc(Φ,Ψ) µloc sup ϕ ,ψ , i 1,2,...,k. i = i = |〈 i j〉| = 1 j ℓ ≤ ≤ 8 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD WhenΦandΨareorthonormalbases,andasubsetofΦisusedforsamplingwhileΨisthebasisin whichsparsityisassumed,themainpointisthatrenormalizingthedifferentvectorsϕ inthesampling i basisbyrespectivefactors 1 doesnotaffectorthogonalityoftherowsofΦ Ψ.Inthisway,thelargerin- µloc ∗ i nerproductscanbereducedinsize.Tocompensateforthisrenormalizationandretaintheorthonormal systempropertyinthesenseofDefinition2.3,onecanthenadjustthesamplingmeasure. Theresulting preconditionedmeasurementsystemwithvariablesamplingdensitywillthenyieldbetterboundsK in Equation(2.5).Thisyieldsthefollowingestimatefortherestrictedisometryproperty. Proposition 2.8 ([KW14, RW12]). Let Φ {ϕ }n and Ψ {ψ }n be orthonormal bases of Cn. As- = j j 1 = k k 1 sume the local coherence of Φ with respect to Ψ=is pointwise bound=ed by the function κ/pn Cn, that ∈ is sup ϕ ,ψ κ .Suppose j k j |〈 〉|≤ 1 k n ≤ ≤ m≥Cδ−2kκk22slog3(s)log(n), (2.10) andchoosem(possiblynotdistinct)indices j [n]i.i.d.fromtheprobabilitymeasureνon[n]givenby ∈ κ2 j ν(j) . = κ 2 k k2 CallthiscollectionofselectedindicesΩ(whichmaypossiblybeamultiset).ConsiderthematrixA Cm n × ∈ withentries A ϕ ,ψ , j Ω,k [n], (2.11) j,k j k =〈 〉 ∈ ∈ andconsiderthediagonalmatrixW diag(w) Cn n withw κ /κ . Thenwithprobabilityatleast × j 2 j = ∈ =k k 1 n clog3(s), the preconditionedmatrix 1 WA has the restrictedisometryproperty withparameters δ − − pm ands. Remark. In case Φ and Ψ are incoherent, or if κ Kn 1/2 uniformly for all j, then local coherence j − = sampling as above reduces to the previous results for incoherent systems: in this case, the associated probabilitymeasureνisuniform,andthepreconditionedmatrixreducesto 1 WA n A. pm = m Our main result on D-RIP for redundant systems and structuredmeasurements (Tqheorem 3.1) im- pliesastrategyforextendinglocalcoherencesamplingtheorytodictionaries, ofwhichTheorem1.2is aspecialcase. Indeed, ifΨ {ψ }, moregenerally, isaParsevalframeandΦ {ϕ }isanorthonormal j i = = matrix,thenrenormalizingthedifferentvectorsϕ in thesampling basis byrespective factors 1 still i µloc i doesnotaffectorthogonalityoftherowsofΦ Ψ. Wehavethefollowingcorollaryofourmainresult;we ∗ willestablishamoregeneralresult,Theorem3.1inthenextsection,fromwhichbothTheorem1.2and Corollary2.9follow. Corollary2.9. Fixasparsitylevels N,andconstant0 δ 1. LetD Cn N beaParsevalframewith × < < < ∈ columns {d ,...,d }, let B with rows {b ,...,b } be an orthonormal basis of Cn, and assume the local 1 N 1 n coherenceofD withrespecttoB ispointwiseboundedbythefunctionκ Cn,thatis sup d ,b κ . j k k ∈ |〈 〉|≤ 1 j N ≤ ≤ κ2 Consider the probabilitymeasure ν on [n] given by ν(k) k along with the diagonal matrixW = κ 2 = k k2 diag(w) Cn n with w κ /κ . Note that the renormalizedsystemΦ 1 WB is an orthonormal ∈ × k =k k2 k = pm systemwithrespecttoν,boundedby κ 2. ConstructB Cm n bysamplingvectorsfromB i.i.d.fromthe × k k ∈ measureν,andconsiderthelocalizationfactorη η oftheframeD asinDefinition1.1, D,s = Aslongas e m≥Cδ−2η2kκk22slog3(sη2)log(N) thenwithprobability1 N log3(2s),thefollowingholdsforeverysignal f: thesolution f♯oftheweighted − − ℓ -analysisproblem 1 1 f♯ argmin D∗f˜ 1 suchthat W(Bf˜ y) 2 ε, (P1,w) = k k kpm − k ≤ f˜ Cn ∈ e COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 9 withy Bf e fornoisee withweightederror 1 We εsatisfies = + kpm k2≤ D f (D f) f♯ f C ε C k ∗ − ∗ sk1. (2.12) 2 1 2 k − k ≤ + ps HereC,C ,andC areabsoluteconstantsindependentofthedimensionsandthesignal f. 1 2 The resulting bound involves a weighted noise model, which can, in the worst case, introduce an additionalfactorof n max pν(i). Webelieve howeverthatthisnoisemodelisjustanartifactofthe m i prooftechnique,andqthatthestabilityresultsinCorollary2.9shouldholdforthestandardnoisemodel usingthestandardℓ -analysisproblem(P ). IntheimportantcasewhereD isawaveletframeandB is 1 1 theorthonormaldiscreteFouriermatrix,webelievethattotalvariationminimization,likeℓ -analysis, 1 should give stable and robust error guarantees with the standardmeasurement noise model. Indeed, such results were recently obtained for variable density sampling in the case of orthonormal wavelet sparsitybasis[Poo15a],improvingonpreviousboundsfortotalvariationminimization[KW14,NW13b, NW13a,AHPR13]. Generalizingsuchresultstothedictionarysettingisindeedaninterestingdirection forfutureresearch. 3. OURMAIN RESULT ANDITSAPPLICATIONS As mentionedin the previous section, the case of incoherence between the sampling basis andthe sparsitydictionary,asinTheorem1.2,isaspecialcaseofaboundedorthonormalsystem.Thefollowing moretechnicalformulationofourmainresultintheframeworkofsuchsystemscoversbothweighted sparsitymodelsandlocalcoherenceandaswewillsee,impliesTheorem1.2. TheproofofTheorem3.1 ispresentedinSection4. Theorem3.1. Fixa probabilitymeasureνon[N],sparsitylevel s N,andconstant0 δ 1. LetD < < < ∈ Cn N beaParsevalframe. Let A beanorthonormalsystemsmatrixwithrespecttoνwithrowsr asin × i Definition2.3,andconsiderweightsω ,ω ,...,ω 1suchthat 1 2 N ≥ max r ,d ω . (3.1) i j j i |〈 〉|≤ Define the unrecoverableenergyη η as in Definition2.7. Constructan m n submatrix A of A by D,s = × samplingrowsofAaccordingtothemeasureν.Thenaslongas e m Cδ−2sη2log3(sη2)log(N), and ≥ m Cδ−2sη2log(1/γ) (3.2) ≥ thenwithprobability1 γ,thenormalizedsubmatrix 1 AsatisfiestheD-ωRIPwithparameterss and − m δ. q e ProofofTheorem1.2assumingTheorem3.1: LetB,D andsbegivenasinTheorem1.2.Wewillapply Theorem 3.1 with ω 1 for all j, with ν the uniformmeasure on [n], sparsity level 2s, γ N log3(2s), j − = = δ 0.08, andmatrix A pnB. Wefirst note that since B is an orthonormalbasis, thatpnB is an or- = = thonormalsystemsmatrixwithrespecttotheuniformmeasureνasinDefinition2.3. Inaddition,(1.2) implies(3.1)formatrix A pnB withK ω 1. Furthermore,settingγ N log3(2s),(1.3)impliesboth j − = = = = inequalities of (3.2) (adjusting constants appropriately). Thus the assumptions of Theorem 3.1 are in force.Theorem3.1thenguaranteesthatwithprobability1 γ 1 N log3(2s),theuniformlysubsampled − − = − matrix 1(pnB)satisfiestheD-ωRIPwithparameters2s andδandweightsω 1. Asimplecalcula- m j = tionandqthedefinitionoftheD-ωRIP(Definition2.5)showsthatthisimpliestheD-RIPwithparameters 2sandδ.ByTheeorem2.2,(1.4)holdsandthiscompletestheproof. ä 10 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD ProofofCorollary2.9assumingTheorem3.1. Corollary2.9isimpliedbyTheorem3.1becausethepre- conditionedmatrixΦ 1 WB formedbynormalizingtheithrow ofB by κ /κ constitutesanor- = pm k k2 j thonormalsystem with respect to the probability measure ν(i) anduniformweights ω κ . Then j 2 =k k thepreconditionedsampledmatrixΦsatisfies theD-RIPforsparsity2s andparameterδaccording to Theorem3.1withprobability1 γ 1 N log3(2s).ApplyingTheorem2.2tothepreconditionedsampling − − = − matrixΦandParsevalframeD produecestheresultsinCorollary2.9. ä 3.1. Example:HarmonicframewithLmorevectorsthandimensions. Itremainstofindexamplesofa e dictionarywithboundedlocalizationfactorandanassociatedmeasurementsystemforwhichincoher- encecondition(1.2)holds.Ourmainexampleisthatofsamplingasignalthatissparseinanoversampled Fouriersystem,aso-calledharmonicframe[VW04];themeasurementsystemisthestandardbasis. In- deed,onecanseebydirectcalculationthatthestandardbasisisincoherentinthesenseof(1.2)toany set of Fouriervectors, evenof non-integerfrequencies. Wewill now show thatif thenumberof frame vectorsexceedsthedimensiononlybyaconstant,suchadictionarywillalsohaveboundedlocalization factor.Thissetupisasimpleexample,butourresultsapply,anditisnotcoveredbyprevioustheory. Moreprecisely,wefixL NandconsiderN n Lvectorsindimensionn.WeassumethatLissuch ∈ = + thatLs N. ThentheharmonicframeisdefinedviaitsframematrixD (d ),whichresultsfromthe ≤ 4 = jk N N discreteFouriertransformmatrixF (f )bydeletingthelastLrows. Thatis,wehave jk × = d 1 exp(2πijk) (3.3) jk=pn L n L + + for j 1...n andk 1...N. ThecorrespondingGrammatrixsatisfies(D D) n and,for j k,by = = ∗ ii = n L 6= orthogonalityofF, + N N |(D∗D)jk|=¯¯(F∗F)jk−ℓ=Xn+1f¯ℓjfℓk¯¯=¯¯ℓ=Xn+1n+1Lexp(−2nπ+iLℓj)exp(2nπ+iℓLk)¯¯≤ n+LL ¯ ¯ ¯ ¯ Asaconsequence,wehav¯eforz s-sparseand j ¯su¯ppz ¯ ¯ ∉¯ ¯ ¯ |(D∗Dz)[j]|=¯¯k∈sXuppz(D∗D)jkz[k]¯¯≤n+LLkzk1 ¯ ¯ wherewewritez[k]todenotethekthindex¯ofthevectorz.Simi¯larly,for j suppz,weobtain ¯ ¯ ∈ |(D∗Dz)[j]−z[j]|=¯¯k∈supXpz,k6=j(D∗D)jkz[k]¯¯≤ n+LLkzk1. ¯ ¯ Soweobtain,usingthatD isanisometry, ¯ ¯ ∗ ¯ ¯ kDzk22=kD∗Dzk22≥ (D∗Dz)[k] 2≥ (|z[k]|−|(D∗Dz)[k]−z[k]|)2 k suppz k suppz ∈X ¡ ¢ ∈X z[k]2 2 L z z[k] ≥ | | − n Lk k1| | k suppz + ∈X z 2 2L z 2 (1 2Ls) z 2 1 z 2. =k k2−n Lk k1≥ − N k k2≥ 2k k2 + Thatis,forz with Dz 1,onehas z p2.Consequently, 2 2 k k = k k ≤ η sup kD∗Dzk1 sup k(D∗Dz)|(suppz)k1+k(D∗Dz)|(suppz)ck1 = ps = ps kDzk2=1,kzk0≤s kDzk2=1,kzk0≤s ≤ sup kD∗Dzk2+k(D∗Dzp)|(ssuppz)ck1 =1+ sup p1s |(D∗Dz)[j]| kDzk2=1,kzk0≤s kDzk2=1,kzk0≤s j∈(sXuppz)c 1 sup 1 (N s)Lkzk1 1 (N−s)Lkzk2 1 Lp2. ≤ + ps − N ≤ + N ≤ + kDzk2=1,kzk0≤s ä