COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTURED MEASUREMENTS FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD ABSTRACT. Consider theproblemofrecoveringan unknownsignalfromundersampled measurements, giventheknowledgethatthesignalhasasparserepresentationinaspecifieddictionaryD. Thisproblem isnowunderstoodtobewell-posedandefficientlysolvableundersuitableassumptionsonthemeasure- 5 mentsanddictionary,ifthenumberofmeasurementsscalesroughlywiththesparsitylevel.Onesufficient 1 conditionforsuchistheD-restrictedisometryproperty(D-RIP),whichasksthatthesamplingmatrixap- 0 proximatelypreservethenormofallsignalswhicharesufficientlysparseinD. Whilemanyclassesofran- 2 dommatricesareknowntosatisfysuchconditions,suchmatricesarenotrepresentativeofthestructural n constraintsimposedbypracticalsensingsystems. Weclosethisgapinthetheorybydemonstratingthat u onecansubsampleafixedorthogonalmatrixinsuchawaythattheD-RIPwillhold,providedthisbasisis J sufficientlyincoherentwiththesparsifyingdictionaryD.Wealsoextendthisanalysistoallowforweighted 8 sparseexpansions.Consequently,wearriveatcompressivesensingrecoveryguaranteesforstructuredmea- surementsandredundantdictionaries,openingthedoortoawidearrayofpracticalapplications. ] T I . s c [ 1. INTRODUCTION 2 1.1. CompressiveSensing. Thecompressive sensingparadigm, asfirstintroducedbyCandèsandTao v [CT05b]andDonoho[Don06],isbasedonusingavailabledegreesoffreedominasensingmechanismto 8 tunethemeasurementsystemsoastoallowforefficientrecoveryofaparticulartypeofsignalorimageof 0 2 interestfromasfewmeasurementsaspossible. Amodelassumptionthatallowsforsuchsignalrecovery 3 issparsity:thesignalcanbewell-approximatedbyjustafewelementsofagivenrepresentationsystem. 0 Often, a near-optimal strategy is to choose the measurements completely at random, for example . 1 following a Gaussian distribution [RV08]. Typically, however, some additional structure is imposed by 0 theapplicationathand,andrandomnesscanonlybeinfusedintheremainingdegreesoffreedom. For 5 1 example,magneticresonanceimaging(MRI)isknowntobewellmodeledbyinnerproductswithFourier : basisvectors. Thisstructurecannotbechanged,andtheonlyaspectthatcanbedecidedatrandomis v i whichFourierbasisvectorstoselect. X Animportantdifferencebetweencompletelyrandommeasurementsystemsandstructuredrandom r a measurement systems is in the aspect of universality. Gaussian measurement systems, among many othersystemswithminimalimposedstructure,areoblivioustothebasisinwhichtheunderlyingsignal issparse, andachieve equalreconstructionquality foralldifferentorthonormalbasis representations. Forstructuredmeasurementsystems,thisis,ingeneral,nolongerthecase.Whenthemeasurementsare uniformlysubsampledfromanorthonormalbasissuchastheFourierbasis,onerequires,forexample, thatthemeasurement basis andthesparsity basis areincoherent; theinnerproducts between vectors fromthetwobasesaresmall. Ifthetwobasesarenotincoherent,morerefinedconceptsofincoherence are required [KW14, AHPR13]. An important example is that of the Fourier measurement basis and a waveletsparsitybasis.Sincebothcontaintheconstantvector,theyaremaximallycoherent. 1.2. Motivation. All of the above mentioned works, however, exclusively cover the case of sparsity in orthonormalbasisrepresentations.Ontheotherhand,thereareavastnumberofapplicationsinwhich sparsity is expressed not in terms of a basis but in terms of a redundant, often highly overcomplete, dictionary. Specifically, if f Cn is the signal of interest to be recovered, then one expresses f Dx, ∈ = Date:June9,2015. 1 2 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD whereD Cn N isanovercompletedictionaryandx CN isasparse(ornearlysparse)coefficientvector. × ∈ ∈ Redundancy is widespread in practice, either because no sparsifying orthonormal basis exists for the signal class of interest, or because the redundancy itself is useful and allows for a significantly larger, richerclassofsignalswhichcanbesparselyrepresentedintheresultingdictionary. Forexample,ithas beenwelldocumentedthatovercompletenessisthekeytoadrasticreductioninartifactsandrecovery errorinthedenoisingframework[SED04,SFM07]. Inthecompressivesensingframework,resultsusingovercompletedictionariesaremotivatedbythe broadarrayoftight(andoftenParseval)framesappearinginpracticalapplications. Forexample,ifone assumessparsitywithrespecttotheDiscrete FourierTransform(DFT),thisisimplicitly assumingthat thesignaliswell-representedbyfrequencieslyingalongthelatticeoftheDFT.Toallowformoreflexibility inthisrigidassumption,oneinsteadmayemploytheoversampledDFTframe,containingfrequencies onamuchfinergrid,orevenoverintervalsofvaryingwidths.Gaborframesareusedinimagingaswellas radarandsonarapplications,whichareoftenhighlyredundant[Mal99]. Manyofthemostwidely-used framesinimagingapplicationssuchasundecimatedwaveletframes[Dut89,SED04],curvelets[CD04], shearlets[LLKW05,ELL08],framelets[CCS08],andmanyothers,areovercompletewithhighlycorrelated columns. 1.3. Related Work. Due to the abundance of relevant applications, a number of works have studied compressive sensing for overcomplete frames. The first work on this topic aimed to recover the coef- ficientvectorxdirectly,andthusrequiredstrongincoherenceassumptionsonthedictionaryD[RSV08]. Morerecently,itwasnotedthatifoneinsteadaimstorecover f ratherthanx,recoveryguaranteescanbe obtainedunderweakerassumptions. Namely,oneonlyneedsthatthemeasurementmatrix A respects thenormsofsignalswhicharesparseinthedictionaryD.Toquantifythis,Candèsetal.[CENR10]define theD-restrictedisometryproperty(D-RIPinshort,seeDefinition2.1below).Formeasurementmatrices thathavethisproperty,anumberofalgorithmshavebeenshowntoguaranteerecoveryundercertainas- sumptions. Optimizationapproachessuchasℓ -analysis[EMR07,CENR10,GNE 14,NDEG13,LML12, 1 + RK13,Gir14]andgreedyapproaches[DNW12,GNE 14,GN13,PE13,GN14]havebeenstudied. + Thispaper establishes the D-RIPfor structuredrandommeasurementsformedbysubsampling or- thonormalbases,allowingforthesetypesofrecoveryresultstobeutilizedinmorerealisticsettings. To date,most randommatrixconstructionsknowntoyieldD-RIPmatriceswithhighprobabilityareran- dommatriceswithacertainconcentrationproperty.Asshownin[CENR10],suchapropertyimpliesthat forarbitrarydictionaryD,oneobtainstheD-RIPwithhighprobability.Thiscanbeinterpretedasadic- tionaryversionoftheuniversalitypropertydiscussedabove. Ithasnowbeenshownthatseveralclasses ofrandommatricessatisfythispropertyaswellassubsampledstructuredmatrices,afterapplyingran- domcolumnsigns[DG03,KW11].Thematricesinbothofthesecasesaremotivatedbyapplicationsce- narios,buttypicallyinapplicationstheyappearwithouttherandomizedcolumnsigns. Inmanycases oneisnotabletoapplycolumnsignsinpractice;forexampleincaseswherethemeasurementsarefixed suchasinMRI,onehasnochoicebuttouseunsignedFouriersamplesandcannotpre-processthedata toincorporatecolumnsigns. Withoutthesesignshowever,suchmeasurementensembleswillnotwork forarbitrarydictionariesingeneral.ThisiscloselyrelatedtotheunderlyingRIPmatrixconstructionsnot beinguniversal.Forexample,itisclearthatrandomlysubsampledFouriermeasurementswillfailforthe oversampledFourierdictionary(forreasonssimilartothebasiscase).Inthiswork,weaddressthisissue, derivingrecoveryguaranteesthattakeintoaccountthedictionary.Similartothebasiscase,ouranalysis willbecoherencebased. AsimilarapproachhasalsobeentakenbyPoonin[Poo15b](completedafter thefirstversionofourpaper),foraninfinitedimensionalversionoftheproblem. 1.4. Contribution. Our main result shows that a wide class of orthogonal matrices having uniformly bounded entries can be subsampled to obtain a matrix that has the D-RIP and hence yields recovery guarantees for sparse recovery in the setting of redundant dictionaries. As indicated above, our tech- nical estimates below will imply such guarantees for various algorithms. As an example we focus on the method of ℓ -analysis, for which the first D-RIP based guarantees were available [CENR10]. Our 1 COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 3 technicalestimateswillalsoprovidemoregeneralguaranteesforweightedℓ -analysisminimization(a 1 weightedversionof(P ),see(P )inSection3fordetails)incaseonehaspriorknowledge oftheun- 1 1,w derlyingsparsitysupport. Recallthatthemethodofℓ -analysisconsistsofestimatingasignal f fromnoisymeasurements y 1 = Af e bysolvingtheconvexminimizationproblem + f♯ argmin D∗f˜ 1 suchthat Af˜ y 2 ε, (P1) = k k k − k ≤ f˜ Cn ∈ whereεisthenoiselevel,thatis, e ε.Theℓ -analysismethod(likealternativeapproaches)assumes 2 1 k k ≤ thatforthesignal f Dx,notonlyistheunderlying(synthesis)coefficientsequencex sparse(typically = unknownandhardtoobtain),butalsotheanalysiscoefficientsD f arenearlysparse, i.e., dominated ∗ byafewlargeentries.WereferthereadertoTheorem2.2belowforthepreciseformulationoftheresult- ing recovery guarantees(as derivedin [CENR10]). The assumption has been observed empirically for manydictionariesusedinpracticesuchastheGaborframe,undecimatedwavelets,curvelets,etc. (see, e.g., [CENR10] for a detailed description of such frames) and is also key for a number of thresholding approachestosignaldenoising[CD04,KL07,ELL08]. A related and slightly stronger signal model, in which the analysis vector D f is sparse or nearly ∗ sparse,hasbeenconsideredindependentlyfromcoefficientsparsity(e.g.,[NDEG13]),andiscommonly calledtheco-sparsitymodel. Theresultsinthispaperneedasimilar,butsomewhatweakerassumptiontoholdforallsignalscor- respondingtosparsesynthesiscoefficientsx.Namely,oneneedstocontrolthelocalizationfactoraswe nowintroduce. Definition1.1. ForadictionaryD Cn N andasparsitylevels,wedefinethelocalizationfactoras × ∈ D Dz η ηdef sup k ∗ k1. (1.1) s,D = = ps kDzk2=1,kzk0≤s In the following we will mainly consider dictionaries, which formParseval frames, that is, D is an ∗ isometry.Thenthetermlocalizationfactorisappropriatebecausethisquantitycanbeviewedasthefac- torbywhichsparsityispreservedunderthegrammatrixmapD D,comparedtothecasewhereD isan ∗ orthonormalbasisandD D I andη 1byCauchy-Schwarz. Forageneralfamilyofsuchframespa- ∗ n = ≡ rameterizedbytheredundancyN/n,ηwillincreasewiththeredundancy;familiesofdictionarieswhere suchgrowth isrelativelyslowwill beofinterest. Forexample,newresultsontheconstructionofunit- normtightframesgiveaconstructivemethodtogeneratespectraltetrisframes[CFM 11],whosegram + matriceshaveatmost2 N/n 6non-zerosperrow,guaranteeingthatηisproportionalonlytothere- ⌈ ⌉+ dundancyfactorN/n.Infact,onecanshowthatthesearesparsestframespossible[CHKK11],suggesting thatfamiliesoftightframeswithlocalizationfactordependinglinearlyontheredundancyfactorshould beessentiallyoptimalforℓ -analysisreconstruction. Weposeasaproblemforsubsequentworktoob- 1 tainsuch estimatesfor dictionariesof practicalinterest, discussing a fewexamplesin Section 3. First, to illustrate that our results go strictly beyond existing theory, we show that harmonicframes (frames constructedbyremovingthehighfrequencyrowsoftheDFT–see(3.3))withsmallredundancyindeed haveaboundedlocalizationfactor.Toourknowledge,thisimportantcaseisnotcoveredbyexistingthe- ories. Inaddition,wealsoboundthelocalizationfactorforredundantHaarwaveletframes. Weexpect, however,thatitwillbedifficulttoefficientlycomputethelocalization factorofanarbitrarydictionary. Forboundedlocalizationfactors,weprovethefollowingtheorem. Theorem1.2. Fixasparsitylevel s N. LetD Cn N bea Parsevalframe –i.e., DD istheidentity– × ∗ < ∈ withcolumns{d ,...,d },andletB {b ,...,b }beanorthonormalbasisofCn whichisincoherenttoD 1 N 1 n = inthesensethat sup sup bi,dj Kn−1/2 (1.2) |〈 〉|≤ i [n]j [N] ∈ ∈ forsomeconstantK 1. ≥ 4 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD Consider the (unweighted) localization factor η η as in Definition 1.1. Construct B Cm n by s,D × = ∈ samplingrowvectorsfromB i.i.d.uniformlyatrandom.Provided e m CsK2η2log3(sη2)log(N), (1.3) ≥ thenwithprobability1 N log3(2s), nB exhibitsuniformrecoveryguaranteesforℓ -analysis. Thatis, − − m 1 foreverysignal f,thesolution f♯ oftqheminimizationproblem(P )withy nBf e fornoisee with e 1 = m + e 2 εsatisfies q k k ≤ D f (D f) e f♯ f C ε C k ∗ − ∗ sk1. (1.4) 2 1 2 k − k ≤ + ps Here,C,C ,andC areabsoluteconstantsindependentofthedimensionsandthesignal f. 1 2 Above and throughout, (u) denotes the best s-sparse approximation to a signal u, that is, (u) def s s = argmin u z . kzk0≤sk − k2 Remarks. 1. Ourstabilityresult(1.4)forℓ analysisguaranteesrecoveryfortheparticular f Dz onlyuptothe 1 = scaledℓ1normofthetailofD∗Dz. Infact,thequantityEf∗= kD∗f−p(Ds∗f)sk1,referredtoastheunrecover- ableenergyofthesignal f inD [CENR10],iscloselyrelatedtothelocalizationfactorη : s D f ηs= sup k p∗sk1 ≤ sup Ef∗+1 f=Dz:kfk2=1,kzk0≤s f=Dz:kfk2=1,kzk0≤s 2. As mentioned, the proof of this theorem will proceed via the D-RIP, so our analysis yields similar guaranteesforotherrecoverymethodsaswell. 3. Itisinterestingtoremarkontheroleofincoherenceinthissetting. Whilepriorworkincompressed sensinghasrequiredthemeasurementmatrixBitselfbeincoherent,wenowaskinsteadforincoherence betweenthebasisfromwhichmeasurementsareselectedandthedictionaryD,ratherthanincoherence withinD.Ofcourse,notethatthecoherenceofethedictionaryDitselfimpactsthelocalizationfactorη . s,D WewillseelaterinTheorem3.1thattheincoherencerequirementbetweenthebasisanddictionarycan beweakenedevenfurtherbyusingaweightedapproach. 4.TheassumptionthatD isaParsevalframeisnotnecessarybutmadeforconvenienceofpresentation. Severalresultshavebeenshownforframeswhicharenottight,e.g.,[LML12,RK13,Gir14]. Indeed,any frameD Cn N withlinearlyindependentrowsissuchthat × ∈ D˜ : (DD∗)−1/2D = is a tight frame. As observed in the recent paper [Fou15], in this case one can use for sampling the adaptedmatrix A˜ (B˜(DD ) 1/2),asmeasuringthesignal f Dz through y A˜f e isequivalentto ∗ − = = = + measuringthesignal f˜ D˜z through y B˜f˜ e. Workingthroughthisextensionposesaninteresting = = + directionforfuturework. 1.5. Organization. Therest of thepaperis organized asfollows. Weintroduce some notation andre- view some technical backgroundin Section 2 before presenting a technical version of our main result inSection3,whichdemonstratesthatbaseswhichareincoherentwithadictionaryD canberandomly subsampledtoobtainD-RIPmatrices. Inthatsection, wealsodiscuss somerelevantapplicationsand theimplicationsofourresults,includingtheregimewherethesamplingbasisiscoherenttothedictio- nary.Theproofofourmaintheoremispresentedinthefinalsection. 2. NOTATION ANDTECHNICAL BACKGROUND Throughoutthepaper,wewriteC,C ,C ,C ,...todenoteabsoluteconstants;theirvaluescanchange ′ 1 2 between different occurrences. We write I to denote the n n identity matrix and we denote by [n] n × theset {1,2,...,n}. Fora vectoru, we denote the jthindex byu[j]oru(j)oru ,dependingonwhich j notationisthemostclearinanygivencontext;itsrestrictiontoonlythoseentriesindexedbyasubset COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 5 Λ is donated by u . These notations should not be confused with the notation (u) , which we use to Λ s denotethebests-sparseapproximationtou,thatis,(u) defarginf u z .SimilarlyforamatrixA, s = kzk0≤sk − k2 A isthe jthcolumnandA isthesubmatricesconsistingofthecolumnswithindicesinΛ. j Λ 2.1. The unweighted case with incoherence. Recall that a dictionary D Cn N is a Parseval frame if × ∈ DD I andthatavector x iss-sparseif x : supp(x) s. Thentherestrictedisometryproperty ∗ n 0 = k k =| |≤ withrespecttothedictionaryD (D-RIP)isdefinedasfollows. Definition2.1([CENR10]). FixadictionaryD Cn N andmatrix A Cm n. Thematrix A satisfiesthe × × ∈ ∈ D-RIPwithparametersδandsif (1 δ) Dx 2 ADx 2 (1 δ) Dx 2 (2.1) − k k2≤k k2≤ + k k2 foralls-sparsevectorsx CN. ∈ Note that when D is the identity, this definition reduces to the standard definition of the restricted isometryproperty[CT05a].Undersuchanassumptiononthemeasurementmatrix,thefollowingresults boundthereconstructionerrorfortheℓ -analysismethod. 1 Theorem 2.2 ([CENR10]). Let D be a tight frame, ε 0, and consider a matrix A that has the D-RIP > withparameters2s andδ 0.08. Thenforeverysignal f Cn,thereconstruction f♯ obtainedfromnoisy < ∈ measurementsy Af e, e ε,viatheℓ -analysisproblem(P )abovesatisfies 2 1 1 = + k k ≤ D f (D f) f f♯ ε k ∗ − ∗ sk1. (2.2) 2 k − k ≤ + ps Thusasindicatedintheintroduction,oneobtainsgoodrecoveryforsignals f whoseanalysiscoeffi- cientsD f haveasuitabledecay. ∗ Both the RIP and the D-RIP are closely related to the Johnson-Lindenstrauss lemma [HV11, AC09, BDDW08]. Recalltheclassicalvariantofthelemmastatesthatforanyε (0,0.5)andpointsx ,...,x 1 d ∈ ∈ Rn,thatthereexistsaLipschitzfunction f :Rn Rm forsomem O(ε 2logd)suchthat − → = (1 ε) x x 2 f(x ) f(x ) 2 (1 ε) x x 2 (2.3) − k i− jk2≤k i − j k2≤ + k i− jk2 foralli,j {1,...,d}. Recentimprovementshavebeenmadetothisstatementwhichinparticularshow ∈ thatthemap f canbetakenasarandomlinearmappingwhichsatisfies(2.3)withhighprobability(see, e.g., [Ach03] for an account of such improvements). Indeed, any matrix A Cm n which for a fixed1 × ∈ vectorz Cn satisfies ∈ P (1−δ)kzk22≤kAzk22≤(1+δ)kzk22 ≤Ce−cm willsatisfytheD-RIPwithhigh¡probabilityaslongasm isatlea¢stontheorderof slog(n/s)[CENR10]. Fromthis,anymatrixsatisfyingtheJohnson-LindenstrausslemmawillalsosatisfytheD-RIP(see[BDDW08] forthe proofof theRIPfor such matrices, which directly carriesover totheD-RIP).Randommatrices knowntohavethispropertyincludematriceswithindependentsubgaussianentries(suchasGaussian or Bernoulli matrices), see for example [DG03]. Moreover, it is shown in [KW11] that any matrix that satisfies the classical RIP will satisfy the Johnson-Lindenstrauss lemma and thus the D-RIP with high probability after randomizing the signs of the columns. The latter construction allows for structured randommatriceswithfastmultiplicationpropertiessuchasrandomlysubsampledFouriermatrices(in combinationwiththeresultsfrom[RV08])andmatricesrepresentingsubsampledrandomconvolutions (incombinationwiththeresultsfrom[RRT12,KMR14]);inbothcases,however,againwithrandomized column signs. While this gives an abundance of such matrices, as mentioned above, it is not always practicalorpossibletoapplyrandomcolumnsignsinthesamplingprocedure. 1 By“fixed”wemeantoemphasizethatthisprobabilityboundmustoccurforasinglevectorzratherthanforallvectorsz likeonetypicallyseesintherestrictedisometryproperty. 6 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD Animportantgeneralset-upforstructuredrandomsensingmatricesknowntosatisfytheregularRIP istheframeworkofboundedorthonormalsystems,whichincludesasaspecialcasethesubsampleddis- creteFouriertransformmeasurements(withoutcolumnsignsrandomized).Suchmeasurementsarethe onlynaturalmeasurementspossibleinmanyphysicalsystemswherecompressivesensingisofinterest, suchasinMRI,radar,andastronomy[LDP07,BS07,HS09,BSO08],aswellasinapplicationstopolyno- mialinterpolation[RW12,BDWZ12,RW11]anduncertaintyquantification[HD15]. Inthefollowing,we recallthisset-up(inthediscretesetting),see[Rau10,Sec.4]foradetailedaccountofboundedorthonor- malsystemsandexamples. Definition2.3(Boundedorthonormalsystem). Consideraprobabilitymeasureνonthediscreteset[n] andasystem{r Cn, j [n]}thatisorthonormalwithrespecttoνinthesensethat j ∈ ∈ r (i)r (i)ν δ k j i j,k = i X 1 if j k whereδ = istheKroneckerdeltafunction. Supposefurtherthatthesystemisuniformly j,k =(0 else. bounded:thereexistsaconstantK 1suchthat ≥ supsup r (i) K. (2.4) j | |≤ i [n]j [n] ∈ ∈ ThenthematrixA Cn n whoserowsarer iscalledaboundedorthonormalsystemmatrix. × j ∈ Drawing m indices i ,i ,...,i independently from the orthogonalization measure ν, the sampling 1 2 m matrix A Cm n whose rows are indexed by the (re-normalized) sampled vectors 1r (i ) Cn will ∈ × m j k ∈ have the restricted isometry property with high probability (precisely, with probabqility exceeding 1 − n Clog3(es))providedthenumberofmeasurementssatisfies − m CK2slog2(s)log(n). (2.5) ≥ ThisresultwasfirstshowninthecasewhereνistheuniformmeasurebyRudelsonandVershynin[RV08], foraslightly worsedependenceofm ontheorderof slog2slog(slogn)log(n). Theseresultswere sub- sequentlyextendedtothegeneralboundedorthonormalsystemset-upbyRauhut[Rau10],andthede- pendenceofmwasslightlyimprovedtoslog3slognin[RW13]. Animportantspecialcasewheretheseresultscanbeappliedisthatofincoherencebetweenthemea- surementandsamplingbases. HerethecoherencebetweentwosetsofvectorsΦ {φ }andΨ {ψ }is i j = = givenbyµ sup φ ,ψ . TwoorthonormalbasesΦandΨofCn arecalledincoherentifµ Kn 1/2. = i,j|〈 i j〉| ≤ − Inthiscase,therenormalizedsystemΦ˜ {pnφ Ψ }isanorthonormalsystemwithrespecttotheuni- i ∗ = formmeasure, which is boundedby K. Then the above results imply thatsignals which are sparse in basisΨcanbereconstructedfrominnerproductswithauniformlysubsampledsubsetofbasisΦ.These incoherence-basedguaranteesareastandardcriteriontoensuresignal recovery, such resultshadfirst beenobservedin[CR07]. 2.2. Generalizationtoaweightedsetupandtolocalcoherence. Recently,thecriterionofincoherence hasbeengeneralizedtothecasewhereonlymost oftheinnerproductsbetweensparsitybasisvectors and measurement basis vectors are bounded. If some vectors in the measurement basis yield larger inner products, one can adjust the sampling measure and work with a preconditioned measurement matrix[KW14],andifsomevectorsinthesparsitybasisyieldlargerinnerproducts,onecanincorporate weights into the recovery schemes and work with a weighted sparsity model [RW13]. Our results can accommodateboththesemodifications.Intheremainderofthissection,wewillrecallsomebackground ontheseframeworksfrom[KW14,RW13]andformulateourdefinitionsforthemoregeneralsetup. COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 7 Considerasetofpositiveweightsω (ω ) .Associatetotheseweightsthenorms j j [N] = ∈ 1/p kxkω,p:=à j |xj|pω2j−p! , 0<p≤2, (2.6) X along with the weighted “ℓ -norm", or weighted sparsity: x : ω2, which equivalently is 0 p k kω,0 = j:|xj|>0 j defined as the limit x lim x . We say that a vector x is weighted s-sparse if x : ω,0 p 0 ω,p P ω,0 ω2 s. In linkekwith=this d→efinkitikon, the weighted size of a finite set Λ N, is given bykωk(Λ):= j:|xj|>0 j ≤ ⊂ = ω2; thusavectorisweighted s-sparseifitssupporthasweightedsize atmost s. Whenω 1for Pj∈Λ j j ≥ all j, the weighted sparsity of a vector is at least as large as its unweighted sparsity, so that the class P ofweighted s-sparsevectorsisa strictsubset ofthe s-sparsevectors. Wemakethisassumption in the remainder. Note in particular the special cases x x ω and x x x 2; by ω,1 j j j ω,2 2 j j k k = | | k k =k k = | | Cauchy-Schwarz, it follows that x ps x if x is weighted s-sparse. Indeed, we caqn extend the ω,1 2 P P k k ≤ k k notions of localization factor (1.1) and D-RIP (2.1) to the weighted sparsity setting. It should be clear fromcontextwhichdefinitionwerefertointheremainder. Definition2.4. ForadictionaryD Cn N,weightsω ,ω ,...,ω 1,andasparsitylevels,wedefinethe × 1 2 N ∈ ≥ (weighted)localizationfactoras D Dz η ηdef sup k ∗ kω,1, (2.7) ω,s,D = = ps kDzk2=1,kzkω,0≤s Definition2.5. FixadictionaryD Cn N,weightsω ,ω ,...,ω 1,andmatrix A Cm n. Thematrix × 1 2 N × ∈ ≥ ∈ AsatisfiestheD-ωRIPwithparametersδandsif (1 δ) Dx 2 ADx 2 (1 δ) Dx 2 (2.8) − k k2≤k k2≤ + k k2 forallweighteds-sparsevectorsx CN. ∈ WhenD isanorthonormalmatrix,thisdefinitionreducestothedefinitionωRIPfrom[RW13]. More generally,weightsallowtheflexibilitytoincorporatepriorinformationaboutthesupportsetandallow forweakerassumptions onthedictionaryD. In particular,alargerweight assigned toadictionaryel- ementwillallowforthiselementtohavelargerinnerproductswiththemeasurementbasisvectors. In thisregard,abasisversionofourresultisthefollowingvariantofatheoremfrom[RW13]: Proposition2.6(from[RW13]). Fixaprobabilitymeasureνon[n],sparsitylevels n,andconstant0 < < δ 1.LetD Cn nbeanorthonormalmatrix.LetAbeanorthonormalsystemwithrespecttoνasinDef- × < ∈ inition2.3withrowsdenotedbyr ,andconsiderweightsω ,ω ,...,ω 1suchthatmax r ,d ω . i 1 2 n i i j j ≥ |〈 〉|≤ Constructanm nsubmatrixAofAbysamplingrowsofAaccordingtothemeasureνandrenormalizing × eachrowby 1.Provided m e q m Cδ−2slog3(s)log(n), (2.9) ≥ thenwithprobability1 nclog3s,thesubmatrixAsatisfiestheωRIPwithparameterssandδ. − Anadjustedsamplingdensityforthecasewhencertainvectorsinthemeasurementbasisyieldlarger e innerproductsisobtainedviathelocalcoherenceasdefinedinthefollowing. Definition2.7(Localcoherence,[KW14,RW12]). ThelocalcoherenceofasetΦ {ϕ }k Cn withre- = i i 1⊂ specttoanothersetΨ {ψ }ℓ Cn isthefunctionµloc(Φ,Ψ) Rk definedcoordinate-=wiseby = j j 1⊆ ∈ = µloc(Φ,Ψ) µloc sup ϕ ,ψ , i 1,2,...,k. i = i = |〈 i j〉| = 1 j ℓ ≤ ≤ 8 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD WhenΦandΨareorthonormalbases,andasubsetofΦisusedforsamplingwhileΨisthebasisin whichsparsityisassumed,themainpointisthatrenormalizingthedifferentvectorsϕ inthesampling i basisbyrespectivefactors 1 doesnotaffectorthogonalityoftherowsofΦ Ψ.Inthisway,thelargerin- µloc ∗ i nerproductscanbereducedinsize.Tocompensateforthisrenormalizationandretaintheorthonormal systempropertyinthesenseofDefinition2.3,onecanthenadjustthesamplingmeasure. Theresulting preconditionedmeasurementsystemwithvariablesamplingdensitywillthenyieldbetterboundsK in Equation(2.5).Thisyieldsthefollowingestimatefortherestrictedisometryproperty. Proposition 2.8 ([KW14, RW12]). Let Φ {ϕ }n and Ψ {ψ }n be orthonormal bases of Cn. As- = j j 1 = k k 1 sume the local coherence of Φ with respect to Ψ=is pointwise bound=ed by the function κ/pn Cn, that ∈ is sup ϕ ,ψ κ .Suppose j k j |〈 〉|≤ 1 k n ≤ ≤ m≥Cδ−2kκk22slog3(s)log(n), (2.10) andchoosem(possiblynotdistinct)indices j [n]i.i.d.fromtheprobabilitymeasureνon[n]givenby ∈ κ2 j ν(j) . = κ 2 k k2 CallthiscollectionofselectedindicesΩ(whichmaypossiblybeamultiset).ConsiderthematrixA Cm n × ∈ withentries A ϕ ,ψ , j Ω,k [n], (2.11) j,k j k =〈 〉 ∈ ∈ andconsiderthediagonalmatrixW diag(w) Cn n withw κ /κ . Thenwithprobabilityatleast × j 2 j = ∈ =k k 1 n clog3(s), the preconditionedmatrix 1 WA has the restrictedisometryproperty withparameters δ − − pm ands. Remark. In case Φ and Ψ are incoherent, or if κ Kn 1/2 uniformly for all j, then local coherence j − = sampling as above reduces to the previous results for incoherent systems: in this case, the associated probabilitymeasureνisuniform,andthepreconditionedmatrixreducesto 1 WA n A. pm = m Our main result on D-RIP for redundant systems and structuredmeasurements (Tqheorem 3.1) im- pliesastrategyforextendinglocalcoherencesamplingtheorytodictionaries, ofwhichTheorem1.2is aspecialcase. Indeed, ifΨ {ψ }, moregenerally, isaParsevalframeandΦ {ϕ }isanorthonormal j i = = matrix,thenrenormalizingthedifferentvectorsϕ in thesampling basis byrespective factors 1 still i µloc i doesnotaffectorthogonalityoftherowsofΦ Ψ. Wehavethefollowingcorollaryofourmainresult;we ∗ willestablishamoregeneralresult,Theorem3.1inthenextsection,fromwhichbothTheorem1.2and Corollary2.9follow. Corollary2.9. Fixasparsitylevels N,andconstant0 δ 1. LetD Cn N beaParsevalframewith × < < < ∈ columns {d ,...,d }, let B with rows {b ,...,b } be an orthonormal basis of Cn, and assume the local 1 N 1 n coherenceofD withrespecttoB ispointwiseboundedbythefunctionκ Cn,thatis sup d ,b κ . j k k ∈ |〈 〉|≤ 1 j N ≤ ≤ κ2 Consider the probabilitymeasure ν on [n] given by ν(k) k along with the diagonal matrixW = κ 2 = k k2 diag(w) Cn n with w κ /κ . Note that the renormalizedsystemΦ 1 WB is an orthonormal ∈ × k =k k2 k = pm systemwithrespecttoν,boundedby κ 2. ConstructB Cm n bysamplingvectorsfromB i.i.d.fromthe × k k ∈ measureν,andconsiderthelocalizationfactorη η oftheframeD asinDefinition1.1, D,s = Aslongas e m≥Cδ−2η2kκk22slog3(sη2)log(N) thenwithprobability1 N log3(2s),thefollowingholdsforeverysignal f: thesolution f♯oftheweighted − − ℓ -analysisproblem 1 1 f♯ argmin D∗f˜ 1 suchthat W(Bf˜ y) 2 ε, (P1,w) = k k kpm − k ≤ f˜ Cn ∈ e COMPRESSIVESENSINGWITHREDUNDANTDICTIONARIESANDSTRUCTUREDMEASUREMENTS 9 withy Bf e fornoisee withweightederror 1 We εsatisfies = + kpm k2≤ D f (D f) f♯ f C ε C k ∗ − ∗ sk1. (2.12) 2 1 2 k − k ≤ + ps HereC,C ,andC areabsoluteconstantsindependentofthedimensionsandthesignal f. 1 2 The resulting bound involves a weighted noise model, which can, in the worst case, introduce an additionalfactorof n max pν(i). Webelieve howeverthatthisnoisemodelisjustanartifactofthe m i prooftechnique,andqthatthestabilityresultsinCorollary2.9shouldholdforthestandardnoisemodel usingthestandardℓ -analysisproblem(P ). IntheimportantcasewhereD isawaveletframeandB is 1 1 theorthonormaldiscreteFouriermatrix,webelievethattotalvariationminimization,likeℓ -analysis, 1 should give stable and robust error guarantees with the standardmeasurement noise model. Indeed, such results were recently obtained for variable density sampling in the case of orthonormal wavelet sparsitybasis[Poo15a],improvingonpreviousboundsfortotalvariationminimization[KW14,NW13b, NW13a,AHPR13]. Generalizingsuchresultstothedictionarysettingisindeedaninterestingdirection forfutureresearch. 3. OURMAIN RESULT ANDITSAPPLICATIONS As mentionedin the previous section, the case of incoherence between the sampling basis andthe sparsitydictionary,asinTheorem1.2,isaspecialcaseofaboundedorthonormalsystem.Thefollowing moretechnicalformulationofourmainresultintheframeworkofsuchsystemscoversbothweighted sparsitymodelsandlocalcoherenceandaswewillsee,impliesTheorem1.2. TheproofofTheorem3.1 ispresentedinSection4. Theorem3.1. Fixa probabilitymeasureνon[N],sparsitylevel s N,andconstant0 δ 1. LetD < < < ∈ Cn N beaParsevalframe. Let A beanorthonormalsystemsmatrixwithrespecttoνwithrowsr asin × i Definition2.3,andconsiderweightsω ,ω ,...,ω 1suchthat 1 2 N ≥ max r ,d ω . (3.1) i j j i |〈 〉|≤ Define the unrecoverableenergyη η as in Definition2.7. Constructan m n submatrix A of A by D,s = × samplingrowsofAaccordingtothemeasureν.Thenaslongas e m Cδ−2sη2log3(sη2)log(N), and ≥ m Cδ−2sη2log(1/γ) (3.2) ≥ thenwithprobability1 γ,thenormalizedsubmatrix 1 AsatisfiestheD-ωRIPwithparameterss and − m δ. q e ProofofTheorem1.2assumingTheorem3.1: LetB,D andsbegivenasinTheorem1.2.Wewillapply Theorem 3.1 with ω 1 for all j, with ν the uniformmeasure on [n], sparsity level 2s, γ N log3(2s), j − = = δ 0.08, andmatrix A pnB. Wefirst note that since B is an orthonormalbasis, thatpnB is an or- = = thonormalsystemsmatrixwithrespecttotheuniformmeasureνasinDefinition2.3. Inaddition,(1.2) implies(3.1)formatrix A pnB withK ω 1. Furthermore,settingγ N log3(2s),(1.3)impliesboth j − = = = = inequalities of (3.2) (adjusting constants appropriately). Thus the assumptions of Theorem 3.1 are in force.Theorem3.1thenguaranteesthatwithprobability1 γ 1 N log3(2s),theuniformlysubsampled − − = − matrix 1(pnB)satisfiestheD-ωRIPwithparameters2s andδandweightsω 1. Asimplecalcula- m j = tionandqthedefinitionoftheD-ωRIP(Definition2.5)showsthatthisimpliestheD-RIPwithparameters 2sandδ.ByTheeorem2.2,(1.4)holdsandthiscompletestheproof. ä 10 FELIXKRAHMER,DEANNANEEDELL,ANDRACHELWARD ProofofCorollary2.9assumingTheorem3.1. Corollary2.9isimpliedbyTheorem3.1becausethepre- conditionedmatrixΦ 1 WB formedbynormalizingtheithrow ofB by κ /κ constitutesanor- = pm k k2 j thonormalsystem with respect to the probability measure ν(i) anduniformweights ω κ . Then j 2 =k k thepreconditionedsampledmatrixΦsatisfies theD-RIPforsparsity2s andparameterδaccording to Theorem3.1withprobability1 γ 1 N log3(2s).ApplyingTheorem2.2tothepreconditionedsampling − − = − matrixΦandParsevalframeD produecestheresultsinCorollary2.9. ä 3.1. Example:HarmonicframewithLmorevectorsthandimensions. Itremainstofindexamplesofa e dictionarywithboundedlocalizationfactorandanassociatedmeasurementsystemforwhichincoher- encecondition(1.2)holds.Ourmainexampleisthatofsamplingasignalthatissparseinanoversampled Fouriersystem,aso-calledharmonicframe[VW04];themeasurementsystemisthestandardbasis. In- deed,onecanseebydirectcalculationthatthestandardbasisisincoherentinthesenseof(1.2)toany set of Fouriervectors, evenof non-integerfrequencies. Wewill now show thatif thenumberof frame vectorsexceedsthedimensiononlybyaconstant,suchadictionarywillalsohaveboundedlocalization factor.Thissetupisasimpleexample,butourresultsapply,anditisnotcoveredbyprevioustheory. Moreprecisely,wefixL NandconsiderN n Lvectorsindimensionn.WeassumethatLissuch ∈ = + thatLs N. ThentheharmonicframeisdefinedviaitsframematrixD (d ),whichresultsfromthe ≤ 4 = jk N N discreteFouriertransformmatrixF (f )bydeletingthelastLrows. Thatis,wehave jk × = d 1 exp(2πijk) (3.3) jk=pn L n L + + for j 1...n andk 1...N. ThecorrespondingGrammatrixsatisfies(D D) n and,for j k,by = = ∗ ii = n L 6= orthogonalityofF, + N N |(D∗D)jk|=¯¯(F∗F)jk−ℓ=Xn+1f¯ℓjfℓk¯¯=¯¯ℓ=Xn+1n+1Lexp(−2nπ+iLℓj)exp(2nπ+iℓLk)¯¯≤ n+LL ¯ ¯ ¯ ¯ Asaconsequence,wehav¯eforz s-sparseand j ¯su¯ppz ¯ ¯ ∉¯ ¯ ¯ |(D∗Dz)[j]|=¯¯k∈sXuppz(D∗D)jkz[k]¯¯≤n+LLkzk1 ¯ ¯ wherewewritez[k]todenotethekthindex¯ofthevectorz.Simi¯larly,for j suppz,weobtain ¯ ¯ ∈ |(D∗Dz)[j]−z[j]|=¯¯k∈supXpz,k6=j(D∗D)jkz[k]¯¯≤ n+LLkzk1. ¯ ¯ Soweobtain,usingthatD isanisometry, ¯ ¯ ∗ ¯ ¯ kDzk22=kD∗Dzk22≥ (D∗Dz)[k] 2≥ (|z[k]|−|(D∗Dz)[k]−z[k]|)2 k suppz k suppz ∈X ¡ ¢ ∈X z[k]2 2 L z z[k] ≥ | | − n Lk k1| | k suppz + ∈X z 2 2L z 2 (1 2Ls) z 2 1 z 2. =k k2−n Lk k1≥ − N k k2≥ 2k k2 + Thatis,forz with Dz 1,onehas z p2.Consequently, 2 2 k k = k k ≤ η sup kD∗Dzk1 sup k(D∗Dz)|(suppz)k1+k(D∗Dz)|(suppz)ck1 = ps = ps kDzk2=1,kzk0≤s kDzk2=1,kzk0≤s ≤ sup kD∗Dzk2+k(D∗Dzp)|(ssuppz)ck1 =1+ sup p1s |(D∗Dz)[j]| kDzk2=1,kzk0≤s kDzk2=1,kzk0≤s j∈(sXuppz)c 1 sup 1 (N s)Lkzk1 1 (N−s)Lkzk2 1 Lp2. ≤ + ps − N ≤ + N ≤ + kDzk2=1,kzk0≤s ä