Deep Convolutional Denoising of Low-Light Images TalRemez1 OrLitany1 RajaGiryes1 [email protected] [email protected] [email protected] AlexM.Bronstein2 [email protected] 1 SchoolofElectricalEngineering,Tel-AvivUniversity,Israel 7 2 ComputerScienceDepartment,Technion-IIT,Israel 1 0 2 n Abstract performanceisachievedusingneuralnetwork-basedstrate- a gies[8,9,36]. Especiallyappealingareconvolutionalneu- J Poisson distribution is used for modeling noise in ralnetwork(CNN)basedsolutions,whichofferseveralad- 6 photon-limitedimaging. Whilecanonicalexamplesinclude vantages. First, they are known to have a high represen- relatively exotic types of sensing like spectral imaging or tation capability, thus, potentially enabling the learning of ] V astronomy, the problem is relevant to regular photography complex priors. In addition, they can be easily adapted to C nowmorethaneverduetotheboomingmarketformobile acertaindatatypemerelybytrainingonaspecificdataset. cameras. Restricted form factor limits the amount of ab- Moreoever, being highly parallelizable, they lead in many . s sorbed light, thus computational post-processing is called casestoafastcomputationonGPUs. c [ for. In this paper, we make use of the powerful framework of deep convolutional neural networks for Poisson denois- 1 ing.Wedemonstratehowbytrainingthesamenetworkwith v 7 images having a specific peak value, our denoiser outper- 8 forms previous state-of-the-art by a large margin both vi- 6 sually and quantitatively. Being flexible and data-driven, Ourcontribution. Inthispaperweproposeanovelfully 1 oursolutionresolvestheheavyadhocengineeringusedin convolutionalresidualneuralnetworkforPoissonnoisere- 0 previous methods and is an order of magnitude faster. We moval. We demonstrate state-of-the-art results on several . 1 furthershowthatbyaddingareasonablepriorontheclass datasets compared to previous techniques. The improve- 0 of the image being processed, another significant boost in ments are noticeable both visually and quantitatively. The 7 1 performanceisachieved. advantage of our proposed strategy stems from its sim- : plicity. It does not rely on data models such as non-local v similarity, sparserepresentationorGaussianmixturemod- i X 1.Introduction els (GMM), which have been used in previous strategies r [10, 22, 34], and only partially explain the structure of the a Poissonnoise,alsoknownasshotnoise,appearsinmany data.Bytakingasupervisedapproach,andusingthepower- applicationsinvariousfieldsrangingfrommedicalimaging fulrepresentationcapabilitiesdemonstratedbydeepneural to astronomy. It becomes dominant especially in the case networks, our method learns to remove the Poisson noise oflowphotoncount,asinthecaseofphotographyinlow- withoutexplicitlyrelyingonamodel. Moreover,itsconvo- lightconditions. Theproblembecomesevenmoreseverein lutionalstructuremakesitwellsuitedtorunonGPUsand modernmobilecamerasthathaveasmallformfactorwhich other parallel hardware, thus, leading to running time that reducestheamountoflightthatreachesthesensor. Inview isoforderofmagnitudebetterthantheothernonconvolu- of the fact that today most of the photos are captured by tionalmodel-basedsolutions[10,22,34]. smartphones,efficienttechniquesforPoissonnoiseremoval areneededforimprovingthequalityofimagescapturedby The performance of our method is further improved thesedevicesinlow-lightconditions. whentrainedonaspecificclassofimages. Thisscenariois InthesetupofGaussiannoiseremoval,whichwasstud- particularly important as the vast majority of photos taken iedmuchmorethanthePoissoncounterpartbutislessreal- bymobilecamerascontainalimitedsetoftypesofobjects istic especially in the low-light regime, the state-of-the-art suchasfacesorlandscapes. Groundtruthimage Noisyimage I+VST+BM3D[4] ProposedDenoiseNet 24.45dB 24.77dB Figure1.Perceptualcomparisonofproposedmethod.Theproposeddenoiserproducesvisuallymorepleasantresultsandavoidsartifacts commonlyintroducedbypreviousmethods. Thereaderisencouragedtozoom-intobetterappreciatetheimprovement. Thepresented imagehasapeakvalueof4.PSNRvaluesarereportedbelowthedenoisedimages. 2.ThePoissonDenoisingProblem play”scheme[31]. Let X ∈ (N∪{0})W×H denote a noisy image pro- Analternativeapproachistodevelopnewmethodsthat are adapted to the Poisson noisy data directly [11, 12, 18, duced by a sensor. The goal of denoising is to recover a latent clean image Y ∈ RW×H observed by the sensor. 19, 22, 21, 30, 34, 38, 42]. For example, the work in + [34] proposed the non-local sparse PCA (NLSPCA) tech- Inlow-lightimaging,thenoiseisdominatedbyshotnoise; nique that relies on GMM [39]. In [22], a different strat- consequently, giventhetruevalueY ofthe(i,j)-thpixel ij egyhasbeenproposed, thesparsePoissondenoisingalgo- expressed in number of photoelectrons, the corresponding rithm(SPDA),whichreliesonsparsecodinganddictionary valueoftheobservedpixelX isanindependentPoisson- ij learning. In [18], a nonlinear diffusion based neural net- distributed random variable with mean and variance Y , ij work, previously proposed for Gaussian denosing [9], has i.e.,X ∼Poisson(Y ): ij ij been adapted to the Poisson noise setting under the name (cid:12) (cid:26) λne−λ λ>0 of trained reaction diffusion models for Poisson denoising P(Xij =n(cid:12)Yij =λ)= n!δ λ=0. (1) (TRDPD). n A popular strategy improving the performance of many NoticethatPoissonnoiseisneitheradditivenorstationary, Poisson denoising algorithms is binning [34]. Instead of as its strength is dependent on the image intensity. Lower processing the noisy image directly, a low-resolution ver- intensityintheimageyieldsastrongernoiseastheSNRin sion of the image with higher SNR is generated by aggre- (cid:112) each pixel is Yij. Thus, it is natural to define the noise gation of nearby pixels. Then, a given Poisson denoising power in an image by the maximum value of Y (its peak techniqueisappliedfollowedbysimplelinearinterpolation value). This is a good measure under the assumption that togetbacktotheoriginalhighresolutionimage. Thebin- the intensity values are spread uniformly across the entire ning technique trades off spatial resolution and SNR, and dynamicrange,whichholdsformostnaturalimages. hasbeenshowntobeusefulintheverylowSNRregimes. A very popular strategy [6, 15, 28, 40] for recovering Y relies on variance-stabilizing transforms (VST), such as Anscombe [2] and Fisz [20], that convert Poisson noise to 3.DenoisingbyDenoiseNet beapproximatelywhiteGaussianwithunitvariance. Thus, it is possible to solve the Poisson denoising problem us- To recover the clean image Y from its realization X, ingoneofthenumerousdenoisersdevelopedforGaussian which is contaminated with Poisson noise, we propose a noise(e.g. [8,9,10,13,14,27,35,36]). fully convolutional neural network. Our architecture, de- The problem with these approximations is the fact that notedasDenoiseNet, isinspiredbythenetworksuggested they cease to hold for very low intensity values [28, 34]. in [36] for the purpose of super-resolution as the network Onestrategythathasbeenusedtoovercomethisdeficiency estimates the difference between the noisy image and its correcttheestimatedvaraincebyapplyingtheGaussiande- clean counterpart. It also bares resemblance to the resid- noiserandtheAnscombetransformiteratively[4]. Another ual network introduced in [23] since the weight gradients technique that also relies on Gaussian denoisers bypasses propagatetoeachlayerboththroughitsfollowinglayerand the need of using any type of VST by using the “plug and directlyfromthelossfunction. on a Titan-X GPU on a set of 8000 images from the PAS- CALVOCdataset[16].Weusedmini-batchesof64patches ofsize128×128.ImageswereconvertedtoYCbCrandthe Ychannelwasusedastheinputgrayscaleimageafterbeing scaledbythepeakvalueandshiftedby−1/2.Fordataaug- menttionpurposes,duringtraining,imagepatcheswereran- domlycroppedfromthetrainingimagesandflippedabout the vertical axis. Also, noise realization was randomzied. Training was done using the ADAM optimizer [26] with the learning rate α = 10−4, β = 0.9, β = 0.999 and 1 2 (cid:15) = 10−8. Separatenetworksweretrainedrespectivelyfor differentpeakvalues. Toavoidconvolutionartifactsatthe bordersofthepatches,duringtrainingweusedan(cid:96) losson 2 thecentralpartofthepatchescroppingtheouter21pixels. Attesttime,imageswerepaddedwith21pixelsusingsym- metricreflectionbeforepassingthemthroughthenetwork, and cropped back to their original size afterwards to give thefinaloutput. 3.3.Class-awaredenoising HavingconstructedasupervisedframeworkforPoisson denoising, itisnaturaltoseekforadditionalbenefitsofits inherentflexibilitytofine-tunetospecificdata. Onepossi- bilitytoexploitthisproperty,whichweproposehere,isto buildclass-specificdenoisers,thatis,torestrictthetraining data to a specific semantic class in order to boost the per- formance on it. This assumption is rather unrestrictive, as in many low-light imaging application the data being pro- cessed belong to a specific domain. In other settings, the classinformationcanbeprovidedmanuallybytheuser.For example, choosing face denoising for cleaning a personal photocollection. Alternatively,onecouldpotentiallytrain, Figure2.DenoiseNetarchitecture. yet,anotherdeepnetworkforautomaticclassificationofthe noisyimages. 3.1.Networkarchitecture Theideaofcombiningclassificationwithreconstruction hasbeenpreviouslyproposedby[5], whichalsodubbedit TheDenoiseNetarchitectureisshowninFigure 2. The recogstruction. In their work, the authors set a bound on network receives a noisy grayscale image as the input and super-resolutionperformanceandshoweditcanbebroken produces an estimate of the original clean image. At each whenaface-priorisused. Severalotherstudieshaveshown layer, we convolve the previous layer output with 64 ker- thatitisbeneficialtodesignastrategyforaspecificclass. nels of size 3×3 using a stride of 1. The first 63 output For example, in [7] it has been shown that the design of channels are used for calculating the next steps, whereas a compression algorithm dedicated to faces improves over the last channel is extracted to be directly combined with generic techniques targeting general images. Specifically theinputimagetopredictthecleanoutput. Thus,theseex- for the class of faces, several face hallucination methods tractedlayerscanbeviewedasnegativenoisecomponents have been developed [37], including face super-resolution astheirsumcancelsoutthenoise. Webuildadeepnetwork andfacesketch-photosynthesistechniques. In[25],theau- comprising 20 convolutional layers, where the first 18 use thorsshowedthatgivenacollectionofphotosofthesame the ReLU nonlinearity, while the last two are kept entirely personitispossibletoobtainamorefaithfulreconstruction linear. of the face from a blury image. In [24, 41] class labeling atapixel-levelisusedforthecolorizationofgray-scaleim- 3.2.Implementationdetails ages. In [3], the subspaces attenuated by blur kernels for We implemented the network in TensorFlow [1] and specific classes are learned, thus improving the deblurring traineditfor120K iterations,whichroughlytook72hours performance. Building on the success demonstrated in the aforemen- tioned body of work, in this paper, we propose to use se- 3 B] Peak Value = 1 mDiafnfetircenctlafrsosemsparsevaiopurisomr aenthdodbus,iloducrlmasosd-aewliasremdaedneocilsaesrss-. 3D [d 2 PPeeaakk VVaalluuee == 24 M Peak Value = 8 awareviatrainingandnotbydesign,henceitcanbeauto- +B Peak Value = 30 T maticallyextendedtoanytypeandnumberofclasses. VS 1 + er I v 4.Experiments R o 0 N S P We tested DenoiseNet performance and compared to nt in -1 other methods on the following series of experiments. me e Whenever code for another technique was not publicly ov-2 pr available,weevaluatedourmethodonthesametestsetand m I compared with the reported scores. As a first experiment, -3 0 100 200 300 400 500 600 700 800 900 1000 DenoiseNet was tested on the test set of the dataset it had Sorted image number beentrainedon, namelyPASCALVOC[17]. Wethenap- plieditonacommonlyusedtestsetof10imagesandcom- Figure 3. Comparison of performance profile relative to paredagainst8othermethodsfor5differentpeakvaluesin I+VST+BM3D[4]. Imageindicesaresortedinascendingorder therange[1,30]. Tofurtherappreciatetheperformanceon ofperformancegainrelativetoI+VST+BM3D.Theimprovement of our method is demonstrated by (i) small zero-crossing point, differentdatawecomparedagainst[18]on68imagesfrom and(ii)consistentlyhigherPSNRvalues.Thedistributionreveals theBerkeleysegmentationdataset[29]. Weproceededwith thestatisticalsignificanceofthereportedimprovement.Thecom- ashortexplorationthatsuggeststhatapplyingbinningand parison was made on images from PASCAL VOC and 15 noise theAnscombetransform,whicharecommontechniquesin realizationsperimage. thePoissondenoisingliterature,doesnotimproveournet- workperformance. Lastly,wefinetunedournetworkusing specific classes of images from ImageNet to demonstrate 2 is almost guaranteed. Denoising of several images from the additional boost obtained in performance provided by the test-set are visualized in Figure 7. We encourage the suchaprior. readertozoom-intoappreciatethesignificantimprovement our method achieves, resulting in much more aesthetically 4.1.PASCALVOCimages pleasingimages. Inthisexperimentwetestedournetworkonasetof1000 images from PASCAL VOC [17]. A comparison with the recent iterative Poisson image denoising via VST method Peak 1 2 4 8 30 (I+VST+BM3D[4]),whichisconsideredtobetheleading I+VST+BM3D 22.71 23.70 24.78 26.08 28.85 method for Poisson denoising to date, shows an improve- DenoiseNet 22.87 24.09 25.36 26.70 29.56 mentinPSNRacrossalltestedpeakvaluesbetween1and PSNRgain 0.16 0.39 0.58 0.62 0.71 30ofasmuchas0.71dB(seeTable1).Adifferentnetwork wastrainedtohandleeachofthepeakvaluesusingimages Table 1. PSNR performance on PASCAL VOC [17]. Average from PASCAL VOC as described in Section 3.2. Interest- PSNR values for different peak values on 1000 test images and ingly, thegainachievedbyourmethoddecaysasthepeak 15noiserealizationsperimage.PSNRgainbetweentheproposed valuesdecrease. methodandI+VST+BM3Dispresentedinthebottomrow. To examine the statistical significance of the improve- mentourmethodachieves,inFigure3wecomparethegain inperformancewithrespecttoI+VST+BM3Dachievedby ourmethod. Imageindicesaresortedinascendingorderof Peak 1 2 4 8 30 performancegain. Asmallzero-crossingvalueaffirmsour I+VST+BM3D 26.0% 5.2% 1.1% 1% 0.8% method outperforms I+VST+BM3D on the large majority DenoiseNet 74.0% 94.8% 98.9% 99% 99.2% oftheimagesinthedataset. Theplotvisualizesthesignifi- cantandconsistentimprovementinPSNRachievedbyour Table2.WinsonPASCALVOC[17].Presentedisthepercentage method. An additional summary of the percentage of im- of images out of the 1,000 image test set on which each of the agesfromthedataset,onwhicheachmethodoutperformed comparedalgorithmsoutperformedtheothers. theotherispresentedinTable2. Itisevidentthatwithour method,abetterreconstructionforpeakvaluesgreaterthan 4.2.Standardimageset Anscombe transform (and its inverse), and binning. To check whether these could also improve our proposed net- Inthisexperimentweevaluatedourmethodonthestan- workwechangedthearchitectureinthefollowingway. We dardsetofimagesusedbypreviousworks. Evaluationwas added a first layer with 4 channels consisting of constant performed for peak values in the range of 1 to 30. PSNR valuedsquarekernelsofsizesn×n,wheren = 1,3,5,7. valuesandrunningtimepresentedinTable3showthatour This has the effect of binning. This layer’s weights were methodoutperformsallothermethodsbyasignificantmar- keptfixedattrainingphase.Wefurtheraddedasecondlayer ginforthelargemajorityoftheimagesandalmostallpeak realizing the non-linear Anscombe transform. The input values. Inaddition, theexecutiontimeisordersofmagni- was also transformed before adding it to the intermediate tude faster on a Titan-X GPU taking only 37 milliseconds residualoutputs. Finally,weaddedtheexactinversetrans- for a 256×256 image, and is comparable to other meth- form[28]aftersummingtheresidualsandthetransformed odswhenitrunsonanIntelE5-26302.20GHzCPU,taking noisyinputimage. Interestingly,thesemodificationshada 1.3 seconds. A qualitative example can be seen in Figure negligible effect on the network performance. As for the 1 showing the man image denoised by DenoiseNet and by reasonswhyorwhethertherewouldhavebeenaneffectfor I+VST+BM3D[4]forapeakvalueof4. shallowernetworks,weleavetheseforfuturework. 4.3.Berkeleysegmentationdataset 4.5.Class-awaredenoising In this experiment we tested our method on 68 test im- Inordertodemonstratethebenefitsofhavingaflexible ages from the Berkeley dataset [29] (as selected by [32]), architecturewhenanadaptationtoaspecificdatatypeisre- and compared it with [18] and I+VST+BM3D [4]. Note quired, we selected the following semantic classes: face, thatwedidnotfine-tuneournetworktofitthisdatasetbut flower, street, living-room, and pet. About 1200 images rather used it after it had been trained on PASCAL VOC. werecollectedforeachofthe5semanticclassesfromIma- ResultsaresummarizedinTable4. ThesuperiorityofDe- geNet[33]. Startingfromthenetworkthathasbeentrained noiseNetoverothermethodsisevidentacrossallpeakval- onPASCALforpeak8,wefine-tunedaseparatenetworkto ues. Especiallyinterestingistheimprovementcomparedto each of the classes, thus, making them ”class-specific” (as [18],whereatrainablenonlinearreactiondiffusionnetwork opposed to being ”class-agnostic” before the fine-tunning was tuned for Poisson denoising on this data. This sug- process). Tuningprocedureconsistedof45×103 training geststhataflexiblenetworkarchitecturemaysometimesbe iterationswiththesameparametersusedfortheinitialtrain- preferabletoamodel-drivenone,andespeciallyinthecase ing. Theimagesofeachclasswereslipintotraining(60%), wheretrainingdataarepracticallyunlimited.Alsonotethat validation(20%)andtest(20%)sets. bothnetworkshavesimilarruntime. The performance of our class-specific networks com- pared to the class-agnostic baseline and I+VST+BM3D Peak 1 2 4 8 is summarized in Table 5. While the class-agnostic De- NLSPCA 20.90 21.60 22.09 22.38 noiseNet outperforms I+VST+BM3D by as much as 0.6 NLSPCAbin 19.89 19.95 19.95 19.91 dB, its class-specific version boosts performance by addi- VST+BM3D 21.01 22.21 23.54 24.84 tional 0.15 to 0.31 dB. A visual inspection is presented in VST+BM3Dbin 21.39 22.14 22.87 23.53 Figure4,whichdemonstratesanalreadylargeimprovement I+VST+BM3D 21.66 22.59 23.69 24.93 of our proposed class-agnostic method compared to previ- TRDPD8 21.49 22.54 23.70 24.96 5×5 ousmethods,andyetanadditionalnon-negligibleboostat- TRDPD8 21.60 22.62 23.84 25.14 7×7 tainedbytheclass-awarenetwork.Weencouragethereader DenoiseNet 21.79 22.90 23.99 25.30 tozoomin,e.g.,ontheperson’sface,theliving-roomfloor andcurtains,thestreetscurbsandbuildingfacades,andthe Table4.PSNRperformance68imagessetfrom[32]. Average cat’s eyes and fur to fully appreciate the improvement in PSNRvaluesfordifferentpeakvalueson68imagetestsetfrom visual quality. Lastly, in Figure 5 we present a confusion [32].Resultsreportedin[18]werecopiedasis,andvaluesforour matrix calculated for applying different denoiser class on methodandforI+VST+BM3D[4]wereadded(averagingover15 alldifferentimageclasses. Itcanbeseenthatthenetwork noiserealizationsperimage). OurnetworkwastrainedonPAS- hasindeedlearneddistinguishablefeaturesperclassandhas CALVOCimagesasdescribedinSection3.2. specialized on it yielding higher PSNR with respect to all otherclass-awaredenoisersforthemajorityofimages. 4.4.TheinfluenceofbinningandVST 4.6.“Underthehood”ofDenoiseNet Twotechniqueshavebeenshowninthepasttoboostre- Thissectionpresentsafewexamplesthatwebelievegive construction quality in Poisson denoising: the use of the insights aboutthe noiseestimation ofDenoiseNet. In Fig- GroundTruth I+VST+BM3D[4] DenoiseNet Class-specificDenoiseNet 25.85dB 26.79dB 26.96dB 25.56dB 26.47dB 26.75dB 24.09dB 24.96dB 25.36dB 23.53dB 24.60dB 24.78dB Figure4.DenoisingexamplesfromImageNet. PresentedareimagesfromImageNetdenoisedbyI+VST+BM3Dandourclass-agnostic andclass-specificdenoisersforpeak8.PSNRvaluesappearbeloweachoftheimages. degradedbythefirstlayers. 5.Conclusion In this work we have proposed a CNN-based Poisson image denoiser. Interestingly, our network achieves state- of-the-art performance without explicitly taking into con- sideration the nature of the noise in hand, but rather it is learnedimplicitlyfromthedata. Wefurthershowhowad- ditional knowledge of the image class boosts performance both quantitatively and qualitatively. We believe this of- fersaflexiblelearning-basedalternativetopreviousheavily engineeredsolutions, whichisbothpowerfulandfast, and thus may also be adopted to more general types of image enhancement such as Poisson-Gaussian denoising, super- resolution,deblurring,etc. References Figure5.Denoiserperformancepersemanticclass. Eachrow [1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, represents a specific semantic class of images while class-aware C.Citro,G.S.Corrado,A.Davis,J.Dean,M.Devin,etal. denoisersarerepresentedascolumns.The(i,j)-thelementinthe Tensorflow:Large-scalemachinelearningonheterogeneous confusionmatrixshowstheprobabilityofthej-thclass-awarede- systems, 2015. Software available from tensorflow.org, 1, noisertooutperformallotherdenoisersonthei-thclassofimages. 2015. Thediagonaldominantstructureindicatesthateachdenoiserspe- [2] F. J. Anscombe. The transformation of Poisson, Binomial cializesonaparticularclass. and negative-Binomial data. Biometrika, 35(3-4):246–254, 1948. Imageclass Face Flower Livingroom Pet Street [3] S.Anwar,C.P.Huynh,andF.Porikli. Class-specificimage I+VST+BM3D 27.17 26.16 26.39 25.95 24.13 deblurring. InIEEEInternationalConferenceonComputer Class-unaware 27.70 26.68 26.99 26.39 24.69 Vision(ICCV),pages495–503,Dec.2015. Class-specific 28.01 26.93 27.19 26.54 24.85 [4] L. Azzari and A. Foi. Variance stabilization for noisy+estimate combination in iterative poisson denoising. IEEESignalProcessingLetters,23(8):1086–1090,2016. Table5. Class-awaredenoisingonImageNetdata. Presented [5] S.BakerandT.Kanade.Limitsonsuper-resolutionandhow is the average PSNR performance on specific class images from tobreakthem. IEEETransactionsonPatternAnalysisand ImageNet. ItisevidentthatDenoiseNetsignificantlyoutperforms MachineIntelligence,24(9):1167–1183,2002. I+VST+BM3D[4]evenwhenitisclass-agnosticbyasmuchas [6] J. Boulanger, C. Kervrann, P. Bouthemy, P. Elbau, J.-B. 0.6dB,whiletheclass-specificapproachboostsperformanceby Sibarita, and J. Salamero. Patch-based nonlocal functional anadditional0.15to0.31dB.Weaveraged15noiserealizations for denoising fluorescence microscopy image sequences. perimageand300imagesperclass. IEEETrans.onMed.Imag.,29(2):442–454,Feb.2010. [7] O. Bryt and M. Elad. Compression of facial images using theK-SVDalgorithm.JournalofVisualCommunicationand ure 6 we show several denoised images with peak value 8 ImageRepresentation,19(4):270–282,2008. and the error after 5,10, and 20 layers (rows 4−6). Sur- [8] H. C. Burger, C. J. Schuler, and S. Harmeling. Image de- prisingly,eventhoughithasnotbeenexplicitlyenforcedat noising: Canplainneuralnetworkscompetewithbm3d? In training,theerrormonotonicallydecreaseswiththelayers’ IEEEConferenceonComputerVisionandPatternRecogni- depth(seeplotsinrow7inFigure6). Thisnon-trivialbe- tion(CVPR),pages2392–2399.IEEE,2012. [9] Y.ChenandT.Pock. Trainablenonlinearreactiondiffusion: havior is consistently produced by the network on almost A flexible framework for fast and effective image restora- all test images. To visualize which of the layers was the tion. IEEE Transactions on Pattern Analysis and Machine most dominant in the denoising process, we assign a dif- Intelligence(CVPR),2016. ferentcolortoeachlayerandcoloreachpixelaccordingto [10] K.Dabov,A.Foi,V.Katkovnik,andK.Egiazarian. Image thelayerinwhichitsvaluechangedthemost. Theresulting denoisingbysparse3-dtransform-domaincollaborativefil- image is shown in the bottom row of Figure 6. It can be tering.IEEETrans.ImageProcess.,16(8):2080–2095,2007. observedthatthefirstfewlayersgovernthemajorityofthe [11] A. Danielyan, V. Katkovnik, and K. Egiazarian. Deblur- pixelswhilethefollowingonesmainlyfocusonrecovering ringofPoissonianimagesusingBM3Dframes. Proc.SPIE, andenhancingtheedgesandtexturesthatmighthavebeen 8138:813812–813812–7,2011. Groundtruth Noisyinput Denoisedimage Errorafter5layers Errorafter10layers Errorafter20layers (output) RMSE at different layers Layer contributing the most to each pixel Figure6.Gradualdenoisingprocess. Imagesarebestviewedelectronically. Thereaderisencouragedtozoominforabetterview. Pleaserefertothetextformoredetails. GroundTruth Noisy I+VST+BM3D[4] DenoiseNet 24.56dB 25.83dB 24.76dB 26.00dB 23.40dB 24.66dB 30.14dB 31.61dB Figure7.DenoisingexamplesfromPASCALVOC2010. PresentedareimagesfromPASCALVOCdenoisedbyI+VST+BM3Dand ourclass-agnosticdenoiserforpeak8.PSNRvaluesappearbeloweachoftheimages.Seetextformoredetails. [12] C.-A.Deledalle,F.Tupin,andL.Denis. PoissonNLmeans: [16] M.Everingham,L.VanGool,C.K.Williams,J.Winn,and Unsupervised non local means for poisson noise. In IEEE A. Zisserman. The pascal visual object classes (voc) chal- InternationalConferenceonImageProcessing(ICIP),pages lenge. Internationaljournalofcomputervision,88(2):303– 801–804,Sept.2010. 338,2010. [13] W.Dong, G.Shi, Y.Ma, andX.Li. Imagerestorationvia [17] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, simultaneoussparsecoding:Wherestructuredsparsitymeets and A. Zisserman. The PASCAL Visual Object Classes gaussianscalemixture. InternationalJournalofComputer Challenge 2010 (VOC2010) Results. http://www.pascal- Vision(IJCV),114(2):217–232,Sep.2015. network.org/challenges/VOC/voc2010/workshop/index.html. [14] W.Dong,L.Zhang,G.Shi,andX.Li. Nonlocallycentral- [18] W.FengandY.Chen. Fastandaccuratepoissondenoising izedsparserepresentationforimagerestoration.IEEETrans. withoptimizednonlineardiffusion. arXiv,abs/1510.02930, ImageProcess.,22(4):1620–1630,April2013. 2015. [15] F.-X.Dupe,J.Fadili,andJ.-L.Starck. Aproximaliteration [19] M. A. T. Figueiredo and J. Bioucas-Dias. Restoration of for deconvolving Poisson noisy images using sparse repre- Poissonianimagesusingalternatingdirectionoptimization. sentations. IEEE Trans. Img. Proc., 18(2):310–321, Feb. IEEETrans.ImageProcess.,19(12):3133–3145,Dec2010. 2009. [20] M.Fisz. Thelimitingdistributionofafunctionoftwoinde- pendentrandomvariablesanditsstatisticalapplication.Col- [37] N.Wang, D.Tao, X.Gao, X.Li, andJ.Li. Acomprehen- loquiumMathematicum,3:138–146,1955. sive survey to face hallucination. International Journal of [21] E.Gil-Rodrigo,J.Portilla,D.Miraut,andR.Suarez-Mesa. ComputerVision,106(1):9–30,2014. Efficient joint poisson-gauss restoration using multi-frame [38] R.Willett and R. Nowak. Platelets: a multiscaleapproach l2-relaxed-l0analysis-basedsparsity. InIEEEInternational forrecoveringedgesandsurfacesinphoton-limitedmedical ConferenceonImageProcessing(ICIP),pages1385–1388, imaging. IEEETrans.onMed.Imag.,22(3):332–350,Mar. Sept.2011. 2003. [22] R. Giryes and M. Elad. Sparsity based poisson denois- [39] G.Yu, G.Sapiro, andS.Mallat. Solvinginverseproblems ing with dictionary learning. IEEE Trans. Imag. Proc., with piecewise linear estimators: From Gaussian mixture 23(12):5057–5069,2014. modelstostructuredsparsity. IEEETrans.ImageProcess., [23] K.He,X.Zhang,S.Ren,andJ.Sun. Deepresiduallearning 21(5):2481–2499,may2012. for image recognition. In IEEE Conference on Computer [40] B.Zhang, J.Fadili, andJ.Starck. Wavelets, ridgelets, and VisionandPatternRecognition(CVPR),2016. curveletsforPoissonnoiseremoval.IEEETrans.Img.Proc., [24] S. Iizuka, E. Simo-Serra, and H. Ishikawa. Let there be 17(7):1093–1108,July2008. color!: Jointend-to-endlearningofglobalandlocalimage [41] R.Zhang,P.Isola,andA.A.Efros.Colorfulimagecoloriza- priors for automatic image colorization with simultaneous tion. ECCV,2016. classification. InSIGGRAPH,2016. [42] X.Zhang,Y.Lu,andT.Chan. Anovelsparsityreconstruc- [25] N. Joshi, W. Matusik, E. H. Adelson, and D. J. Kriegman. tionmethodfromPoissondatafor3Dbioluminescenceto- Personal photo enhancement using example images. ACM mography. J.Sci.Comput.,50(3):519–535,Mar.2012. Trans.Graph.,29(2):12:1–12:15,Apr.2010. [26] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. InICLR,2015. [27] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Non-local sparse models for image restoration. In ICCV, pages2272–2279,2009. [28] M.MakitaloandA.Foi.OptimalinversionoftheAnscombe transformationinlow-countPoissonimagedenoising. IEEE Trans.onImageProces.,20(1):99–109,Jan.2011. [29] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluatingsegmentationalgorithmsandmeasuringecologi- calstatistics. InProc.8thInt’lConf.ComputerVision,vol- ume2,pages416–423,July2001. [30] T.Remez,O.Litany,andA.M.Bronstein.Apictureisworth abillionbits:Real-timeimagereconstructionfromdensebi- narypixels. In18thInternationalConferenceonComputa- tionalPhotography(ICCP),2015. [31] A.Rond,R.Giryes,andM.Elad. Poissoninverseproblems bytheplug-and-playscheme. JournalofVisualCommuni- cationandImageRepresentation,2016. [32] S. Roth and M. J. Black. Fields of experts. International JournalofComputerVision,82(2):205–229,2009. [33] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. Int. Journal of Computer Vision, 115(3):211–252,2015. [34] J. Salmon, Z. Harmany, C.-A. Deledalle, and R. Willett. Poisson noise reduction with non-local PCA. Journal of MathematicalImagingandVision,pages1–16,2013. [35] U.Schmidt,Q.Gao,andS.Roth. Agenerativeperspective on mrfs in low-level vision. In IEEE Conference on Com- puterVisionandPatternRecognition(CVPR),2010. [36] R.Vemulapalli,O.Tuzel,andM.-Y.Liu.Deepgaussiancon- ditionalrandomfieldnetwork:Amodel-baseddeepnetwork fordiscriminativedenoising. InIEEEConferenceonCom- puterVisionandPatternRecognition(CVPR),2016.