ebook img

Deep Class Aware Denoising PDF

8.5 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Deep Class Aware Denoising

Deep Class Aware Denoising TalRemez1 OrLitany1 RajaGiryes1 [email protected] [email protected] [email protected] AlexM.Bronstein2 [email protected] 1 SchoolofElectricalEngineering,Tel-AvivUniversity,Israel 7 1 2 ComputerScienceDepartment,Technion-IIT,Israel 0 2 b e Abstract is largely dominatedby the Poisson-distributed shot noise, F there exist various techniques that allow accurate treat- The increasing demand for high image quality in mo- ment of non-Gaussian noise sources with a Gaussian de- 7 2 biledevicesbringsforththeneedforbettercomputational noiser [39, 40, 47, 57]. Moreover, it has been shown in enhancement techniques, and image denoising in particu- [11,15,46,57,63]thathavingagoodGaussiandenoising ] lar. Atthesametime,theimagescapturedbythesedevices algorithmallowstosolveefficientlymanyotherimagepro- V can be categorized into a small set of semantic classes. cessing problems such as deblurring, inpainting, compres- C However simple, this observation has not been exploited sion postprocessing and more, without compromising the . s in image denoising until now. In this paper, we demon- reconstructionqualityortheneedtodesignanewstrategy c strate how the reconstruction quality improves when a de- adapted to a new setting. In view of these results, it is ev- [ noiserisawareofthetypeofcontentintheimage. Tothis identthatagoodGaussiandenoisersetsthefoundationfor 2 end, we first propose a new fully convolutional deep neu- solvingavarietyofimagereconstructionandenhancement v ral network architecture which is simple yet powerful as problems. 8 it achieves state-of-the-art performance even without be- 9 Numerous methods have been proposed for removing 6 ing class-aware. We further show that a significant boost Gaussian noise from images, including k-SVD [2], non- 1 in performance of up to 0.4 dB PSNR can be achieved by local means [9], BM3D [14] non-local k-SVD [38], field 0 making our network class-aware, namely, by fine-tuning it of experts (FoE) [52], Gaussian mixture models (GMM) . 1 for images belonging to a specific semantic class. Relying [65], non-local Bayes [33], nonlocally centralized sparse 0 on the hugely successful existing image classifiers, this re- representation(NCSR)[19]andsimultaneoussparsecoding 7 search advocates for using a class-aware approach in all combined with Gaussian scale mixture (SSC-GSM) [18]. 1 imageenhancementtasks. : Thesetechniqueshavebeendesignedbasedonsomeprop- v ertiesofnaturalimagessuchastherecurrenceofpatchesat i X differentlocationsortheirsparsityinacertaindictionary. 1.Introduction r In the past few years, the state-of-the-art in image de- a The ubiquitous use of mobile phone cameras in the re- noising has been achieved by techniques based on artifi- centdecadehassetaveryhighdemandontheimagequality cial neural networks [10, 13, 62]. Neural networks (NNs) these devices are expected to produce. On the other hand, are essentially concatenations of basic units (layers), each thenever-endingpursuitofmorepixelsatsmallerformfac- comprising a linear operation followed by a simple non- tors puts stringent constraints on the amount of light each linearity, resulting in an intricate highly non-linear re- pixel is exposed to and results in noisier images. This sponse. Currently, they are among the most popular and putsanincreasingweightoncomputationalpost-processing powerfultoolsinmachinelearning[6,17,23,34,51]. NN- techniques,inparticularonimagedenoising. based approaches have led to state-of-the-art results in nu- Manyimageacquisitionartifactssuchaslow-lightnoise merous tasks in computer vision (e.g. for image classifi- and camera shake [16] can be compensated by image en- cation [25, 32], video classification [29], object detection hancemnet techniques. Denoising in the presense of ad- [60], face recognition [53], and handwriting word recog- ditive white Gaussian noise is one of the key problems nition [44]), speech recognition [50] and natural language studied in this context. While realistic low-light imaging processing[5,26,56,59],artificialintelligence(e.g.,play- 1 Groundtruthimage Noisyimage DenoisedbyTNRD[13] Denoisedbyourmethod Figure 1. Perceptual comparison of class-aware and standard denoising. Our proposed face-specific denoiser produces a visually pleasantresultandavoidsartifactscommonlyintroducedbygeneral-purposedenoisers. Thereaderisencouragedtozoominforabetter viewoftheartifacts. ing videogames [42] or beating the world Go champion further significant improvement in performance compared [55],whichisconsideredtobeaveryprominentmilestone totheclass-agnosticbaseline. in the AI community), medical imaging [24], image pro- Inlightofthehighperformanceachievedbymodernim- cessing(e.g.,imagedecovolution[54],inpainting[43]and age classification schemes, the proposed techniqe may be super-resolution[7,35,30]),andmore[34]. usedtoimprovetheimagequalityinmobilephonecamera. The first neural network to achieve state-of-the-art per- Tosubstantiatethisclaim,weshowthatthestate-of-the-art formanceinimagedenoisinghasbeenproposedin[10]. It image classification networks are resilient to the presence isbasedonafullyconnectedarchitectureandthereforere- ofevenlargeamountsofnoise. quires more training examples at training and much more memory and arithmetic complexity at inference compared 2.ClassAwareDenoising to the more recent solution in [62], which proposes a neu- ralnetworkbasedonadeepGaussianConditionalRandom The current theory of patch-based image denoising sets Field(DGCRF)model,orthemodel-basedTrainableNon- aboundontheachievableperformance[12,36,37].Infact, linear Reaction Diffusion (TNRD) network introduced in since existing methods have practically converged to that [13]. bound, one may be tempted to deem futile the on-going pursuit of better performance. As it turns out, two possi- Contribution bilities to break this barrier still exist. The first is to use One of the main elements we find to be missing in larger patches. This has been proved useful in [10] where the current denosing techniques (and image enhancement the use of 39×39 patches allowed to outperform BM3D strategies in general) is the awareness of the class of im- [14] that held the record for many years. A second “loop- agesbeingprocessed. Suchanapproachismuchneededas hole”whichallowsafurtherimprovementindenoisingper- the objects typically photographed by phone camera users formance is to use a better image prior, such as narrowing belongtoalimitednumberofsemanticclasses. Inthispa- down the space of images to a more specific class. These per, we demonstrate that it is possible to do better image twopossibilitiesarenotmutuallyexclusive,andindeedwe enhancementwhenthealgorithmisclass-aware. exploitboth.First,asdetailedinthesequel,ournetworkhas We demonstrate this claim on the Gaussian denoising aperceptivefieldofsize41×41,whichisbiggerthanthe task,forwhichweproposeanovelconvolutionalneuralnet- existingpractice,whiletheconvolutionalarchitecturekeeps work (CNN)-based architecture that obtains performance thenetworkfrombecomingprohibitivelylarge.Second,we higher than or comparable to the state-of-the-art. The ad- fine-tuneourdenoisertobestfitaparticularclass.Theclass vantageofourarchitectureisitssimpledesignandtheease information can be provided manually by the user, for ex- of adaptation to new data. We fine-tune a pretrained net- amplewhenchoosingfacedenoisingforcleaningapersonal work on several popular image classes and demonstrate a photo collection, or automatically, by applying one of the manyexistingpowerfulclassificationalgorithms. Theideaofcombiningclassificationwithreconstruction has been previously proposed by [4] which also dubbed it recogstruction. In their work, the authors set a bound on super-resolutionperformanceandshoweditcanbebroken whenaface-priorisused. Severalotherstudieshaveshown thatitisbeneficialtodesignastrategyforaspecificclass. For example, in [8] it has been shown that the design of a compression algorithm dedicated to faces improves over generic techniques targeting general images. Specifically for the class of faces, several face hallucination methods have been developed [64], including face super-resolution Figure2.DenoiseNetfullyconvolutionalarchitecture. Allcon- andfacesketch-photosynthesistechniques. In[28],theau- volutions are of size 3×3 and stride 1. Convolution resulting thorsshowedthatgivenacollectionofphotosofthesame featuresizesarelistedasWidth×Height×#Channels.The personitispossibletoobtainamorefaithfulreconstruction bottomrowofoutputscanbeviewedasanegativenoisecompo- of the face from a blury image. In [27, 66] class labeling nentsastheirsumcancelsoutthenoise. atapixel-levelisusedforthecolorizationofgray-scaleim- ages. In [3], the subspaces attenuated by blur kernels for specific classes are learned, thus improving the deblurring thecentralpartcroppingtheouter21pixelsduringtraining performance. time and padded the image symmetrically during test time Building on the success demonstrated in the aforemen- by21. TrainingwasdoneusingtheADAMoptimizer[31] tioned body of work, in this paper, we propose to use se- with a learning rate of α = 10−4, β1 = 0.9, β2 = 0.999 mantic classes as a prior and build class-aware denoisers. and (cid:15) = 10−8. Code and pretrained models will be made Differentfrompreviousmethods,ourmodelismadeclass- available1. aware via training and not by design, hence it may be au- tomatically extended to any type and number of classes. 3.1.Simplicityvscapacity While in this paper we focus on Gaussian denoising, our methodologycanbeeasilyextendedtomuchbroaderclass- Thechoiceofnetworkarchitecturewasmotivatedbythe awareimageenhancement,renderingitapplicabletomany tradeoffbetweensimplicityandcapacity. Tobestillustrate low-levelcomputervisiontasks. the concept that class awareness may improve image en- hancement algorithms, it was important to incorporate the 3.DenoiseNet classviathedata,insteadofexplicitlymanipulatingthenet- work architecture. This requires an as-simple-as-possible Our network performs additive Gaussian image denois- design. A rather straightforward choice would have been ing in a fully convolutional manner. It receives a noisy the fully connected architecture proposed by Burger et al. grayscale image as the input and produces an estimate of [10];however,thehugeamountofparametersthisnetwork theoriginalcleanimage.Thenetworkarchitectureisshown uses renders it impractical for many applications. Alter- in Figure 3. The layers at the top row of the diagram cal- natively, a very lightweight architecture was proposed by culate features using convolutions of size 3×3 , stride 1, ChenandPock[13]; howevertheirmodelwasspecifically and ReLU non-linearities. While the layers at the bottom tailored to their task and, thus, one should be extremely ofthediagramcanbeviewedasnegativenoisecomponents cautious about generalizing any concept demonstrated on as their sum cancels out the noise, and are calculated us- it. Thesetwosomewhatconflictingparadigmsledustode- ing a single channel convolution of size 3×3 with stride signanewarchitecturewhichisbothrelativelylight-weight 1. Inallexperimentsweusednetworkswith20layersim- while extremely simple to understand and implement. In plementedinTensorFlow[1]andtraineditfor160K mini- termsofcapacity,wehavetwoordersofmagnitudelesspa- batchesonaTitan-XGPUwithasetof8000imagesfrom rametersthantheNNproposedbyBurger[10],butonlyone thePASCALVOCdataset[20].Weusedmini-batchesof64 orderofmagnitudemorethanthatintroducedbyChenand patchesofsize128×128.ImageswereconvertedtoYCbCr Pock[13]. Notethatthereductioninthenumberofparam- and the Y channel was used as the input grayscale image eters does not decrease the receptive field as our model is after being scaled and shifted to the range of [−0.5,0.5]. muchdeeper. Duringtraining,imagepatcheswererandomlycroppedand flipped about the vertical axis. To avoid convolution arti- facts at the borders of the patches caused by the receptive 1https://github.com/TalRemez/deep_class_aware_ field of pixels in the deepest layer, we used an (cid:96)2 loss on denoising 100 2 Labeling error 899505 ver BM3D [dB] 01..551 DTMNeLnRPoDiseNet 805 10 15 20 25 30 35 40 45 50 55 R o Noise level [σ] SN 0 P oFfigiumreag3e.sNoonisearpersei-literanicneedofiinmceapgteiocnl-avs3sificlcaastsiiofine.rTrehmepaienrscesntatabglee ent in -0.5 m -1 exceeds85%eveninthepresenceoflargeamountofnoise. ve o pr-1.5 m I -2 4.Classificationinthepresenceofnoise 0 100 200 300 400 500 600 700 800 900 1000 Sorted image number Thetacitassumptionofourclass-awareapproachisthe Figure4.ComparisonofperformanceprofilerelativetoBM3D. ability to determine the class of the noisy input image. Imageindicesaresortedinascendingorderofperformancegain Whilethegoalofthisresearchisnottoimproveimageclas- relativetoBM3D.Theimprovementofourmethodovertwocom- sification, we argue that the performance of modern CNN petingalgorithmsisdemonstratedby(i)anoticeabledecreaseof thezero-crossingpoint,and(ii)consistentlyhighervaluesofgain basedclassificationalgorithmsuchasInception[60,61]or overBM3D.Thedistributionrevealsthestatisticalsignificanceof resNet [25] is relatively resilient to a moderate amount of thereportedimprovement. Thecomparisonwasmadeonimages noise. Inaddition, sinceweareinterestedin canonicalse- fromPASCALVOC. manticclassessuchasfacesandpetswhicharefarcoarser thanthe1000ImageNetclasses[49],thetaskbecomeseven easier:confusingtwobreedsofcatsisnotconsideredaner- termsofaveragePSNRforalltestimagescontaminatedby ror. whiteGaussiannoisewithσ = 10anduptoσ = 75. Itis The aforementioned networks can be further fine-tuned evident that our method outperforms all other methods for using noisy examples to increase their resilience to noise. bothnoiselevelsbyover0.2dB. Alternatively, one could simply run a class-agnostic de- noiserontheimagebeforepluggingitintotheclassification σ 10 15 25 35 50 65 75 network. To illustrate the noise resilience property we ran BM3D 34.26 32.10 29.62 28.14 26.61 25.64 25.12 the pre-trained Inception-v3 [61] network on a few tens of MLP[10] 34.29 − 29.95 28.49 26.98 26.07 25.54 imagesfromthepetsclass. Wethengraduallyaddednoise TNRD[13] − 32.35 29.90 − 26.91 − − totheseimagesandcountedthenumberofimagesonwhich DenoiseNet 34.87 32.79 30.36 28.88 27.32 26.30 25.74 theclassifierchangeditsmostconfidentclasstoadifferent Table1.PerformanceonPASCALVOC.AveragePSNRvalues class, as visualized in Figure 3. Observe that the network ona1000imagetestset.Ourmethodoutperformsallothermeth- classification remains stable even in the presence of large odsforallnoiselevels. amountofnoise. To examine the statistical significance of the improve- 5.Experiments ment our method achieves, in Figure 4 we compare the Inallexperimentsinthissectionournetworkwastrained gain in performance with respect to BM3D achieved by on 8000 images from the PASCAL VOC [20] dataset our method, MLP and TNRD. Image indices are sorted and was compared to BM3D [14], multilayer perceptrons in ascending order of performance gain. A smaller zero- (MLP) [10] and TNRD [13] on the following three test crossingvalueaffirmsourmethodoutperformsBM3Dona sets: (i) images from PASCAL VOC [21]; (ii) a denois- largerportionofthedatasetthanthecompetitors. Theplot ing dataset with quantized images from [62]; and (iii) 68 visualizes the large and consistent improvement in PSNR testimageschosenby[48]fromtheBerkeleysegmentation achievedbyDenoiseNet. Asummaryofthenumberofim- dataset[41]. ages on which each algorithm performed the best is pre- sentedinFigure5. 5.1.Class-agnosticdenoising PASCALVOC. Inthisexperimentwetestedthedenois- Berkeley segmentation dataset. In this experiment we ingalgorithmson1000testimagesfromthePASCALVOC testedtheperformanceofourmethod,trainedonPASCAL dataset[21]. Webelievethislargeanddiversesetofimages VOC,ontheasetof68imagesselectedby[48]fromBerke- isrepresentativeenoughtomakeconclusionsaboutthede- ley segmentation dataset [41]. Even though these test im- noising performance. Table 1 summarizes performance in ages belong to a different dataset, Figure 2 shows that our 5.2.Class-awaredenoising This experiment evaluates the boost in performance re- sultingfromfine-tunningadenoiseronasetofimagesbe- longingtoaparticularclass. Inordertodosowecollected images from ImageNet [49] of the following six classes: face,pet,flower,beach,livingroom,andstreet. The1,500 images per class were split into train (60%), validation (20%)andtest(20%)sets. Wethentrainedaseparateclass- awaredenoiserforeachofthesixclasses.Thiswasdoneby fine-tuningourclass-agnosticmodel,thathadbeentrained on PASCAL VOC, using the images from ImageNet. The performanceoftheclass-awaredenoiserswascomparedto its class-agnostic counterpart as well as to other denoising methods. Average PSNR values summarized in Figure 6 Figure5.TopperformancedistributiononPASCALVOCtest demonstrate that our class-aware models outperforms our set. Percentage of images on which a denoising algorithm per- class-agnosticnetwork,BM3D[14],multilayerperceptrons formed the best. Our method wins on 92.4% of the images, whereasMLP,BM3D,andTNRDwinon6.6%,1%and0%re- (MLP) [10] and TNRD [13]. Notice how class-awareness spectively. boostsperformancebyupto0.4dB. methodoutperformspreviousmethodsforallsigmavalues. 31.5 Class-aware DenoiseNet 31 DenoiseNet TNRD MLP σ 10 15 25 35 50 65 75 30.5 BM3D BM3D 33.31 31.10 28.57 27.08 25.62 24.68 24.20 30 MLP[10] 33.50 − 28.97 27.48 26.02 25.10 24.58 dB] TNRD[13] − 31.41 28.91 − 25.95 − − R [29.5 N S DenoiseNet 33.58 31.44 29.04 27.56 26.06 25.12 24.61 P 29 Table 2. Performance on images from Berkeley segmentation 28.5 dataset.AveragePSNRvaluesonatestsetof68imagesselected 28 by[48].Ourmethodoutperformsallothersforallnoiselevels. 27.5 Face Living room Flower Beach Pet Street Quantizednoise. Eventhoughournetworkhasnotbeen Figure 6. Class-aware denoising performance on ImageNet. explicitlytrainedtotreatquantizednoisyimages,weevalu- AveragePSNRvaluesfordifferentmethodsonimagesbelonging ateditsperformanceon300suchimagesfrom[62].Results tosixdifferentsemanticclasses.Itisevidentthattheclass-specific are reported in Table 3. The set contains 100 test images fine-tunedmodelsoutperformallothermethods.Inadditionbeing fromtheBerkeleysegmentationdatasetandadditional200 class-aware enables to gain up to 0.4 dB PSNR copared to our imagesfromthePASCALVOC2012[22]dataset. Allim- class-agnosticnetwork. ageshavebeenquantizedto8bitsintherange[0,255]. For thenoiselevelofσ =25ourmethodsoutperformsprevious 5.3.Cross-classdenoising methodsbutfailstodosoforσ =50. Tofurtherdemonstratetheeffectofrefiningadenoiserto σ =25 σ =50 aparticularclass,wetestedeachclass-specificdenoiseron imagesbelongingtootherclasses.Theoutcomeofthismis- BM3D 28.21 24.43 match is evident both qualitatively and quantitatively. The MLP[10] 28.58 25.20 top row of Figure 7 presents a comparison of class-aware TNRD[13] 28.46 24.57 denoisersfine-tunedtothestreetandfaceimageclassesap- DenoiseNet 28.71 24.75 plied to a noisy image of a face. The denoiser tuned to Table3.Performanceonquantizedtestimagesfrom[62]. Im- thestreetclassproducesnoticeableartifactsaroundtheeye, ageshavebeenclippedtoarangeof[0,255]andquantizedto8 cheekandhairareas. Moreover,theedgesappeartoosharp bits.PSNRvaluesfortwodifferentnoiselevelsarereported. andseemtofavorhorizontalandverticaledges. Thisisnot very surprising as street images contain mainly man-made Groundtruth Noisyimage Correctdenoiser Wrongdenoiser face-specific street-specific pet-specific livingroom-specific street-specific face-specific face-specific street-specific Figure8.Denoiserperformancepersemanticclass. Eachrow represents a specific semantic class of images while class-aware denoisersarerepresentedascolumns.The(i,j)-thelementinthe confusionmatrixshowstheprobabilityofthej-thclass-awarede- Figure 7. Cross-class denoising. Representative outputs of De- noisertooutperformallotherdenoisersonthei-thclassofimages. noiseNetdenoisersfine-tunedtotheclassoftheinpuitimage(third columnfromleft),andtoamismatchedclass(rightmostcolumn). Thereaderisencouragedtozoominforabetterviewofthearti- lowlayerestimationsappeartohandlelocalnoisewhilethe facts. deeper ones seem to focus on object contours. In the top row, we present the input image after it has been denoised rectangle shaped structures. In the second row, strong ar- byalllayersuptoaspecificdepth.Tofurtherexaminewhat tifacts appear on the hamster’s fur when the image is pro- is happening ”under the hood” of our class-aware denois- cessed by living room-specific denoiser. The pet-specific ers,inFigure10weshowtheerrorafter5,10,and20lay- denoiser,ontheotherhand,producesamuchmorenaturally ers(rows4−6). Surprisingly,eventhoughtithasnotbeen lookingresult. Additionalexamplesdemonstratingartifacts explicitly enforced at training, the error monotonically de- causedbythemismatchonthecanonicalimagesHouseand creaseswiththelayerdepth(seeplotsinrow7). Thisnon- Lenaarepresentedinthebottomtworows. Noticehowthe trivialbehaviorisconsistentlyproducedbythenetworkon street-specificdenoiserreconstructssharpboundariesofthe alltestimages. Lastly,tovisualizewhichofthelayerswas buildingwhereastheface-specificcounterpartsmearsthem. themostdominantinthedenoisingprocess,weassignadif- To quantify the effect of mismatching we evaluated the ferentcolortoeachlayerandcoloreachpixelaccordingto percentage of wins of every fine-tuned denoiser on each thelayerinwhichitsvaluechangedthemost. Theresulting typeofimageclass. Awinmeansthataparticulardenoiser image is shown in the bottom row of Figure 10. It can be producedthehighestPSNRamongalltheothers. Aconfu- observedthatthefirstfewlayersgovernthemajorityofthe sionmatrixforallcombinationsofclass-specificdenoisers pixelswhilethefollowingonesmainlyfocusonrecovering and image classes is presented in Figure 8. We conclude andenhancingtheedgesandtexturesthatmighthavebeen thatapplyingadenoiserofthesameclassastheimagere- degradedbythefirstlayers. sultsinthebestperformance. 6.Discussion 5.4.Networknoiseestimation Given the state-of-the-art performance of our network, This section presents a few examples that we believe animportanttaskistointerpretwhatithaslearnedandwhat giveinsightsaboutthenoiseestimationofourclass-aware is the relation between the action of DenoiseNet and the networks. The overall noise estimation of the network is principlesgoverningthepreviousmanuallydesignedstate- thesumoftheestimatesproducedbyallindividuallayers. of-the-artdenoisingalgorithms.Onesuchprinciplethathas ThesearepresentedinthebottomrowofFigure9. Interest- been shown to improve denoising in recent years is grad- ingly, theydiffersignificantlyfromoneanother. Theshal- ual denoising, namely that iteratively removing small por- Noisyinput Output Groundtruth Layer5 Layer10 Layer15 Layer20 Figure9.Gradualdenoisingprocessbyflower-specificDenoiseNet. Thetoprowpresentsthenoisyimage(left)andtheintermediate resultobtainedbyremovingthenoiseestimateduptotherespectivelayerdepth. Thesecondrowpresentsthegroundtruthimage(left) andthenoiseestimatesproducedbyindividuallayers;thenoiseimageshavebeenscaledfordisplaypurposes.Weencouragethereaderto zoom-inontotheimagestobestviewthefinedetailsandnoise. tions of the noise is preferable to removing it all at once [45, 58, 67]. Interestingly, as can be seen in Figure 9, our networkexhibitssuchabehaviordespitethefactithasnot been trained explicitly to have a monotonically decreasing error throughout the layers. Each layer in the network re- movespartofthenoiseintheimage,wheretheflatregions arebeingdenoisedmainlyinthefirstlayers,whiletheedges in the last ones. This may be explained by the fact that thedeeperlayerscorrespondstoalargerreceptivefieldand therefore may recover in a better way global patterns such asedgesthatmaybeindistinguishablefromnoiseifviewed justinthecontextofasmallpatch. Inacertainsense,thepresentresearchdemonstratesthat insomecasesthewholeissmallerthanthesumofitsparts. That is, splitting the input image to several categories and thenbuildingafine-tunedfilterforeachispreferableovera universalfilter. Thatsaid,thedecisiontosplitaccordingto asemanticclasswasmadeduetotheimmediateavailability ofoff-the-shelfclassifiersandtheirresiliencetonoise. Yet, thissplittingschememayverywellbesub-optimal. Other choices for data partitioning could be made. In particular, aclassifiercouldbelearnedautomatically,e.g.,byincorpo- rating the splitting scheme into a network architecture and trainingitend-to-end. Insuchcases,thepartitioningwould loseitssimpleinterpretationassemanticclasses,andwould insteadyieldsomeabstractclasses. Wedeferthisinterest- ingdirectiontofutureresearch. Groundtruth Noisyinput Denoisedimage Errorafter5layers Errorafter10layers Errorafter20layers (output) RMSE at different layers Layer contributing the most to each pixel Figure10.Gradualdenoisingprocess. Imagesarebestviewedelectronically, thereaderisencouragedtozoominforabetterview. PleaserefertoSection5.4formoredetails. References [18] W.Dong, G.Shi, Y.Ma, andX.Li. Imagerestorationvia simultaneoussparsecoding:Wherestructuredsparsitymeets [1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, gaussianscalemixture. InternationalJournalofComputer C.Citro,G.S.Corrado,A.Davis,J.Dean,M.Devin,etal. Vision(IJCV),114(2):217–232,Sep. 1 Tensorflow:Large-scalemachinelearningonheterogeneous [19] W.Dong,L.Zhang,G.Shi,andX.Li. Nonlocallycentral- systems, 2015. Software available from tensorflow. org, 1, izedsparserepresentationforimagerestoration.IEEETrans. 2015. 3 ImageProcess.,22(4):1620–1630,April2013. 1 [2] M. Aharon, M. Elad, and A. Bruckstein. K-svd: An al- [20] M.Everingham,L.VanGool,C.K.Williams,J.Winn,and gorithm for designing overcomplete dictionaries for sparse A. Zisserman. The pascal visual object classes (voc) chal- representation. IEEETrans.SignalProcess., 54(11):4311– lenge. Internationaljournalofcomputervision,88(2):303– 4322,Nov2006. 1 338,2010. 3,4 [3] S.Anwar,C.P.Huynh,andF.Porikli. Class-specificimage [21] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, deblurring. InIEEEInternationalConferenceonComputer and A. Zisserman. The PASCAL Visual Object Classes Vision(ICCV),pages495–503,Dec.2015. 3 Challenge 2010 (VOC2010) Results. http://www.pascal- [4] S.BakerandT.Kanade.Limitsonsuper-resolutionandhow network.org/challenges/VOC/voc2010/workshop/index.html. tobreakthem. IEEETransactionsonPatternAnalysisand 4 MachineIntelligence,24(9):1167–1183,2002. 3 [22] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, [5] J.BellegardaandC.Monz.Stateoftheartinstatisticalmeth- and A. Zisserman. The PASCAL Visual Object Classes odsforlanguageandspeechprocessing. ComputerSpeech Challenge 2012 (VOC2012) Results. http://www.pascal- andLanguage,35:163184,Jan.2016. 1 network.org/challenges/VOC/voc2012/workshop/index.html. [6] Y.Bengio. Learningdeeparchitecturesforai. Foundations 5 andTrendsinMachineLearning,2(1):1127,2009. 1 [23] I.Goodfellow,Y.Bengio,andA.Courville. Deeplearning. [7] J. Bruna, P. Sprechmann, and Y. LeCun. Super-resolution BookinpreparationforMITPress,2016. 1 withdeepconvolutionalsufficientstatistics. InICLR,2016. [24] H.Greenspan,B.vanGinneken,andR.M.Summers. Guest 2 editorial deep learning in medical imaging: Overview and [8] O. Bryt and M. Elad. Compression of facial images using future promise of an exciting new technique. IEEE Trans- theK-SVDalgorithm.JournalofVisualCommunicationand actions on Medical Imaging, 35(5):1153–1159, May 2016. ImageRepresentation,19(4):270–282,2008. 3 2 [9] A.Buades,B.Coll,,andJ.Morel.Anon-localalgorithmfor [25] K.He,X.Zhang,S.Ren,andJ.Sun. Deepresiduallearning imagedenoising. InIEEEConferenceonComputerVision for image recognition. In IEEE Conference on Computer andPatternRecognition(CVPR),2005. 1 VisionandPatternRecognition(CVPR),2016. 1,4 [10] H. C. Burger, C. J. Schuler, and S. Harmeling. Image de- [26] J.HirschbergandC.D.Manning. Advancesinnaturallan- noising: Canplainneuralnetworkscompetewithbm3d? In guageprocessing. Science,349(6245):261–266,2015. 1 IEEEConferenceonComputerVisionandPatternRecogni- [27] S. Iizuka, E. Simo-Serra, and H. Ishikawa. Let there be tion(CVPR),pages2392–2399.IEEE,2012. 1,2,3,4,5 color!: Jointend-to-endlearningofglobalandlocalimage [11] S.Chan,X.Wang,andO.Elgendy.Plug-and-playadmmfor priors for automatic image colorization with simultaneous imagerestoration:Fixedpointconvergenceandapplications. classification. InSIGGRAPH,2016. 3 ArXiv,abs/1605.01710,2016. 1 [28] N. Joshi, W. Matusik, E. H. Adelson, and D. J. Kriegman. [12] P. Chatterjee and P. Milanfar. Is denoising dead? IEEE Personal photo enhancement using example images. ACM Trans.ImageProcess.,19(4):895911,2010. 2 Trans.Graph.,29(2):12:1–12:15,Apr.2010. 3 [13] Y.ChenandT.Pock. Trainablenonlinearreactiondiffusion: [29] A.Karpathy,G.Toderici,S.Shetty,T.Leung,R.Sukthankar, A flexible framework for fast and effective image restora- andL.Fei-Fei. Large-scalevideoclassificationwithconvo- tion. IEEE Transactions on Pattern Analysis and Machine lutionalneuralnetworks. InCVPR,2014. 1 Intelligence(CVPR),2016. 1,2,3,4,5 [30] J. Kim, J. K. Lee, and K. M. Lee. Accurate image super- [14] K.Dabov,A.Foi,V.Katkovnik,andK.Egiazarian. Image resolutionusingverydeepconvolutionalnetworks.InCVPR, denoisingbysparse3-dtransform-domaincollaborativefil- 2016. 2 tering.IEEETrans.ImageProcess.,16(8):2080–2095,2007. [31] D. P. Kingma and J. Ba. Adam: A method for stochastic 1,2,4,5 optimization. InICLR,2015. 3 [15] Y. Dar, A. M. Bruckstein, M. Elad, and R. Giryes. Post- [32] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet processing of compressed images via sequential denoising. classification with deep convolutional neural networks. In IEEETrans.Imag.Proc.,25(7):3044–3058,2016. 1 AdvancesinNeuralInformationProcessingSystems(NIPS), [16] M. Delbracio and G. Sapiro. Burst deblurring: Remov- pages1097–1105,2012. 1 ing camera shake through fourier burst accumulation. In [33] M.Lebrun,A.Buades,andJ.M.Morel.Anonlocalbayesian IEEE Conference on Computer Vision and Pattern Recog- imagedenoisingalgorithm. SIAMJournalonImagingSci- nition(CVPR),2015. 1 ences,6(3):1665–1688,2013. 1 [17] L. Deng and D. Yu. Deep learning: Methods and applica- [34] Y.LeCun,Y.Bengio,andG.Hinton. Deeplearning. Foun- tions. Foundations and Trends in Signal Processing, 7(3- dationsandTrendsinSignalProcessing,521:436444,May 4):197387,2014. 1 2015. 1,2 [35] C.Ledig,L.Theis,F.Husza´r,J.Caballero,A.Aitken,A.Te- [52] U.Schmidt,Q.Gao,andS.Roth. Agenerativeperspective jani,J.Totz,Z.Wang,andW.Shi. Photo-realisticsingleim- on mrfs in low-level vision. In IEEE Conference on Com- agesuper-resolutionusingagenerativeadversarialnetwork. puterVisionandPatternRecognition(CVPR),2010. 1 arXivabs/1609.04802,2016. 2 [53] F.Schroff,D.Kalenichenko,andJ.Philbin. Facenet:Auni- [36] A.LevinandB.Nadler. Naturalimagedenoising: Optimal- fiedembeddingforfacerecognitionandclustering.InCVPR, ityandinherentbounds. InIEEEConferenceonComputer 2015. 1 VisionandPatternRecognition(CVPR),pages2833–2840. [54] C.J.Schuler,H.C.Burger,S.Harmeling,andB.Schlkopf. IEEE,2011. 2 Amachinelearningapproachfornon-blindimagedeconvo- [37] A.Levin,B.Nadler,F.Durand,andW.Freeman.Patchcom- lution. InCVPR,2013. 2 plexity, finite pixel correlations and optimal denoising. In [55] D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, ECCV,2012. 2 G. van den Driessche, J. Schrittwieser, I. Antonoglou, [38] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, Non-local sparse models for image restoration. In ICCV, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, pages2272–2279,2009. 1 M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. [39] M.MakitaloandA.Foi.OptimalinversionoftheAnscombe Mastering the game of go with deep neural networks and transformationinlow-countPoissonimagedenoising. IEEE treesearch. Nature,529:484489,2016. 1 Trans.onImageProces.,20(1):99–109,Jan.2011. 1 [56] R. Socher, A. Perelygin, J. Wu, J. Chuang, C. Manning, [40] M. Makitalo and A. Foi. Noise parameter mismatch A. Ng, and C. Potts. Recursive deep models for seman- in variance stabilization, with an application to poisson- ticcompositionalityoverasentimenttreebank. InEMNLP, gaussian noise estimation. IEEE Trans. on Image Proces., 2013. 1 23(12):5348–5359,Jan.2014. 1 [57] S.Sreehari,S.V.Venkatakrishnan,B.Wohlberg,G.T.Buz- [41] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database zard, L. F. Drummy, J. P. Simmons, and C. A. Bouman. of human segmented natural images and its application to Plug-and-play priors for bright field electron tomography evaluatingsegmentationalgorithmsandmeasuringecologi- and sparse interpolation. IEEE Transactions on Computa- calstatistics. InProc.8thInt’lConf.ComputerVision,vol- tionalImaging,2(4):408–423,Dec2016. 1 ume2,pages416–423,July2001. 4 [58] J.SulamandM.Elad. Expectedpatchloglikelihoodwitha [42] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Ve- sparseprior. InEnergyMinimizationMethodsinComputer ness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Vision and Pattern Recognition (EMMCVPR), Hong-Kong, Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, 2015. 6 I.Antonoglou,H.King,D.Kumaran,D.Wierstra,S.Legg, [59] I.Sutskever, O.Vinyals, andQ.Le. Sequencetosequence and D. Hassabis. Human-level control through deep rein- learningwithneuralnetworks. InNIPS,2014. 1 forcementlearning. Nature,518:529533,Feb.2015. 1 [60] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, [43] D.Pathak,P.Krahenbuhl,J.Donahue,T.Darrell,andA.A. D.Anguelov, D.Erhan, V.Vanhoucke, andA.Rabinovich. Efros. Contextencoders: Featurelearningbyinpainting. In Goingdeeperwithconvolutions. InCVPR,2015. 1,4 CVPR,2016. 2 [61] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wo- [44] A.PoznanskiandL.Wolf.Cnn-n-gramforhandwritingword jna. Rethinkingtheinceptionarchitectureforcomputervi- recognition. InCVPR,2016. 1 sion, journal = arXiv, abs/1512.00567, year = 2015, url = [45] Y.RomanoandM.Elad. Boostingofimagedenoisingalgo- http://arxiv.org/abs/1512.00567,. 4 rithms.SIAMJournalonImagingSciences,8(2):1187–1219, [62] R.Vemulapalli,O.Tuzel,andM.-Y.Liu.Deepgaussiancon- 2015. 6 ditionalrandomfieldnetwork:Amodel-baseddeepnetwork [46] Y.Romano,M.Elad,andP.Milanfar. Thelittleenginethat fordiscriminativedenoising. InIEEEConferenceonCom- could:Regularizationbydenoising(red).arXiv:1611.02862, puterVisionandPatternRecognition(CVPR),2016. 1,2,4, 2016. 1 5 [47] A.Rond,R.Giryes,andM.Elad. Poissoninverseproblems [63] S.Venkatakrishnan,C.Bouman,andB.Wohlberg.Plug-and- bytheplug-and-playscheme. JournalofVisualCommuni- play priors for model based reconstruction. In GlobalSIP, cationandImageRepresentation,2016. 1 2013. 1 [48] S. Roth and M. J. Black. Fields of experts. International [64] N.Wang, D.Tao, X.Gao, X.Li, andJ.Li. Acomprehen- JournalofComputerVision,82(2):205–229,2009. 4,5 sive survey to face hallucination. International Journal of [49] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, ComputerVision,106(1):9–30,2014. 3 S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, [65] G.Yu, G.Sapiro, andS.Mallat. Solvinginverseproblems A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual with piecewise linear estimators: From Gaussian mixture Recognition Challenge. Int. Journal of Computer Vision, modelstostructuredsparsity. IEEETrans.ImageProcess., 115(3):211–252,2015. 4,5 21(5):2481–2499,may2012. 1 [50] H. Sak, A. Senior, K. Rao, and F. Beaufays. Fast and ac- [66] R.Zhang,P.Isola,andA.A.Efros.Colorfulimagecoloriza- curaterecurrentneuralnetworkacousticmodelsforspeech tion. ECCV,2016. 3 recognition. InINTERSPEECH,2015. 1 [67] D. Zoran and Y. Weiss. From learning models of natural [51] J. Schmidhuber. Deep learning in neural networks: An imagepatchestowholeimagerestoration. InICCV,2011. 6 overview. NeuralNetworks,61:85–117,2015. 1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.