Language Independent Single Document Image Super-Resolution using CNN for improved recognition RamKrishnaPandey A.G.Ramakrishnan IndianInstituteofScience IndianInstituteofScience Bangalore,India Bangalore,India [email protected] [email protected] 7 1 0 2 Abstract than hard-coding these tasks as computer instructions, a n learning based approach is more sensible where comput- a Recognition of document images have important appli- ers train/learn from and adapt to the real-world examples, J cationsinrestoringoldandclassicaltexts. Theproblemin- inahierarchicalmanner,fromsimplertomorecomplexsit- 0 volves quality improvement before passing it to a properly uations. Thisapproachcloselyresemblesthewayaperson 3 trained OCR to get accurate recognition of the text. The acquires knowledge from the world to behave in an ”ex- ] imageenhancementandqualityimprovementconstituteim- pected” or ”sensible” manner. An ”intelligent” being per- V portant steps as subsequent recognition depends upon the ceives real world information through its senses; but it is C quality of the input image. There are scenarios when high difficult to formally provide such information to the com- . resolution images are not available and our experiments puters in its raw analog form. Hence data is digitized and s c showthattheOCRaccuracyreducessignificantlywithde- furtherprocessedtomoreconciseandefficient-to-compute [ crease in the spatial resolution of document images. Thus numeric-vector format (called as feature-vector or more 1 the only option is to improve the resolution of such docu- conciselyasfeatures)tobehandledbytraining/learningal- v mentimages. Thegoalistoconstructahighresolutionim- gorithms. 5 age,givenasinglelowresolutionbinaryimage,whichcon- The choice of representation of features has significant 3 stitutestheproblemofsingleimagesuper-resolution. Most effect on the performance of the machine-learning algo- 8 of the previous work in super-resolution deal with natural rithms. So a lot of effort needs to be put in designing 8 0 imageswhichhavemoreinformation-contentthanthedoc- and hand-coding the features. Lately, multi-layer neural- . ument images. Here, we use Convolution Neural Network networkbasedlearningtechniques,alsocalled”deeplearn- 1 to learn the mapping between low and the corresponding ingalgorithms”,aregainingpopularityasoneneednotma- 0 7 highresolutionimages. Weexperimentwithdifferentnum- nipulaterawdatatoagreatextent,andtrainingthenetwork 1 ber of layers, parameter settings and non-linear functions itselftakescareofgeneratingfeaturesatdifferentlayersin : to build a fast end-to-end framework for document image ahierarchyofcomplexityandsuitabletothetaskitisbeing v i super-resolution. Our proposed model shows a very good trainedfor. X PSNRimprovementofabout4dBon75dpiTamilimages, In this paper, we use such techniques to enhance the r resulting in a 3% improvement of word level accuracy by quality of low-resolution document-images for aiding the a the OCR. It takes less time than the recent sparse based subsequent recognition by OCR. We train a Convolution naturalimagesuper-resolutiontechnique, makingituseful NeuralNetwork(CNN)tolearnthemappingbetweenlow- forreal-timedocumentrecognitionapplications. and high- resolution example images and generate high- resolution images from the test dataset of low-resolution images.Afullyconnectedneuralnetworkarchitecturedoes Keywords not take into account the spatial structure of images. For instance,ittreatsinputpixelswhicharefarapartandclose Convolution Neural Network; Document Images; togetherinexactlythesameway[12]. Thereforetheusage Parametric-ReLU;PSNR;OpticalCharacterRecognizer. ofconvolutionneuralnetworkisnecessary,whichlearnslo- cal patterns and takes into account, the spatial structure of 1.Introduction feature maps [12]. It is also much easier to train a CNN ResearchinAItacklestheproblemswhichhumanscan thanfully-connecteddensearchitecturesduetolessernum- solveeasily,butaredifficultformachinestosolve. Rather ber of model parameters. In a broader sense, convolution 4321 neuralnetworkallowscomputationalmodelsthatarecom- high- resolution image patches. We have created a large posedofmultipleprocessinglayerstolearnrepresentations dataset contaning 5.1 million LR and HR patch pairs of ofdatawithmultiplelevelsofabstraction [9]. sizes16×16and10×10respectively,fortrainingthemodel. Convolutionneuralnetworkshavelayersof”3Dfilters”. The rest of the paper is organized as follows. Related Each filter operates on all the outputs obtained from the workisdiscussedinSection2.Section3describestheprob- filters of the previous layer corresponding to a particular lem formulation. Training model and dataset creation are neighborhood. Theareaoftheneighborhoodinvolvedwith explainedinSections4and5,respectively. Section6sum- reference to the input image increases with every subse- marizes the experiments performed and results obtained. quentlayer. Forexampleatfirstlayeronefiltermaydetect Finally,conclusionsaredrawnandconcludingremarksare a single vertical line, and another might detect horizontal drawninSection7. lines. The next layer now takes the features from the pre- vious layer and can detect (corner) more complex features 2.Relatedwork andsoon. ThisisabasicviewofCNN. Thetechniqueforimagesuper-resolution(SR)inthelit- Handcraftingfeaturesisdifficultinthecaseofdocument eraturecanbeclassifiedbroadlyintotwocategories:aclas- images and therefore convolution neural network is used, sical multiple image SR and single image SR. The former whichisadata-drivenapproachtolearntherightfilterbank approachrequiresmultipleimageswithsub-pixelalignment foraparticulardataset. Featuresarelearnedasthetraining toobtainahighresolutionimage[20]. Thelatterapproach progresesfromsimpletomorecomplexoneswithincreas- isdiscussedindetailin [11]and [19]. Thesparse-coding ing depth. Increasing depth doesn’t always guarantee im- based method and its several improvements [22], [21] provedperformance,butinourcasetheuseoffiltersofsize areamongthestate-of-the-artSRmethods. ChaoDonget. [1×1]atthreeconsecutivelayersafterthefirsthiddenlayer, al. [3] proposed deep learning based natural image super- eachfollowedbyReLUorPReLUnon-linearity,givesbet- resolution,whichshowedthattraditionalSRmethodcanbe ter performance than the same experiment performed over viewedasadeepconvolutionneuralnetworkwhichislight athreelayerarchitecture. Convolutionlayers’filtersofsize weightandachievesstate-of-the-artrestorationquality. Itis [1×1],describedindetailinlatersections,aimstofurther alsofastenoughforpracticalapplications. enhance the non-linearity in the model and decreases the dimensionalityofthefeaturespace. 3.Problemformulation Therearevariousfactors,whichaffecttheaccuracyofan OCR ondocument images the most importantof them be- Given a low resolution image, the objective is to in- ingtheimageresolutionorspatialresolution.Weoftenhave crease the resolution so that the OCR recogniton accuracy low-resolutionimagesatourdisposal,wherethehighreso- improves. The problem can be mathematically formulated lutioncounterpartisnotavailable,orthedocumentimages asfollows. whichwerescannedatverylowresolutionandtheoriginal Assume that the given low resolution images is y, and its image got destroyed or lost. When images are scanned at bicubicinterpolatedimageisY. Thetaskistolearnmap- a low resolution and the quality is low, it affects the OCR ping function J to recover a high resolution image from recognition and gives bad results in terms of character or Y i.e. J(Y)whichgivesbetterOCRrecognitionaccuracy wordlevelaccuracy. Thismotivatesustoworkonthecru- thanY andalsoimprovedPSNR. cial stage of document image enhancement before passing The training set is defined as I = {(Y ,X ) : 1 ≤ i ≤ i i itontoOCR. N}, where Y is the LR image and X ≈ J (Y ) is the i i λ i There are several techniques by which high resolution correspondingHRimagepatchofthetrainingset. imagescanbeobtained. Performancegainisobtainedeven Weuseaparametricmodeltolearnanon-linearmapping in bi-cubic interpolated images over plain low-resolution functionJλ(Y),λbeingtheparametersofthemodel,which images. But with this method we can’t recover high fre- minimize the reconstruction error between Jλ(Y) and the quency components [13]. Classical reconstruction based corresponding ground truth high resolution image X. The techniques require multiple low resolution images to re- learnedmodel/functionshouldbestfittrainingdataandgen- construct a higher-resolution image. Sparse representation eralizewelltothetestdata. dictionary learning based method, used for image super- Themodelisrepresentedas,λ = {W ,b }whereW = i i i resolutionisveryslow. Thistechniqueinvolvestheuseof (cid:8)Wj : 1 ≤ j ≤ n (cid:9), and Wj is the jth filter at the ith i i i an over-complete dictionary i.e. Dd∗K where d is the di- layerasmentionedinTable1andwelearnthesemodelpa- mension of the signal and K ≥ d [22] [21]. Thus, there rameters with a training algorithm in a supervised-manner is a need for an algorithm or architecture, which is useful asfollows. in such scenarios. We have designed a five layer convo- lution neural network, which learns mapping from low- to Let∗denotestheconvolutionoperationand0 < α < 1 4322 isthedatadependentparameter. thenthesignalshrinksasitpassesthrougheachlayeruntil InitializeJ =Y itistoosmalltobeuseful. However,ifthestartingweights 0 in a network are large, then the signal grows as it passes for i=1:4 do througheachlayeruntilitistoolargetobeuseful. Allthe initialization techniques mentioned above aims at keeping Zj =Wj ∗J +b theweightsinacontrolledrangeduringtraining. i i i−1 i Forimplementationreasons,itisdifficulttofindouthow Jj =max(Zj,0) (forReLU) many neurons in the next layer consume the output of the i i currentones. Inourwork, wehaveusedinitializationpre- or sentedin [6].ItstartedfromGlorotandBengio[4]andsug- (cid:26) αZj if Zj ≤0 (cid:27) gest using var(W)=2/n instead of 2/(n +n ) [4]. This Jj = i i (forPReLU) i o i i Zj else makessensebecausearectifyinglinearunitiszeroforhalf i endfor ofitsinputandhenceweneedtodoubletheweightvariance tokeepthesignal’svarianceconstant. J (Y)=W *J (Y)+b λ 5 4 5 4.2.Non-linearitylayers’activationfunction Thelossfunctionisgivenby, We use non-linear layers at several places in the CNN N 1 (cid:88) L(λ)= (cid:107)J (Y )−X (cid:107)2 model. Amongthenon-linearitiesproposed,themostpop- N λ i i F ular ones are sigmoid and rectified linear unit (ReLU) i=1 [5]. Recently, extensions of ReLU non-linearities were To minimize this loss function, back-propagation with proposed such as Leaky ReLU [16], parametric ReLU stochastic gradient descent algorithm (SGD) with momen- (PRELU) [7], SReLU [8] and randomized ReLU [17]. In tum is used. The momentum provides faster convergence our experiments, we have used ReLU and PReLU units and reduced oscillation about the minima. Stochastic gra- as activation functions. The advantage of using ReLU in- dient decent algorithm performs parameter update at each steadofsigmoidisbecauseweavoidthevanishinggradient iteration and is faster than batch gradient descent(BGD). problem in the former case. ReLU is represented mathe- BGD performs redundant computations for large datasets, matically as f(x) = max(0,x), i.e., it clips the -ve input since it recomputes gradients for similar examples before to zero. ReLU can be implemented simply by threshold- eachparameterupdate [14].SGDremovesthisredundancy ing a matrix of activations at zero and thus it is computa- byperformingoneupdateovereachmini-batch. tionally much more efficient than sigmoid or tanh. Con- vergenceofstochasticgradientdescentwithReLUisfaster 4.CNNmodel than for the sigmoid/tanh functions [14], due to its linear, 4.1.Initializingthefivelayernetwork non-saturatingform. ThedisadvantagewithReLUisthatit can be fragile during training and can ”die” in the middle Deep neural networks have a highly non-linear archi- oftraining. Forexample,alargegradientflowingthrougha tecture and the solution for the model parameters is non- ReLUneuronmayresultinupdateofweightssuchthatthe convex. Initializationisoneofthemostimportantstepsin neuron will never activate on any datapoint again. If this designingdeepneuralmodels.Ifnotperformedwithproper happens,thenthegradientflowingthroughtheunitwillbe care,networkmaynotperformwell,andcanevenbecome zero forever from that point on [10]. In other words, the deadorunresponsiveinthemiddleoftraining,resultingin ReLU units can irreversibly die during training since they poorlytrainedmodel.Initializationmeanssettingthemodel cangetknockedoffthedatamanifold [10]. Thisproblem weightswithinitialvalues. Severalinitializationstrategies canbesolvedifthelearningrateissetproperly. have been proposed in the literature, some of them are : Other activation functions that can be used are Leaky uniform initialization scaled by square root of the number ReLU, which reported better performance; but the results of inputs i.e. W ∼ U[√−1 ,√1 ] [1] ; Glorot uniform ini- ni ni arehowever,notalwaysconsistent. ItisthesameasReLU tialization i.e. W ∼ U[ −1 , 1 ], Glorot normal ini- ni+no ni+no buttheslopeofthenegativepartis0.01. ParametricReLu tialization i.e. W ∼ N[0, 1 ] [4] and He-uniform ini- (PReLU) is also the same as ReLU, but the slopes of neg- ni+no tializationi.e,W∼U[−1, 1 ] [6],wheren andn arethe ativeinputarelearnedfromthedataratherthanbeingpre- ni no i o fan-inandfan-outofthelayerWrespectively. defined.InrandomizedReLU(RReLU),theslopesforneg- Thepaper [4]givesinsightsontheinitializationofdeep ativeinputarerandomizedinagivenrangewhiletraining, neural networks. In their work, they have explained that and then fixed during testing. It has been reported that properinitializationhelpssignalstoreachdeepintothenet- RReLU can reduce over fitting due to its random nature work. If the starting weights in a network are very small, [18]. 4323 Table1.5-LayerCNNarchitecture Layer Featuremap Filter stride pad output conv1 16×16×1 5×5×1×64 1 0 12×12×64 Act.fn. 12×12×64 ReLU/PReLU 1 0 12×12×64 conv2 12×12×64 1×1×64×44 1 0 12×12×44 Act.fn. 12×12×44 ReLU/PReLU 1 0 12×12×44 conv3 12×12×44 1×1×44×24 1 0 12×12×24 Act.fn. 12×12×44 ReLU/PReLU 1 0 12×12×24 conv4 12×12×24 1×1×24×14 1 0 12×12×14 Act.fn. 12×12×14 ReLU/PReLU 1 0 12×12×14 conv5 12×12×14 3×3×14×1 1 0 10×10×1 Figure1givesthearchitectureofthe5-layerCNNused for obtaining the HR patches from the LR input patches. Thefirstlayerhas64filterseachofsize[5×5]producing 64featuremapsofsize[12×12],fora16×16imageinput Figure 2. Output at each convolution layer showing the feature patch. Tofurtherembednon-linearityintotheCNNmodel mapsobtainedateachlayerforasampleinput. andincreasedepthofthenetworktoreducethedimension- alityoffeaturespace,weuselayerswithfiltersize[1×1]. Theuseoffiltersofsize[1×1]in3consecutivelayerafter thefirsthiddenlayersfollowedbyReLU/PReLUactivation toprovidemorenon-linearity,helpstoobtaingoodfeatures. Thismodelgivesbetterperformancethanaplainthreelayer architecture. Figure3.Learnedfiltersateachlayerwhichareusedtoobtainthe featuremaps2 used64filtersofsize[5×5],whichresultinoutputvolume of size [12×12×64]. Similarly, the second layer has 44 filters of size [1×1] and generates output volume of size [12×12×44]. The third layer has 24 filters of size 1× 1 which generates output volume of size [12×12×24]; Figure1.The5-layerconvolutionneuralnetworkmodelwithac- andfourthlayerhas14filtersofsize1×1whichgenerates tivationinbetweenthelayers. outputvolumeofsize[12×12×14]. Thelastlayerhas1 filterofsize3×3toproducetheoutputvolume[10×10×1]. From experiments, we found that using 2/3 times the EachconvolutionlayerisfollowedbyaReLU/PReLUnon- number of filters as that of the previous layer gives better linearity layer. ReLU/PReLU layer applies element-wise results. activation function, max(0,x), and leaves the size of the Figure 2 shows that the learned filters 3 are perform- volumeunchanged. ingoperationsontheinputimagetoobtainsimpletomore If the input to a convolution layer is of size width1× complexfeaturesatprogressivelayers.Sincedocumentim- height1×depth1 and at each layer we have four hyper- ages contain less number of features than natural images, parameternamelynumberoffilters(nf),filterspatialextent we are reducing the number of features by using filters of (se),stride(s)andamountofzeropadding(zp);theoutput size[1×1]atthreeconsecutivelayers. sizewidth2×height2×depth2iscalculatedaccordingto TheinputtoCNNisalowresolutionimageofsize[16× thebelowmentionedformula[10]. 16×1]. Thefirstconvolutionlayercomputestheoutputsof neuronsthatareconnectedtothelocalregionsintheinput, width2=(width1−se+2×zp)/s+1 eachcomputingadotproductbetweentheirweightsanda small region they are connected to. In our case, we have height2=(height1−se+2×zp)/s+1 4324 depth2=nf namelyEnglish,TamilandKannadaat150dpiusingReLU andPReLUactivations. Figure 2 shows the obtained feature maps at each con- volutionlayerusingthelearnedfiltersinFigure 3 5.Thetrainingandtestdatasets Figure4.SomeLRandHRpairsfromtheTamiltrainingimage Wecreateatrainingdatasetof5.1millionHR-LRimage patch-pairs,byrandomlycroppingfrom135documentim- Figure5.Trainingcurveobtainedusinginputimagesfromthree ages.Weusedocumentimageshavingtextinthreedifferent languages namely English, Tamil and Kannada at 150 dpi using languages,viz.Tamil,KannadaandEnglish,scannedatdif- ReLUorPReLUactivations. ferentresolutionsof100, 200and300DPI.Thisenhances robustness of the trained model towards input of different Suppose R is the reconstructed image and X is the h languagesandresolutions. TheFigure 4showssometrain- groundtruthimage.WeusePSNR(peak-signal-to-noisera- ingsamplesofTamilimages. tio)asthemetricforimagequalitylistedinTable 2,which TheHRimagepatchisofsize10×10,andcroppeddirectly iscalcutaedasfollows: fromthedocumentimages. ThecorrespondingLRpatchis of size 16 × 16 and cropped from the image obtained by E =Xh−R down-samplingandup-samplingtheoriginaldocumentim- (cid:114) 1 age with a factor of 2. The image patch-pair sizes smaller RMSE = (cid:107)E (cid:107)2 N2 F than the selected ones do not result in any significant per- formanceimprovement. 255 PSNR=20log For the testing dataset, we use different set of 18 docu- 10RMSE mentimagesandgenerateHR-LRimagepatch-pairsinthe samemanner. Table2.ComparisonofaveragePSNRSRimageswiththatofthe bicubicinterpolatedimagesforeachlanguagefordifferentinput 6.Resultsofsuper-resolutionexperiments resolutions. Language Resolution BicubicPSNR(dB) CNNReLUPSNR(dB) CNNPReLU(dB) Weimplementthemodelusingtheano[15]andKeras[2] English 150 18.62 26.02 26.3 Tamil 150 20.34 28.29 28.48 libraries in python. We used NVIDIA TITAN GTX Kannada 150 18.41 24.92 26.12 English 100 16.44 21.56 21.64 GeForce (12GB memory) GPU for training. The training Tamil 100 16.94 22.42 22.72 imagepatchesareofsize16×16,whicharenormalizedby Kannada 100 16.24 20.55 21.00 Tamil 75 15.40 19.20 19.44 subtractingthemeanovertheentiretrainingdataandrange- English 50 13.37 14.81 15.08 Tamil 50 13.30 14.56 14.97 normalized to [0,1], as pre-processing step. The training Kannada 50 13.06 14.46 14.71 datasetisalsorandomlyshuffledtopreventthemodelfrom beingtrainedtosomearbitrarydatapattern. Wehaveused MSE loss function and SGD solver with standard back- Table 3 lists the mean character level accuracy (CLA) propagation.Wetrainedfor50epochswiththelearningrate and word level accuracy (WLA) obtained from the OCR of 0.0001 and batch-size of 32. Figure 5 shows the train- performance as performance metrics for the quality of the ingcurveobtainedusinginputimagesfromthreelanguages HRimage. 4325 Figure6.AsampleEnglishimagewithitsimprovementinPSNR Figure8.AsampleKannadaimagewithitsimprovementinPSNR overbicubicinterpolation overbicubicinterpolation Table3.MeancharacterandwordlevelaccuraciesofCNNderived SRimages(in%)for75dpiTamilinputimages. Metric LR Input Bicubic CNN PReLU CLA 56.3 91.1 94.0 WLA 12.5 59.6 62.9 7.Conclusion Afive-layerCNNhasbeendesignedthatcanobtainhigh resolutiondocumentimagesfromsinglelowresolutionim- ages. The network’s performance scales across the three languages that it has been tested for. Further, the same trainednetworkworksondifferentinputresolutionsof50, 100and150dotsperinch. TheincreaseinthePSNRvalue oftheoutputimageoverabicubicinterpolatedimagevaries from1.5dBto7.8dBforinputimagesof50and150dpi, respectively. TheOCRaccuracyhasimprovedby3%atthe word level for Tamil images with 75 dpi input resolution. Thus,withinthetestedinputresolutionsandthelanguages, theproposedtechniqueislanguageandresolutionindepen- dent. Figure 7. A sample Tamil image with its improvement in PSNR As future work, the performance of dedicated networks overbicubicinterpolation trained on a particular language and input resolution will betestedandcomparedwiththeresultsreportedhere. References Figures [ 6 7 8] show comparison of results obtained [1] E.BackProp. Lecunetal,1998. ontestimagesusingCNN PReLUandReLUoverbicubic [2] F.Chollet.Keras.https://github.com/fchollet/ interpolatedimage. keras,2015. 4326 [3] C. Dong, C. C. Loy, K. He, and X. Tang. Image super- resolutionusingdeepconvolutionalnetworks.arXivpreprint arXiv:1501.00092,2014. [4] X. Glorot and Y. Bengio. Understanding the difficulty of trainingdeepfeedforwardneuralnetworks. InAistats,vol- ume9,pages249–256,2010. [5] X.Glorot,A.Bordes,andY.Bengio. Deepsparserectifier neuralnetworks. InAistats,volume15,page275,2011. [6] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers:Surpassinghuman-levelperformanceonimagenet classification.InProceedingsoftheIEEEInternationalCon- ferenceonComputerVision,pages1026–1034,2015. [7] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers:Surpassinghuman-levelperformanceonimagenet classification.InProceedingsoftheIEEEInternationalCon- ferenceonComputerVision,pages1026–1034,2015. [8] X.Jin,C.Xu,J.Feng,Y.Wei,J.Xiong,andS.Yan. Deep learningwiths-shapedrectifiedlinearactivationunits.arXiv preprintarXiv:1512.07030,2015. [9] Y.LeCun,Y.Bengio,andG.Hinton.Deeplearning.Nature, 521(7553):436–444,2015. [10] F.-F.Li,A.Karpathy,andJ.Johnson.Cs231n:Convolutional neuralnetworksforvisualrecognition,2015. [11] K. Nasrollahi and T. B. Moeslund. Super-resolution: a comprehensive survey. Machine vision and applications, 25(6):1423–1468,2014. [12] M.A.Nielsen. Neuralnetworksanddeeplearning. URL: http://neuralnetworksanddeeplearning.com/.(visited:01.11. 2014),2015. [13] J.A.Parker,R.V.Kenyon,andD.E.Troxel.Comparisonof interpolatingmethodsforimageresampling. IEEETransac- tionsonmedicalimaging,2(1):31–39,1983. [14] S.Ruder. Anoverviewofgradientdescentoptimizational- gorithms. arXivpreprintarXiv:1609.04747,2016. [15] TheanoDevelopmentTeam. Theano: APythonframework forfastcomputationofmathematicalexpressions. arXive- prints,abs/1605.02688,May2016. [16] B.Xu,N.Wang,T.Chen,andM.Li.Empiricalevaluationof rectifiedactivationsinconvolutionalnetwork.arXivpreprint arXiv:1505.00853,2015. [17] B.Xu,N.Wang,T.Chen,andM.Li.Empiricalevaluationof rectifiedactivationsinconvolutionalnetwork.arXivpreprint arXiv:1505.00853,2015. [18] B.Xu,N.Wang,T.Chen,andM.Li.Empiricalevaluationof rectifiedactivationsinconvolutionalnetwork.arXivpreprint arXiv:1505.00853,2015. [19] C.-Y. Yang, C. Ma, and M.-H. Yang. Single-image super- resolution: abenchmark. InEuropeanConferenceonCom- puterVision,pages372–386.Springer,2014. [20] J. Yang and T. Huang. Image super-resolution: Historical overview and future challenges. Super-resolution imaging, pages1–34,2010. [21] J.Yang,Z.Wang,Z.Lin,S.Cohen,andT.Huang. Coupled dictionarytrainingforimagesuper-resolution. IEEETrans- actionsonImageProcessing,21(8):3467–3478,2012. [22] J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image super- resolution via sparse representation. IEEE transactions on imageprocessing,19(11):2861–2873,2010. 4327