Enhancement and Artifact Removal for Transform Coded Document Images Tak Shing Wong School of Electrical and Computer Engineering Purdue University April, 2010 1/45 Outline l Overview F The problems with JPEG decoding. F Rational and related works. l A document image model and estimation algorithm for optimized JPEG decompression[1]. F A Bayesian reconstruction approach. F A document image is partitioned into 3 classes: background, text, and picture. F Each class is characterized by a specific prior model for improved decoding quality. l Optimized JPEG decoding of document images using soft classification. F A purely post-processing approach. F No segmentation is needed. F Introduced the Hypothesis Selection Filter (HSF) as a new approach for image enhancement. [1] Tak-ShingWong,CharlesA.Bouman,IlyaPollak,andZhigangFan,“ADocumentImageModelandEstimationAlgorithmforOptimizedJPEGDecompression,” IEEETransactionsonImageProcessing,vol. 18,no. 11,pp. 2518-2535,Nov. 2009. 2/45 TheProblemswithJPEGDecoding Original JPEGencodedanddecoded l JPEG encoding is lossy and leads to severe image artifacts. l Loss of high frequency content causes ringing artifacts, as apparent in the text region. l Loss of low frequency content causes blocking artifacts, as apparent in the sky region. l These artifacts are inconsistent to human interpretation, thus they are easily noticeable. 3/45 RationalandRelatedWorks l Rational: F In practice, JPEG is also commonly used with document images. F JPEG is still superior in simplicity over the more advanced schemes, e.g. JPEG 2000, DjVu, MRC. l Objective: F Assume document images are compressed by the traditional JPEG encoder. F Improve the decoding quality of any such JPEG-compressed document images. l Important, related approaches which aim at natural images: F Projection onto convex sets[1,2] F Bayesian reconstruction[3] F Adaptive post-processing[4] l Relatively few works focus on document images. [1] A.Zakhor, “Iterativeproceduresforreductionofblockingeffectsintransformimagecoding,” IEEETransactionsonCircuitsandSystemsforVideoTechnology, vol. 2,no. 2,pp. 91-95,Mar. 1992. [2] Y.Yang,N.Galatsanos,andA.Katsaggelos,“Regularizedreconstructiontoreduceblockingartifactsofblockdiscretecosinetransformcompressedimages,”IEEE TransactionsonCircuitsandSystemsforVideoTechnology,vol. 3,no. 6,pp. 421-432,Dec. 1993. [3] T. O’Rourke and R. Stevenson, “Improved image decompression for reduced transform coding artifacts,” IEEE Transactions on Circuits and Systems for Video Technology,vol. 5,vol. 6,pp. 490-499,Dec. 1995. [4] A. Averbuch, A. Schclar, and D. Donoho, “Deblocking of block-transform compressed images using weighted sums of symmetrically aligned pixels,” IEEE TransactionsonImageProcessing,vol. 14,no. 2,pp. 200-212,Feb. 2005. 4/45 FirstApproach–ADocumentImageModelandEstimationAlgorithmfor OptimizedJPEGDecompression Background Select a color Choose the algorithm block decoding JPEG encoded image channel to decode for each image block Text/graphic Combine results Decoded block decoding DCT blocks color channel from a channel Y channel Picture block decoding C channel b (conventional JPEG) C channel r Segmentation Image block map Y channel segmentation l Segment document into regions with distinct statistical attributes F Background F Text/graphics F Picture l For each region, design a decoding strategy optimized for that content 5/45 ClassesofContent l Decompose image blocks into three classes: picture F Background n Background and smooth regions n Low AC energy n Vulnerable to blocking artifacts F Text/graphic n High density of sharp edges n Susceptible to ringing artifacts F Picture n Non-smooth part of natural images n Conventional JPEG decoding text background 6/45 JPEGDecodingasanInverseProblem l JPEG decoding as an inverse problem F Solution is non-unique ⇒ inverse is ill-posed. n JPEG quantization is a many-to-one mapping. n Many image blocks would be quantized to the same set of DCT coefficients. F Prior model should be different for background, text, and picture regions F Maximum a posteriori (MAP) estimate can be computed using optimization l Forward model for JPEG encoding y = Dx Quantizer xs DCT s s y˜s y˜ = Q(y ) s s JPEG Encoder F Probability of data is either 1 or 0 1, if Q(Dx ) = y˜ s s p(y˜ |x ) = s s 0, otherwise ( 7/45 MAPReconstruction conventional JPEG decoded block x1 y1 Q0 Q1 x˜s,1 y0 DCT (basis rotation) y˜s,0 y˜s,1 x˜s,0 x0 Spatial domain Frequency domain x˜ : i-th pixel of conventional y˜ : i-th quantized DCT coefficient s,i s,i JPEG decoding Q : i-th quantization step size i : possible estimate of x s l The MAP reconstruction of the document is given by xˆ = arg min −log p(y˜ |x ) − log p(x ) (1) s s s s xs (cid:8) (cid:9) xˆ = arg min −log p(x ) (2) s s xs:Q(Dxs)=y˜s F The forward model is a binary constraint for the estimate of x s 8/45 PriorModelforLuminanceTextBlock l Model for text F Each JPEG block consists of two predominant colors c - background color 1 c - foreground color 2 F Colors are mixed together for each pixel using an alpha channel α i x = α c + (1 − α )c + noise, 0 ≤ α ≤ 1 i i 1 i 2 i l Distribution of pixels 1 1 2 p(x|c ,c ,α) = exp − |x − α c − (1 − α )c | 500 1 2 i i 1 i 2 const ( 2σ2 ) W iinblock X 400 l The foreground and background colors of image blocks form a MRF α)s,i300 p( 200 1 1 p(c1,c2) = exp − ρ(c1,r−c1,s)+ρ(c2,r−c2,s) 100 const ( 2σ2 ) C (r,Xs)are h i 0 neighbortextblocks −30 −20 −10 α0 10 20 30 s,i where ρ(t) = min(t2,τ2) 6 5 l The alpha values are i.i.d. on interval [0,1] 4 α)s,i3 1 p( exp ν|α − .5|2 , if α ∈ [0,1] 2 i i p(αi) = const (cid:26) (cid:27) 1 0, otherwise 0 −0.5 0 α0.5 1 1.5 s,i 9/45 PriorModelforLuminanceBackgroundBlocks l Model for background blocks F Each background block is characterized by its DC coefficient F DC coefficients are modeled as a Gaussian MRF l Distribution of DC coefficients 1 1 p(x) = exp − |y − y |2 s,0 r,0 const 2σ2 (cid:16) B {r,sX}∈Kbb (cid:17) y - DC coefficient for block s s,0 10/45
Description: