ebook img

Lectures on image processing PDF

92 Pages·2000·0.404 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Lectures on image processing

Lecture 2: Image Processing Review, Neighbors, Connected Components, and Distance (cid:1)cBryanS.Morse,BrighamYoungUniversity,1998–2000 LastmodifiedonJanuary6,2000at3:00PM Reading SH&B,Chapter2 2.1 Review of CS 450 2.1.1 ImageBasics ImageDomains Animage(picture)canbethoughtofasbeingafunctionoftwospatialdimensions: f(x,y) (2.1) Formonochromaticimages,thevalueofthefunctionistheamountoflightatthatpoint. Sometimes,wecangotoevenhigherdimensionswithvariousimagingmodalities.MedicalCATandMRIscanners produceimagesthatarefunctionsofthreespatialdimensions: f(x,y,z) (2.2) Animagemayalsobeoftheform f(x,y,t) (2.3) (xandyarespatialdimensions;tistime.) Sincetheimageisofsomequantitythatvariesovertwospatialdimensions andalsoovertime,thisisavideosignal,animation,orothertime-varyingpicturesequence. Becareful—althoughbothvolumesandtime-varyingsequenciesarethree-parametertypesofimages,theyarenot thesame! Forthiscourse,we’llgenerallysticktostatic,two-dimensionalimages. TheVaryingQuantities Thevaluesinanimagecanbeofmanytypes. Someofthesequantitiescanbescalars: • Monochromaticimageshaveasinglelightintensityvalueateachpoint. Sometimes,thesescalarsdon’tcorrespondtoquantitiessuchaslightorsound: • InX-rayimaging,thevalueateachpointcorrespondstotheattenuationoftheX-raybeamatthatposition(i.e., nottheradiationthatgetsthroughbuttheamountthatdoesn’tgetthrough). • InoneformofMRimaging,thevalueateachpointindicatesthenumberofsingle-protonatoms(i.e.,hydrogen) inthatarea. • Rangeimagesencodeateachpointinanimagethedistancetothenearestobjectatthatpoint,notit’sintensity. (Nearerobjectsarebrighter,fartherobjectsaredarker,etc.) Somesignalsdon’thavescalarquantitiesbutvectorquantities. Inotherwords,multiplevaluesateachpoint. • Colorimagesareusuallystoredastheirred,green,andbluecomponentsateachpoint. Thesecanbethoughtof asa3-dimensionalvectorateachpointoftheimage(a2-dimensionalspace). Eachcolorissometimescalleda channel. • Satelliteimagingofteninvolvesnotonlyvisiblelightbutotherformsaswell(e.g.,thermalimaging).LANDSAT imageshavesevendistinctchannels. SamplingandQuantization Thespacingofdiscretevaluesinthedomainofanimageiscalledthesamplingofthatimage.Thisisusuallydescribed intermsofsomesamplingrate–howmanysamplesaretakenperunitofeachdimension. Examplesinclude“dotsper inch”,etc. The spacing of discrete values in the range of an image is called the quantization of that image. Quantization is usually thought of as the number of bits per pixel. Examples include “black and white images” (1 bit per pixel), “24-bitcolorimages”,etc. Samplingandquantizationareindependent,andeachplaysasignificantroleintheresultingsignal. Resolution Samplingandquantizationalone,though,don’ttellthewholestory. Eachdiscretesampleisusuallytheresultofsome averagingofthevaluesaroundthatsample.(Wejustcan’tmakephysicaldeviceswithinfinitelysmallsamplingareas.) Thecombinationofthesamplingandtheaveragingareaforeachsampledeterminestheresolutionofthedigitalsignal. Forexample,adigitalmonitorwith5000x5000pixelsand24-bitcolormaysoundgreat,butlookbeforeyoubuy. Ifthepixelsare0.1mmapartbuteachpixelhasa10mmspread,wouldyoubuyit? Note: Thisdiffersfromthedefinitionofresolutiongiveninyourtextbook,whichdefinesresolutionsimplyasthe samplingrate. Thisisacommonmisconception. Resolutionistheabilitytodiscernfinedetailintheimage,andwhilethesamplingrateplaysafactor,itisnotthe onlyfactor. 2.1.2 TheDeltaFunction TheDiracdeltafunctionisdefinedas (cid:2) ∞ ifx=0andy =0 δ(x,y)= 0 otherwise and (cid:3) (cid:3) ∞ ∞ δ(x,y) dx dy =1 −∞ −∞ Fordiscreteimages,weuseadiscreteversionofthisfunctionknownastheKroeneckerdeltafunction: (cid:2) 1 ifx=0andy =0 δ(x,y)= 0 otherwise and(asyou’dexpect) (cid:4)∞ (cid:4)∞ δ(x,y) dx dy =1 −∞−∞ Oneimportantpropertyofthedeltafunctionisthesiftingproperty: (cid:3) (cid:3) ∞ ∞ f(x,y) δ(x−a,y−b) dx dy =f(a,b) −∞ −∞ 2 2.1.3 Convolution Oneofthemostusefuloperationsinimageprocessingisconvolution: (cid:3) (cid:3) ∞ ∞ g(x,y)=f(x,y)∗h(x,y)= f(a,b) h(x−a,y−b) da db −∞ −∞ Remember that convolution was useful for describing the operation of a linear and shift-invariant system. If h(x,y) is the system’s response to a delta function (impulse), the output of the system for any function f(x,y) is f(x,y)∗h(x,y). Manyusefulsystemscanbedesignedbysimplydesigningthedesiredconvolutionkernel. 2.1.4 TheFourierTransform Anotherwayofcharacterizingtheoperationofalinear,shift-invariantsystemistheFourierTransform. Remember that any image can be derived as the weighted sum of a number of sinusoidal images of different frequencies: [acos(2πux)+bsin(2πux)][acos(2πvx)+bsin(2πvx)] whereuisthefrequencyinthexdirectionandvisthefrequencyinthevdirection. Formathematicalconvenienceandformorecompactnotation,weoftenwritetheseusingcomplexarithmeticby puttingthecosineportionoftheseimagesastherealpartofacomplexnumberandthesineportionoftheseimages astheimaginarypart: ei(2πux) =cos(2πux)+isin(2πux) TogettheweightsweusetheFourierTransform,denotedasF: (cid:3) (cid:3) ∞ ∞ F(u,v)= f(x,y)e−i2π(ux+vx) dxdy −∞ −∞ AndtorecombinetheweightedsinusoidsweusetheInverseFourierTransform,denotedF−1: (cid:3) (cid:3) ∞ ∞ f(x,y)= F(u,v)ei2π(ux+vx) dudv −∞ −∞ What’suniqueaboutthesesinusoidalimagesisthateachgoesthroughthesystemunchangedotherthanamplifi- cation. Thisamplificationdiffersaccordingtofrequency. So,wecandescribetheoperationofasystembymeasuring how each frequency goes through the system (H(u,v))—a quantity called the system’s transfer function. We can describewhathappenstoaparticularimagegoingthroughthatsystembydecomposinganyimageintoitsweightsfor eachfrequency(F(u,v)),multiplyingeachcomponentbyitsrelativeamplification(F(u,v)H(u,v)),andrecombin- ingtheweighted,multipliedcomponents. So, the Fourier Transform of the output is the Fourier Transform of the input with each frequency component amplifieddifferently. IfF(u,v)istheFourierTransformoftheinputf(x,y)andG(u,v)istheFourierTransformof theoutputg(x,y), G(u,v)=F(u,v)H(u,v) 2.1.5 TheConvolutionTheorem Theconvolutiontheoremstatesthat g(x,y)=f(x,y)∗h(x,y) implies G(u,v)=F(u,v)H(u,v) andsimilarly(thoughoftenoverlooked): g(x,y)=f(x,y)h(x,y) implies G(u,v)=F(u,v)∗H(u,v) NoticethatthisimpliesthatthetransferfunctionH(u,v)istheFouriertransformoftheimpulseresponseh(x,y). 3 2.1.6 LinearSystems Tosummarize, therelationshipbetweenalinearshift-invariantsystem, itsinputf(x,y), theFouriertransformofits inputF(u,v),itsoutputg(x,y),andthetransformofitsoutputG(u,v)canbesummarizedasfollows: 1. Theoutputg(x,y)istheconvolutionoftheinputf(x,y)andtheimpulseresponesh(x,y). 2. ThetransformG(u,v)oftheoutputistheproductofthetransformF(u,v)oftheinputandthetransferfunctionH(u,v). 3. ThetransferfunctionistheFouriertransformoftheimpulseresponse. 4. TheConvolutionTheoremstatesthatconvolutioninonedomainismultiplicationintheotherdomain,andviceversa. 5. Itdoesn’tmatterinwhichdomainyouchoosetomodelorimplementtheoperationofthesystem—mathematically,itisthe same. 2.1.7 Sampling Rememberthatadigitalimageismadeupofsamplesfromacontinuousfieldoflight(orsomeotherquantity). Ifundersampled, artifactscanbeproduced. Shannon’ssamplingtheoremstatesthatifanimageissampledatlessthantwicethefrequencyofthe highestfrequencycomponentinthecontinuoussourceimage,aliasingresults. 2.1.8 ColorImages In CS 450, you covered color spaces and color models. One of the key ideas from this is that rather than describing color as simply(red,green,blue)components,wecanalsodescribeitinavarietyofwaysasanintensitycomponentandtwochromaticity components. This is useful for vision because for some application we may wish to operate on (analyze) the intensity or hue componentsindependently. 2.1.9 Histograms AsyoulearnedinCS450,ahistogramofanimagecanbeausefultoolinadjustingintensitylevels. Itcanalsobeausefultoolin analyzingimages,aswe’llseelaterinSection5.1ofyourtext. 2.1.10 Noise RememberalsofromCS450thatimagesusuallyhavenoise.Weusuallymodelthisnoiseas g(x,y)=f(x,y)+n˜(x,y) wherethenoisen˜(x,y)isaddedtothe“real”inputf(x,y). So,althoughwe’dideallyliketobeanalyzingf(x,y),allwereally havetoworkwithisthenoise-addedg(x,y). 2.2 Playing on the Pixel Grid: Connectivity ManyofthesimplestcomputervisionalgorithmsinvolvewhatI(andothers)call“playingonthepixelgrid”.Thesearealgorithms thatessentiallyinvolveoperationsbetweenneighboringpixelsonarectangularlattice. Whilethesealgorithmsareusuallysimple, theyareoftenveryusefulandcansometimesbecomemorecomplex. 2.2.1 NeighborhoodsandConnectivity Onesimplerelationshipbetweenpixelsisconnectivity—whichpixelsare“nextto”whichothers?Canyou“getto”onepixelfrom another?Ifso,how“far”isit? Suppose that we consider as neighbors only the four pixels that share an edge (not a corner) with the pixel in question: (x+1,y),(x-1,y),(x,y+1),and(x,y-1).Thesearecalled“4-connected”neighborsforobviousreasons. 4 Figure2.1: 4-connectedneighbors. Nowconsiderthefollowing: Figure2.2: Paradoxof4-connectedneighbors. TheblackpixelsonthediagonalinFig.2.2arenot4-connected.However,theyserveasaneffectiveinsulatorbetweenthetwo setsofwhitepixels,whicharealsonot4-connectedacrosstheblackpixels.Thiscreatesundesirabletopologicalanomalies. Analternativeistoconsiderapixelasconnectednotjustpixelsonthesameroworcolumn,butalsothediagonalpixels. The four4-connectedpixelsplusthediagonalpixelsarecalled“8-connected”neighbors,againforobviousreasons. Figure2.3: 8-connectedneighbors. Butagain,atopologicalanomalyoccursinthecaseshowninFigure2.2. Theblackpixelsonthediagonalareconnected,but thenagainsoarethewhitebackgroundpixels.Somepixelsareconnectedacrossthelinksbetweenotherconnectedpixels! Theusualsolutionistouse4-connectivityfortheforegroundwith8-connectivityforthebackgroundortouse8-connectivity fortheforegroundwith4-connectivityforthebackground,asillustratedinFig2.4. Figure2.4: Solutiontoparadoxesof4-connectedand8-connectedneighbors: usedifferentconnectivityforthefore- groundandbackground. 5 Anotherformofconnectivityis“mixed-connectivity”(Fig.2.5,aformof8-connectivitythatconsidersdiagonally-adjacentpix- elstobeconnectedifnoshared4-connectedneighborexists.(Inotherwords,use4-connectivitywherepossibleand8-connectivity wherenot.) Figure2.5: Mixedconnectivity. 2.2.2 PropertiesofConnectivity Forsimplicity,wewillconsiderapixeltobeconnectedtoitself(trivialconnectivity).Inthisway,connectivityisreflexive. Itisprettyeasytoseethatconnectivityisalsosymmetric:apixelanditsneighboraremutuallyconnected. 4-connectivityand8-connectivityarealsotransitive: ifpixelAisconnectedtopixelB,andpixelBisconnectedtopixelC, thenthereexistsaconnectedpathbetweenpixelsAandC. Arelation(suchasconnectivity)iscalledanequivalencerelationifitisreflexive,symmetric,andtransitive. 2.2.3 ConnectedComponentLabeling Ifonefindsallequivalenceclassesofconnectedpixelsinabinaryimage,thisiscalledconnectedcomponentlabeling. Theresult of connected component labeling is another image in which everything in one connected region is labeled “1” (for example), everythinginanotherconnectedregionislabeled“2”,etc. Canyouthinkofwaystodoconnectedcomponentlabeling? Hereisonealgorithm: 1. Scanthroughtheimagepixelbypixelacrosseachrowinorder: • Ifthepixelhasnoconnectedneighborswiththesamevaluethathavealreadybeenlabeled, createanew uniquelabelandassignittothatpixel. • Ifthepixelhasexactlyonelabelamongitsconnectedneighborwiththesamevaluethathasalreadybeen labeled,giveitthatlabel. • Ifthepixelhastwoormoreconnectedneighborswiththesamevaluebutdifferentlabels,chooseoneofthe labelsandrememberthattheselabelsareequivalent. 2. Resolvetheequivalenciesbymakinganotherpassthroughtheimageandlabelingeachpixelwithauniquelabel foritsequivalenceclass. Algorithm2.1: Onealgorithmforconnectedcomponentlabeling Avariationofthisalgorithmdoesnotkeeptrackofequivalenceclassesduringthelabelingprocessbutinsteadmakesmultiple passesthroughthelabeledimageresolvingthelabels. Itdoessobyupdatingeachlabelthathasaneighborwithalower-valued label.Sincethisprocessmayrequiremultiplepassesthroughthelabelimagetoresolvetheequivalenceclasses,thesepassesusually alternatetop-to-bottom,left-to-rightandbottom-to-top,right-to-lefttospeedlabelpropagation. Youwillimplementthisalgorithm(orasimilaroneofyourchoosing)aspartofyoursecondprogrammingassignment. 2.3 Distances Between Pixels Itisoftenusefultodescribethedistancebetweentwopixels(x ,y )and(x ,y ). 1 1 2 2 6 • OneobviousmeasureistheEuclidean(asthecrowflies)distance (cid:5) (cid:6) (x −x )2+(y −y )2 1/2 1 2 1 2 . • Anothermeasureisthe4-connecteddistanceD (sometimescalledcity-blockdistance 4 |x −x |+|y −y | 1 2 1 2 . • Athirdmeasureisthe8-connecteddistanceD (sometimescalledchessboarddistance 8 max(|x −x |,|y −y |) 1 2 1 2 . Forthosefamiliarwithvectornorms,thesecorrespondtotheL2,L1,andL∞norms. 2.3.1 DistanceMaps Itisoftenusefultoconstructadistancemap(sometimescalledachamfer)foraregionofpixels. Theideaistolabeleachpoint with the minimum distance from the pixel to the boundary of the region. Calculating this precisely for Euclidean distance can be computationally intensitive, but doing so for the city-block, chess-board, or similar measures can be done iteratively. (See Algorithm2.1onpage28ofyourtext.) 2.4 Other Topological Properties 2.4.1 ConvexHull Theconvexhullofaregionistheminimalconvexregionthatentirelyencompassesit. 2.4.2 Holes,Lakes,andBays Onesimplewayofdescribingashapeistoconsidertheconnectedregionsinsidetheconvexhullbutnotintheshapeofinterest. 2.5 Edges and Boundaries Anedgeisapixelthathasgeometricpropertiesindicativeofastrongtransitionfromoneregiontoanother. Therearemanyways tofindedges,aswe’lltalkaboutlater,butfornowsimplythinkofitasastrongtransition. Sometimes,wewanttoconsideredgesnotasatpixelsbutasseparatingpixels.Thesearecalledcrackedges. Anotherconceptisofaborder.Oncearegionisidentified,itsborderisallpixelsintheregionthatareadjacenttopixelsoutside theregion.Onewouldhopethatthesetofboundarypixelsandedgepixelsarethesame,butthisisrarelysosimple. New Vocabulary • 4-connectedneighbor • 8-connectedneighbor • mixed-connectedneighbor • ConnectedComponentLabeling • Distancemetrics(Euclidean,cityblock,chessboard) • Distancemap • Convexhull • Edge • Crackedge • Border 7 Lecture 3: Data Structures for Image Analysis (cid:1)cBryanS.Morse,BrighamYoungUniversity,1998–2000 LastmodifiedonMonday,January10,2000at9:30PM. Reading SH&B,Chapter3 3.1 Introduction Ifyou’regoingtoanalyzethecontentofanimage,youfirsthavetodevelopwaysofrepresentingimageinformation. 3.2 Image Maps Thesimplestwaytostoreinformationaboutanimageisonaper-pixelbasis.Theideaistobuildaone-to-onemapping betweeninformationandpixels. Ofcourse,theeasiestwaytodothisiswithotherimagesofthesamesize. So,while you may have an original image that you’re trying to analyze, you may also have any number of other images that storeinformationaboutthecorrespondingpixelintheoriginalimage. Suchinformationmayinclude • regionlabels(whichobjectdoesthispicturebelongto) • localgeometricinformation(derivatives,etc.) • distances(thedistancemapswesawinthelastlecture) • etc. 3.3 Chains One step up from image maps is to store information on a per-region basis. The most basic thing to store about a regioniswhereitis. Onecouldstorethelocationofeachpixelintheregion,butonecanbemoreefficientbysimply storingtheborder. Bytraversingtheborderinapre-defineddirectionaroundtheregion,onecanbuildachain. Insteadofencodingtheactualpixellocations, wereallyneedtoonlyencodetheirrelativerelationships: achain code. For4-connectedborders,wecandothiswithonlytwobitsperpixel. For8-connectedborders,weneedthree bitsperpixel. AnexampleofachaincodeisshowninFigure3.1ofyourtext. Later on, we’ll talk about other ways of representing chains and using these encodings to extract shape object information. 3.4 Run-Length Encoding In CS 450, you probably talked about run-length encoding as a way of compressing image content. We can also use run-length encoding to store regions (or images, or image maps). An example run-length encoding is given in Figure3.2ofyourtext. 3.5 Hierarchical Image Structures Anotherwayofstoringimageinformationishierarchically. Therearemanywaystodothis,butwe’lltalkabouttwo methodshere: pyramidsandtrees. 3.5.1 Pyramids Animagepyramidisahierarchyofsuccessivelylower-resolutionimagesbasedontheoriginalimage. Eachstagein buildingthepyramidinvolvesthefollowingtwosteps: 1. Blurtheimagebysomeresolution-reducingkernel(low-passfiltering). 2. Becausetheimagehasnowbeenlow-passfiltered,onecanreducethesamplingrateaccordingly. Noticethattherearenoconstraintsonthetypeoflow-passfilteringdoneorthereductioninsamplingrateother thanthecontraintimposedbythesamplingtheorem. Themostcommonwayofbuildingapyramidistodoa2-to-1reductioninthesamplingrate. Thismakesthings convenientforstoringtheimage,mappinglower-resolutionpixelstohigher-resolutionpixels,etc. Thesimplestway toblurtheimagesoastoreducetheresolutionbyafactorof2iswitha2×2uniformkernel. Donethisway,each pixelinthepyramidissimplytheaverageofthecorresponding2×2regioninthenextlower-resolutionimage. Theproblemwiththistypeofreductionisthatthe“footprints”ofthelower-resolutionpixelsdon’toverlapinthe higher-resolutionimage. Thismeansthatthisupperlevelsofthepyramidcanbeextremelysensitivetosingle-pixel shiftsintheoriginalimage. Ideally,wewanttobuildpyramidalrepresentationsthatareinvarianttosuchtranslation. It’susuallybetterthentobuildpyramidsbyusingfootprintsthatoverlapspatiallyintheoriginalimage. Onecan usesmalltrianglesoffiniteextent,Gaussians,etc. Pyramids can be used to greatly speed up analysis algorithms. The provide a “divide and conquer” approach to visionalgorithms. We’llseevariousexamplesofsuchpyramidalalgorithmsasthesemesterprogresses. 3.5.2 Trees Anotherwayofrepresentingimagesittostorelarge, coherentregionsusingasinglepieceofdata. Oneexampleof thisisthequadtree. Aquadtreeisbuiltbybreakingtheimageintofourequal-sizepieces. Ifanyoneofthesepieces ishomogeneous(inwhateverpropertyyou’reusingthetreetorepresent),don’tsubdivideitanyfurther. Iftheregion isn’t homogeneous, split it into four equal-size subregions and repeat the process. The resulting representation is a treeofdegree4wheretherootofthetreeistheentireimage,thefourchildnodesarethefourinitialsubregions,their children(ifany)aretheirsubregions,etc. Theleavesofthetreecorrespondtohomogeneousregions. Anexampleof aquadtreeisshowninFigure3.6ofyourtext. Quadtrees have been used for compression, rapid searching, or other applications where it is useful to stop pro- cessinghomogeneousregions. Noticethatalthoughbuiltdifferently(bottom-upvs. top-down),reduce-by-twopyramidsandquadtreesarealmost the same. One just uses images and the other uses trees. Pyramids have the advantage that it is easy to do spatial operations at any level of the hierarchy; trees have the advantage that they don’t store redundant information for homogeneousregions. 3.6 Relation Graphs Onceyou’vesegmentedanimageintoregionsthat(youhope)correspondtoobjects,youmaywanttoknowthespatial relationshipsbetweentheregions. Byrecordingwhichregionsarenexttowhichotherregions,youcanbuildagraph that describes these relationships. Such a graph is called a region adjacenty graph. An example such a graph is in Figure3.3ofyourtext. Noticethatnodesofdegree1areinsidetheregionthatthey’readjacentto. (It’scommonto useanodetorepresenteverythingoutsidetheimage.) 3.7 Co-occurrence Matrices Suppose that you want to record how often certain transitions occur as you go from one pixel to another. Define a spatialrelationshiprsuchas“totheleftof”,“above”,etc. Theco-occurrencematrixC forthisrelationshiprcounts r the number of times that a pixel with value i occurs with relationship r with a pixel with value j. Co-occurrence matricesaremainlyusedtodescriberegiontexture(andwe’llcomebacktothemthen),buttheycanalsobeusedon imagemapstomeasurehowoftenpixelswithcertainlabelsoccurwithcertainrelationshipstootherlabels. 2 Vocabulary • imagemap • chain • runlengthencoding • pyramid • quadtree • regionadjacencygraphs 3

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.