AUTOMATIC SYSTEM FOR COUNTING CELLS WITH ELLIPTICAL SHAPE WesleyNunesGonc¸alves,OdemirMartinezBruno InstitutodeF´ısicadeSa˜oCarlos-UniversidadedeSa˜oPaulo Av. TrabalhadorSa˜o-carlense,400-CEP:13560-970 Cx. Postal369-Sa˜oCarlos-SP-Brasil [email protected],[email protected] Abstract – This paper presents a new method for automatic quantification of ellipse-like cells in images, an important and challengingproblemthathasbeenstudiedbythecomputervisioncommunity. Theproposedmethodcanbedescribedbytwo mainsteps. Initially,imagesegmentationbasedonthek-meansalgorithmisperformedtoseparatedifferenttypesofcellsfrom thebackground. Then, arobustand efficientstrategyis performedonthe blobcontourfor touchingcellssplitting. Due tothe contourprocessing,themethodachievesexcellentresultsofdetectioncomparedtomanualdetectionperformedbyspecialists. 2 Keywords–Touchingcellssplitting,computervision,patternrecognition. 1 0 2 1.Introduction n a Imageanalysismethodsforidentifyingandquantifyingobjects(e.g. bloodcell,bacteria,nanostructure)areanessentialtask J formanyresearchareas.Inmicrobiology,forinstance,examiningandquantifyingcellsbymicroscopyhasbeenacentralmethod 5 forstudyingcellularfunction,suchastheestimationofparasitemiafrommicroscopyimagesofblood[1]andthequantification 1 ofcelladhesionforunderstandingphysiologicalphenomena. Quantifyingcellsinimagesbecomesevenmoreimportantbecause ] inmostcases,thesequenceoftheresearchdependsontheresultsobtainedinthisstep. V Usually, cell counting is performed in a manual process, which takes hours or even days of work. Due to human factors C such as fatigue and distraction, the results obtained by manual counting are not completely reliable or reproducible. Thus, . automationofthisprocesshasattractedincreasingattentionfromcomputervisioncommunity. Besidesprovidingmorereliable s c andreproducibleresults,automaticcellcountingalsoprovidesstatisticsofthecellsthatahumanbeingisunabletoestimate,as [ area,perimeterandvolume. Severalmethodsforcountingcellsinimageshavebeenproposedintheliterature. Alargeportionofthemethodsisbased 1 v onthewatershedalgorithm[2–4],whosebasicideaistofloodanimageasatopographyrelief. Althoughlessexplored,active 9 contours[5,6]andregiongrowingmethods[7,8]havebeenusedinseveralothermethodsandobtainedinterestingresults. There 0 aremethodsthatcountingcellsinimagesbasedonmorphologicaloperations[9,10]andmethodsthatuseprioriinformationof 1 theobjectshape[11,12]. Althoughcellcountinghavebeenheavilystudiedbycomputervisioncommunity,mostmethodsdoes 3 notprovidesatisfactoryresultsforimageswithcomplextouchingcells[13]. . 1 Thispaperproposesanapproachforautomaticcountingofcellsinimagesthatcombinesk-meanssegmentationandellipse 0 fitting. Different types of cells in the same image (e.g. cells of different colors) are segmented from the background using k- 2 1 meansalgorithm. Aftersegmentation,ellipsefittingisperformedonthecontourofblobstoseparatetouchingcells. Twosetof : experimentswerecarriedoutusingthreetypesofcellimages.Thefirstexperimentaimstoevaluatetheproposedmethodbyusing v imagesmarkedbyspecialists.ThisexperimentwasperformedinimageswithhighdensityoftheLactobacillusparacaseibacteria. i X Thesebacteriaarefoundinhumanbeingmouthandareresponsibleforthemajorityofdiseasessuchascaries. Furthermore,the r secondexperimentwasperformedinimagescontainingalargenumberofthreetypesoftouchingcells. Inbothapplications,the a proposedapproachprovidedexcellentresultscomparedtothemanualannotationperformedbyaspecialist. The paper is described in four sections. In Section 2, the proposed approach for counting cells is described from the pre- processing and segmentation of images to the ellipse fitting that provides the separation of touching cells. Experiments and resultsforthreetypesofimages(imagesofbacteriaintwostagesandbloodcells)arepresentedinSection3. Finally,inSection 4,conclusionandfutureworksarediscussed. 2.ProposedApproach Proposed approach is summarized in Figure 1. Initially, a pre-processing is applied in order to enhance the image. Then, cellsaresegmentedfromthebackgroundbyusingk-meansalgorithmwithk =n+1groups(ntypesofcellsandbackground). Finally,blobscontainingmorethanonecellaredividedintosegments. Thesesegmentsaredeterminedbyconcavepointsonthe contourandthenanellipseisfittedforeachsegment. 2.1 Pre-processing In order to enhance image contrast, images are preprocessed using the decorrelation stretch method [14]. Decorrelation stretchingmethodisbasedonprincipalcomponentstransformationtoeliminatethecorrelationbetweenbands(e.g. RGBcolor Figure1: Summarizationoftheproposedapproach. First,pre-processingisappliedintheimage. Second,cellsaresegmented usingk-meanswithk =n+1. Then,contourofblobsaredividedatconcavepoints. Finally,anellipseisfittedforeachsegment ofthecontour. space). This process involves three main steps. First, principal component analysis is applied on the rows and columns of the image. Then, contrast equalization is applied by a Gaussian filter. Finally, coordination conversion is applied to the original bands. Moreinformationcanbefoundin[14]. 2.2 SegmentationusingK-Means After the pre-processing step, cells in the image are segmented from the background. Since the images contains relevant colorinformation(Figure2(a)),segmentationisdonebyusingthewell-knownk-meansalgorithm. Thisalgorithmisaclustering methodthataimstopartitionthedataintok groupssuchthatthedistancebetweenelementsofthesamegroupareminimized. In image segmentation, each pixel is considered an element x = [R ,G ,B ] that must be assigned to one of the k groups. i i i i Thealgorithmhastwoiterativesteps. Givenk centroids, eachpixelx isassignedtothenearestcentroid. Then, centroidsare i recalculated according to the pixels belonging to each group. The two steps above are repeated until the difference between centroidsoftwoiterationsislessthanathreshold. Figure2showsanexampleofimagesegmentationusingk-meansalgorithm withk =3. (a) Originalimage. (b) Segmentedimage. Figure2: Resultsofsegmentationofcellimageusingk-meansalgorithm. Insomecases,thesegmentedimagehasblobswithaholeinsideduetonoise,imagecapturingortypeofcell(Figure3(b)). Tosolvethisproblem,afterthek-meanssegmentation,weapplyafillholemethod[15]. Notethat,ifthealgorithmforcontour extractionisinvarianttoholes,thisstepisunnecessary. Figure3showsanexampleofthestepdescribedabove. (a) Originalimage. (b) Segmentedimageusingk-meanswithk=2. (c) Segmentedimagefollowedbyfillholemethod. Figure3: Resultsofsegmentationofbloodcellimageusingk-meansfollowedbythefillholemethod. 2.3 ContourProcessing Aftertheimagesegmentation,someblobscontaintwoormoretouchingcells.Inthiswork,thecontouroftheblobsisusedto splittouchingcells.ThecontourisrepresentedbyasetofpointsC ={p ,p ,...,p },wherep =(x,y)isthecontourpointand 1 2 n i nisthenumberofpoints. Figure5illustratesthecontourprocessingproblemandtheseparationusingtheproposedapproach. Themainideaofthecontourprocessingistosplitthecontourintosegmentsbelongingtodifferentcellsthroughconcavepoints. The original contour of the cells has many small-scale fluctuations and noises that can affect its analysis. To decrease the influence of noise and fluctuation, a polygon approximation [11] is applied to smooth the original contour C. The polygon approximationprovidesasetofpointsPAC = {p ,p ,...,p } | p ∈ C. Theapproximationmethodusedinthisworkstarts 1 2 m i with two points p and p ∈ C, where j = i+nStep and nStep > 1. Then, distances between the line p p and each point i j i j p |i<t<jarecalculatedandcomparedtoathresholddTh. Ifthedistanceofapointp isgreaterthandTh,thispointbelongs t t tothepolygonapproximation(p ∈PAC),p movestop andtheprocedureisrepeated. Otherwise,p movestothenextpoint t i t j andthedistancesarerecalculateduntilthereisapointp orp reachestheendofthecontour. Whenp coverallcontourpoints, t j i theprocedureisterminated. TheapproximatedcontourPAC isdividedatconcavepointstosplittouchingcells. Thesepointsareidentifiedbasedonthe angleofthreeconsecutivepoints. Giventhreepointsp ,p ,p ,pointp isaconcavepointiftheangleθ(i)(Equation1)is i−1 i i+1 i betweentheminimumangleθ andthemaximumangleθ (Equation2). Inaddition,toqualifyapointasaconcavepoint, min max thelinep p shouldnotcrossthecontour,asillustratedinFigure4(a). Thissecondruleisneededtodiscardfalseconcave i−1 i+1 points. (cid:26) d ,ifd <π θ(i)= i−1,i,i+1 i−1,i,i+1 (1) π−d ,otherwise i−1,i,i+1 whered =|a(p ,p )−a(p ,p )|anda(p ,p )=tan−1(yi−yj). i−1,i,i+1 i−1 i i+1 i i j (xi−xj) θ <θ(i)<θ min max (2) p p shouldnotcrossthecontourC. i−1 i+1 Insomecases,touchingbetweentwoormorecellshasonlyoneconcavepointp thatcanbeidentifiedbytherulesabove. In c thesecases,anewconcavepointisinsertedattheoppositesideoftheidentifiedconcavepoint.Followingtheassumptionthatthe cellsintheimagehavesimilarsize,thepositionofthenewconcavepointisthemiddleofthecontour,consideringtheposition cofthesingleconcavepointequalto0,asillustratedinFigure4(b). Anotherspecialcaseistheinsertionofconcavepointsat incompletecellswhosecontourreachedtheimageboundaries. Inthesecases,concavepointsareinsertedatthebeginningand theendofthecontour. AnexamplecanbeseeninFigure5(b). (a) Illustration of concave point. An point is(b) Insertionofconcavepoints. Foundconcave discartedbecausethelinepi−1pi+1crossesthepoint(green)andinsertedconcavepoint(red). contour. Figure4: Calculationofconcavepointsandtheinsertionofpointsinaspecialcase. Theconcavepointsdividethecontourintosegments. Thesesegmentsarerepresentedbyc = {p ,p ,...,p },wheresis i i1 i2 is thenumberofpointsofthesegmentc ,p andp areconcavepoints. Iftherearemconcavepoints,thecontourisdividedinto i i1 is msegmentssuchthatC =c ∪c ∪...∪c . Figure5(c)showsanexampleofconcavepointsandsegments. 1 2 m (a) Originalimage. (b) Concavepoints. (c) Contoursegments. (d) Fittedeliipsestoeachsegment. Figure5: Exampleofcontoursegmentsandellipses. 2.4 EllipseProcessing Mostcellscanbemodeledbyanellipseoracircle. Thus,thepurposeofthisstepistomodeleachcontoursegmentwithan ellipse.Theseellipsesareprocessedinseveralstepsthatcombineordividethemaccordingtorulesderivedfrompriorknowledge ofthecells. Foreachcontoursegmentc ,anellipsee isfittedbyanellipsefittingalgorithm. Following[13],directleastsquare i i method[16]wasusedbecauseitiscomputationallyefficientandprovidesrobustresultsevenwithnoiseandocclusions. After ellipsefittingforeachsegment,thestepsbelowareperformed. 2.4.1 EllipseSelection Theellipsesmustsatisfytwoconditionstobeselected. First,themeanalgebraicdistance[16],whichmeasuresthequalityofthe ellipsegiventhepoints,mustbesmallerthanathresholddisTh. Second,theratiooftheminoraxistomajoraxisoftheellipse shouldbegreaterthanathresholdeTh. Thissecondconditiondiscardstooslenderellipses. Theselectedellipsesareusedinthe ellipsecombinationstep,whiletheellipsesthatwerenotselectedareusedinthelaststep(ellipserefinement). 2.4.2 EllipseCombination Atthispoint,thecellsarebasicallyseparated. However,theremaybesegmentsbelongingtothesamecellerroneouslyseparated by concave points misidentified. As some cells do not have an ellipse shape or have a high mean algebraic distance error, the rulesarealsoderivedfromtheknowledgeofthecellsintheimages[13]. Theserulesaredescribedbelowintwocases. Case1:Thesimpletouchingoftwocellsiseasilyidentifiedbytherulesofthecase1.Theserulesdonotcombinetwoellipses whosetouchingisexplicit. Aswearenotinterestedincombiningtheellipses,thedistanceofthecenterofthenewellipseec new andthecenterofthetwopreviousellipsesec andec shouldbegreaterthanathresholddMinTh,accordingtoEquation3. i j dist(ec ,ec )>dMinTh i new (3) dist(ec ,ec )>dMinTh j new wheredist(p,q)istheEuclideandistanceofthepointspandq. ThresholddMinThiseasilydeterminedusingcellproperties, usuallyclosetothelengthoftheminoraxisofthesmallest cell[13]. Anotherruleusedinthecase1saysthattwocellsshouldbeseparatedifthedistanceofthetwopreviouslycellcenters isconsiderable,accordingtoEquation4. dist(ec ,ec )>[2.5,4.0]dMinTh (4) i j Case2: Considertwosegmentsc andc andtheirellipsese ande . Consideralso,asegmentc = c ∪c anditsellipse i j i j ij i j e . Ifthesegmentsc andc belongtothesamecell,themeanalgebraicdistanceofthenewellipsee isprobablysmallerthan ij i j ij thedistancesobtainedbythetwopreviousellipsese ande . Ifthisoccurs,thesegmentsshouldbecombined. i j ThealgorithmforcombiningellipsesisgiveninAlgorithm1. Input :Segmentsc andtheirellipsese i i fori=1toM do forj =i+1toM do c =c ∪c ; ij i j Fitanellipsee forc ; ij ij ifnotCase 1andCase 2then Replacee bye ; i ij Replacec byc ; i ij Deletee andc ; j j G=G−1andi=1; end end end Algorithm1:EllipseCombinationAlgorithm. 2.4.3 EllipseRefinement Atthisstep,segmentsthathavenotbeenprocessedareusedtorefinetheellipses(e.g. segmentswithasmallnumberofpoints andsegmentswhoseellipsewerenotselectedintheselectionellipsestep). Forthis,eachunprocessedsegmentisconcatenated with all existing segments and an ellipse is fitted for each combined segment. After, the unprocessed segment belongs to the ellipsethatprovidesthesmallermeanalgebraicdistanceandisstillacceptableunderthetermsoftheellipseselectionstep. 3.ExperimentsandResults We have conducted two sets of experiments to evaluate the proposed method. The first experiment aims to validate the proposedmethodusingimagesannotatedbyspecialists. Inthesecondseriesofexperiments,theproposedmethodwasapplied todifferenttypesofcells,rangingfrombacteriatobloodcells. First,experimentswereperformedonannotatedimagesofbiofilmsofLactobacillusparacasei,bacteriainthehumanmouth. Themotivationforusingtheseimagesisthenecessitytoquantifytheareaandthenumberofbacteriabeforeandaftertheuseof chemicalsolutions.Thechemicalsolutionsaimtoreducethenumberofbacteria,asthereisanunrestrictedformationofbiofilms onthetoothsurface,whichisassociatedwiththeoccurrenceofdiseaseslikedentalcaries. For both experiments, image segmentation was performed by k-means algorithm with k = 3 because the images contains, besidesthebackground,twotypesofbacteria. Theremainingparameterswereempiricallyadjustedasfollows. Inthecontour processingstep,thethresholddThwassetat3.5. Duetothesmallsizeofcellinrelationtotheimagesize,thethresholddTh wassettoalowvaluethatcorrespondstothemaximumpolygonapproximationerrorinpixels. Tocalculatetheconcavepoints, theminimumangleθ andthemaximumangleθ wereseton35◦and155◦,respectively. min min Thefittedellipseforeachsegmentmustsatisfytwoconstraints: meanalgebraicdistanceshouldbelessthandisThandratio between minor axis and major axis should be greater than eTh. The two parameters were: disTh = 0.03, to allow certain robustnessintheellipsefitting,andeTh = 0.2,torestrictveryelongatedellipses. Finally,intheellipsecombinationstep,the thresholddMinThwas4, whichcorrespondstotheminoraxis ofthesmallestcellintheimagesoftraining. Eachobjecthas different properties, so the parameters used in the proposed method should be adjusted according to a priori knowledge of the object. Inthefirstexperiment,theproposedmethodwasperformedon167imageswithhighdensityofbacteria.Below,experimental results in this application are presented and discussed. The proper detection of touching objects is one of the main difficulties ofthemethodsoftheliterature. However,thistaskisnecessaryforimageswithhighdensityofcells. Thecorrectidentification of cells provides estimates closer to reality and thus, more reliable results are obtained. In Figure 6, results for images with touchingbacteriaarepresented. Thefigures6(a)and6(c)correspondtotheresultsobtainedbytheproposedmethod,whilethe otherfiguresweremarkedbyaspecialisttovalidatethemethod. Despitethelargenumberoftouchingcells,proposedmethod achievessimilarresultstospecialistinbothimages. (a) Proposedmethodresults. (b) Imagemarkedbyaspecialist. (c) Proposedmethodresults. (d) Imagemarkedbyaspecialist. Figure6: Resultsforimageswithhighdensityoftouchingcells. Forthesameimages,thecountofcellswasperformedbytheproposedmethodandfacedwiththecountcarriedoutbythree specialists(Table1). Wenotethat,evenbetweenspecialists,therearedifferencesduetothebiasofeachspecialist. Nevertheless, resultsobtainedbytheproposedmethodweresimilartotheaverageamongspecialistsinbothimages. Table1: Countingofbacteriacarriedoutbytheproposedmethodandthreespecialists. Figure6(a) Figure6(c) DetectionMethod Bacteria1 Bacteria2 Bacteria1 Bacteria2 Specialist1 23 24 24 29 Specialist2 22 26 23 31 Specialist3 21 25 22 30 SpecialistAverage 22 25 23 30 ProposedMethod 23 25 24 31 Figure7presentsacomparisonofbacteriaareascalculatedbytheproposedmethodandmanualtracinginthetwoimages. Forbothimages,bacteriaareasweresortedtocreatetheplot. Ascanbeseen,themethodalsoobtainsgoodresultswithrespect tothearea,whichcanbecorroboratedbytheaverageerrorinpixelsof17.13and29.52forimages7(a)and7(b),respectively. In the second set of experiments, the proposed method was applied to three species of cells. Figure 8 shows results for a completeimageofbacteriacellsusedintheearlierexperiments. Despitethelargeamountofbacteria,theresultsareinteresting becausetheprocessisfullyautomated. InFigure9,histogramofareaforeachtypeofbacteriaispresented. Thesehistograms canbeusedforevaluatingchemicalsolutionthatcombatsmouthdiseases. Toevaluatetheproposedmethodinotherimages,experimentalresultsforbloodcellsarepresentedinFigure10. Thisimage containsonlyonetypeofcell,whichhistogramofareaispresentedinFigure11. Wehavefoundthatthismethodisaveryuseful techniqueforvarioustypeofcells,sinceithastheadvantageofpredicttheshapeofcellsoccludedduetothetouching,ascanbe seeninFigure10. Finally,theproposedmethodwasappliedtomouthbacteriainanotherstage. TheresultsofdetectionarepresentedinFigure 12. Althoughthebacteriainthisstagehavemoreelongatedshape,theproposedmethodachievesgoodresultsofdetectionwith respecttoareaandnumberofbacteriaintheimage. ThehistogramofareaispresentedinFigure13. Besidestheexcellentresultsindetectionofcells,theproposedmethodisalsoefficientinprocessingtime. Onaveragefor10 imageswith1024×1024pixelsandhighdensityofbacteria,themethodtook553millisecondsonacomputerIntelQuadCore 2.33GHzCPUand3GBRAM. (a) e=17.13. (b) e=29.52. Figure7: Sortedbacteriaareasfortwoimagescalculatedbytheproposedmethodandbyaspecialist. Figure8: Resultsforanimagewithhighconcentrationoftwotypesofbacterias. (a) Histogramofareaforthegreenbacterias. (b) Histogramofareafortheredbacterias. Figure9: Histogramofareaforbothtypesofbacterias. Figure10: Resultsofdetectionforbloodcellimage. Figure11: Histogramofareaforthebloodcells. Figure12: Resultsformouthbacteriainanotherstage. Figure13: Histogramofareaformouthbacteria. 4.Conclusion Thispaperproposedanewapproachforidentifyingandquantifyingcellsinimages. Theproposedmethodconsistsofimage segmentation based on k-means algorithm and an important step of contour processing to separate touching cells. Promising results have been obtained on three types of cells. Experimental results indicate that the proposed method achieves detection performancecomparabletodetectionperformedbyspecialists. Inaddition,ourmethodmakesthedetectionofcellsfeasibleand simple,whichresultsinanefficientandlowcostimplementation. The proposed method is able to successfully handle a wide range of types of cells. As part of the future work, we plan to focusoninvestigatingtheperformanceofthemethodonartificialimages. Anotherresearchissueistoevaluateotherstrategies tosegmenttheimagesbasedonthewatershedalgorithm. Acknowledgments The authors would like to thank Dr. Luis E. Cha´vez de Paz who provided images of cells. WNG was supported by CNPq grants142150/2010-0. OMBwassupportedbyCNPqgrants306628/2007-4and484474/2007-3. References [1] S. Halim, T. R. Bretschneider, Y. K. Li, P. R. Preiser and C. Kuss. “Estimating Malaria Parasitaemia from Blood Smear Images”. InICARCV06,pp.1–6,2006. [2] K.Z.Mao,P.ZhaoandP.Tan. “Supervisedlearning-basedcellimagesegmentationforp53immunohistochemistry.” IEEE TransBiomedEng,vol.53,no.6,pp.1153–63,2006. [3] Z.Peter,V.Bousson,C.BergotandF.Peyrin. “Aconstrainedregiongrowingapproachbasedonwatershedforthesegmen- tationoflowcontraststructuresinbonemicro-CTimages”. PatternRecognition,vol.41,no.7,pp.2358–2368,2008. [4] L.VincentandP.Soille. “WatershedsinDigitalSpaces: AnEfficientAlgorithmBasedonImmersionSimulations”. IEEE Trans.PatternAnal.Mach.Intell.,vol.13,no.6,pp.583–598,1991. [5] S. Eom, S. Kim, V. Shin and B. Ahn. “Leukocyte Segmentation in Blood Smear Images Using Region-Based Active Contours”. pp.867–876,2006. [6] P.BamfordandB.Lovell. “Unsupervisedcellnucleussegmentationwithactivecontours”. SignalProcessing,vol.71,no. 2,pp.203–213,1998. [7] J.Ning, L.Zhang, D.ZhangandC.Wu. “Interactiveimagesegmentationbymaximalsimilaritybasedregionmerging”. PatternRecognition,vol.43,no.2,pp.445–456,2010. InteractiveImagingandVision. [8] D.Anoraganingrum,S.KrˆnerandB.Gottfried. “CellSegmentationwithAdaptiveRegionGrowing”. InICIAPVenedig, Italy,pp.27–29,1999. [9] C.diRuberto,A.Dempster,S.KhanandB.Jarra. “Analysisofinfectedbloodcellimagesusingmorphologicaloperators”. vol.20,no.2,pp.133–146,February2002.