SpringerBriefs in Electrical and Computer Engineering Forfurther volumes: http://www.springer.com/series/10059 Liana Stanescu Dumitru Dan Burdescu l Marius Brezovan Cristian Gabriel Mihai l Creating New Medical Ontologies for Image Annotation A Case Study LianaStanescu DumitruDanBurdescu DepartmentofSoftwareEngineering DepartmentofSoftwareEngineering UniversityofCraiova UniversityofCraiova Craiova200585,Romania Craiova200585,Romania [email protected] [email protected] MariusBrezovan CristianGabrielMihai DepartmentofSoftwareEngineering DepartmentofSoftwareEngineering UniversityofCraiova UniversityofCraiova Craiova200585,Romania Craiova200585,Romania [email protected] [email protected] ISSN2191-8112 e-ISSN2191-8120 ISBN978-1-4614-1908-2 e-ISBN978-1-4614-1909-9 DOI10.1007/978-1-4614-1909-9 SpringerNewYorkDordrechtHeidelbergLondon LibraryofCongressControlNumber:2011943747 #SpringerScience+BusinessMedia,LLC2012 Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewritten permissionof the publisher (SpringerScience+Business Media, LLC, 233 SpringStreet, New York, NY10013,USA),exceptforbriefexcerptsinconnectionwithreviewsorscholarlyanalysis.Usein connectionwithanyformofinformationstorageandretrieval,electronicadaptation,computersoftware, orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden. Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,evenifthey arenotidentifiedassuch,isnottobetakenasanexpressionofopinionastowhetherornottheyaresubject toproprietaryrights. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface Advancesinmedicaltechnologygeneratehugeamountsofnontextualinformation (e.g., images)along with more familiar textual one. Theimage isoneofthe most important tools in medicine since it provides a method for diagnosis, monitoring, and disease management of patients with the advantage of being a very fast noninvasiveprocedure. As new image acquisition devices are constantly being developed, to increase efficiency and produce more accurate information, and data storage capacity increases,asteadygrowthofthenumberofmedicalimagesproducedcanbeeasily inferred.Withsuchanexponentialincreaseofmedicaldataindigitallibraries,itis becoming more and more difficult to execute certain analysis on search and informationretrieval-relatedtasks. Such problems can be tackled by representing the available information using languages such as DAML+OIL and OWL, which benefit from their underlying description logic foundation. Recently, there has been lot of discussion about semantically enriched information systems, especially about using ontologies for modelingdata.Ontologieshavethepotentialofmodelinginformationinawaythat they can capture the meaning of the content by using expressive knowledge representation formalisms, such as description logics, and therefore achieve good informationretrievalresults. In this book, we outline our experience of building a semantically system by accommodating image annotation and retrieval services around an ontology for medical images used in digestive diseases. Our approach is based on the assump- tionsthat,givenimagesfromdigestivediseases,expressingallthedesiredfeatures usingdomainknowledgeisfeasible,andmanuallymarkingupandannotatingthe regionsofinterestispractical.Inaddition,bydevelopinganautomaticannotation system, representing and reasoning about medical images are performed with a reasonablecomplexityinagivenqueryingcontext. This book is mainly intended for scientists and engineers who are engaged in researchanddevelopmentofvisualinformationretrievaltechniques,especiallyfor v vi Preface medical color images from the area of digestive diseases, and who want to move fromcontent-basedvisualinformation retrieval tosemantic-based visualinforma- tionretrieval. The objective of this book is to review and survey new research and develop- ment in intelligent visual information retrieval technologies, and to present a system for semantic-based visual information retrieval in collections of medical imagesthataimstoreducethe“semanticgap”inimageunderstanding. Thebookincludessevenchaptersthatcoverseveraldistinctiveresearchfieldsin visual information retrieval ranging between content-based to semantic-based visualretrievalandfromlow-leveltohigh-levelimagefeatures.Theyalsoprovide manystate-of-the-artadvancementsandachievementsinfillingthesemanticgap. Thebookisstructuredinfourdistinctsections.Thefirstpartgivesanoverview ofthecontent-basedimageretrieval,anditcontainsonlytheChap.2.Thesecond partdealswiththeproblemofimagesegmentationthatisnecessaryforidentifying visual objects from images, and it contains the Chap. 3. The third part looks of someaspectsconcerningimageannotationbyusingontologies,anditcontainstwo chapters: Chaps. 4 and 5. The last part deals with the problem of semantic-based image retrieval, and it presents an object-oriented system for visual information retrievalfrommedicalimages,whichusesbothcontent-basedandsemantic-based visualsearching. Overall,thebookcontainsseveralpicturesandseveraldozentablesthatoffera comprehensive image about the current advancements of semantic-based visual informationretrieval. Craiova,Romania LianaStanescu DumitruDanBurdescu MariusBrezovan CristianGabrielMihai Contents 1 Introduction................................................................. 1 2 Content-BasedImageRetrievalinMedicalImagesDatabases ......... 5 2.1 Introduction............................................................ 5 2.2 Content-BasedImageRetrievalSystems.............................. 7 2.3 Content-BasedImageQueryonColorandTextureFeatures......... 9 2.4 EvaluationoftheContent-BasedImageRetrievalTask ............. 11 2.5 Conclusions ........................................................... 12 References................................................................... 13 3 MedicalImagesSegmentation............................................ 15 3.1 Introduction........................................................... 15 3.2 RelatedWork ......................................................... 16 3.3 Graph-BasedImageSegmentationAlgorithm........................ 19 3.4 TheColorSetBack-ProjectionAlgorithm ........................... 32 3.5 TheLocalVariationAlgorithm....................................... 33 3.6 SegmentationErrorMeasures ........................................ 35 3.7 ExperimentsandResults.............................................. 36 3.8 Conclusions ........................................................... 40 References................................................................... 41 4 Ontologies .................................................................. 45 4.1 Ontologies:AGeneralOverview..................................... 45 4.2 OntologyDesignandDevelopmentTools............................ 47 4.3 MedicalOntologies................................................... 51 4.4 TopicMaps............................................................ 54 4.5 MeSHDescription .................................................... 55 4.6 MappingMeSHContenttotheOntology andGraphicalRepresentation......................................... 58 References................................................................... 63 vii viii Contents 5 MedicalImagesAnnotation............................................... 65 5.1 GeneralOverview..................................................... 65 5.2 AnnotationSystemsintheMedicalDomain......................... 73 5.3 Cross-MediaRelevanceModelBased onanObject-OrientedApproach..................................... 75 5.3.1 Cross-MediaRelevanceModelDescription.................. 75 5.3.2 TheDatabaseModel.......................................... 77 5.3.3 TheAnnotationProcess....................................... 78 5.3.4 MeasuresfortheEvaluationoftheAnnotationTask ........ 82 5.3.5 ExperimentalResults ......................................... 83 5.4 Conclusions ........................................................... 85 References................................................................... 85 6 Semantic-BasedImageRetrieval......................................... 91 6.1 GeneralOverview..................................................... 91 6.2 Semantic-BasedImageRetrievalUsingtheCross-Media RelevanceModel...................................................... 97 6.3 ExperimentalResults ................................................. 99 6.4 Conclusions ......................................................... 100 References................................................................. 101 7 ObjectOrientedMedicalAnnotationSystem ......................... 103 7.1 SoftwareSystemArchitecture...................................... 103 7.2 Conclusions ......................................................... 110 Chapter 1 Introduction Medicalimagesplayacentralroleinpatientdiagnosis,therapy,surgicalplanning, medical reference, and medical training. With the advent of digital imaging modalities, as well as images digitized from conventional devices, collections of medicalimagesareincreasinglybeingheldindigitalform.Duetothelargenumber of images without text information, content-based medical image retrieval has receivedincreasedattention. Content-basedvisualinformationretrieval(CBVIR)hasattractedmanyinterests, fromimageengineering,computervision,anddatabasecommunity.Alargenumber of researches have been developed and have achieved plentiful and substantial results. Content-based image retrieval task could be described as a process for efficiently retrieving images from a collection by similarity. The retrieval relies on extracting the appropriate characteristic quantities describing the desired contentsofimages.MostCBVIRapproachesrelyonthelow-levelvisualfeatures of image and video, such as color, texture, and shape. Such techniques are called feature-basedtechniquesinvisualinformationretrieval.Thesecondchapterofthe book, “Content-Based Image Retrieval in Medical Images Databases,” deals with theCBVIRapproaches,presentingageneraloverview,anevaluationmethodofthe CBVIRtask. Unfortunately,currentmethodsoftheCBVIRsystemsonlyfocusonappearance- basedsimilarity,thatis,theappearanceoftheretrievedimagesissimilartothatof aqueryimage.Thereislittlesemanticinformationexploited.Amongthefewefforts which claim to exploit the semantic information, the semantic similarities are definedbetweendifferentappearancesofthesameobject.Thesekindsofsemantic similaritiesrepresentthelow-levelsemanticsimilarities,whileandthesimilarities between different objects represent the high-level semantic similarities. The similaritiesbetweentwoimagesarethesimilaritiesbetweentheobjectscontained bythetwoimages. Asaconsequence,awaytodevelopasemantic-basedvisualinformationretrieval (SBVIR)systemconsistsintwosteps:(1)toextractthevisualobjectsfromimages and(2)toassociatesemanticinformationtoeachvisualobject.Thefirststepcanbe L.Stanescuetal.,CreatingNewMedicalOntologiesforImageAnnotation: 1 ACaseStudy,SpringerBriefsinElectricalandComputerEngineering, DOI10.1007/978-1-4614-1909-9_1,#SpringerScience+BusinessMedia,LLC2012 2 1 Introduction achievedbyusingsegmentationmethodsappliedtoimages,whilethesecondstep can be achieved by using semantic annotation methods to the visual objects extractedfromimages. Image segmentation techniques can be distinguished into two groups, region- based and contour-based approaches. Region-based segmentation methods can be broadly classified as either top-down (model-based) or bottom-up (visual feature- based) approaches. An important group of visual feature-based methods is represented by the graph-based segmentation methods, which attempt to search a certain structuresintheassociated edge-weighted graphconstructed ontheimage pixels,suchasminimumspanningtreeorminimumcut.Otherapproachestoimage segmentationconsistofsplittingandmergingregionsaccordingtohowwelleach regionfulfillssomeuniformitycriterion.Suchmethodsuseameasureofuniformity ofaregion.Incontrast,othermethodsuseapairwiseregioncomparisonratherthan applying auniformitycriteriontoeach individual region. The third chapter of the book,“MedicalImagesSegmentation,”describessomegraph-basedcolorsegmen- tationmethodsandanarea-basedevaluationframeworkoftheperformanceofthe segmentationalgorithms. ThesecondstepoftheproposedSBIRsysteminvolvesanannotationprocessof the visual objects extracted from images. It becomes increasingly expensive to manuallyannotatemedicalimages.Consequently,automaticmedicalimageanno- tationbecomesimportant.Weconsiderimageannotationasaspecialclassification problem,thatis,classifyingagivenimageintooneofthepredefinedlabels. Automaticimageannotationistheprocessofassigningmeaningfulwordstoan image,takingintoaccountitscontent.Thisprocessisofgreatinterestasitallows indexing,retrieving,andunderstandingoflargecollectionsofimagedata.Thereare tworeasonsthataremakingtheimageannotationadifficulttask:thesemanticgap, being hard to extract semantically meaningful entities using just low-level image featuresandthelackofcorrespondencebetweenthekeywordsandimageregionsin the training data. Several interesting techniques have been proposed in the image annotationresearchfield.Mostofthesetechniquesdefineaparametricornonpara- metricmodeltocapturetherelationshipbetweenimagefeaturesandkeywords.The problemofimageannotationispresentedinthefifthchapterofthebook,“Medical ImagesAnnotation.”Inthischapterispresentedanoverviewoftheexistingmethods fortheannotationtaskfromseveralperspectives:unsupervised/supervisedlearning, parametric/nonparametric unsupervised learning models, or text/image based. An extensionofthecross-mediarelevancemodelbasedonanobject-orientedapproach hasbeenchoosingformedicalimagesannotation,andanevaluationoftheannota- tionprocessandtheexperimentalresultsarepresentedinthefinalpartofthechapter. The concepts used for annotation of visual objects are generally structured in hierarchies of concepts that form different ontologies. The notion of ontology is definedasanexplicitspecificationofsomeconceptualization,whiletheconceptu- alization is defined as an intensional semantic structure that encodes the rules of constraining the structure of a part of reality. The goal of an ontology is to define some primitives and their associated semantics in some specified context. Ontologies have been established for knowledge sharing and are widely used as