ebook img

Data-driven knowledge acquisition, validation, and transformation into HL7 Arden Syntax PDF

20 Pages·2015·3.86 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data-driven knowledge acquisition, validation, and transformation into HL7 Arden Syntax

G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx ContentslistsavailableatScienceDirect Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim Data-driven knowledge acquisition, validation, and transformation into HL7 Arden Syntax MaqboolHussaina,MuhammadAfzala,TaqdirAlia,RahmanAlia,WajahatAliKhana, ArifJams hedb,Sung youngLeea, ∗,Byeon gHoK angc ,KhalidL atifd aDepartmentofComputerEngineering,KyungHeeUniversity,Seocheon-dong,Giheung-gu,Yongin-si446-701,Gyeonggi-do,RepublicofKorea bDepartment of Radiation Oncology,Sh aukatK han umMemor ialCancerHospit alandResea rchCentre ,7ABlock R-3,M.A.Joh arTown, La hore54782, Pakistan cComputingandInformationSystems,UniversityofTasmania,Hobart7001,Tasmania,Australia dDepartmen tofC omputerSci ence,COM SATSInst itu teofInform ationT echno logy,Park Road,Islamabad45550,Pakistan a r t i c l e i n f o a b s t r a c t Articlehistory: Objective:Theobjectiveofthisstudyistohelpateamofphysiciansandknowledgeengineersacquire Receiv ed1January2015 clinicalkn owl edgefrom ex istin gpra cti ces data se tsfor tr eatmentof head andneck cancer,to validate 1R5ecSeeivpetedm inb ererv2i0se1d5 form the kno wledge aga inst p ublished guideline s, to crea te r efined rule s, a nd to inco rpora te thes e ru les into clinicalworkflowforclinicaldecisionsupport. Accepted15September2015 Methodsandmaterials:Ateamofphysicians(clinicaldomainexperts)andknowledgeengineersadaptan approachformodelingexistingtreatmentpracticesintofinalexecutableclinicalmodels.Forinitialwork, Keywords: theoralcavityisselectedasthecandidatetargetareaforthecreationofrulescoveringatreatmentplanfor Knowledgeacquisition cancer.ThefinalexecutablemodelispresentedinHL7ArdenSyntax,whichhelpstheclinicalknowledge Knowledgevalidation PCrliendiiccatliognu imd eoldineelss bofe rsehaalrpeadt aiemnotndga toarsgeatnsitzoatgieonnes.r aWtee ausper ead diacttiav-edrmivoedne kln(PoMw)le.dTghee aPcMquiisscitoionnv earptpedroianctho baarseefidn oend -acnlainlyicsaisl Clinical decisionsupportsystems kn owle dgemo del(R-C KM ),which fo llowsarig orous valida tion pro ce ss.Thevali datio n processusesa HL7ArdenSyntax clinicalknowledgemodel(CKM),whichprovidesthebasisfordefiningunderlyingvalidationcriteria.The R-CKMisconvertedintoasetofmedicallogicmodules(MLMs)andisevaluatedusingrealpatientdata fromahospitalinformationsystem. Results:Weselectedtheoralcavityastheintendedsiteforderivationofallrelatedclinicalrulesfor possibleassociatedtreatmentplans.AteamofphysiciansanalyzedtheNationalComprehensiveCancer Network(NCCN)guidelinesfortheoralcavityandcreatedacommonCKM.Amongthedecisiontree algorithms,chi-squaredautomaticinteractiondetection(CHAID)wasappliedtoarefineddatasetof 1229patientstogeneratethePM.ThePMwastestedonadisjointdatasetof739patients,whichgives 59.0%accuracy.Usingarigorousvalidationprocess,theR-CKMwascreatedfromthePMasthefinal model,afterconformingtotheCKM.TheR-CKMwasconvertedintofourcandidateMLMs,andwasused toevaluaterealdatafrom739patients,yieldingefficientperformancewith53.0%accuracy. Conclusion:Data-drivenknowledgeacquisitionandvalidationagainstpublishedguidelineswereusedto helpateamofphysiciansandknowledgeengineerscreateexecutableclinicalknowledge.Theadvantages oftheR-CKMaretwofold:itreflectsrealpracticesandconformstostandardguidelines,whileproviding optimalaccuracycomparabletothatofaPM.Theproposedapproachyieldsbetterinsightintothesteps ofknowledgeacquisitionandenhancescollaborationeffortsoftheteamofphysiciansandknowledge engineers. ©2015ElsevierB.V.Allrightsreserved. ∗ Correspondingauthor.Tel.:+82312012514. E-mailaddresses:[email protected](M.Hussain),[email protected](M.Afzal),[email protected](T.Ali), [email protected](R.Ali),[email protected](W.A.Khan),[email protected](A.Jamshed),[email protected](S.Lee),[email protected] (B.H.Kang),[email protected](K.Latif). http://dx.doi.org/10.1016/j.artmed.2015.09.008 0933-3657/©2015ElsevierB.V.Allrightsreserved. Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 2 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 1. Background Cognitive methods examine how people mentally represent informationtosolvesubsequentproblems.CPGtranslationtoCIGs 1.1. Motivations fromnarrativeguidelinesdiscussedin[8]isacognitivemethodin whichdomainexpertsareprovidedwithpredefinedalgorithmic Clinical decision support systems (CDSSs) play a pivotal role steps to develop the guidelines. Information extraction methods in improving patient care and enhancing practitioner perfor- use semi-automatic translation by extracting knowledge using mance[1].Nevertheless,adaptionofCDSSsinanactualhealthcare templates from narrative guidelines. Serban et al. [9] extracted workflowsetupischallenging.DespitealonghistoryofCDSSdevel- templates from narrative guidelines and used them as building opment,mostofthesystemsevaluatedinacademiahavenotbeen blocksforguidelinesbasedonbackgroundthesaurusknowledge. realizedinarealclinicalpracticeenvironment.Themostprominent CIGauthoringtoolsareusedtodirectlytransformCPGsintoexe- barriers have included heterogeneous healthcare workflow inte- cutableCIGs.CIGsfollowstandardformats,suchasXML,RDF,and gration,lackofstandardknowledgerepresentation,complexityof anyotherstandardformat.Examplesofsuchknowledgeacquisi- knowledgerepresentationlanguages,lackofframeworksforclin- tionincludeuseofanArdenSyntaxeditortoexplicitlytransform icalknowledgetransformationintoexecutableknowledgebases, the clinical knowledge into an executable module, an approach andphysicianfearsregardingvalidityoftheservicesrelatedtothe mentionedin[10]. knowledgebasesandqualityofpublishedguidelines[2].Wethere- The systematic review [7] covers the methods that are based foreinitiatedtheSmartCDSSprojectincollaborationwithShaukat on translating narrative CPGs into CIGs; however, alternative Khan umMem ori alCanc erHo spitala nd ResearchCen tre(S KMCH)1 app roachestha tuseclinic aldata setsf orkno wledgeext ractionalso toprovidedecisionsupportservicesforheadandneckcancer.We exist.Forexample,Pereraetal.[11]usedasemantic-drivenknowl- observedtheabove-mentionedbarrierswhileworkingwithclin- edge acquisition approach to establish missing relationships in icaldomainexpertsandhospitalITstaffingatheringSmartCDSS data.Theirproposedapproachislimitedtodeterminingmissing requirements. relationshipsindatawherebytheinitialknowledgebaseiscon- Notable problems in CDSS development that physicians face structedfromUMLSvocabulary.Similarly,Gomoietal.[12]used includetheCDSSutilityintheirdomainandthebuildingofamodel data mining techniques to generate rules from data and trans- toacquiretheknowledgeinstandardrepresentation,suchasArden form them into MLMs. This approach lacks a criterion definition Syntax.Tomaketheknowledgeacquisitionapplicable,wedevel- for selection of candidate MLMs and final validation from refer- opeddata-drivenknowledgeacquisitionandvalidationtechniques, enceguidelines.Ourpresentstudyiscloselyrelatedto[11,12]in whichusepatientdatasetsforcreationofthefinalclinicalmodel, terms of our use of clinical data as a common source of knowl- therefinedclinicalknowledgemodel(R-CKM).Data-drivenknowl- edge acquisition. In addition, we employ a cognitive method to edgeacquisitionhelpsinbuildingapredictivemodel(PM)toreflect aligntherefinedmodelinaccordancewithpublishedNCCNguide- actualtreatmentdecisionsforpatientsrecordedoverthelastten lines. years.Furthermore,thePMisvalidatedagainstaclinicalknowledge Thestudyin[13]resemblesourstudywithrespecttotheobjec- model(CKM)tofinalizetheR-CKMfortheexecutableknowledge tivesofusingadata-drivenapproachtocreaterules.Theauthors base.TheCKMrepresentsstandardguidelinesforcancertreatment deriverulesfrompatientdataandincorporatethemintoguidelines publishedbyNationalComprehensiveCancerNetwork(NCCN)[3] forcompletenessofmissingrecommendations.Themethodology http://www.nccn.org/ (accessed 24.04.15). The R-CKM is trans- proposedinthestudycomprisesarigorousinspectionofguidelines formedintomedicallogicmodules(MLMs)usingHL7ArdenSyntax. to find missing recommendations for all possible patient symp- ToavoidcomplexityofHL7ArdenSyntax,theMLMsarebuiltin toms. A decision tree learning algorithm is used to generate the basiccontrolartifacts,suchasbyusingIFconstructs.Moreover,to rules,alignedwiththeguidelinetreeformissingrecommendations. handletheintrinsiclimitationsofHL7ArdenSyntax,suchasthe Theproposedmethodologyisrobustfortheclinicaldomainwith curly brace problem [4] and binding issues with vocabulary [1], lesscomplexity.Forcomplexclinicaldomainswithcomprehensive we adapt HL7 virtual medical record (vMR) [5] and use system- guidelines,itmaynotbefeasibletoidentifythemissingdecisions. atizednomenclatureofmedicineclinicalterms(SNOMEDCT)[6] Furthermore, as complexity of the clinical domain increases, the http://www.ihtsdo.org/snomed-ct/(accessed26.12.14)bindingto numberofpatientconditionsisalsoincreasedtoverifythefinal represent all related concepts of head and neck cancer. The oral recommendation.Withanincreasednumberofpatientconditions, cavity is selected as the primary site for creation of R-CKM and itbecomesdifficulttoderiverelatedprofiles(asusedinthisstudy) thefinalcandidateMLMs.Theproposedapproachisdistinguished forwhichasetofrulesneedstobegenerated.Finally,thestudy from available knowledge acquisition and validation approaches definesnoformalvalidationcriteriaforthegeneratedrules,and throughitsuseofreferencestandardguidelinestovalidatethefinal dependsonthequalityandquantityofdata.Incontrast,weusea knowledgemodel. datadrivenapproachtoselectrulesfromthePMthatconformtothe guidelines.ThePMisevolvedtoR-CKMusingarigorousvalidation process. 1.2. Relatedapproaches Most existing validation approaches aim to enable developed CIGs to capture the requirements of corresponding CPGs. These Knowledge acquisition and validation are prerequisites for approachescanbeviewedintwobroadcategoriesoftechniques: effective decision support services. Various approaches are used inspection and testing [7]. In the inspection technique, domain totargetobjectivesintargetsystemdesign.Inamethodological expertsinvestigatetheCIGsforanypossibleerrorsinlogic.Collabo- review,Peleg[7]categorizedtheapproachesoftranslatingclinical rativedevelopmentofclinicalguidelinesisdiscussedin[14].Teams practiceguidelines(CPGs)intocomputer-interpretableguidelines ofexpertphysiciansandknowledgeengineerscreatemarkupsin (CIGs):cognitivemethods,acollaborativemodelingmethodology finalguidelines.Thesemarkupsareevaluatedtogetheragainstthe andtools,aninformationextractionmethodology,andspecialized gold-standardmarkupswithathoroughinspectiontodetermine CIGauthoringtools. whetherthefinalobjectiveisachieved. Testingtechniquesareusedinadditiontoinspectiontechniques tominimizechancesoferrorsinlogicthatmaynotbetraceddur- inginspection.Thesetechniquesuserealpatientcasestoevaluate 1 SKMCH:https://www.shaukatkhanum.org.pk/ allpossibledecisionpathsinCIGsforobtainingcorrectdecisions. Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 3 Miller[15]usedadomain-constraint-basedapproachwithaclin- Nevertheless, HL7 Arden Syntax has limitations in represent- ical condition set while generating the minimal set of test cases ingthe‘curlybraces’problem,thestandardmodelusedinlogic. requiredforevaluatingtheparticularguidelines.Wehereinemploy HL7vMRisastandarddatamodelproposedtoresolvethecurly both approaches: inspection and testing with different perspec- bracesproblemthatisassociatedwithintegrationofMLMswith tives.DomainexpertskeepthePMdecisionpathasacandidatepath healthcareworkflows[4,22].ThemainintentionofvMRistocre- inR-CKMaftervalidation(inspection)fromtheCKM.Knowledge ate a simplified representation of a sufficient amount of clinical engineersprovidetheevidenceofpatientcasesfromthePMthat records for capturing information relevant to clinical knowledge correctlyclassifyitintoacorrectdecision(testing).Thisvalidation and enabling understanding in knowledge engineers [5]. Most processallowsthedecisionpathtobeincludedinR-CKMifitcon- importantly, vMR is influenced by the HL7 V3 reference infor- formstothevalidationcriteria(whichisbasedontheCKM),which mation model (RIM), which makes it easier to integrate it with isprovidedwithoptimumaccuracyfromthepatientdata.Finally, HL7-complianthealthcaresystems.TheHL7vMRmodelwasdevel- R-CKM is validated against real patient data while transforming oped based on analysis of requirements from 22 US institutions itintoexecutableMLMsandintegratedintohospitalinformation [22].WecreateacancerclinicaldomainmodelderivedfromHL7 system(HIS)workflows. vMR and used as a data model for creating MLMs. To this end, PMsareconsideredprimarysourcesthatphysicianscanusein weemployHL7vMRversion1;however,readersmayconsultthe clinical setups for recommendations. Widespread computational recentversion2ofHL7vMRfordetailedspecifications[5]. methodsandtoolsareavailablefordataanalysisandcreatingof PMs. These methods and tools require a systematic method of selecting an appropriate PM that best fits the clinical prediction 2. Methods problem.Theauthorsin[16]providedasystematicreviewofthe most commonly used methods and simple guidelines for using Clinicalknowledgeacquisitionisthemainactivityinachieving thesemethodsinclinicalmedicine.Inourwork,weusedecision successful deployment of CDSSs. Unlike conventional require- trees(DTs)tocreatethePM.WedemonstrateCHAIDandaclassi- ments gathering and modeling of a system, knowledge creation ficationandregressiontree(CRT/CART)inourPMcreation.Details needs a detailed set of activities that cover the actual scenarios onusingbasicPMtechniquesarefoundin[16–19]. andfactsoccurringinarealenvironment.Mostimportantly,the Variousstandardsforsharingclinicalknowledgeareavailable. domainexpertsarenotrequiredtoknowtheexecutableknowledge HL7ArdenSyntaxisoneoftheopenstandardsofproceduralrep- paradigmusedasanintegralpartinarealhealthcareworkflow. resentationofmedicalknowledge.Theknowledgeisrepresented Moreover, the ultimate goal of knowledge acquisition is to rep- inamodularlogicunit–theso-calledMLM–whichissharable resent the knowledge that functions with an existing healthcare across an organization [20]. HL7 Arden Syntax is used to spec- workflowandtoenableapropervalidationprocessforthefinal ifyaknowledgerepresentationthatissharable,withthecontents knowledgemodel.Inthisregard,weadaptadata-drivenknowl- being readable by both humans and machines [1]. In this work, edgeacquisitionandvalidationapproachthatreflectsarealclinical R-CKM, which was created from a PM, is represented in MLMs setupderivingtheclinicalknowledgefromexistingclinicalprac- andevaluatedintermofMLMperformanceonrealpatientmed- tices. The proposed approach is an iterative model that includes ical data. We use HL7 Arden Syntax V2.7. However, readers can threephaseswithtenactivities.Itenablesadomainexperttocreate accessdetailedspecificationswiththecurrentversion(2.10)ofHL7 anexecutableknowledgebasewithcoordinationofaknowledge ArdenSyntaxfromtheHL7ArdenSyntaxworkinggrouprepository engineer.Anabstractviewofthephaseswithactivitydescriptions [21]. isdepictedinFig.1. Fig.1. Data-drivenclinicalknowledgeacquisitionandvalidationprocessmodel. Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 4 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 2.1. Phase-I:clinicalknowledgemodelling algorithmsonexistingmedicalrecordsinHISandapplyingarig- orousvalidationprocesssupportedbytheCKM.PhaseIIcomprises ACKMisthebaselinemodelforvalidationofultimateclinical activitiesthatarecategorizedintotwobroadperspectives:data- rulesinaknowledgebase.Itrepresentstheclinicalknowledgefor drivenknowledgeacquisitionandknowledgevalidation. aspecificclinicaldomain,whilereferencingstandardknowledge resources,suchasCPGs.CPGsarewidelyusedconsultingknowl- 2.2.1. Data-drivenknowledgeacquisition edgeresourcesthatareappliedinclinicalpracticesfordiagnoses Patient medical records in HIS are the primary resource for andtreatments.CPGsareavailableindifferentformats,suchastex- acquisitionofclinicalknowledge.Patientmedicalrecordsreflect tualnarrativesand/ordecisiontrees.ACKMisaformaldecisiontree patient encounters and detail histories of diagnoses and ongo- representationofCPGsalignedwithclinicalobjectivesinapartic- ing or completed treatment. Various computational methods ulardomain.InPhaseI,ateamofphysiciansisinvolvedtofinalize and tools are widely used for analyzing patient medical records theCKMandperformthefollowingactivities. and creating PMs for future recommendations. The authors in [16]discussedoverallissuesandprovidedstate-of-the-artguide- Definingasetofclinicalobjectives: Clinical objectives specify the lines to use a ppropr iate m etho ds for P M creation in c linical scope an di nt endedo utcomeso fknowl edgeacquis ition.In the medi cin e.Int hisstudy,pa tientdata ofhe ada ndneckc anc erswith contex tof aCDSS,cli nicalobjec tiv esaredefin edtospecify po ssi- tumors in th e pr imary oral ca vity sit e wer e im port ed into SPSS bleCDS Sin t ervent ions. [19]htt p:/ /ww w.sussex .ac.u k/its/p dfs/S PSSD ecisionTr ees2 1.pdf • We definetheclinicalobjectiveforcreatingclinicalknowl- (acce ssed24.12.14)andWeka[23]fromHIS tocreate aPM usinga edg ecover ing atreatm entplan rec ommend ationfo rhead decisiontr ee(DT)cl assifi cation met hod.T hem a inadva n tage ofDT s andn eckcanc er .Thetreatm ent planincludesasin gle pro- is the co mpr ehen sibility of th e classifi catio n str ucture, wh er eby cedu re or combi natio n of proc edure s from r a diother apy, th eyc aneasilydetermin ea ttrib utesforclass ifyingand verifying chemot he rapy, inductio n c hemotherap y, and surgery. The new data [18].I naddition, owingtop ow erfulheuris tics, thecom- clinicalknowle dgehelps physiciansdurin gtr eatment ofa puta tiona lcom pl exityoft heDT ind uctional gorithmis low [16]. patient orgivesrec omme ndationsof atreat mentplan du r- Finally,itp rovidestheo pp ortu nit ytogenera tereadilyc om pre hen- ingam ult idiscip linaryconference 2o f physicians. siblekn ow ledgeru les. • The r ecommendations are classified b ased on a particular In thecontex toftasksandguidelinesprovidedby[16]forthe tum orsite.Inthispape r,th eoralcav ityisco ns id eredapri- PM,t hed ata-driv en know ledg eacquisitio ntakesin to acco unt the marys itefo ra nin itialCD SSr ecom mend at ion. follo win grelatedac tivities.Fig. 2highlight sthe deta iledtask sin SelectionofCPGs: PhysiciansconsultCPGsfordiagnosesandtreat- eachactivityandpossiblesequencesforperformingtheseactivities. ment plans during medical practices. In this step, physicians intend to select appropriate CPGs that align with the clinical Clinicaldatadescriptionandpre-processing: The clinical data of objectives and use an appropriate knowledge representation oral cavity cancer patients for this study was imported from scheme. In current work, SKMCH physicians use the following HISwiththirteenconditionattributesandonedecisionattribute guidelines: • NCCNguidelinesareusedascandidateCPGstomodelclini- (treatment plan). Details of the condition attributes and decision attributesareshowninTable1.Forthefinalpredictionmodel, calknowledgefortumorsoftheoralcavityandothersites. • Tum or,Node,M et astases( TN M) stag inggui delin esare used 1229 patient records were used after applying pre-processing (removingandcalculatingmissingattributes)on2181original torepresenttheclinicalstagingoftumors. • A decision t ree is selec ted as a f ormal representation of patient records. Detailed steps of the pre-processing are shown inthe“Datapreprocessing”activityinFig.2. knowledge. Decision nodes are represented as rectangles Selectionofmachinelearningalgorithm: Themaingoalofthisactiv- anddiamonds,whiletheconclusionnodesarerepresented ityistodeterminetheappropriatebest-performingdecisiontree withovalcornerrectangles.Ovalcornerrectanglesalsoplay algorithmonagivendatasetforgeneratingthefinalPM.PMscan theroleofconditionnodesincaseitcomesinthemiddleof be evaluated based on their predictive performance and com- thetree. prehensibility. Predictive performance can be quantified using CreationofCKM: Ateamofphysiciansconvertstheselectedguide- classification accuracy, while comprehensibility is a subjective linesintoaformaldecisiontreerepresentation.Fororalcavity measurethatisassessedbyadomainexpert.Inourcontext,we CKMdevelopment,thefollowingstepsareused: • Tworesidentd octo rsareassi gnedt oin itiallycreatethedraft cooumrdbeinsier ebdocthri tmereiaa.suThreesc irnitteor iqaudaenfitinteadtivbey mtheeadsuomreas intoe axcpheiretvies decis iontree fromNC CN guidelin et reesand narrat ives . the generat ingofaP Mw ithhigh accurac ya nd providin gamin i- • Theiniti aldra ftof thede cisiontree isth orou ghlyinspected mal setofdecis io n path sby invol vingfewe rdo minantco n dition bya senio ronc olo gist andapp rove d asafinalCK Mwitha attri but es .Thiscrit eriais tra nslatedin toaq uantitative measure po ss ibleam endmentif nee ded. usingthew eigh tedsum m odel(WSM ).W SM rankingise xpressed • Inthefin alCKMdecis io ntree,asenioroncologistmayincor- inEq. (1), whichuse sclas sificat ionaccu racyP ,thenum b erofrules po rate som epro venpra ctices thatm aynotbei nclu dedin ge ner ated R,and the numberofat tributesA i nvo lvedint he con- CPGsb utbea reviden ceofitsva lidit yfrom oth er knowledg e. ditions.W eig hts w a reassign ed basedon th eimport an ceo fthe j attributes.ClassificationaccuracyPisthemostimportantinthe 2.2. Phase-II:knowledgeacquisitionandvalidation selectionofalgorithm,whichisassignedwp:+0.8.Thenumber ofrulesR an dnumbero fattrib u tesAarea ssig nedw r :− 0.1and Knowledgeacquisitionandvalidationareacoreaspectofthis wa :−0 .1 , res pectively . A ccording to our criteria, an a lgori thm work.Theyareachievedthroughapplicationofmachinelearning withaminimumrulesetandinvolvingfewerattributesispre- ferred.Inthisregard,wechooseanegativescaletodiscourage algorithmsthatgeneratemaximumrulesand/orthoseinvolving moreattributes. 2 Amultidisciplinaryconferencecomprisesapanelofdoctorsincludingoncologi- Accordingtoourcriteria,CHAIDisthemostsuitablealgorithm sts,radiologists,surgeonsandotherresidentphysicians.Theyconductaconference onr egularbasis toselectt hefi naltr eatment planforap atien t. amongCRT ,Q UES T,DFTre e,andJ4 8 foru sewi ththePM andrules Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 5 Fig.2. PMcreationprocess. generation,asshowninthe“MLalgorithmselection”activityin parametersoftheSPSStool,asshowninthe“predictionmodel Fig.2. creation”activityinFig.2. (cid:2)m A decision tree representation of the PM is shown in Fig. 3. The RankingWSM−score= ˛ wjaij, for i = 1, 2, 3, . . ., m (1) rcaocnyfuosfiothne mmaotrdiexl .sIhnoswunm imn aTrayb,loeu r2 fipnraelsemnotsd etlheac ohvieevreadll 7a1cc.0u%- j=1 accu rac yfo rclassifi ca tionoffina lcan cert reatme nt. ThePMofCHAIDclassifiedthedesignatedtreatmentplanwith Here ˛ : (0.8) is scaling constant and aij are attributes the accuracy of C CRT (85.5%), S RT (80.5%) and RT (46.2%). The accuracyofRTwascomparativelylow,butthemajorproportion with weight wj ofcases( 12 7;5 9.4% )ofremainingc ases (21 4)w erecla ssifiedasS RT.TheSRTtreatmentplancoveredradiotherapy(RT)follow- (cid:3) (cid:4) ing surgery as the main procedure; therefore, it compensated awijj A0.c8c uracy(P) N−u0m.1b erofRules(R) A−t0tr.1ibutes(A) tdheec isleiossne rc lap srse.c Tishioen re oafr eth me acnlays sriefiacsao tniosn f omr otdhee l dti or eocnt lRyT t htree aRtT- mentratherthanfollowingstandardtreatmentofSfollowedby CreationofaPM: CHAIDisanappropriatecandidatealgorithmthat RT.Theseincludepatientnotwillingforsurgeryorhavingsome providesaPMwithreasonableclassificationaccuracyongiven comorbiditiesassociatedwithtumorsite.TheCHAIDclassifica- datawhilegeneratingeasilyunderstandablerules. tionalgorithmhastheintrinsicpropertyofselectingdominant CHAID uses multiway splits to generate more than two nodes attributes from a set of condition attributes, which provided a from a current node. It chooses the independent (predictor or highersegmentationofdata.Itselected4/13attributes“Treat- conditionattribute)variablewiththestrongestinteractionwith mentIntentDescription”,“ClinicalStageT”,“ClinicalStageS”and the dependent variable (decision attribute). It has the capabil- “HistologyDescription”.Themodelgeneratedtennodesoverall, ity of merging the category of each predictor if they are not ofwhichsixnodeswereterminal(leaf)nodesinthetree. significantlydifferentwithrespecttothedependentvariable.A detaileddescriptionofCHAIDandothertreealgorithmsaregiven in[18,19]. 2.2.2. Knowledgevalidation CHAIDisappliedona1229-patientdatasetusingthe13condi- Knowledge validation is performed on the PM with confor- tionattributesmentionedinTable1.Thealgorithmhasdefault mancecriteriathatensurethatthefinalR-CKMmodelconforms Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 6 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx Table1 Datadescriptionoforalcavity. Datasummary: Totalattributes:14 Decisionattribute:treatmentplandescription Attributetype Attributetitle Attributedescription Decisionattribute Treatmentplandescription Treatmentplanforpatient. C:Inductionchemotherapyisdone. CRT:Chemoradiotherapyisdone. RT:Radiotherapyisdone. S:Surgeryisdone. CCRT:CisdonefollowedbyCRT. CRT:CisdonefollowedbyRT. SCRT:SisdonefollowedbyCRT. SRT:SisdonefollowedbyRT. CSCRT:CisdonefollowedbySandCRT. CSRT:CisdonefollowedbySandRT. UK:Treatmentunknown. NA:Treatmentnotapplicable. Conditionattributes Gradedescription Indicatepatienttreatmentstatussuchaspoorandmoderate. Treatmentintentdescription Patientstatusfortreatment,suchaspalliativeorradical. Treatmentstatusdescription Indicatepatienttreatmentstatussuchascompleted. ClinicalStageT TNMStagingTvalue. ClinicalStageN TNMStagingNvalue. ClinicalStageM TNMStagingMvalue. ClinicalStageS TNMStagingSvalue. Smoking Smokingstatus. Alcohol Alcoholstatus. Naswar Naswarstatus.Naswarisamoist,powderedtobaccosnuff. Pan Panstatus.Panistypeof,tobaccochewedandfinallyspatoutorswallowed. Patientstatus Patientcurrentstatussuchasaliveanddead. Histologydescription IndicatepatientdiseasesuchasCarcinoma. totheCKM(standardCPGs).Domainexpertsandknowledgeengi- allprimaryvalidationcriteriaandpassesatleastoneofthenon- neers worked in a collaborative environment to validate the PM primarycriteria.Forexample,anydecisionpathintheoralcavity against CKM using well-defined inspection and testing mecha- PMbecomespartofR-CKMifitsatisfiescriteria1and2andeither nismsanddevelopedtheR-CKMconformedandvalidatedmodel. 3or4,asmentionedinTable3.Theiterationisfinishedonceall Fig. 4 depicts the process of validation with detailed flows of thedecisionpathsinthePMareevaluatedagainstthevalidation steps. criteria. Activitiesperformedinthevalidationprocesscanbeclassified InspectionandrefinementofselectedPMdecisionpath: The deci- intothreemaincategories:settingthevalidationcriteria,validating sion path P in the PM that passes the validation criteria is i thePMagainstthevalidationcriteria,andfinallyevolvingtheR- inspected and refined to P prior to becoming part of R-CKM. j CKMbyinspectingandrefiningthePM. Inspection and refinement involves activities to identify con- ditional and decision values in a decision path for the same Settingvalidationcriteria: Thevalidationcriteriaareasetofasser- interpretation with existing CKM conditional and decision tions thatmay berequ ired topassthe decisio np at hs in PMto values. Theref ore, in the refi ned d ecision path , the concepts be el igible for incl usion in R-C KM. Dom ain exp erts sp ec ify t he arealig nedaccord ing tot heCKM. Forexam ple,th eco nditional val idationc rite ria,which is influenc edbythe irpractic esandc on- valu eclinic alstageS: 1 isin terpre ted asclinica lsta geT:1and forms to t he evide nce fr om standard gu ideli nes (CKM ). W hile clinica l stage N: 0 in t he CKM. Simi lar ly, durin g refi ne m ent, specify in gth ecriteria, each criterion isassigned apriori tyand the phy sician ma y add som e oth er treatme nt to a lready given itsprimary sta tus.The priori tydictate st heordero f executio nin cho icesbypro vidin gev idence forth einclusion . thevalidationprocess;theprimarystatusspecifiesthatthegiven criteriamustbesatisfiedbythedecisionpath.Table3provides 2.3. Phase-III:R-CKMtransformationintoexecutablerules thesetofcriteriadefinedbydomainexpertstovalidatetheoral cavityPMinthiswork. Mostoftheprojectsinvolvingdataminingandmachinelearning PMvalidationagainstcriteria: Validation is an iterative process techniquesintheclinicalsetuparestoppedafterthePMiscreated. thatselectsonedecisionpathP atatimefromthePMforvali- ThisisunfortunatebecausethePMshouldbedeployedinareal i dation.ThedecisionpathisselectedaspartofR-CKMifitsatisfies clinicalsetupforclinicaldecisionsupportasadailybaseservice. Table2 ClassificationusingtheCHAIDmodelfororalcavitytreatmentplanning. Observed Predicted CCRT RT SRT Totalcases Accuracy CCRT 343 12 46 401 85.5% RT 87 184 127 398 46.2% SRT 81 3 346 430 80.5% Total 511 199 519 1229 Overallpercentage 41.6% 16.2% 42.2% 71.0% Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 7 TreatmentPlanDesc Node 0 Category % n C CRT 32.6 401 RT 32.4 398 S RT 35.0 430 C CRT Total 100.0 1229 RT S RT TreatmentIntentDesc Radical Palliative Node 1 Node 2 Category % n Category % n C CRT 37.8 389 C CRT 6.0 12 RT 20.8 214 RT 92.5 184 S RT 41.5 427 S RT 1.5 3 Total 83.8 1030 Total 16.2 199 Clinical Stage T 1 2 3; 4 Node 3 Node 4 Node 5 Category % n Category % n Category % n C CRT 1.5 3 C CRT 16.0 42 C CRT 60.4 344 RT 28.8 57 RT 17.9 47 RT 19.3 110 S RT 69.7 138 S RT 66.0 173 S RT 20.4 116 Total 16.1 198 Total 21.3 262 Total 46.4 570 Clinical Stage S HistoDesc II (2) III (3); IV (4) Squamous cell carcinoma; Small cell Mucoepidermoid carcinoma; NA; carcinoma; Carcinoma NOS Adenocarcinoma; Adenoid cystic carcinoma; Basal cell carcinoma; Squamous cell carcinoma in situ; Verrucous carcinoma; Malignant melanoma; Pleomorphic adenoma; Spindle cell carcinoma; Ameloblastoma, malignant; Adenoid squamous cell carcinoma; nasopharyngeal carcinoma; Sebaceous adenocarcinoma; Sarcoma, not otherwise specified; Plasmacytoma, not otherwise specified Node 6 Node 7 Node 8 Node 9 Category % n Category % n Category % n Category % n C CRT 10.8 21 C CRT 30.9 21 C CRT 67.1 343 C CRT 1.7 1 RT 19.6 38 RT 13.2 9 RT 17.0 87 RT 39.0 23 S RT 69.6 135 S RT 55.9 38 S RT 15.9 81 S RT 59.3 35 Total 15.8 194 Total 5.5 68 Total 41.6 511 Total 4.8 59 Fig.3. CHAIDpredictionmodelfororalcavitytreatmentplans. Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 8 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx Fig.4. Knowledgevalidationprocess. Thiswouldhelpinsupportingthemedicalservices,andstakehol- analyzedforfinalexecutableknowledge.Theseareexplainedin derscouldeasilymonitortheefficiencyofthetechnologyinterms Section 3.3. ofimprovingqualityofcareanddecreasinghealthcarecosts[16]. DatarequirementsforMLMsusingtheHL7vMRmodel: Data speci- Maintainingthemotivationtoprovideend-to-endimplementation ficationsforeachMLMareimportantfortheformalcreationof ofthePM,R-CKMisrepresentedinsharableformatusingHL7Arden logic. Data specifications include enlisting clinical data that is Syntax.R-CKMisconvertedintoasetofcandidateMLMsandimple- requiredfortheMLM,representingclinicaldatainthestandard mentedinarealsetupforrecommendationofatreatmentplanfor data model (HL7 vMR), and mapping coded concepts into a oralcavitycancerpatients.ConversionofR-CKMintoexecutable standardvocabulary.Section3.3detailsthespecificationsofdata knowledgerepresentation(MLM)isperformedusingthefollowing requirementsforcandidateMLMs. activities. IdentificationofHL7ArdenSyntaxartifactsandMLMscreation: Arden Syntax is a comprehensive specification supporting a SelectionofcandidateMLMsfromtheR-CKM: R-CKMcanbetrans- largenumberofoperators,variouscontrolstructures,including formed in to differe nt set s of M LM s depe nding on th e d omain decis ionandl oo pingstruct ures,an dcomp rehensivem odelsfor expert intuit ions and logic al connec tions existi ng in t he deci- various d ata types. K nowledge engin eers summariz e the ba sic sionpa thofR-CK M.Fo rR-CK M,threecand idateap pro ach esare artifacts requ ired t o transform the R-CKM into corre spo nding Table3 Validationcriteriafororalcavity. C.no. Criteria Priority Primary Remarks 1. { ∀ Pi∈ PM : Accuracy(Pi) > N % } 1 Yes • The domain expert assigns N, which represents the accuracyofPMbasedonthetrainingdata. •Tradeoff: H ighe raccu rac yse ttingpro ducesanefficient model,butcoverageofinvolvingmorepatientfeaturesis limitedandviceversa. 2. { ∀ Pi∈ PM ∧ ∀ Pj∈ CKM : ! Conflict(Pi, Pj)} 1 Yes • Conflicts wit h gu idelines; conflicting treatments must notbeexist. •Exam p le:aftersurgery,chemo-inductionhasnomeaning. 3. {∀Pi ∈ PM ∧ ∃Pj ∈ CKM : Conform(Pi, Pj)y→ields Pi ∈ (cid:2)RCKM} 2 No • Decision path in PM conforming to any CKM path shall be partofR-CKM. {∃Pi ∈ PM ∧ ∀Pj ∈ CKM : 3 No • Decis io n path in PM not conforming to any path in CKM 4. !Conform(Pi, Pj) pro→vid es Evidence(Pi)y→ield s Pi ∈ (cid:2)RCKM} •Scuafnfi bciee nptaretv oidf eRn-cCeKeMx iosntslyfo irf:effectivenessofthetreatment. • Evidence canbeoth ersta nda rdclinicalkno w ledg e resourcesorlocalpracticeswithareasonablesuccess ratioforthepredictedtreatment. Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx 9 Fig.5. ClinicalknowledgemodelfortheoralcavityusingtheNCCNtreatmentplan. MLMs and provide physicians with training on using these CKMdecisiontreemodel,physiciansalsointegratelocalpractices artifacts. intodecisionpathswithsufficientevidenceforimprovingpatient IntegrationofMLMswithHISworkflow: Knowledge engineers care.Forexample,ChemoInductionwasaddedtothefinalCKMwith implement the MLMs and integrate them with HIS workflows. localpractices,suggestingthatinductionchemotherapyfollowed DetailsoftheimplementationandintegrationofMLMswithHIS bychemoradiotherapybeforesurgeryimprovesoverallsurvivalof areavailableinourpreviousresearch[24,25]. patients[26]. 3. Results 3.2. Phase-II:knowledgeacquisitionandvalidation 3.1. Phase-I:clinicalknowledgemodelling PMevaluationusingdisjointpatienttestdata: The experiment is performedona(disjoint)datasetof739patientswithcompleted The team of physicians establish the clinical objective of treatments.EachdecisionpathofthePMisevaluatedforasetof incorporatingtheCDSSinterventionforthetreatmentplanrecom- candidatepatientcases.Thepatientcasesaredistributedintosix mendationsforpatientswithanoralcavitytumor.NCCNguidelines decisionpaths,wheredistributionisbasedonqualificationfor areselectedtodeveloptheCKMforthedecisionoftreatmentplans theconditionpartinthepath. forthesepatients.Fig.5,showsthefinaloutcomeoftheCKMof TheoverallaccuracyofthePMiscalculatedastheweightedmean theclinicalknowledgemodelingphasefromtheNCCNguidelines. oftheaccuraciesofallofthedecisionpaths,asshowninEq.(2). Theseguidelinescoverthedomainasawholeandprovideageneral PMaccuracyis59.0%inthetestdata,whichisencouragingtocon- viewofthedecisionmodel.InordertoconvertNCCNintothefinal siderthemodelforconstructingthefinalR-CKM.Table4shows Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008 G Model ARTICLE IN PRESS ARTMED-1432; No.ofPages20 10 M.Hussainetal./ArtificialIntelligenceinMedicinexxx(2015)xxx–xxx Table4 PMevaluationondisjointpatienttestdata. Path# PMdecisionpath Candidatepatientcases PMpathaccuracy Path-1 Node0→RT Palliativepatients:69 40.58%(C:28W:41) Path-2 Node0 → No de1→Node3→SRT Patientw ithradica lan dCS:T1:139 95.68% (C:133 ,W:6) Path-3 Node0 → Node1 → Node4 → N ode6→SRT Patient swith radica land FC SII: 123 73.98%(C:91,W :32) Path-4 Node0 → Node1 → Node4 → Node7 → S RT Patients with radical and FCS III; IV:5 6 67.86%(C:38, W:18) Path-5 Node0 → Node1 → Node5 → Node8 → C CR T Patients with radical ,CS:T 3-4 andH ist oDescription1,2,3:324 38.58%(C:12 5,W:199) Path-6 Node0 → Node1 → Node5 → Node9 → S RT Patients with radical, CS:T3-4 and HistoDescription ot he rt han 1,2,3:28 85.71% (C:24,W :4) Overall PM Accuracy: PMacc 59.0% •CS:ClinicalStage. • FCS :FinalC linicalStage. • Histo Desc ription1 :Squamouscellcarcinoma. • HistoDescription 2: Smallcellc arci noma. • HistoDescription 3: Carcin oma NOS. • C:Correctlyclass ifie dpatientc ases. • W :Wrongly classified patient cases. the detailed distribution of patient test cases over six decision ValidatedR-CKMbasedonPM: R-CKMiscreatedfromthePMwhile path sandthe ircorrespon din gaccura cies .The“ PMDe cis ionPath” usingth evalid ation crit eriadefined b ydomai nexp ert s.U singa repres ents the decisionpaths coveredby PM, “Can didatePa tient set of fou r validatio n criteri a for the o ral cavit y, the va lidatio n Cases”are the recordn umber ofcase sth atq ualifythed ecision pro ces sisa ppliedont heoralc avit yPM ,wh ichresu lted inR-CKM, pathan d“ PM PathAc curacy” ist hepe rcent ageofp atie ntcases asshow n inFig.6 . corre ctly class ified bythePM de cisio npath. Ta ble 5 s ho ws t he details of the applicable validation criteria (cid:5) (AVC) fo r each dec ision pa th of t he PM. For validation criteria n (pat ×A ) 1, phy sici ans e stablish t he ac cur acy leve l of N = 50% (a ccuracy PMacc= i=(cid:5)1 n ci dpi (2) of pathsbase dontrain ing data).Fiv edec isio n p aths inthePM i=1patci ar econfo rmed to theCKM (i.e.,p assin gvalidat ioncr ite riao f1, 2,3),whileonedecisionpathrepresentthelocalpractices(pass- where pat and A represent the number of patient ing criteria 1, 2, 4). Moreover, the PM decision always existed ci dpi inl eafnode s, wh ere asinR-CK Mth ed ecisionn odecou ldhave cases assigned to path dp and accuracy of path dp i i occurred in the middle and would not work as a conditional respectively. nodeforthefollowingsub-tree.Forexample,inthePM,decision Fig.6. Refinedclinicalknowledgemodelfortheoralcavitytreatmentplan Pleasecitethisarticleinpressas:HussainM,etal.Data-drivenknowledgeacquisition,validation,andtransformationintoHL7Arden Syntax.ArtifIntellMed(2015),http://dx.doi.org/10.1016/j.artmed.2015.09.008

Description:
a Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si 446-701, Gyeonggi-do, Republic of Korea The final executable model is presented in HL7 Arden Syntax, which helps the clinical knowledge .. There are many reasons for the direct RT treat-.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.