SS symmetry Article An Internet of Things Approach for Extracting Featured Data Using AIS Database: An Application Based on the Viewpoint of Connected Ships WeiHe1,2,3,ZhixiongLi4,5,† ID,RezaMalekian6,* ID,XinglongLiu3,7,*andZhiheDuan8 1 CollegeofMarineSciences,MinjiangUniversity,Fuzhou350108,China;[email protected] 2 TheFujianCollege’sResearchBasedofHumanitiesandSocialScienceforInternetInnovationResearch Center(MinjiangUniversity),Fuzhou350108,China 3 FujianProvincialKeyLaboratoryofInformationProcessingandIntelligentControl(MinjiangUniversity), Fuzhou350121,China 4 SchoolofMechatronicEngineering&JiangsuKeyLaboratoryofMineMechanicalandElectricalEquipment, ChinaUniversityofMining&Technology,Xuzhou221116,China;[email protected] 5 DepartmentofMechanicalEngineering,IowaStateUniversity,Ames,IA50010,USA 6 DepartmentofElectrical,Electronic&ComputerEngineering,UniversityofPretoria,Pretoria0002, SouthAfrica 7 DepartmentofPhysicsandElectronicInformationEngineering,MinjiangUniversity,Fuzhou350108,China 8 SchoolofMechanicalEngineering,Xi’anJiaotongUniversity,Xi’an710001,China;[email protected] * Correspondence:[email protected](R.M.);[email protected](X.L.); Tel.:+27-12-420-4305(R.M.);+86-0591-8376-1175(X.L.) † Currentaddress:Fuzhou350108,China. Received:19July2017;Accepted:28August2017;Published:7September2017 Abstract: Automatic Identification System (AIS), as a major data source of navigational data, is widelyusedintheapplicationofconnectedshipsforthepurposeofimplementingmaritimesituation awarenessandevaluatingmaritimetransportation. EfficientlyextractingfeatureddatafromAIS databaseisalwaysachallengeandtime-consumingworkformaritimeadministratorsandresearchers. Inthispaper,anovelapproachwasproposedtoextractmassivefeatureddatafromtheAISdatabase. An Evidential Reasoning rule based methodology was proposed to simulate the procedure of extractingroutesfromAISdatabaseartificially. First,thefrequencydistributionsofshipdynamic attributes,suchasthemeanandvarianceofSpeedoverGround,CourseoverGround,areobtained, respectively, according to the verified AIS data samples. Subsequently, the correlations between the attributes and belief degrees of the categories are established based on likelihood modeling. Inthiscase,theattributeswerecharacterizedintoseveralpiecesofevidence,andtheevidencecan becombinedwiththeEvidentialReasoningrule. Inaddition,theweightcoefficientsweretrained in a nonlinear optimization model to extract the AIS data more accurately. A real life case study wasconductedatanintersectionwaterway,YangtzeRiver,Wuhan,China. Theresultsshowthatthe proposedmethodologyisabletoextractdataveryprecisely. Keywords: AutomaticIdentificationSystem(AIS);EvidentialReasoningrule;likelihoodmodeling; beliefdistribution;non-linearoptimization 1. Introduction TheAutomaticIdentificationSystem(AIS)isabroadcast-stylecommunicationsystemdeveloped for exchanging navigational data automatically and autonomously. It was originally designed for ship collision avoidance and maritime regulatory, and now it is gradually being used as an importantconnected-shiptechniquebyshipownerstomonitortheshiplocationandcargos. TheAIS Symmetry2017,9,186;doi:10.3390/sym9090186 www.mdpi.com/journal/symmetry Symmetry2017,9,186 2of15 communicationnetworkcanbeseparatedintotwoparts,theshipborneAISstationandtheshore station. TheshipborneAISstationsbroadcastshiprelatedstaticanddynamicinformationonVery HighFrequency(VHF)bandatregularfrequencyaccordingtotheInternationalTelecommunication Union (ITU) specification. The static information, which is manually entered or updated, mainly include identifier, call sign, ship name, dimensions, type, etc. The dynamic information, which is fromtheGlobalNavigationSatelliteSystem(GNSS),includestimestamp,SpeedoverGround(SOG), CourseoverGround(COG),position(presentbylongitudeandlatitude),etc. Theshorestationsare mainlysetupbyshoreauthorities. WhiletheshipborneAISstationissendingoutmessages,ships andshorestationsaroundcanreceivethesemessagesanddisplayingthemonElectricalNavigation Chart(ENC).Therefore,theAISenablesthecommunicationbetweenshipsandshoreauthorities,and helpthemaritimeadministrators,shipowner,andshippilottodetectthepositionandstateofvessels. AccordingtothemandatoryrequirementoftheConventionontheSafetyofLifeatSea(SOLAS) conventionin2002,allshipsof300grosstonnageandupwardsengagedoninternationalvoyages,cargo shipsof500grosstonnageandupwardsnotengagedoninternationalvoyages,aswellaspassenger shipsirrespectiveofsizeshallbefittedwithanAIS.Hence,theAISbecameoneofthemostwidelyused piecesofequipmentonboard. BecausetheAISprovidesabundantshipvoyagerelatedinformation, itisrecognizedastheprimarydatasourceinmaritimesituationawareness. TheAISdataiswidely usedinriskanalyses,accidentprevention,motionpatternrecognition,routeprediction,andanomaly detection,etc.Quetal.[1]introducedtheAISdatabaseintoquantitativelyevaluatingthecollisionrisks intheSingaporeStrait. Balmatetal.[2]proposedafuzzyapproachtoassessthemaritimeriskatsea basedonthedecision-makingsystemwiththerealAISdata. Wangetal.[3]exploitedthecontextual AISdatainacoupledspatial–temporalperspectivemicroscopicanalyticalschemeforatwo-vessel collisionaccidentinvestigation. Mouetal.[4]madeastatisticalanalysisforcollisioninvolvedshipsby usingAISdatainRotterdamPorttoinvestigatetheaccurateandactualbehaviorofcollision-involved ships. Risticetal.[5]extractedmotionpatternsbythestatisticalanalysisoftherealhistoricAISdata, inordertoconstructthecorrespondingmotionanomalydetectorsofshipsinportsandwaterways. Pallottaetal.[6]proposedanunsupervisedandincrementallearningapproachtoextractmaritime movementpatterns,forthepurposeofconvertingrawdatatoinformationsupportingdecisions,such astrafficrouteextractionandanomalydetection. Inthesepreviousstudies,theextractionofvaliddata fromtheAISdatabaseisalwaystheprimaryandfundamentalstep. However,theextractionofvalid dataisalsothemosttime-consumingandlaborioustaskbeforeconductingstudies. Many researchers gave their solutions on this problem. Demšar et al. [7] introduced a 3D space-timedensityoftrajectoriestovisualizingthetemporal-spatialdata,andpresentedanapplication tovessels’trajectoriesacquiredbyAIS.Scheepensetal.[8]proposedamethodthatcanexplorethe attributesalongtrajectoriesbycalculatingadensityfieldformultiplesubsetsofthedata. Inorder todiscoversimilargroupsofsub-trajectoriesinthespatialdatabase,Debnathetal.[9]proposeda frameworktoclustersub-trajectoriesbycombiningtechniquesfromgridbasedapproaches,spatial geometry, and string processing. Kim et al. [10] developed a maritime traffic gridded database byprojectingtrafficdataonageographiccoordinatesystem. However, ingeneral, thegrid-based methods for extracting AIS data need a prior definition on optimal grid size, which is difficult to get. To solve this problem, Arguedas et al. [11] proposed an algorithm to automatically produce hierarchical graph-based representations of maritime shipping lanes extrapolated from historical AISdata,butthegrid-basedmethodbecameinefficientasthescaleexpands. Theimprovementof grid-basedmethodsondataclassificationandextractionisthevector-basedmethods. Inthisaspect, Gerbenetal.[12]appliedapiecewiselinearsegmentationmethodtocompressthetrajectoriesobtained fromAISdata. Zhangetal.[13]usetheDouglas-Peucker(DP)algorithmtosimplifyAIStrajectoriesby extractingthecharacteristicpoints. Comparedwiththegrid-basedmethod,thecomputationalburden ofthevector-basedmethoddecreasedalotindeed. However,thesemethodsdidn’tfullyutilizethe contentsoftheAISdata,suchastheSOG,COG,etc. Xiaoetal.[14]proposedastatisticalapproach toextractingshiptrafficbehaviorswithAISdata,includingspatialdistribution,speeddistribution, Symmetry2017,9,186 3of15 coursedistribution,averagespeed,andtrafficdensity,theresearchcontributedalotfordataextraction fromtheAISdatabase. Liuetal.[15]providesasolutionforextractinganomalymovementinthe maritime traffic domain based on the clustering, in the study, the position, SOG, and COG were utilizedinextractingAISdataintandem,i.e.,theroleofposition,SOG,andCOGinAISdatapoint areunrelated. In daily administration, maritime administrators are capable of extracting featured AIS data accordingtotheirexperienceondynamicattributesofthevessels,suchasSOG,COG,andposition. However, it is impossible and inefficient to extract the massive AIS data timely by manual labor. Referringtotheprocedureofvesselclassificationofmaritimeadministrators,itisfeasibletosimulate theexperiencesofadministratorsbasedonartificialintelligencethatshouldbeabletomakeconjunctive inferencewiththeSOG,COG,andposition. Focusedontheextractionofvesseltargetsinradarimages, Maetal.[16]giventhesimilarcognitionofbuildinganartificialintelligencebaseonBayesianNetwork. Therefore,buildinganartificialintelligenceforextractingfeaturedAISdataisatypicalprobabilistic inferenceissueappliedinvessels’spatial-temporaldata. Basedontheaboveanalysis,aprobabilistic methodologywasproposedtoclassifyshipswithAISdynamicinformation. Theposition,SOG,and COGaretreatedaspiecesofevidencetomakethefinaldecisioninacooperativeway. However,in contrasttotheSOGandCOG,thepositionishardtobeconvertedintopiecesofevidence.Nevertheless, astheshipsailinginthewaterway,theSOGandCOGofthevesselswouldbefeatured.Inconsideration ofthetwofactors,thetrajectoryevidencewasdiscardedandtheAISdatawiththeSOGandCOG evidencewasextracted. Inthiscase,thecomputationalburdenoftheAISdatawillbedecreasegreatly. Thescopeofthisstudyistoproposeanartificialintelligenceforextractingfeatureddatafromthe AISdatabase. Thepaperisorganizedasfollows: Section2givestheproposedapproachforextracting featuredAISdatawithvalidatesamples. Section3deliversanapplicationoftheproposedapproach, whichlocatedintheWuhanwaterway,middlestreamoftheYangtzeRiver. Finally,Section4contains theconclusionsofthisproposal. 2. AProposedApproach Asmentionedabove,thispaperisdedicatedtointroducingtheartificialintelligenceofmaritime administratorsintoextractingfeaturedAISdata. Themaritimeadministratorsmaketheirinferences onfeatureddataaccordingtotheirexperience,whichcomesfromthedailyobservationofmassive vessels. Theartificialintelligenceistypicallyakindofprobabilisticinference[16]. BayesianInferenceisoneofthemostclassicalmethodologiesintheprobabilisticinferencedomain, anditiscapableofhandlingthenoisedataofmovingtargets[16]. However,thelimitationofBayesian Inference is that it requires the frame of discernment being mutually exclusive and independent. Dempster-Shafer (DS) theory is considered as an improvement of Bayes Inference, it is able to makerigorousprobabilisticinference,whichconstitutesaconjunctiveprobabilisticprocess[17,18]. ThereliabilitiesofevidencesinDStheoryareconsideredtobefullyreliable. Hence,DStheorycannot combinetwopiecesofevidenceincompleteconflict[19]. Toaddressthisproblem,Dezertetal.[20,21] developed the Dezert-Smarandache Theory (DSmT) by introducing coefficient of reliabilities and weights,andmakestheDSmTcapableofcombiningpiecesofevidenceincompleteconflictionby redistributingtheconflictsofevidences. However,DSmTneitherkeepstheconsistencieswiththe Bayesianrule,norisacknowledgedasrigorousreasoning[22]. OriginatedfromtheDempster’sand Bayesianrule,Yangetal.[22]developedtheEvidentialReasoning(ER)ruleinwhichthecoefficients ofweightandreliabilityareinheritedfromDSmT,andtheirdiscountingmethodaremodified,hence there is no more deficiency in the inference process, meanwhile keeping the consistency with the Bayesianrule. Moreover,comparedwiththeconventionalBayesianinference,theERruledoesnot need the prior probability of patterns or states, because it constitutes a likelihood modelling [23]. Therefore, the ER rule provides a rigorous way to build the artificial intelligence of experienced maritimeadministrators. Inthissection,thedetailsonintroducingtheERruleintoextractingfeatured AISdatawillbepresent. Theprocessingflowoftheapproachisdepictedwithfoursteps,including Symmetry2017,9,186 4of15 featureextracting,likelihoodmodeling,conjunctiveinference,andnonlinearoptimizationonweights, aSsymsmhoetwry n20i1n7, F9i, g18u6r e 1. 4 of 15 FFiigguurree 11.. TThhee flflooww cchhaarrtt ooff tthhee pprrooppoosseedd mmeetthhooddoollooggyy.. 22..11.. SStteepp 11:: FFeeaattuurree EExxttrraaccttiioonn IInn oorrddeerrt otob ubiuldildan aenf fiecfifeincitepnrto pbarobbilaisbtiilcisitnifce riennfecreemncoed eml,oadqelu, aan tqituyaonftiAtyIS odf aAtaISsa mdaptlae ssashmopulleds bsheogualtdh ebree dgafitrhsetrlye.dF friorsmtlyth. eFrvoemri fitehde vAeIrSifdieadta AsIaSm dpalteas ,sdamynpalmesi, cdaytntraimbuicte asttorfibvuetsesse losf, svuecshsealss, SsOucGh aans dSCOOGG a,ncadn CbeOqGu, acnatinfi ebde aqnudatnratinfisefdor manedd ttorafnresqfouremnceyd dtios trfirbeuqtuioennscyo fdmisetarinbuantidonvsa roiafn mceeoafnS OanGd, avnadriaCnOceG o,fr eSsOpeGc,t iavnedly C. OG, respectively. SSuuppppoossee tthhaatt tthheerree aarree mm ccaatteeggoorriieess ooff ddaattaa ssaammpplleess,, aanndd tthhee ddaattaa ssaammpplleess ccaann bbee ccllaassssiififieedd bbyy mmaakkiinngg ccoonnjujunnctcitviveein ifnefreernecnecoef tohfe tmhee amneaannd avnadri avnacreiaonfcSeO oGf, SCOOGG,. TChOeGn,. thTehemne, atnhoe fmSOeaGn, voafr iSaOncGe, ovfaSriOanGc,et hoef SmOeGan, tohfeC mOeGan,a onfd CvOaGria, nacnedo vfaCrOiaGncaer oeft CreOatGed aares ftoreuartepdie caess fooufer vpidieecnecse offo ervcildaessnicfye ifnogr dclaatsassifayminpgle sd.aTtaak seatmhepmlese.a nTaokfeS OthGe fomreeaxna mopf lSeO,iGn ofrodre retxoamobptalein, itnh eofrrdeqeur etnoc yobdtiastinri btuhteio fnreoqfumeenacny odfisStOriGbuotifotnh eofd amtaeasna mofp SleOs,Gt hoef mtheea ndaotfa SsOamGpslheosu, tldheb medeiasnc roeft iSzOedG. Ssuhopuplods ebteh deimscirneitmizuedm. Smuepapnoosef SthOeG misinvimminuman dmtehaenm ofa xSiOmGu mis mvmeina nanodf SthOeG misaxvimmaxu,mth mereaanng oefo SfOmGe aisn vomfaSx,O thGe irsaenqguea ollfy mdeivaind oedf SiOntGo Lisp eaqrutsa.lTlyh edniv,iydjedde ninottoe sLt hpeafrrtes.q uTehnecny, oyrjt hdeennuotmesb ethreo ffrtiemquesenthcya totrh ethme enaunmofbeSrO oGf btieminegs etqhuata lthtoe i i Vmaeluaen ioffo rSCOlGas sbej,inwgit heqiu=a1l ,t2o, V..a.lu,Le, ij =fo1r ,C2,la.s.s. ,j,m w. Tithhe in ,=t h1e,2t, o…ta,l nLu, mj =b e1r, o2f, d…at,a msa,.m Tphleens,f othrec ltaostsajl cnaunmbbeepr roefs ednattaa ssaQmjp=le∑s ifL=o1r yclija.sAss j schaonw bne pinreTsaebnlte a1s. Qj =L yj . As shown in Table 1. i=1 i Table1.Attributevaluesforverifiedsampleinthehypotheses. Table 1. Attribute values for verified sample in the hypotheses. VerifiedSampleAttributeDistributionValue ClassCifilcaastsiiofnication Verified Sample Attribute Distribution Value ToTtaolta l ValueV1 alue 1. .. … VaVluaeliue i ..…. VVaalluuee LL ClassC(1l)ass (1) y11 y1 ... … y1i y1 ..…. yy11L Q1Q=1=∑LiL=y11y 1i Class(2) y2 1 ... y2 i ... yL2 Q2 =∑i=1L i y2 1 i L i=1 i ... Class (2) ... y12 ... … ... yi2 ...… y...L2 Q2 =iL...=1yi2 Class(j)… yj … ... … yj… .….. …yj Q…j =∑L yj 1 i L i=1 i ... Class (j) ... y1j ... … ... yij ...… y...Lj Qj =iL...=1yij Class(m…) ym … ... … ym… .….. …ym Qm…=∑L ym 1 i L i=1 i Class (m) ym … ym … ym Qm =L ym 1 i L i=1 i Similarly,thefrequencydistributionofvarianceofSOG,meanofCOG,andvarianceofCOGfor thedatasamplescanalsobepresentedlikeTable1. Similarly, the frequency distribution of variance of SOG, mean of COG, and variance of COG for the data samples can also be presented like Table 1. 2.2. Step2: LikelihoodModelling 2.2. SAtefpte 2r: aLciqkeuliihrionogd Mthoedferleliqnuge ncydistributionsofshipmotionattributesdemonstratedpresentin Table1,thesecharacterizeddistributionsshouldthenbetransferredtosupportdegreesofthecategories. After acquiring the frequency distributions of ship motion attributes demonstrated present in Thatistosay,themappingfunctionsbetweensupportdegreesandstatisticaldistributionsshouldbe Table 1, these characterized distributions should then be transferred to support degrees of the foundout. categories. That is to say, the mapping functions between support degrees and statistical distributions should be found out. { } The concepts description of DS theory is introduced as follows. Suppose Θ= θ,θ,,θ is a 1 2 m set of mutually exclusive and collectively exhaustive propositions. Where θ,θ,,θ denote to 1 2 m Symmetry2017,9,186 5of15 TheconceptsdescriptionofDStheoryisintroducedasfollows. SupposeΘ = {θ ,θ ,··· ,θ }isa 1 2 m setofmutuallyexclusiveandcollectivelyexhaustivepropositions. Whereθ ,θ ,··· ,θ denotetothe 1 2 m categories,respectively. Letφrepresenttheemptyset. Then,Θisreferredtoasaframeofdiscernment. ThepowersetofΘconsistsof2m subsetsofΘ,isdenotedbyP(Θ). Differentfromtheconventional probabilisticinferencemethods,abeliefdegreeoraprobabilitymightbealsoassignedtothepower setP(Θ)intheERrulewhenthereisareliabilityprobleminevidence[22]. ABasicProbabilityAssignment(bpa)isafunctionp: 2Θ → [0,1] thatsatisfies, ∑ p(φ) =0, θ⊆Θp(θ) =1 (1) wherethebasicprobability p(θ)isassignedexactlytoapropositionφandnottoanysmallersubset of φ. p(θ) is generated from the frequency distributions. Referring to the research conducted by Yangetal.[23],thelikelihoodsbasedonthefrequencydistributionscanbepresentedasfollows. Thecoreoflikelihoodmodelingistofindtheprobabilisticrelationshipsbetweenthevaluesof afrequencydistributionandtheclassification(i.e.,Class1,Class2,... ,Classm)ofaAISdatacell. ThelikelihoodtransformationapproachestablishedbyYangetal.[23]underpinsanewlikelihood modelingmethodforclassifyingvessels,whichisdescribedasfollows. BasedonthefrequencydistributionsgiveninTable1,thelikelihoodthatanattributeisequalto valueiforaclassjiscalculatedinEquation(2). cj = yj/Qj i =1,2,··· ,L,j =0,1,··· ,m (2) i i j wherec denotesthelikelihoodtowhichattributedistributionisexpectedtobeequaltovalueiin i classj. j Letp denotetheprobabilitythatanattributewithvalueipointstoclassj,whichisindependentof i j thepriordistributionoftheclassification. p isthenacquiredasnormalizedlikelihoodasfollows[23]. i pj = cj/∑m ck i =1,2,··· ,L,j =1,2,··· ,m (3) i i k=1 i Beliefdistributions,givenbyEquation(3)representtheprobabilisticrelationshipsbetweenthe motion attributes and its classification. Note that a belief distribution reduces to a conventional probability distribution when p0 is equal to zero, or there is no ambiguity about the classification. i Followingtheaboveprocedure,anattributevaluecanbemappedtoabeliefdistribution,whichis regardedasapieceofevidence. 2.3. Step3: ConjunctiveInference Subsequently,theERruleisusedtoprocessthesepiecesofevidence,anditalsotakesthereliability andweightofevidenceintoconsiderations. Apieceofevidencee isrepresentedasarandomsetand i profiledbyaBeliefDistribution(BD)asfollows: ei = (cid:8)(θ,pθ,i),∀θ ⊆ Θ,∑θ⊆Θpθ,i =1(cid:9) (4) where (θ,p ) is an element of evidence e, representing that the evidence points to proposition θ, θ,i i whichcanbeanysubsetofΘoranyelementofP(Θ)exceptfortheemptyset,tothedegreeof p , θ,i referredtoasprobabilityordegreeofbeliefingeneral. (θ,p )isreferredtoasafocalelementofe,if θ,i i p > 0. Inthisoccasion, p isexactlycomingfromtheprobabilitiesobtainedfromthequantified θ,i θ,i characterizeddistributionsofdifferentclassification,givenbyEquations(2)and(3). In addition, the reliability is associated with evidence e, denoted by r, which represents the i i abilityoftheevidence,wheree isgenerated,toprovideacorrectassessmentorsolutionforagiven i problem[21]. Thereliabilityofapieceofevidenceistheinherentpropertyoftheevidence,andinthe ERframeworkitmeasuresthedegreeofsupportfor,oroppositionto,apropositiongiventhatthe Symmetry2017,9,186 6of15 evidencepointstotheproposition. Inotherwords,theunreliabilityofapieceofevidencesetsabound withinwhichanotherpieceofevidencecanplayaroleinsupportfor,andoppositionagainst,different propositions. On the other hand, evidence e can also be associated with a weight, denoted by w. i i Theweightofapieceofevidencesharesthesamedefinitionasthatofitsreliability[22].Whendifferent pieces of evidence are acquired from different sources, or measured in different ways, the weight of evidence can be used to reflect its relative importance in comparison with other evidence and determinedaccordingtowhousestheevidence. Tocombineapieceofevidencewithanotherpieceofevidence,itisnecessarytotakeintoaccount threeelementsoftheevidence;itsbeliefdistribution(probability),reliability,andweight. IntheER rule,thisisachievedbydefiningaso-calledweightbeliefdistributionwithreliabilityasfollows: (cid:110) (cid:16) (cid:17)(cid:111) mi = (θ,m(cid:101)θ,i),∀θ ⊆ Θ; P(Θ),m(cid:101)P(Θ),i (5) wherem measuresthedegreeofsupportforθfrome withboththeweightandreliabilityofe taken (cid:101)θ,i i i intoaccount,definedasfollows: 0 θ = φ m(cid:101)θ,i = crw,imθ,i θ ⊆ Θ,θ (cid:54)= φ (6) c (1−r ) θ = P(Θ) rw,i i c =1/(1+w −r ) (7) rw,i i i wherec denotesanormalizationfactor,w denotesforweight,andr denotesreliability. m isthe rw,i i i θ,i degreeofsupportforpropositionθfromevidencei,whichisgivenbym = w p ,with p beingthe θ,i i θ,i θ,i degreeofbeliefthatevidenceipointstoθ. Asdescribedpreviously, p canbeobtainedusingTable1, θ,i Equations(2)and(3)P(Θ)isthepowersetoftheframeofdiscernmentΘthatcontainsallmutually exclusive hypotheses in question. It is worth mentioning that P(Θ) is treated as an independent elementintheERrule[22]. Ifeverypieceofevidenceisfullyreliable,e.g.,r =1foranyi,theERrulereducestoDempster’s i rule. EvidencesgivenbyAISarenotfullyreliable,sor <1. Thecombinationoftwopiecesofevidence i e ande (definedinEquation(4))willbeconductedasfollows: 1 2 0 θ ⊆ φ pθ,e(2) = ∑D⊆mˆΘθ,meˆ(2D),e(2) θ ⊆ Θ (8) mˆθ,e(2) = [(1−r2)mθ,1+(1−r1)mθ,2]+∑B∩C=θmB,1mC,2∀θ ⊆ Θ (9) wherem ,m ,m andm aregivenbyEquations(5)–(7);B,CandDdenoteanyelementsinthe θ,1 θ,2 B,1 C,2 powersetP(Θ)exceptfortheemptyset;the p isthesyntheticbeliefdistributiontoproposition θ,e(2) θ whentakingthetwopiecesofevidencee ande intoconsideration. Yangetal.[23]provedthat 1 2 the belief distribution here is equivalent to the probability in Bayesian rule if belief is assigned to singletonstatesonlyand pθ iscalculatedbyEquation(4). Therefore,theERrulecanbeusedtoobtain i theprobabilityofanAISdatacellindicatestowhichclassification,byusingthemeanandvarianceof SOG,COGextractedfromtheAISdata,respectively. 2.4. Step4: NonlinearOptimization Weightcoefficientsweretrainedthroughanon-linearoptimization,withverifiedsamplestogeta higheraccuracyofclassification. Asdiscussed,inEquations(6)and(7),thereliabilityandweightof evidenceareparametersusedtomeasureevidencequality. InStep2,theevidencefromnormalized likelihoodswasgeneratedindependentlyofthepriordistributionofhypothesesbyassumingthat eachsetofsampledataisfullyreliable. Symmetry2017,9,186 7of15 In practice, the data provided by AIS is not completely reliable. Hence the evidences are not completely reliable either. As mentioned above, a weight of a piece of evidence is a relative importance which determined by whom is making the inference. In practice, such importance is exactlyrelatedtoaverifiedsampleandacertainoptimizationobjective[23]. Toagroupofverified samples,appropriateweightsofevidenceshouldmakethefinalresultmorelikelytoachieveasetting objective. Hence, appropriateweightcoefficientscanbeobtainedonlyifanobjectiveandverified samplesaredetermined. A typical objective is set to minimize a global deviation. All of the classification results were comparedwiththereality,theaimofoptimizingtheweightcoefficientisthat,foracertainnumberof validatedsamples,thenumberofincorrectclassificationdividethenumberofvalidatesamples,i.e.,the errorrate,isminimized. Infact,theglobaldeviationinquestiongenerallyincludestheclassification onobject. Thus,asignfunctionisoftenusedtodetermineadiscretestate(usuallytrueoffalse)of objectbasedontheirbeliefdistributions(probability)tocorrespondinghypotheses. However,thesign functionisverydifficulttobemodeledinrelatedalgorithms[24]. Hence,inthisresearch,theweight coefficientcanbesolvedinacompromisedwayasfollows. Let S betheverifiedsamplesindicatetovesselsbelongtoclassj, wherej=1,2, ... , m. And j N isthenumberofsamplesforclassj. LetSk denotesthesamplekinclassj,wherek=1,2,... ,N. j j j (cid:16) (cid:17) (cid:16) (cid:17) p Sk,wT denotestheprobabilitytopropositionθ ,whereθ indicatestotheclassj. p Sk,wT θj,e j j j θj,e j isobtainedbytheconjunctivereasoningprocessusingtheERrule. Thus,foreachjudgmentonthe (cid:16) (cid:17) (cid:16) (cid:17) sampleSk,thedeviationcanbepresentsas1− p Sk,wT . Alloftheprobability p Sk,wT share j θj,e j θj,e j thesameweightvectorwT={w ,w ,w ,w },whichdenotestheweightcoefficientsofmeanofSOG, 1 2 3 4 varianceofSOG,meanofCOG,varianceofCOG,respectively. Hence,theglobalaccuracyorsumof inferredprobabilitiesthathavebeenassignedtothecorrectpropositionsispresentedas, φ(cid:16)wT(cid:17) = ∑m ∑Nj (cid:104)1−p (cid:16)Sk,wT(cid:17)(cid:105) (10) j=1 k=1 θj,e j Tominimizethedeviation,wTshouldmakeEquation(10)minimum. Therefore,theoptimization formulationcanbepresentedas, (cid:16) (cid:17) wT =argmin φ wT (11) wT:feasible Asdiscussed,withoutasignfunction,thisoptimizationcanonlyprovideacompromisedsolution toweights. SinceEquation(10)iscontinuousandderivable, Equation(11)canbesolvedwiththe ‘fmincon’functionofMATLAB.Withthisprocedure,moreappropriateweightsofpiecesofevidence canbesolved. Actually,theweightsofpiecesofevidencecanalsobesetthroughoptimizingother objectives, depending on the requirements, for example, to achieve a higher accuracy of class 1. Otheroptimizationobjectiveswillbediscussedinthefollowingcasestudy. 3. ACaseStudy Tovalidatetheproposedmethodology,acasestudywasconductedatWuhanwaterway,located atmiddlestreamofYangtzeRiver,China. Forthepurposeoffollowingthenavigationalrulesand fuelprudent,captainstendtomaneuvertheirvesselswithafixedpatternwhichmaketheSOGand COGofvesselsindifferentroutescharacterized. ThecharacteristicSOGandCOGcouldbequantified bystatisticalanalysisofdynamicAISinformation. Takethemarinetrafficsceneattheintersection waterwayinWuhan,China,forexample. AsshowninFigure2,therearesixdirectionsofthetraffic flowattheintersectionwaterway,whicharelabeledbynumber1to6. Forvesselsindirections4and 6,thevesselswillkeepthespeedanddirectionsteadywhentheygothroughthechannelofYangtze River,hencethevarianceofSOGandCOGwouldbesmall,andthemeanoftheCOGwouldbesimilar tothedirectionofchannel,whatismore,fortherushofcurrent,themeanofSOGindirection4willbe Symmetry 2017, 9, 186 8 of 15 of the traffic flow at the intersection waterway, which are labeled by number 1 to 6. For vessels in Symmetry2017,9,186 8of15 directions 4 and 6, the vessels will keep the speed and direction steady when they go through the channel of Yangtze River, hence the variance of SOG and COG would be small, and the mean of the COG would be similar to the direction of channel, what is more, for the rush of current, the mean of lowerthanthatindirection6. Forvesselsindirection1,i.e.,voyagingdownstreamfromHanRiverto SOG in direction 4 will be lower than that in direction 6. For vessels in direction 1, i.e., voyaging YangtzeRiver,fortherushofcurrentandturningdirection,themeanofSOGshouldbehigherthan downstream from Han River to Yangtze River, for the rush of current and turning direction, the vesselsindirection3,andthevarianceofCOGshouldbelargerthanvesselsindirections4and6. mean of SOG should be higher than vessels in direction 3, and the variance of COG should be larger Forvesselsindirection5,i.e.,voyagingfromupstreamofYangtzeRivertotheHanRiver,duetothe than vessels in directions 4 and 6. For vessels in direction 5, i.e., voyaging from upstream of Yangtze impaRctivoefr ctou rthreen Ht,atnh Reisvpere,e dduwe toou thlde idmepcraecat soef csuorrthenatt, tthhee svpaereida nwcoeuoldf SdOecGreagseet ssol athrgaet rt,hae nvdartiahnecve aorfi ance of COSOGGa glseots blaercgoemr, eanldar tgheer v.aSriiamniclea rolfy C,OotGh earlsfoe baetucormese claarngebre. Sfiomuinladrlyfr, oomthetrh feeamtuereasn caann dbev faoruinadn ce of SOG,frComO Gth,er mesepaenc atinvde vlya.riIanncceo onfs SeOquGe, nCcOeG,b, ryessptaectitsivtieclya.l Iann caolnysseiqsuoefntche,e bfyo sutrataistttirciabl uatneaslyosfist ohfe thvee ssels indifffoeurre natttrriobuutteess ,otfh theeh videsdseelns imn odtiifofenrepnat trtoeurtnesw, tihlel bheidedxepno mseodti,onan pdatctearnn bweiltl rbaen esxfopromseedd, atnodp ciaenc esof be transformed to pieces of evidence to make the inference. evidencetomaketheinference. (a) (b) (c) (d) (e) (f) FigurFeig2u.rTe h2e. Tdhiere dcitrieocntisonosf ovfe vsseesslselisn int htheei nintteerrsseeccttiioonn wwaateterwrwaya,y W,Wuhuahna, nC,hCinhai.n Tah.eT bhlaecbk llainckesl winieths with arrows in (a–f) present the directions for vessels in different routes. (a) The direction 1 of vessels go arrowsin(a–f)presentthedirectionsforvesselsindifferentroutes. (a)Thedirection1ofvessels downstream from Han River to lower reaches of Yangtze River; (b) The direction 2 of vessels go go downstream from Han River to lower reaches of Yangtze River; (b) The direction 2 of vessels downstream from Han River to upper reaches of Yangtze River; (c) The direction 3 of vessels go godownstreamfromHanRivertoupperreachesofYangtzeRiver;(c)Thedirection3ofvesselsgo upstream from Yangtze River to Han River; (d) The direction 4 of vessels go upstream of Yangtze upstreamfromYangtzeRivertoHanRiver;(d)Thedirection4ofvesselsgoupstreamofYangtzeRiver, River, (e) The direction 5 of vessels go downstream from Yangtze River to Han River; (f) The (e)Thedirection5ofvesselsgodownstreamfromYangtzeRivertoHanRiver;(f)Thedirection6of direction 6 of vessels go downstream of Yangtze River. vesselsgodownstreamofYangtzeRiver. The verified samples were acquired from Wuhan Maritime Administrator from 1 January 2014 to 31 August 2016. The samples from 1 January 2014 to 31 May 2014 (including 5486 verified TheverifiedsampleswereacquiredfromWuhanMaritimeAdministratorfrom1January2014to samples) were applied to obtain the frequency distributions for the four attributes. The samples from 31August2016. Thesamplesfrom1January2014to31May2014(including5486verifiedsamples) 1 June 2014 to 31 October 2014 (including 3015 verified samples) were used to train the weight wereappliedtoobtainthefrequencydistributionsforthefourattributes. Thesamplesfrom1June coefficient of evidence. The samples from 1 November 2015 to 31 August 2016 (including 17,723 2014to31October2014(including3015verifiedsamples)wereusedtotraintheweightcoefficientof samples in total) were invited to validate the accuracy of methodology proposed in Section 2. evidence. Thesamplesfrom1November2015to31August2016(including17,723samplesintotal) were3i.n1v. Sitteepd 1t:o Svtaatlisidticaatle Athnaelyascisc uofr tahcey Aottfrmibuettehs ofrdooml oVgeryifipedro ApIoSs Dedatain Section2. In order to extract the frequency distributions from historical AIS data, 220 samples for 3.1. Step1: StatisticalAnalysisoftheAttributesfromVerifiedAISData direction 1, 117 samples for direction 2, 244 samples for direction 3, 2260 samples for direction 4, 108 Isnamorpdleesr ftoor edxirteractcitotnh 5e, farnedq u25e3n7c ysadmisptlreisb fuotri odnirsefcrtoiomn 6h, iwsteorrei csaelleActIeSdd aantda ,a2n2a0lyszaemd.p lesfordirection1, 117sa mplesfordirection2,244samplesfordirection3,2260samplesfordirection4,108samplesfor direction5,and2537samplesfordirection6,wereselectedandanalyzed. • Attribute1: MeanofSOG WhentheaccuracyofmeanofSOGwassetto0.1,thefrequencydistributionsofmeanofSOGfor sixdirectionsareshowninFigure3. Symmetry 2017, 9, 186 9 of 15 Symmetry 2017, 9, 186 9 of 15 •• AAttttrriibbuuttee 11:: MMeeaann ooff SSOOGG SymmeWWtryhh2ee0n1n7 t,th9h,ee1 8aa6ccccuurraaccyy ooff mmeeaann ooff SSOOGG wwaass sseett ttoo 00..11,, tthhee ffrreeqquueennccyy ddiissttrriibbuuttiioonnss ooff mmeeaann ooff 9 SSoOOf1GG5 for six directions are shown in Figure 3. for six directions are shown in Figure 3. ((aa)) ((bb)) ((cc)) ((dd)) ((ee)) ((ff)) FFFiiiggguuurrreee 33.3..F rFFerrqeeuqqeuuneecnnyccydy isddtirisisbttrruiibtbiuuotntiiosonnosfs moofef ammneoeaafnnS OooGff fSSoOOrGGve sffsooerrl svvienesssdseeifllsfse riiennn tdddiififfrfeeercreteinnottn sdd.iir(raee)cctTtiiohonenssf.r. e((qaau)) eTnThchyee dfrisetqruibeuntciyo ndisotfrivbeustsioenls oifn vedsisreelcst iionn di1r;ec(tbio)nT 1h; e(bf)r eTqhuee fnrecyqudenisctyri bduisttiroinbuotifonv eosfs velesssienls dinir edcitrieocntio2n; frequency distribution of vessels in direction 1; (b) The frequency distribution of vessels in direction (2c;) (Tc)h Tehfere fqreuqeunecnycdy idstirsitbriubtuiotinono fofv vesesseselslsi nin ddiirreeccttiioonn 33;; ((dd)) TThhee frfreeqquueennccyy ddisitsrtirbiubutitoino nofo vfevsesesslse lisn 2; (c) The frequency distribution of vessels in direction 3; (d) The frequency distribution of vessels in idnirdeicreticotnio n4; 4(;e()e T)hTeh efrferqeuqeunecnyc yddisitsrtirbibuutitoionn oof fvveesssseelsls inin ddiirreeccttiioonn 55;; ((ff)) TThhee ffrreeqquueennccyy ddiissttrriibbuuttiioonn ooff direction 4; (e) The frequency distribution of vessels in direction 5; (f) The frequency distribution of vveesssseellss iinn ddiirreeccttiioonn6 6.. vessels in direction 6. ••• AAAtttrttitrbriiubbtuuette2e : 22V:: VaVraairariniaacnnecceoe f ooSff O SSOGOGG The frequency distribution of variance of SOG is shown in Figure 4, where the accuracy of TThhee ffrreeqquueennccyyd idsitsrtirbiubtuiotinono fovfa rviaanricaenocfe SoOf GSOisGsh oisw snhionwFnig iunr eF4ig,wurhee r4e, twhehaecrceu trhaec yaocfcuvraarciayn coef variance of SOG was set to 0.2. ovfarSiOanGcew oafs SsOetGto w0a.2s .set to 0.2. ((aa)) ((bb)) ((cc)) Figure4.Cont. Symmetry2017,9,186 10of15 Symmetry 2017, 9, 186 10 of 15 Symmetry 2017, 9, 186 10 of 15 (d) (e) (f) (d) (e) (f) Figure 4. Frequency distribution of variance of SOG for vessels in different directions. (a) The FFiigguurree 44..F Frerqeuqeunecnycyd isdtirsitbruibtiuotnioonf voafr ivaanrcieanofceS OoGf fSoOrGve sfsoerl sviensdseiflfse rienn tddififreercetinotn sd.ir(ae)ctTiohnesf.r e(qau) eTnhcye frequency distribution of vessels in direction 1; (b) The frequency distribution of vessels in direction dfriesqtruibenuctiyo ndisotfribvuetsisoenls ofi nvedsisreelcst iino ndi1re;c(tibo)n T1h; e(bf) rTeqhue efnrecqyuednisctyr idbiusttiroibnutoifonv eosf sveelssseinls dinir decirteiocntio2n; (22c;;) ((ccT))h TTehhfeer effrrqeeuqqeuuneecnnyccyyd iddsiitssrttirrbiiubbutuittoiinoonno oofffv vveeesssssseeellslss iiinnn dddiiirrreeeccctttiiiooonnn 333;;; (((ddd))) TTThhheee ffrrfeerqqequuueennecncyyc yddiidssttisrriitbbriuubttuiiootinno nooff ovvfeessvsseeesllssse iilnns idnirdeicrteicotnio 4n; 4(e;)( eT)hTeh ferefrqeuqeunecnyc ydidsitsrtibriubtuiotino nofo fvevsesseslesl sinin ddirierecctitoionn 55; ;((ff)) TThhee ffrreeqquueennccyy ddiissttrriibbuuttiioonn ooff direction 4; (e) The frequency distribution of vessels in direction 5; (f) The frequency distribution of vveesssseellss iinn ddiirreeccttiioonn 66.. vessels in direction 6. • Attribute 3: Mean of COG •• AAtttrtirbiubtuete3 :3M: Meaenano foCf COOGG The frequency distribution of mean of COG is shown in Figure 5, where the accuracy of the TThhee ffrreeqquueennccyyd disitsrtirbibuutitoionno fomf meaenanof oCf OCGOiGs sihs oswhnowinnF iing uFrieg5u,rwe h5,e rwehtheerea cthcue raacccyuorfatchye omf etahne mean of COG was set to 6. omfeCaOn Gof wCaOsGse wttaos 6s.et to 6. (a) (b) (c) (a) (b) (c) (d) (e) (f) (d) (e) (f) Figure 5. Frequency distribution of mean of COG for vessels in different directions. (a) The frequency Figure 5. Frequency distribution of mean of COG for vessels in different directions. (a) The frequency Fdiigsturribeu5t.ioFnre oqfu venescsyedlsi sintr idbiurteicotnioonf 1m; e(ban) TohfeC fOreGqufoernvcyes dseislstriinbudtiifofenr eonf tvdeisrseecltsi oinn sd.i(rae)cTtihoen f2re; q(cu)e Tnhcye distribution of vessels in direction 1; (b) The frequency distribution of vessels in direction 2; (c) The dfriesqtruibenuctiyo dnisotfribvuetsisoenls ofi nvedsisreelcst iino ndi1re;c(tibo)n T3h; e(df) rTeqhue efnrecqyuednisctyr idbiusttiroibnutoifonv eosf sveelssseinls dinir decirteiocntio2n; frequency distribution of vessels in direction 3; (d) The frequency distribution of vessels in direction (c) The frequency distribution of vessels in direction 3; (d) The frequency distribution of vessels 4; (e) The frequency distribution of vessels in direction 5; (f) The frequency distribution of vessels in 4; (e) The frequency distribution of vessels in direction 5; (f) The frequency distribution of vessels in indirection4;(e)Thefrequencydistributionofvesselsindirection5;(f)Thefrequencydistributionof direction 6. direction 6. vesselsindirection6. • Attribute 4: Variance of COG • Attribute 4: Variance of COG • Attribute4: VarianceofCOG The frequency distribution of variance of COG is shown in Figure 6, where the accuracy was The frequency distribution of variance of COG is shown in Figure 6, where the accuracy was set toT 1h0e0f. requencydistributionofvarianceofCOGisshowninFigure6,wheretheaccuracywasset set to 100. to100.