ebook img

Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing ... PDF

17 Pages·2013·6.81 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing ...

Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image Xiaomei Zhong1, Jianping Li2,3*, Huacheng Dou2,3, Shijun Deng2,3, Guofei Wang2,3, Yu Jiang2,3, Yongjie Wang2,3, Zebing Zhou2,3, Li Wang2,3, Fei Yan4 1TianjinChengjianUniversity,Tianjin,China,2TianjinInstituteofGeotechnicalInvestigationandSurveying,Tianjin,China,3TianjinStarGISInformationEngineering CompanyLimited,Tianjin,China,4BeijingForestryUniversity,Beijing,China Abstract Currently,remotesensingtechnologieswerewidelyemployedinthedynamicmonitoringoftheland.Thispaperpresented analgorithmnamedfuzzynonlinearproximalsupportvectormachine(FNPSVM)bybasingonETM+remotesensingimage. ThisalgorithmisappliedtoextractvarioustypesoflandsofthecityDa’aninnorthernChina.Twomulti-categorystrategies, namely‘‘one-against-one’’ and ‘‘one-against-rest’’ for this algorithm were described in detail and then compared. Afuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM)underdifferenttrainingconditions.Theimpactsoftheselectionoftrainingsamples,testingsamplesandfeatureson the fourclassifiers were also evaluatedinthese experiments. Citation:ZhongX,LiJ,DouH,DengS,WangG,etal.(2013)FuzzyNonlinearProximalSupportVectorMachineforLandExtractionBasedonRemoteSensing Image.PLoSONE8(7):e69434.doi:10.1371/journal.pone.0069434 Editor:GuyJ.-P.Schumann,NASAJetPropulsionLaboratory,UnitedStatesofAmerica ReceivedDecember9,2012;AcceptedJune7,2013;PublishedJuly24,2013 Copyright: (cid:2) 2013 Zhong etal. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalauthorandsourcearecredited. Funding:FinancialsupportforthisstudywasprovidedbyXiaomeiZhong,JianpingLi,HuchengDou,andtheyhadakeyroleinalgorithmstudy,datacollection andanalysis,accuracyassessment. CompetingInterests:Theauthorsdeclarethattheyhavenoconflictofinterest,theyhavenofinancialandpersonalrelationshipswithotherpeopleor organizationsthatcaninappropriatelyinfluencetheirwork,thereisnoprofessionalorotherpersonalinterestofanynatureorkindinanyproduct,serviceand/or companythatcouldbeconstruedasinfluencingthepositionpresentedin,orthereviewof,themanuscriptentitled,‘‘FuzzyNonlinearProximalSupportVector MachineforLandExtractionBasedonRemoteSensingImage’’.Oftheauthors,LiJianping,DouHuacheng,DengShijun,WangGuofei,JiangYu,WangYongjie, ZhouZebing,andWangLi,arecurrentlyemployedbyTianjinStarGISInformationEngineeringCO,.LTD.Andtheauthorsherebydeclarethatthisaffiliationdoes notcauseanycompetinginterests,andthatitdoesnotaltertheiradherencetoallthePLOSONEpoliciesonsharingdataandmaterials. *E-mail:[email protected] Introduction architectures and therefore the information processing function thatitcarriesoutistheapproximationofaboundedmapping[7]. Remotesensing(RS)playsakeyroleinthedynamicmonitoring Furthermore, the approach can effectively avoid some of the of lands[1–3]. Approaches of land extraction that are based on problems associated with MLC by simulating the processing remote sensing image basically include manual visual interpreta- patterns of the human brain, although it also has some tion and computerized auto-classification. Due to the large disadvantages including a slow learning convergent velocity and number of drawbacks in manual visual interpretation, numerous being easily converging to local minimum [8]. Lastly, the basic classification algorithms for computerized auto-classification have ideaofdecisiontreeclassifieristobreakdownacomplexdecision- been developed; among the most popular are the maximum making process into a collection of simpler decisions, thus likelihood classifier, neural network classifiers and decision tree providing a solutionwhichisoften easier tointerpret. classifiers [4]. The maximum likelihood classifier is a popular Support vector machine (SVM) is based on statistical learning classifier on the basis of the assumption that classes in the input theory,andaimstodeterminethelocationofdecisionboundaries datafollowaGaussiandistribution.However,therewillbeerrors thatproducetheoptimalseparationofclasses[9].Thisapproach, in the results if the sample data size is not sufficient, where the a new classification technique in the field of remote sensing as inputdatasetdoesnotfollowtheGaussiandistributionand/orthe comparedtotheabovethreemethods,hasquicklygainedground classes have much overlap in their distribution, and therefore in the past ten years. The SVM classifier can achieve higher resulting in poor separability. The back propagation neural accuracies than both the ML (Maximum Likelihood) and ANN network model is widely applied because of its simplicity and its (ArtificialNeuralNetwork)classifiers[10]can,thusrecentlyithas power to extract useful information from samples [5,6]. It is a been applied to classify remote sensing images [11]. Although hierarchicaldesignconsistingoffullyinterconnectedlayersorrows perfect performance and high classification accuracy can be of processing units (with each unit comprising several individual achieved by basing on the SVM approach, there still are some processing elements, which will be explained below). Back shortcomings. Oneof suchshortcomings isthat theSVM mainly propagation belongs to the class of mapping neural network aims at the classification of a small number of training samples, PLOSONE | www.plosone.org 1 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine andthecostofcalculationincreasesrapidlywithlargerdatasize, especiallysoforremotesensingdata.Inordertoresolvesuchissue of high calculation cost, Fung and Mangasarian [12]proposed proximal support vector machine (PSVM), which can also be interpreted as regularized least squares and considered in the much more general context of regularized networks, wherein classifiespointsareassignedtotheclosetoftwoparallelplanesthat are pushed apart as far as possible. In addition, the method is much more efficient than traditional SVM in terms of running speedbecauseitmerelyrequiresthesolutionofasinglesystemof linearequations.Accuracyandspeedofclassificationaredeemed significant in the classification that’s based on remote sensing images.Avarietyoffactorswouldaffecttheaccuracyandspeedof classification: training data size, selection of feature, algorithm parametersetting,justtonameafew.Often,realdatasetscontain noisesandthenoisysamplesmightnotberepresentativeofaclass, asifthereisanuncertaintywithregardtotheclasstowhichthey belong. The noises tend to corrupt the data samples, and the optimal hyperplane obtained by the PSVM may be sensitive to noisesoroutliersinthetrainingsets.Asaresult,aclassifiermight not be able to correctly classify some of the data samples having noisydata,sothefuzzysupportvectormachines[13,14]andfuzzy linear proximal support machines [15,16] were proposed to Figure1.TheProximalSupportVectorMachineClassifier:The address theproblem. planesx’v{c~+1aroundwhichpointsofthesetsA+andA- Normallyhowever,realdatasetisnotlinearlyseparable.Inthis cluster and which are pushed apart by the optimization paper, we proposed the fuzzy nonlinear proximal support vector problem(1). machine (FNPSVM) to extract different types of lands, and this doi:10.1371/journal.pone.0069434.g001 technique is actually a fuzzy non-linear extension of the existing PSVM methods. In addition, we defined a fuzzy membership function that assigned a fuzzy membership to each data point, such that different data points could have different effects in the c 1 learning of the separating hyperplane. Additionally, for the min kyk2z (v’vzc2) purpose of improving algorithm performance, we presented the (v,c,y)[Rnz1zm2 2 approachesofsomekeyparametersofthisalgorithm,aswellasthe approachesoffeatureextractionandfeatureselection.Andlastly, wecomparedouralgorithmwiththeotherthreeclassifiers(MLC, s:t:D(Av{ec)zy~e, ð1Þ BPN,andPSVM). where e is an m-dimensional vector of ones, and y is an error The paperisorganizedas follows: vector.Whenthetwoclasses arestrictlylinearlyseparable,y~0 Section 2 discusses in detail the architectures of PSVM and i in (1) (which is not the case shown in Figure 1). As depicted in FNPSVM. Figure 1, the variables (v,c) determine the orientation and Training algorithm ofFNPSVMisshown insection 3. location of theproximal planes: Experimental results of the algorithm and discussion are presented insection 4. Section 5 contains theconcluding remarks. x’v{c~z1 Architectures of PSVM and FNPSVM Architecture of PSVM x’v{c~{1, ð2Þ To deduce our FNPSVM algorithm, we briefly introduce the binary category proximal support vector machine first. Let the aroundwhichthepointsofeachclassareclusteredandwhichare dataset consistingof mpoints inthen-dimensionalrealspaceRn pushed apart as far as possible by the term (v’vzc2) in the be represented by the m|n matrix, and let each point be objective function. Consequently, theplane: represented by an n-dimensional row eigenvector A(i~1,2,(cid:2)(cid:2)(cid:2),m): In the case of binary classification, each data poiint A in the class of A+ or A- is specified by a given m|m x’v{c~0, ð3Þ i diagonalmatrixD,with+1or-1elementsalongitsdiagonal.The midway between and parallel to the proximal planes (2), is a targetisseparatingthemdatapointsintoA+andA-,asdepictedin separating plane that approximately separates A+ from A- as Figure 1. For the problem, the proximal support vector machine with a linear kernel [12] is given by the following quadratic depictedinFigure1.Thedistance(cid:4)(cid:4)(cid:2)v2 (cid:3)(cid:4)(cid:4)iscalledthe‘‘margin’’ programwithparameterc.0(whichcontrolsthetradeoffbetween (cid:4)(cid:4) c (cid:4)(cid:4) themargin andthe error)andlinear equalityconstraint: (see Figure 1), and maximizing the margin enhances the generalization capability of a support vector machine [9,17]. PLOSONE | www.plosone.org 2 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine The approximate separating plane (3) shown in Figure 1, acts as (9) as in [12,20] first by substituting the variable v with its dual decision functionas follows: equivalent v~A’Du, and then by modifying the last term of the objectivefunctiontobethenormofthenewdualvariableuandc. 8 w0,thenx[Az Now weobtainthefollowing problem: >< x’v{c v0,thenx[A{ ð4Þ >:~0,thenx[A{orx[A{ min ckS(D(AA’Du{ec){e)k2z1(cid:4)(cid:4)(cid:4)(cid:2)u(cid:3)(cid:4)(cid:4)(cid:4)2 ð10Þ (u,c)[Rmz1 2 2(cid:4) c (cid:4) Architecture of FNPSVM If we now replace the linear kernel AA’ by a nonlinear kernel K In this paper, we will employ the following norms of a vector (A,A)’, weobtain: x’[Rn [17]: n L1normof x:~kxk1~XDxiD ð5Þ min ckS(D(K(A,A’)Du{ec){e)k2z1(cid:4)(cid:4)(cid:4)(cid:2)u(cid:3)(cid:4)(cid:4)(cid:4)2 ð11Þ i~1 (u,c)[Rmz1 2 2(cid:4) c (cid:4) Xn !1=2 Let F(u,c)~2ckS(D(K(A,A’)Du{ec){e)k2z12(cid:4)(cid:4)(cid:4)(cid:4)(cid:2)uc(cid:3)(cid:4)(cid:4)(cid:4)(cid:4)2, and L normof x:~kxk ~ (x)2 ð6Þ 2 2 i setting one-order derivative of F(u,c) with respect to u and c to i~1 (cid:5)LF(u,c)=Lu~0 zero, i.e. , wearriveat thefollowing formula: LF(u,c)=Lc~0 L normof x:~kxk ~ max (DxD) ð7Þ ? ? 1ƒiƒn i (cid:5)c(SDK(A,A’)D)’S(D(K(A,A’)Du{ec){e)zu~0 , ð12Þ c(SDe)’S(D({K(A,A’)Duzec)ze)zc~0 The fuzzy nonlinear binary category proximal support vector machine. Generally, real data sets are corrupted with where both D and S are diagonal matrices, and so that D~D’, noises. And as a result, it’s not always the case that one classifier S~S’ and D2~I.Further, we deal with the above formula (12), obtainedbytrainingwithnoisydatawouldcorrectlyclassifysome andobtain theequations with respectto u andc: ofthedatasamples.Sincetheoptimalhyperplaneonlydependson asmallpartofthedatapoints,itmaybecomesensitivetonoisesor 8 (c(SDK(A,A’)D)’SDK(A,A’)DzI)u{c(SDK(A,A’)D)’SDec outliers in the training set [18,19]. We can associate each data >>>< {c(SDK(A,A’)D)’Se~0 point with a fuzzy membership that reflects their relative degrees ð13Þ {c(SDe)’SDK(A,A’)Du awshmicehanitinbgfeulolndgast.a,Tanhdosaeccnoouinsetssfoorrthoeutulinecrsertaarientytreinattehdecalsassletsos >>>: z(c(SDe)’SDez1)czc(SDe)’Se~0 important and have lower fuzzy membership. This equips the Now let classifier with the ability to train data that has noises or outliers. Suchisdonebysettinglowerfuzzymembershipstothedatapoints thatareconsideredtobenoisesoroutlierswithhigherprobability. A classifier that is able to use information regarding this fuzzy M ~c(SDK(A,A’)D)’SDK(A,A’)DzI, 1 degree can improve its performance, and reduce the effects of L ~{c(SDK(A,A’)D)’SDe, noiseoroutliers.Thusweproposedthefollowingtheoptimization 1 problem in determining theclassifier: C ~{c(SDK(A,A’)D)’Se, 1 M ~{c(SDe)’SDK(A,A’)D, c 1 2 (v,c,y)m[Rinnz1zm 2kSyk2z2(v’vzc2) L2~c(SDe)’SDez1, C ~c(SDe)’Se 2 s:t: D(Av{ec)zy~e ð8Þ Andthusformula(13)canbeexpressedbythefollowingformula: where S denotes a diagonal matrix, i.e. S~diag(s1,s2,(cid:2)(cid:2)(cid:2),sm), (cid:2)M L (cid:3) (cid:2)u(cid:3) (cid:2){C (cid:3) whosediagonalelementscorrespondtothemembershipvaluesof 1 1 (cid:3) ~ 1 ð14Þ M L c {C thedatasamplesbelongingtoA+orA-;andeisthevectorofplus 2 2 2 ones. And0vsƒsƒ1(i~1,2,(cid:2)(cid:2)(cid:2),m). i Wecanworkoutuandcbysolvingformula(14),andhencethe Accordingtotheobjectivefunctionof(8),ycanbereplacedby binary category nonlinear classifier canbewritten asfollows: v and c, so we then arrive at the following unconstrained minimization problem: 8w0,thenx[Az c 1 >< min kS(D(Av{ec){e)k2z (v’vzc2) ð9Þ K(x,A)Du{c v0,thenx[A{ ð15Þ (v,c)[Rnz1 2 2 >:~0,thenx[Azorx[A{ Toobtainfuzzynonlinearproximalclassifier,wemodifyformula PLOSONE | www.plosone.org 3 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine The fuzzy nonlinear proximal support vector machine. There are roughly four types of support vector machines that handle multi-class problems [21]. Two strategies D ~1forA[Ar,D ~{1forA[=Ar,r[f1,(cid:2)(cid:2)(cid:2),kg ii i ii i have beenproposedtoadapttheSVMtoN-classproblems [22], namelythe‘‘one-against-one’’strategyandthe‘‘one-against-rest’’ Fromformula(14),thekuniqueuandccanbeobtained,andthus strategy.The‘‘one-against-one’’strategyistoconstructamachine k proximal surfaces aregenerated: for each pair of classes, resulting in N (N- 1)/2 machines. When appliedtoatestpixel,eachmachinegivesonevotetothewinning class,andthepixelislabeledwiththeclasshavingmostvotes.The K(x,A)Dur{cr~0,r~1,(cid:2)(cid:2)(cid:2),k ‘‘one-against-rest’’strategyistobreaktheN-classcaseintoNtwo- class cases, in each of which a machine is trained to classify one Anewgivenpointx[Rnisassignedclasst,dependingonwhichof class against all others[4]. In thispaper, weemployed the above theknonlinearhalfspacesgeneratedbytheksurfacesitliesdeepest mentioned strategies. in, namely: ¤‘‘One-against-one’’ strategy: A~(cid:6)A1(cid:2)(cid:2)(cid:2)Ak(cid:7),Az~Ar,A{~Aj, K(x,A)Dut{ct~maxK(x,A)Dur{cr,r~1,(cid:2)(cid:2)(cid:2),k: r[f1,(cid:2)(cid:2)(cid:2),k{1g,j[f2,(cid:2)(cid:2)(cid:2),kg,rwj: Inthismethod,SVMclassifiersforallpossiblepairsofclassesare created. Therefore, for M classes, there will be binary classifiers. Here, k is the class number, while Ar[Rmr|n and Aj[Rmj|n The output from each classifier in the form of a class label is represent themr andmj points in class r and class j, respectively. obtained.Theclasslabelthatoccursmostisassignedtothatpoint Letm~mrzmj,andthusDisam|mdiagonalmatrixasfollows: inthedatavector.Incaseofatie,atie-breakingstrategymaybe adopted.Acommontie-breakingstrategyistorandomlyselectone of theclasslabels that are tied [23]. Dii~1forAi[Ar,Dii~{1forAi[Aj, Training algorithm of FNPSVM From formula (14), the k|(k{1)=2 unique u and c can be Fuzzy membership model obtained,andthusk|(k{1)=2proximalsurfacesaregenerated: In order to improve classification performance and to reduce the corruption of data samples from noises, we defined a fuzzy membership function to a given class, where a membership is K(x,A)Dus{cs~0,s~1,(cid:2)(cid:2)(cid:2),k|(k{1)=2 assigned toeach datapoint. It iswrittenas: Aittihmnceeldawsbsgyinivke|tner(mpko{sino1tf)tx=h2[eRpfnroolilxosiwmaisansliggsnufoerrdfamctuheslea,:aitnhdclfainssalTlyix(i~isa1s,s(cid:2)ig(cid:2)(cid:2)n,ekd) f(x)~8>><e{t12t1{:t2t1x 0ƒt1xƒƒxtƒ1 t2 , >>: 0 t ƒxƒ1 2 K(x,A)Dui{ci~maxT,i~1,(cid:2)(cid:2)(cid:2),k, where x denotes the distance between the data sample and the i centeroftheclassthatitbelongsto.Inaddition,t andt thattune 1 2 supposingthedatasetistobeclassifiedintoMclasses.Therefore, the fuzzy membership of each data point in the training are two M binary SVMclassifiers maybe created whereeach classifier is user-definedconstants,andtheydeterminetherangeinwhichthe trained to distinguish one class from the remaining M-1 classes. datasampleabsolutelydoesordoesnotbelongtoagivenclass.On Forexample,classonebinaryclassifierisdesignedtodiscriminate the other hand, they also control the figure of the curve (see between class one data vectors and the data vectors of the Figure 2). remaining classes. Other SVM classifiers are constructed in the Areducingvalueofxwouldindicatethatthedistancebetween same manner. During the testing or application phase, data thedatasamplepointandthecenterofthegivenclassissmaller, vectorsareclassifiedbyfindingmarginfromthelinearseparating andtheprobabilityofthissamplebelongingtothiscertainclassis hyperplane. The final output is the class that corresponds to the higher.Whenxisbetween0andt1,thedatasamplepointbelongs SVM withthelargest margin[23]. tothegivenclasswithabsolutecertainty;andwhenxisbetweent2 ¤‘‘One-against-rest’’ strategy: and 1, the data sample point doesn’t belong to the given class. When the value of x is known, the values of t and t would 1 2 influence the values of fuzzy memberships, and thus would also influence theultimate classification result. A~(cid:6)A1(cid:2)(cid:2)(cid:2)Ak(cid:7),Az~Ar,A{~(cid:6)A1(cid:2)(cid:2)(cid:2)Ar{1Arz1(cid:2)(cid:2)(cid:2)Ak(cid:7), The distance x is the key of each training sample’s fuzzy r[f1,(cid:2)(cid:2)(cid:2),kg,Az~Ar, membership, andit canbeobtained asfollows: wherekistheclassnumber,Ar[Rmr|nrepresentsthemrpointsin 1Xn class r. Letting m~m1zm2z(cid:2)(cid:2)(cid:2)zmk, so that D is a m|m Mt~n VFti,t~1,(cid:2)(cid:2)(cid:2),p: diagonal matrix as follows: i~1 PLOSONE | www.plosone.org 4 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine pointsasthedatapointsaremappedintoahighdimensionalspace inwhichthedata aremore clearlyseparable [27,28]. ThreekernelfunctionsfornonlinearSVM,includingtheradial basis function (RBF), the polynomial, and thesigmoid are widely used.Inthispaper,wehaveadoptedtheGaussianRBFkernelas thedefaultkernelfunctionmodelduetothefactthat:(1)TheRBF kernelcanhandlethecasewheretherelationbetweenclasslabels and attributes is nonlinear [29]; (2) The polynomial function spends a longer time in the training stage of SVM, and some previous studies [30–32] have reported that the RBF function would provide better performance compared to polynomial function. In addition, the polynomial kernel has more hyper parameters than RBF kernel does, and may approach infinity or Figure2.FigureofFuzzyMembershipFunction:t1andt2that zerowhilethedegreeislarge[29];(3)Thesigmoidkernelbehaves tunethefuzzymembershipofeachdatapointinthetraining like the RBF under certain parameters; however, it is not valid aretwouser-definedconstants,andtheydeterminetherange under some parameters [9]; (4) When the size of sample data is inwhichthedatasampleabsolutelydoesordoesnotbelong toagivenclass. quitelarge,convergentabilityofRBFkernelisstrongerthanthat doi:10.1371/journal.pone.0069434.g002 of theother kernelsabove. The Gaussian kernel functionisexpressed as: DMt~maxDVFti{MtD,i~1,(cid:2)(cid:2)(cid:2),n,t~1,(cid:2)(cid:2)(cid:2),p: K(A,B) ~e{s(cid:4)(cid:4)Ai’{B:j(cid:4)(cid:4)2,i~1,(cid:2)(cid:2)(cid:2),m,j~1,(cid:2)(cid:2)(cid:2),k ij Here, the matrix A[Rm|n, and B[Rn|k;A is the ith row of A, i x~1Xp DVF {MD=DM,i~1,(cid:2)(cid:2)(cid:2),n,t~1,(cid:2)(cid:2)(cid:2),p, which is a row vector in Rn, while B:j is the jth column of B; the i p ti t t kernel K(A,B) maps Rm|n|Rn|k into Rm|k. In particular, if x t~1 and y are column vectors in Rn, then K(x’,y) is a real number, wherenisthenumberoftrainingsamplestoagivenclass,andpis K(x’,A’) is a row vector in Rm, and K(A,A’) is a m|m matrix. the number of feature selected, with VF representing the tth The parameter s of the RBF kernel is a user-defined positive ti featurevalueoftheithsample.M isthemeanvalueoftthfeature constantregulatingthewidthoftheGaussiankernel,whichhasan t ofnsamplestoagivenclass;DM isthemaxvalueofthedistances important impact on kernel performance. There is however little t betweenallsamplepointsandthecenter(M)ofthetthfeaturetoa guidance in the literatures on the criteria of selecting kernel- t given class; and x denotes the average distance between the ith specific parameters [33], hence we carried out lots of trials to i sample andthecenters of all features. acquire theoptimalparameter s: Sample Selection Parameter Selection Method The choice in sample size and sampling design affect the Regardless of using a simple or a more complex classifier, the performance and reliability of a classifier. Sufficient samples are learningparametershavetobechosencarefullyinordertoyielda necessary. A previous study indicated that this factor alone could good classification performance. The FNPSVM algorithm pro- bemoreimportantthantheselectionofclassificationalgorithmsin posedinthispaperrequiresfourgivenparameters,specificallyc,s, obtaining accurate classifications [24]. t and t . Vapnik [9] discovered that varying kernel functions 1 2 Sample selection includes two parts, namely sample data size would slightly affect classification results of SVM, while the andselectionmethod.Increasesinsampledatasizegenerallywill parameters of the kernel functions and penalty constant c would leadtoimprovedperformances,thoughatthesametimeresulting have astrong effecton theperformance of SVM. in a higher calculation cost. The sample size must be sufficient One such parameter c.0 is an important quantity in enoughtoprovidearepresentativelymeaningfulbasisfortraining determining a trade-off between the empirical error (number of of a classifier and for accuracy assessment. The basic sampling wronglyclassifiedinputs)andthecomplexityofthefoundsolution. designs,suchassimplerandomsampling,canbeappropriateifthe Normally large values for c lead to fewer training errors (and a sample size is large [25] enough. The adoption of a simple narrower margin), all at the cost of more training time; whereas sampling design is also valuable in helping to meet the smallvaluesgeneratealargermargin,withmoreerrorsandmore requirements of a broad range of users [26]. In this paper, we training points situated inside the margin. Since the number of apply simple random sampling design to collect training samples trainingerrorscannotbeinterpretedasanestimateofthetruerisk, and testingsamples. thisknowledgedoesnotreallyhelpinchoosingasuitablevaluefor theparameter.TheparametersoftheGaussiankernelaffectsthe Kernel Function Strategy complexity of the decision boundary. Improper selection of these TheconceptofthekernelisintroducedtoextendSVM’sability two parameters can cause over-fitting or under-fitting problems indealingwithnonlinearclassification.Itcantransformnon-linear [29,34]. Nevertheless, there is little explicit guidance to solve the boundaries in low-dimensional space into linear ones in high- problem of choosing parameters for SVM. Recently, Hsu [35] dimensional space by mapping feature vector into a high- suggested a method in determining parameters, namely grid- dimensional space, and thus the training data can be classified searchandcrossvalidation.Formulti-categoryhowever,thecross inthehigh-dimensionalspacewithoutknowingthespecificformof validation method is not feasible. In this paper, we advanced his themappingfunction.Akernelfunctionisageneralizationofthe method and proposed an approach named the multi-layer grid distance metric that measures the distance between two data search andrandom-validation. PLOSONE | www.plosone.org 5 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine Thebasicideaofrandom-validationisthatwerandomlydivide onGaussianRBFkernelindealingwiththen-classcasewereused, thesamplesetintotrainingsetandtestsetofdifferentsizetoeach and the results (various accuracies, training speed, and classifica- category. The test set is sequentially tested using the classifier tion speed) obtained using FNPSVM algorithm were compared trained on the training set, and the classification accuracy is with those derived from the four conventional classification derived. The above procedure is iteratively executed for n times methodsincludingthemaximumlikelihoodclassifier(MLC),back during each cycle, and n accuracies are obtained. Finally, the propagation neural network (BPN), support vector machine random-validation accuracy isthemean ofnaccuracies. (SVM), and proximal support vector machine (PSVM) under Werecommendthe‘‘multi-layergridsearch’’methodoncand different trainingconditions (shownin Table1). s using n random-validation, in order to accurately find the Feature extraction and feature selection. (1) Feature optimal parameters while lowering computational cost. We first extraction.Featureextractionhasastrongimpactonclassification acquire the boundary of the parameters c and s, and the 2- accuracy. In this paper, we extracted 14 features, including six dimentional grid of pairs of (ci,sj) is roughly constructed. Here, bands of ETM+ image, the first principle components of K-L i~1,2,(cid:2)(cid:2)(cid:2),m,andj~1,2,(cid:2)(cid:2)(cid:2),n,thusm|ngird-planeandm|n transform and K-T transform, soil index, NDVI (normalized pairs of (ci,sj) are obtained. The FNPSVM algorithm uses each differencevegetationindex),compositionindex,aswellasH(hue), pair of (ci,sj) to learn by basing on n random-validation, and S (saturation), and I (intensity) color components of HSI color obtainstheclassificationaccuracy.Thecorresponding(c,s) of space.Some ofthefeatures can beobtainedas follows: i j high thebestaccuracyistheoptimalpair.Ifthebestaccuracydoesnot satisfytherequirementofclassification,anew2-dimentionalgrid- Soilindex: SI~(B {(255{B ))=(B z(255{B ))½37(cid:4) 5 4 5 4 planethat’sbasedonthecenterofthepairof(c,s) shouldbe i j high constructed, and the learning by using new pairs of (c,s) in the newgrid-planeisexecutedtoacquirehigheraccuracy.Theabove NDVI: VI~(B {B )=(B zB ) procedureisperformediterativelytofindtheoptimalparametersc 4 3 4 3 and s. Although the multi-layer grid search and random-validation seemsimple,itisactuallypracticalbecauseofthefactthat:(1)For Compositionindex: CI~(B5{B1)=(B5zB1)½37(cid:4) each parameter, a finite number of possible values is prescribed, andthenallpossiblecombinationsof(c,s)areconsideredtofind Here B1, B3, B4, B5 represent the first band, third band, forth onethatyieldsthebestresult;(2)thecomputationaltimeinfinding band, andfifth band of ETM+ image, respectively. goodparametersthroughtheapproachisn’tmuchmorethanthat In the field of digital image processing, a number of color of advanced methods, since there are only two parameters models were proposed, such as RGB, HSI, CIE, etc. But selecting (generally the complexity of grid search grows exponentially with the most optimal color space is still a problem in color image the number of parameter); (3) The grid-search can be easily segmentation [20]. parallelized because each(c,s)isindependent,unlikesomeother TheRGBcolormodelissuitableforcolordisplay,butlesssofor advanced methods that require iterativeprocesses. color analysis because of its high correlation among R, G, and B colorcomponents[38].Incolorimageprocessingandanalysis,we Experiments and Discussion know that: (1) H and S components are closely correlated to the color sense of the eyes; (2) Hue information and intensity All experiments were run on 1800MHz ADM Sempron (tm) information are distinctly differentiated in HSI model; (3) By processor 3000+ under Windows XP using Matlab 7.0 compiler. HSI model, computer program can easily process color informa- WehaveadoptedtheclassificationcriterionofChen[36];saline- tion after the color sense of the eye has been transformed into alkalized lands are classified into heavy saline-alkalized land, specific values, so we extracted H, S, and I color components of moderate saline-alkalized land, andlightsaline-alkalized land. HSIcolorspaceasthreefeaturesofclassification.Falsecolorimage composite of bands 5, 4, and 2 were performed, after which the Classification Experiments Using ETM+ Image imagewasexportedintoRGB image. Andfinallythe RGBmodel Experiment summary. We have selected Da’an, a city in was transformed into HSI model according to the following northern China with a total area of 4,879km2 as our test area. formulas [39]: Multi-spectral (Landsat-7 ETM+) remote sensing data (30 m spatial resolution, UTM project) acquired on August 30th, 2000 ( ) ½(R{G)z(R{B)(cid:4)=2 was used to classify the image data into nine land cover types H~arccos (heavy saline-alkalized land, moderate saline-alkalized land, light ½(R{G)2z(R{B)(G{B)(cid:4)1=2 saline-alkalized land, water area, cropland, grassland, rural residential area, urban residential area,andsand land). According to the topographic maps of Da’an city (1:100,000 3 1 S~1{ ½min(R,G,B)(cid:4),I~ (RzGzB), scale), we implemented precise geometric correction and resam- RzGzB 3 pling of the image. Geometric correction of image was accom- plished through two-order polynomial while resampling was (2)Featureselection.Normally,thesizeofarealdatasetissolarge achieved through cubic convolution with the error of matching thatlearningmightnotwork,andtherunningtimeofalearning lessthan onepixel. We selected 270samples (90for trainingand algorithm might be drastically increased before removing these 180fortesting)foreachclassusingarandomsamplingprocedure unwanted features. Thus we must select some features that are from the image, totally 810 training samples and 1,620 test neither irrelevantnorredundant tothetarget concept. samples for nine classes. For each sample set, the test set was Featureselectionforclassificationisawell-researchedproblem, independent of thetraining set. striving to improve the classifier’s generalization ability, and to Todemonstratetheeffectivenessoftheproposedmethod,both reduce the dimensionality and the computational complexity. It ‘‘one-against-one’’and‘‘one-against-rest’’strategiesthatarebased directly reduces the number of original features by selecting a PLOSONE | www.plosone.org 6 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine Table1. Trainingdata conditions underwhichthe classification algorithms were tested. Samplesize Numberoffeatures Trainingcaseno. Trainingsamplenumber Testingsamplenumber 60 210 4 A 7 B 10 C 14 D 90 180 4 E 7 F 10 G 14 H 120 150 4 I 7 J 10 K 14 L doi:10.1371/journal.pone.0069434.t001 subset of them that still retains sufficient information for problem,thecorrespondingparametersofthebestperformanceof classification [40]. Feature selection attempts to select the each algorithm were chosenforthepurposeof comparison. minimally sized subset of features according to the following (1)ParametersettingofPSVMandFNPSVM.Theperformance criterion. Thecriterion canbe [41]: ofclassificationalgorithmsisaffectedbytheparametersettingsof thosealgorithms.Asdescribedinsection3.4,wesearchedforthe 1) The classification accuracy does not significantly decrease; optimalparameterst ,t ,c,andsforFNPSVMclassifier.Inthis 1 2 and procedure, we used two steps to find the best parameters. In the 2) Theresultingclassdistributionwhengivenonlythevaluesfor firststep,wesettheparameterst =0.1andt =0.8,andsearched 1 2 theselectedfeatures,isascloseaspossibletotheoriginalclass forthekernelparametersandpenaltyconstantcasdescribedin distribution whengiven allfeatures. section 3.5. In the second step, we set the parameters s and c as foundinthefirststep,andsearchedfortheparameterst andt of 1 2 For this paper, in terms of the above criterion, the data types the fuzzy membership mapping function. In the first step, we and the characteristics of remote sensing image, we adopted constructedthetwo-dimensionalgridforthefirstlayer.Thevalues traditional DB Index rules which used the methods of between- ofcandswereprescribedfrom2214to214,multipliedby24.The classscatterandwithin-classscattertoselectclassificationfeatures. grid-searchusing5-timerandom-validationwasexecuted,andwe DBIndexrules are asfollows[42]: foundthattheoptimalparameterpair(c,s)was(210,2210),having 1) the highest overall classification accuracy (93.31%) and kappa 1 X Si~N kx{Xik, Table2.DB indices offourteen featuresand theirranks. ix[Ni where Ni denotes the number of samples of ith class; and Xi Rank Feature DBindex represents thecenterof the ithclass. 1 the6thbandofETMimage 2.0408 2) 2 the5thbandofETMimage 4.2657 dij~(cid:4)(cid:4)Xi{Xj(cid:4)(cid:4), 3 the4thbandofETMimage 5.0092 4 CI(CompositionIndex) 6.3319 where d isthedistancebetween thecenters of thetwoclasses. 5 the1stcomponentofK-Ltransform 7.0428 ij 3) DB Index DB ~1Xk R, R~ max SizSj,where k is 6 HcomponentofHSIcolorspace 7.6135 k ki~1 i i j~1,(cid:2)(cid:2)(cid:2),k,j=i dij 7 SI(SoilIndex) 8.5819 thenumber of classes. 8 NDVI(NormalizedDifferenceVegetationIndex) 9.8511 ThesmallerthevalueofDB is,thebettertheperformanceof 9 the1stcomponentofK-Ttransform 10.8020 k classificationis.Basedontheaboverulesand270samplepointsof 10 the1stbandofETMimage 14.8599 each category, we obtained DB indices of fourteen features and 11 the3rdbandofETMimage 25.2807 their ranks(see Table2). 12 the2ndbandofETMimage 26.7408 Parameter setting. Due to the differing nature of the 13 IcomponentofHIScolorspace 29.8844 impacts that algorithm parameters have on different algorithms, it is impossible to account for such differences in evaluating the 14 ScomponentofHIScolorspace 153.2745 comparative performances of the algorithms [4]. To avoid this doi:10.1371/journal.pone.0069434.t002 PLOSONE | www.plosone.org 7 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine value (0.9248). Table 3 summarized the resultsof first-layer grid- gradient descent with momentum. The other parameters of the search. Subsequently we constructed the second-layer grid based network are chosen as follows: learning rate g=0.5, momentum on the center (210, 2210); and the values of c and s were chosen factora=0.8,minimumgradientd=10220,andminimummean from27to213andfrom227to2213,multipliedby2respectively; squareerrore=1026.Figure3showstheclassificationmapsusing and the grid-search using 5-time random-validation was imple- theMLC,BPN,PSVM,andFNPSVM,allbasedonthesettingsof mented. As was shown in Table 4, c=213 and s=2213 gave the above parameters ofvarious classifiers. bestoverallclassificationaccuracy(93.56%)andkappacoefficient Performance assessments. Normally, settings of the vari- (0.9275). As the accuracies could fundamentally satisfy our ous parameters on different algorithms affect the classification classification demand, we began the next step, where we set the results, so it is difficult to evaluate the comparative performances parametersc 213ands=2213,andsearchedfortheparameterst ofthealgorithmsbecauseofthechangingparameters.Toaddress = 1 and t . Unfortunately, we couldn’t find that the changes of this problem, the best performance of each algorithm on each 2 parameters t (0.05,0.2) and t (0.7,0.9) to be able to training case was listed in the following tables. The criterion for 1 2 significantly improve the performance of the FNPSVM, hence evaluating the performances of classification algorithms includes weset t =0.1,t =0.8. accuracy,speed,stabilityandcomprehensibility,amongothers[4]. 1 2 (2) Parameter setting of BP neural network.There are many In this paper, we chose one group of criteria, consisting of parameters associated with BP neural network, including neuron classification accuracy, speed and stability to assess the perfor- number,transferfunction,learningrate,iterationtimeandsoon. mances of different algorithms. Table 5 gave overall accuracies Itisnoteasytoknowbeforehandwhichvaluesoftheseparameters and kappa coefficients using various multi-class strategies and arethebestforaproblem.Consequentlyinthispaper,inorderto classifiers with ETM+ data on different cases. Using different yieldtheoptimal classificationperformances, thesettings ofsome classifiers under different training conditions, Table 6 gave key parameters of BP neural network were achieved by repeated training speed and classification speed of the entire data set. trials andsome experiences fromprevious studying. Means and standard deviations of the overall classification ABPneuralnetworkwithahiddenlayercanapproximatewith accuracies basing on different training samples, testing samples arbitrary precisionanarbitrary non-linearfunctionthat’sdefined and features, were manifested in Table 7. Figure 4 shows the on a compact set of Rn [43,44].We employed three-layer BP boxplots of the overall classification accuracies, developed by neural network including input layer, hidden layer and output randomlyselectingtrainingsamplesandtestingsamplesfromthe layer. The number of neurons in the hidden layer is one of the 270samples ofeach classfor sixtimes. primaryparametersofBPNalgorithm;currentlyhoweverthereis (1)Classificationaccuracy.Inthispaper,classificationaccuracy, no authoritative rule to determine it. Larger number of hidden oneofthemostimportantcriterionsinevaluatingtheperformance units leads to a poor generalization and increases training time, oftheclassifier,wasmeasuredusingoverallaccuraciesandkappa buttoofewneuronswouldcausethenetworkstounfitthetraining coefficientscomputedbytheconfusionorerrormatrix.Themost set and to prevent the correct mapping of inputs and outputs. In widelyusedwaytorepresenttheclassificationaccuracyofremote this paper, the number of neurons in the hidden layer was sensing datashould beintheformofan errormatrix,applicable determined by the empirical formula [44] to be 20, thus the for a variety of site-specific accuracy assessments. Numerous network structure became n-20-9 (n denotes the number of researchershaverecommendedusingerrormatrixinrepresenting features). accuracy inthepast, andit hasnowbecome oneof thestandard We chose log-sigmoid function as the transfer functions from conventionstoadoptsuchpractice.Theeffectivenessoftheerror inputlayer,whilesettingthelimitontheneuralnetwork’siteration matrix in representing accuracy can be seen from the fact that number to be 1,000 times for each desired output. Levenberg- accuracies of each category are fundamentally described along Marquard optimum algorithm (trainlm function in Matlab withboththeerrorsofinclusionanderrorsofexclusionpresentin software) was utilized as the training function because it could the classification [25,45]. In order to accommodate the effects of greatlyincreasethetrainingspeedofthenetworkbyutilizingalot chance agreement, some researchers suggest using kappa coeffi- of memory. Gradient descent with momentum weight and bias cient and adopting it as a standard measure of classification learning function was employed to calculate a given neuron’s accuracy[46].Foody[47]alsopointedoutthatsincemanyofthe weight change from the neuron’s input and error, the weight, remote sensing data sets are dominated by mixed pixels, the learning rate, and the momentum constant according to the standard accuracy assessment measures such as the kappa Table3.Theoverallaccuracies(%)andkappacoefficientsofthefirstlayergrid-searchusing5-timerandom-validationbasedon ETM+image. cs 2214 2210 226 222 22 26 210 214 2214 40.87/0.3348 69.15/0.6530 17.92/0.0766 11.11/0 11.11/0 11.08/0 11.11/0 11.11/0 2210 59.65/0.5461 74.25/0.7103 59.18/0.5408 12.45/0.0151 11.09/0 11.11/0 11.11/0 11.11/0 226 64.00/0.5950 81.48/0.7916 75.83/0.7281 25.00/0.1563 11.43/0.0036 11.52/0.0046 11.34/0.0026 11.34/0.0026 222 76.93/0.7405 88.02/0.8653 85.30/0.8346 42.13/0.3490 12.53/0.0160 11.57/0.0051 11.62/0.0058 11.49/0.0043 22 85.60/0.8380 90.71/0.8955 90.32/0.8911 47.95/0.4145 13.15/0.0230 11.55/0.0050 11.60/0.0055 11.54/0.0048 26 89.33/0.8800 92.78/0.9188 89.91/0.8865 46.20/0.3948 13.80/0.0303 11.42/0.0035 11.43/0.0036 11.61/0.0056 210 91.86/0.9085 93.31/0.9248 89.36/0.8803 48.60/0.4218 13.49/0.0268 11.57/0.0051 11.55/0.0050 11.70/0.0066 214 92.44/0.9150 93.08/0.9221 87.70/0.8616 45.14/0.3828 13.79/0.0301 11.71/0.0068 11.61/0.0056 11.67/0.0063 doi:10.1371/journal.pone.0069434.t003 PLOSONE | www.plosone.org 8 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine Table4.Theoverallaccuracies(%)andkappacoefficientsofthesecondlayergrid-searchusing5-timerandom-validationbased onETM+ image. cs 227 228 229 2210 2211 2212 2213 27 91.20/0.9010 92.82/0.9193 92.34/0.9138 92.91/0.9203 92.71/0.9180 92.11/0.9113 91.00/0.8988 28 91.14/0.9003 92.45/0.9151 92.74/0.9183 92.71/0.9180 92.42/0.9148 92.07/0.9108 91.71/0.9068 29 91.34/0.9026 91.95/0.9095 92.94/0.9206 92.68/0.9176 92.75/0.9185 92.57/0.9165 92.10/0.9111 210 89.70/0.8841 92.45/0.9151 93.36/0.9253 92.48/0.9155 93.17/0.9231 92.91/0.9203 92.08/0.9110 211 90.99/0.8986 91.37/0.9030 92.37/0.9141 92.75/0.9185 92.99/0.9211 92.29/0.9133 92.57/0.9165 212 90.19/0.8896 91.49/0.9043 92.23/0.9126 93.08/0.9221 92.63/0.9171 92.96/0.9208 93.06/0.9220 213 90.16/0.8893 91.57/0.9051 92.42/0.9148 92.82/0.9193 93.05/0.9218 92.80/0.9190 93.56/0.9275 doi:10.1371/journal.pone.0069434.t004 coefficient isoftennot suitableforaccuracy assessmentinremote against-one’’ multi-class classification strategy of PSVM and sensing.Althoughitssensitivitytothedensityorfrequencyofthe FNPSVMfor comparisonwith theothertwoclassifiers. dynamicchangeinrealworldhadsomeresearchersarguingabout 2)ThelevelofclassificationaccuraciesachievedbyPSVMand its effect, the fact remains that the kappa coefficient has many FNPSVM was significantly higher than that produced by either intriguing features as an index of classification accuracy. More theMLCorBPNclassifier.Inaddition,theyyieldedsignificantly specifically, it offers some compensation for chance agreement, betterresultsthantheMLCorBPNclassifierdidinall12training and a variance term could be calculated, enabling the statistical cases(Table5).TheaccuracydifferencesbetweenthePSVMand testingofthesignificanceofthedifferencebetweentwocoefficients FNPSVMwere rather small,and quitethesame asthat between [25,48]. theMLCandBPN(Table5).Themeanoverallaccuraciesofthe We also need to emphasize that the various measures of PSVMandFNPSVMwereremarkablyhigherthanthoseofMLC accuracyaretoevaluatedifferentcomponentsofaccuracyandto and BPN, however the differences between MLC and BPN or make different assumptions on thedata [49]. The fact is that the betweenPSVMandFNPSVMwereonlyslight(Table7).Thisis measurement and meaning of classification accuracy depend expectedbecausethePSVMandFNPSVMaredesignedtolocate substantially on individual perspective and demands [49,50]. An anoptimalseparatinghyperplane,whiletheothertwoalgorithms accuracyassessmentcanbeconductedforavarietyofreasons,and maynotbeabletolocatethisseparatinghyperplane.Statistically, many researchers have recommended that measures such as the the optimal separating hyperplanes located by the PSVM and kappa coefficientofagreementbeadopted asastandard[25,46]. FNPSVMshouldbegeneralized tounseensampleswiththeleast errorsamongallseparatinghyperplanes.Generally,asthenumber q of available features increases, the overall accuracies and kappa X Overallaccuracy~ n =n|100% coefficients of PSVM and FNPSVM grow gradually. Unexpect- kk k~1 edly however, the increase in the number of available features didn’t always lead to an improvement of the accuracies of MLC and BP. On the contrary, MLC and BP showed better q q comparative performances on training cases with ten features n P nkk{ P nkznzk thantheydidontrainingcaseswithfourteenfeatures,whichmight Kappacoefficient~ k~1 q k~1 be explained by the presence of a large number of irrelevant n2{ P nkznzk features that would hurt the classification performances. This k~1 againdemonstratestheimportanceoffeatureselection.Intermsof Table5,itcouldbeseenthattheaccuraciesandkappacoefficients In terms of the above parameters selected from different ofthefouralgorithmsimprovedwiththeincreaseintrainingdata algorithms, and basing on the 270 samples of each category size,though not significantly. obtained through simple random sampling design, we obtained 3)TheoverallaccuracydifferencesbetweenMLCandBPNon overall classification accuracies and kappa coefficients using the data set used in this study were generally small, and those various multi-class strategies and classifiers on 12 training cases withtheETM+datasetconsistingof4,037,099points(seeTable5). between PSVM and FNPSVM were also not obvious. However, manyof them were statisticallysignificant. Unfortunately, confronting such a large dataset, SVM failed on (2) Training speed and classification speed.Training speed and this problem because it required the more costly solution of a classification speed are twoimportant criterions in evaluating the linear or quadratic program. Several patterns can be observed performances of classification algorithms. Shown in Table 6, the fromTable 5 andTable 7,explained as follows: training speed and classification speed of the four classifiers were 1)Asfarasthemulti-classclassificationstrategiesofPSVMand substantially different. Generally, the training time and classifica- FNPSVM were concerned, the accuracies of ‘‘one-against-one’’ tion time rise with an increase in available features. The training strategyinalltrainingcaseswereabout1–2%higherthanthoseof speedofBPNwassignificantlylowerthanthoseoftheotherthree ‘‘one-against-rest’’ strategy. Also, through experiments, we found classifiers because of its complex network structure. As far as that compared to the classification speed of ‘‘one-against-rest’’ classification speed was concerned, in all training cases, those of strategy,theclassificationspeedof‘‘one-against-one’’strategywas thePSVMandFNPSVMwereremarkablylowerthanthoseofthe atleasttwotimeshigher,forbothPSVMandFNPSVM(notlisted MLC and BPN. The classification of the MLC and BPN in all in the following tables). So in this paper, we employed ‘‘one- trainingcases tookfromlessthananhourtoonlyafewminutes, PLOSONE | www.plosone.org 9 July2013 | Volume 8 | Issue 7 | e69434 FuzzyNonlinearProximalSupportVectorMachine Figure3.ClassificationmapsforthetestareainnorthernChinausingvariousclassifiersunderthesametrainingcase(90training samplesforeachclass,10features).(a)MLCalgorithm.(b)BPNalgorithm,g=0.5,a=0.8,d=10–20,e=10–6.(c)PSVMalgorithm,c=213,s=2– 13.(d)FNPSVM,t1=0.1,t2=0.8,c=213ands=2–13. doi:10.1371/journal.pone.0069434.g003 whilethePSVMandFNPSVMtookmorethanseveralhoursand optimalkeyparametersincludingthekernelparameterssandthe ten hours, respectively. This was due to the fact that PSVM and constant c in the training process, therefore yielding a better FNPSVM involved large matrix calculation and reverse matrix performance. Compared with PSVM, the training speed and operationduringtheprocessofclassification.Inaddition,itshould classification speed of FNPSVM were more than twice its be noted that we have spent much time in searching for the counterparts. The reason was that in terms of the comparison PLOSONE | www.plosone.org 10 July2013 | Volume 8 | Issue 7 | e69434

Description:
an algorithm named fuzzy nonlinear proximal support vector machine This algorithm is applied to extract various types of lands of the city Da'an in northern China diagonal matrix D, with +1 or -1 elements along its diagonal. The .. moderate saline-alkalized land, and light saline-alkalized land.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.