ebook img

NASA Technical Reports Server (NTRS) 19910006309: A method for classification of multisource data using interval-valued probabilities and its application to HIRIS data PDF

8 Pages·0.47 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 19910006309: A method for classification of multisource data using interval-valued probabilities and its application to HIRIS data

N91-15622 A Method for Classification of Muitisource Data using Interval-Valued Probabilities and Its Application to HIRIS Data H. Kim and P.H. Swain School of Electrical Engineering and Laboratory for Applications of Remote Sensing Purdue University West Lafayette, IN 47907, U.S.A. Tel: (317)494-0212, FAX: (317)494-0776 E-mail address: [email protected] tent information as "evidential informa- ABSTRACT tion." The objective of the current research is A method of classifying multisource data to develop a mathematical framework in remote sensing is presented. The pro- within which various applications can be posed method considers each data source as made with multisource data in remote an information source providing a body of sensing and geographic information sys- evidence, represents statistical evidence by tems. The methodology described in this interval-valued probabilities, and uses paper has evolved from "evidential reason- Dempster's rule to integrate information ing," where each data source is considered based on multiple data sources. as providing a body of evidence with a cer- The method is applied to the problems of tain degree of belief. The degrees of belief ground-cover classification of multispectral based on the body of evidence are repre- data combined with digital terrain data such sented by "interval-valued (IV) probabili- as elevation, slope, and aspect. Then this ties" rather than by conventional point- method is applied to simulated 201 -band valued probabilities so that uncertainty can High Resolution Imaging Spectrometer be embedded in the measures. (HIRIS) data by dividing the dimensionally There are three fundamental problems huge data source into smaller and more in the multisource data analysis based on IV manageable pieces based on the global sta- probabilities: (1) how to represent bodies of tistical correlation information. It produces evidence by IV probabilities, (2) how to higher classification accuracy than the combine IV probabilities to give an overall Maximum Likelihood (ML) classification assessment of the combined body of evi- method when the Hughes phenomenon is dence, and (3) how to make decisions based apparent. on IV probabilities. The paper describes a formal method of representing statistical evidence by IV 1 INTRODUCTION probabilities based on the Likelihood Principle. In order to integrate informa- The importance of utilizing multisource tion obtained from individual data sources, data in ground-cover classification lies in the method presented in the paper uses the fact that it is generally correct to as- Dempster's rule for combining multiple sume that improvements in terms of classi- bodies of evidence. Although IV probabili- fication accuracy can be achieved at the ex- ties together with Dempster's rule provide pense of additional independent features an innovative means for the representation provided by separate sensors. However, it and combination of evidential information, should be recognized that information and they make the decision process rather knowledge from most available data sources complicated. We need more intelligent in the real world are neither certain nor strategies for making decisions. This paper complete. We refer to such a body of uncer- also focuses on the development of decision tain, incomplete, and sometimes inconsis- rules over IV probabilities. 75 2 AXIOMATIC DEFINITION OF when there is absolutely no knowledge of IV PROBABILITY the probability of A. The basic probability assignment m de- Interval -valued probabilities can be fined in Shafer's mathematical theory of thought as a generalization of ordinary evidence[6] has the following relations with point-valued probabilities. The endpoints of the IV probabilities: IV probabilities are called the "upper prob- ability" and the "lower probability." £(A) = 2 re(B) (2.8) There have been various works intro- ducing the concepts of IV probabilities in Beat the areas of philosophy of science and re(A) =Z (_I)IA-BI L(B) for all AcO (2.9) statistics [1][2][3][4]. Although the mathe- Bff.A matical rationales behind those approaches are different, there are some properties of IV probabilities which are commonly re- U(A) = Z re(B) (2.10) quired. The axiomatic approach to IV prob- BnA,O abilities is based on those common proper- ties, so that it can avoid conceptual ambi- guities. 3 REPRESENTATION OF STATISTICAL DEFINITION [5] Suppose O is a finite set of EVIDENCE BY IV PROBABILITY exhaustive and mutually exclusive events. Let _ denote a Boolean algebra of the subsets When a body of evidence is based on the of O. The IV probability [L, U] is defined by outcomes of statistical experiments known the set-theoretic functions: to be governed by any probability model, it is called "statistical evidence." One of the L" 13-* [0, 11 (2.1) basic problems for any theory of IV prob- U'I3 -->[0, 1] (2.2) abilities is how to represent a given body of statistical evidence by IV probabilities. satisfying the following properties: I ) U(A) _>L(A) _>0 for any Ae 13 (2.3) DEFINITION [6] An upper probability function U is said to be "consonant" if its I I) U(O) = L(O) = 1 (2.4) focal elements are nested, i.e., if for A i_O III) U(A) + L(A) = 1 for any Ae_ (2.5) (i=l ..... r) such that m(Ai) > 0 for all i and IV) For any A, Be15 and AnB=O, r L(A)+L(B) <_L(AuB) < L(A)+U(B) m(Ai)=l, AicA j for any i < j, where m is the i=l < U(AuB) <_U(A)+U(B) (2.6) basic probability assignment of u. Given a system of IV probabilities over 13, Suppose the observed data in a statistical the actual probability measure, P(A), of any subset A of @ is assumed to lie in the interval experiment are governed by a probability model {P0:0eO}, where P0 is a conditional [L, U] such that probability density function on a sample L(A) < P(A) < U(A) (2.7) space X given 0. Our intuitive feeling is that The degree of uncertainty about the actual an observation xe X seems to more likely probability of A is represented by the width, belong to those elements of O which assign U(A)-L(A), of the interval. In particular, the greater chance to x. U(A)=L(A)=P(A) when there is complete Based on the above intuition along with the consonance assumption of the upper knowledge of the probability of A. In this probability function, Shafer[6] proposed the case, the IV probability becomes an ordi- linear plausibility function defined as: nary additive probability. And L(A)+L(A)=0 76 tion is obtained by the sum of m's committed max p0.(x) to X k and its subsets. The denominator (l-f0 0'_A _(AIx)- for all A_ I] and A#_ (3.1) normalizes the result to compensate for the max pe(x) _0 measure committed to the empty set so that the total probability mass has measure one. The corresponding lower probability func- Consequently, Dempster's rule discards the tion is given as: conflict between E 1 and E2 and carries their consensus to the new belief function. max po.(x) Dempster's rule is both commutative and 0"eA associative. Therefore, the order or group- L(AIx) = 1- max po(x) for all A_ 13 (3.2) ing of evidence in combination does not affect the result, and a sequence of infor- In particular, when the set A is singleton, mation sources can be combined either say {e'}, the function in Eq.(3.1) gives the sequentially or pairwise. relative likelihood of 0' to the most likely element in O. 5 DECISION RULES FOR 4 DEMPSTER'S RULE FOR IV PROBABILITIES [7] COMBINING IV PROBABILITIES Consider a classification problem where an arbitrary pattern xe X from an unknown Dempster's rule is a generalized scheme class is assigned to one of n classes in O. Let of Bayesian inference to aggregate bodies of evidence provided by multiple information k(0il0 j) be a measure of the "loss" incurred sources. Let m 1 and m2 be the basic prob- when the decision 0i is made and the true ability assignments associated respec-tively pattern class is in fact 0j, where i, j = 1..... n. with the belief functions Bel 1 and Bel 2 Also, let _(x) denote a decision rule that tells which are inferred from two entirely dis- which class to choose for every pattern x. tinct bodies of evidence E1 and E2. For all Ai, We define the "upper expected loss" and the "lower expected loss" of making a decision Bj, and XkCO, Dempster's rule (or Dempster's O(x)=0 i as: orthogonal sum) gives a new belief func- tion denoted by n Bel= Bel l _ Bel2 (4.1) fi(x) = 2 2L(OilOj) Ux(Oj) (5.1) j=l The basic probability assignment associated with the new belief function is defined as: n hi(x) = Z k(0il0j) Lx(0j) (5.2) m(Xk )=(1"/0"1 2 ml(Ai)'m2(Bj) j=l AinBj=X k where U x and L x are respectively the upper for any Xk¢ O (4.2) and the lower probabilities for x being where k is the measure of conflict between actually from 0j. The "Bayes-like rule" is the one which Bel 1 and Bel 2 defined as: minimizes both the upper and the lower expected losses, i.e., k= Z ml (Ai)'m2(Sj) (4.3) AinBj=_ _(x)=0 i if _(x)_<q(x) and l,i(x)_<l,j(x) Dempster's rule computes the basic for j=l ..... n (5.3) probability of X k, m(Xk), from the product A problem with the above decision rule is of ml(Ai)and m2(Bj) by considering all A i that there does not always exist 0 which sat- and Bjwhose intersection is Xk. Once m is isfies the condition in Eq.(5.3), which can computed for every XkCO, the belief func- lead to ambiguity. In such an ambiguous 77 situation, one may withhold the decision Radar (SAR) imagery in Shallow mode and and wait for a new piece of information. Steep mode, respectively. Sources 4 through Otherwise, the ambiguity may be resolved 6 provide digital terrain data. by resorting to the following rule, so-called In this experiment, 6 classes were de- "minimum average expected loss rule": fined as listed in Table 2, and 100 pixels per class were used for training data, which is _i(x)+ (,i(x) /_i(x)+ [.j(x) between 4% and 8% of the total pixels of the O(x)=0 i if 2 _< 2 classes in the test fields. The training sam- ples are uniformly distributed over the test for j=l ..... n (5.4) As an alternative to the Bayes-like rule, Table 1. Anderson River Data Set. there are two other rules by which a deci- sion is made according to individual mea- Source Data Spectral Input Spectral sures of the interval, that is, either the up- Index Type Region Channel Band(Ixm) per expected loss or the lower expected loss: 1 .38 - .42 2 .42 - .45 (A) minimum upper expected loss rule: 3 .45 - .50 Visible 4 .50 - .55 d(x)=0 i if _i(x) </_(x) for j=l,.:., n (5.5) 5 .55 - .60 (B) minimum lower expected loss rule: 1 A/B 6 .60 - .65 MSS §(x)=0 i if /.i(x) < [.j(x) for j=l ..... n (5.6) 7 .65 - .69 8 .70 - .79 Although the above two rules always pro- duce decisions and there is no ambiguous Near IR 9 .80 - .89 10 .92 - 1.10 situation in making a decision according to the rules, they do not utilize all of the in- Thermal 11 8 - 14 formation represented by the IV probabili- XHV ties. The performance of these rules are 2 SAIl Shallow XHH compared with the minimum average ex- LHV pected loss rule in the experiments by LHH applying them to problems of ground-cover XHV classification based on remotely sensed and 3 SAR Steep XHH geographic data. LHV I.HH 4 Topo- Elevation 6 EXPERIMENTAL RESULTS 5 graphic Aspect 6 Slope The experiments have been performed over two different image data sets. In the experiments, the classification accuracies of Table 2. Information Classes for Test of the multisource data (MSD) classification Anderson River Data Set. based on the proposed method were com- pared with those of Maximum Likelihood Class Cover Tree No. of % of (ML) classifications based on the stacked Inde_ T_,pes Sizes ?ixels Total vector approach. 1 Douglas Fir 2 (df2) 31 - 40m 2246 21.72 Table 1 describes the set of data sources 2 Douglas Fir 3 (df3) 21 - 30m 1501 14.52 for the first experiment. The image in this 3 DF+Other Species 2 31 - 40m 1352 13.08 data set consists of 256 lines by 256 columns (df+os2) and covers a forestry site around the DF+Lodgepole Pine 2 21 - 30m 1589 15.37 Anderson River area of British Columbia, (df+lp2) Canada. Source 1 is ll-band Airborne 5 Hemlock+Cedar (hc) 31 - 40m 1587 15.35 Multispectral Scanner (A/B MSS) data. 6 Sorest Clearings _fc) 2064 19.96 Sources 2 and 3 are Synthetic Aperture :Total 1033_ I00.0 78 fields so that they may be considered as good general which rule is the best. Further representatives of the total samples. research is needed to determine whether We have observed that some of the guidelines can be devised for selection of classes defined in Table 2 cannot be assumed the decision rule. to be normally distributed in the topo- graphic data. Thus it was decided to adopt a Table 4. Results of Classifications over Test nonparametric approach such as the Samples of Anderson River Data. "Nearest Neighbor" (NN) method [8] in computing probability measures while the Decision Sources optical and radar data sources were assumed Rule 1 1, 4 1,2,41 -4 1 -5 1 -6 to have Gaussian probability density func- tions. ML 74.1677.77 79.13 78.9379.80 ;1.0 First, the ML classification based on the stacked vector approach was carried out for MUEL 80.60 82.39 82.6983.02 _4.54 various sets of the data sources, adding one MSD MLEL 78.45 81.42 81.6782.24_3.65 source at a time to the A/B MSS data in the order Elevation, SAR-Shallow, SAR-Steep, MAEL 78.21 80.9582.0581.88_3.1( Aspect, and Slope. Then the MSD classifica- tion based on the proposed method was per- In the second experiment, the proposed formed using different decision rules. method was applied to the classification of Tables 3 and 4 compare the results for the HIRIS data by decomposing the data into training samples and the test samples, re- smaller pieces, i.e., subsets of spectral spectively. Even though the compounded bands. The data set used in this experiment data in the ML classification were treated as is simulated HIRIS data obtained by RSSIM having Gaussian distributions, the ML and [9]. RSSIM is a simulation tool for the study the MSD methods produced similar results of multispectral remotely sensed images and for the training samples. This is not sur- associated system parameters. It creates prising because the ML method uses con- realistic multispectral images based on de- ventional additive probabilities assuming tailed models of the ground surface, the at- that the knowledge concerning the actual mosphere, and the sensor. Table 5 provides unknown probabilities is complete, which is a description of the simulated HIRIS data set. reasonable as far as the training samples are concerned. Figure 1 is a visual representation of the global statistical correlation coefficient matrix of the data. The image is produced by Table 3. Results of Classifications over converting the absolute values of coeffi- Training Samples of Anderson River Data. cients to gray values between 0 and 255. Based on the correlation image, the 201 Decision Sources bands were divided into 3 groups in such a Rule 1 1,4 1, 2,4 1 -411 - 5 1-6 way that intra-correlation is maximized and inter-correlation is minimized. Table 6 de- ML g2.513 ]8.67 91.67 92.00 _2.83 _3.5( scribes the multisource data set after divi- MUEL 89.83 92.00 92.5093.17_4.32 sion. Note that the spectral regions of the input channels in Source 3 coincide with MSD MLEL 88.67 91.17 91.3392.33_3.65 the water absorption bands. With 225 training samples (a third of the MAEL 88.50 91.00 91.6791.67_3.5C total samples) for each class, the ML classi- fication and the MSD classification using Comparing the performance of the three the minimum upper expected loss rule were decision rules, the minimum upper expected performed over the total samples for vari- loss (MUEL) rule was superior to the other ous sets of the sources, and the results are rules, the minimum lower expected loss listed in Table 7. (MLEL) rule and the minimum average ex- The results of the ML method apparently pected loss (MAEL) rule. It is not known in show effects of the Hughes phenomenon; 79 the accuracygoes down as the dimensional- Table 6. Divided Sources of HIRIS Data Set. ity of the source increases while the num- Source Input No. of ber of training samples is fixed. In particu- Index Channels Features lar, the accuracy decreases by a consider- able amount when all features are used. Source 1 1- 35T 107 - 141_ 157 - 201 115 Presence of the Hughes phenomenon causes Source 2 36 - 95 60 the ML method to be particularly sensitive Source 3 96- 106 (1.35 - 1.451.tm) 26 to a bad source, Source 3 in this case. 142- 156 (1.81 - 1.951am) Meanwhile, the proposed MSD classification method always shows robust performance Table 7. Results of Classifications over Test and gives consistent results. Samples of Simulated HIRIS Data Set. Table 5. Description of Simulated Sources HIRIS Data Set. S1 $2 $3 S1, $2 All ML 75.75 75.60 45.83 74.56 65.14 Name Finney County Data Set MSD - 77.83 77.63 Data Type 201-band HIRIS data simulated by RSSIM Spectral Region 0.4 - 2.4m m 6 CONCLUSIONS Spectral 0.01ram Resolution In this paper we have investigated how Image Size 45 lines x 45 columns (2025 interval-valued probabilities can be used to samples) represent and aggregate evidential infor- Information Winter Wheat, Summer Fallow, mation obtained from various data sources. Classes Unknown Overall concepts of interval-valued proba- bilities have been employed to develop a new method of classifying multisource data in remote sensing and geographic infor- mation systems. The experiments demon- strate the ability of our method to capture 1 uncertain information based on inexact and incomplete multiple bodies of evidence. The basic strategy of this method is to decompose 36 the relatively large size of evidence into smaller, more manageable pieces, to assess 52 plausibilities and supports based on each piece, and to combine the assessments by 75 Dempster's rule. In this scheme, we are able 95 to overcome the difficulty of precisely estimating statistical parameters, and to 107 integrate statistical information as much as possible. 141 ACKNOWLEDGEMENTS 157 This research is partially supported by the National Aeronautics and Space Administration under Contract No. NAGW- 201 925. Figure 1. Global Statistical Correlation The SAR/MSS Anderson River data set Coefficient Image of Simulated HIRIS Data. was acquired, processed and loaned to Purdue University by the Canadian Center 80 ORIGINAL PAGE BLACK AND WHITE PHOTOGRAPH for Remote Sensing, Department of Energy, mapping", Ann. Math. Stat., vol.38, Mines and Resources, of the Government of pp325-339, 1967. Canada. [5] P. Suppes, "The measurement of be-lief", J. Roy. Stat. Soc., Ser.B, vol.36, pp160-191, 1974. REFERENCES [6] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, [1] B.O. Koopman, "The axioms and algebra 1976. of intuitive probability", The Annals of [7] H. Kim, "A Method of Classification for Mathematics, vol.41, pp269-292, 1940. Multisource Data in Remote Sensing [2] I.J. Good, "Subjective probability as the based on Interval-Valued Probabilities", measure of a non-measurable set", in Ph.D. Thesis, School of Electrical E.Nagel, Suppes and Tarski (eds.), Logic, Engineering, Purdue University, West Methodology and the Philosophy of Lafayette, IN 47907 U.S.A., August 1990. Science, pp319-329, Stanford University [8] Fukunaga, K., Intro. to Statistical Pattern Press, 1962. Recognition, Academic Press, 1972. [3] C.A.B. Smith, "Consistency in statistical [9] Kerekes, J.P., and D.A. Landgrebe, inference and decision", J. Roy Stat. Soc., "Modeling, Simulation, and Analysis of Ser.B, vol.23, ppl-25, 1961. Optical Remote Sensing Systems", TR-EE [4] A.P. Dempster, "Upper and lower pro- 89-49, School of Electrical Engineering, babilities induced by a multivalued Purdue University, West Lafayette, IN U.S.A., 1989. 81

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.