ebook img

selecting the number of knots in fitting cardinal b-splines for density estimation using aic PDF

12 Pages·2008·0.46 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview selecting the number of knots in fitting cardinal b-splines for density estimation using aic

J. Japan Statist. Soc. Vol. 20 No.2 1990 179-190 SELECTING THE NUMBER OF KNOTS IN FITTING CARDINAL B-SPLINES FOR DENSITY ESTIMATION USING AIC Taskin Atilgan* and Hamparsum Bozdogan** B-splines with equally spaced knots (cardinal B-splines) provide a simple, locally flexible and computationally efficient basis for approximation problems in regression and density estimation. The fidelity to the data (small bias, large variance) and smoothness (large bias, small variance) of a B-spline fit depends on the number of knots employed. This paper studies the problem of selecting the number of knots in cardinal B- splines in density estimation. _4kaike's Information Criterion (AIC) is proposed and discussed as a data-based criterion for selecting the number of knots in density estima tion. The EM algorithm is developed for a "mixture" of B-splines to obtain the maxi mum likelihood estimators of the parameters and to score AIC. Numerical examples are provided to illustrate the versatility and efficiency of the proposed approach by automating the curve fitting process. 1. Introduction A sρline function is a function consisting of polynomial pieces j0ined together with certain smoothness conditions. For example, a polygonal function which is a spline f unc tion of degree l is a simple function whose pieces are linear polynomials joined together to achieve continuity. The pointsξ1,ξ2,…,ξk at which the function changes its character are called knots in the theory of splines. In this paper our emphasis will be primarily on B-splines, especially cubic B-splines. These are special spline functions that are well adapted to numerical tasks and more and more are becoming popular for approximating the underlying function of real data sets in many practical problems across the spectrum of diverse scientific fields, such as tho se of biology, chemistry, economics, geology, medicine, meteorology, nuclear physics, etc . B-splines with equally spaced knots which are called cardinal B-splines or basis splines provide a simple, locally flexible and computationally efficient basis for approximatio n problems in regression and density estimation, and for other complicated pheno mena which are either difficult or impossible to model precisely. The fidelity to the data(smal l bias, large variance)and the smoothness(large bias, small variance)of a B-spl ine fit de pends on the number of knots employed. Hence, our objective in this paper is to study the problem of selecting the number of knots in cardinal B-splines in density estimation. We achieve this by using A kaike's Information Criterion (AIC) defined by Received February,1989. Revised September,1989 and May,1990. Accepted June,1990. * AT&TBell Labs, Murray Hi11,N J 07974 USA. ** Department of Statistic,s University of Tennessee, Knoxville, TN 37996-0532 U.S.A. This research for the second author was partiallys upported as a Research Fellow by the Institute of StatisticalM athematics, Tokyo, Japan during January-August 1988. 180 J. JAPAN STATIST. SOC. Vol. 20 No.2 1990 (1.1) AIC (m)=-2 log (maximized likelihood) +2M, where(cid:129)@ m is the number of independently adjusted parameters in the model. The minimizer of AIC (m), over m=1, 2, ..., M, is called the minimum AIC estimate (MAICE). More on AIC, see, e.g., Akaike (1973, 1974, 1981), and Bozdogan (1981, 1986, 1987), Atilgan (1983), Sakamoto, et al. (1986), and others. The format of the paper is as follows. In Section 2, we introduce cardinal B-splines. In Section 3, we consider density estimation in terms of cardinal B-splines and briefly develop the EM algorithm for cardinal B-splines for density estimation to obtain the maximum likelihood estimates of the parameters. Then, we show the analytical expres sion for AIC for this problem to select the dimension of the approximation. In Section 4, we give numerical examples and illustrate the application of cardinal B-splines to density estimation on two real data sets. In Section 5, we present our conclusions and discussion. 2. Cardinal B-splines B-splines are special spline functions that are well adapted to numerical tasks and for approximating the underlying function of real data sets in practice. They are a lso called"basis splines"since they form a"basis"for the set of all splines, or"bell splines, " because of their bell shape shown in Figure 2.1. In more technical terms, a spline of degree d with k knots,α=ξo<ξ1<ξ2<…<ξk<ξk+1 =bis a function S(x)eCd-1(a , b)such that S(x)ePd. That is, S(x)is a function whose (d-1)st derivative is absolutely continuous on(a, b), and where Pd is the class of poly nomials of degree at most d, in each of the intervals(α1,ξ1),(ξ1,ξ2),…,(ξk, b). We will denote this class of splines by Sd,k(ξ1,ξ2,…,ξk). Let S*(x)denote the"best linear approximation"to S(x). Here,"best approximation" is used in the sense that S*(x)safisfies the sup-norm(or Lm norm)inequality (2.1) If we have a function f∈C(a, b), then by the general theory of linear approximations, we have the following theorem due to Schumaker(1968). THEOREM 2.1. Let a<ξ ・<ξ2<…<ξk<b be fixed and supp0se f(x)∈c(a, b). Then there exists a best approximation S*(x)∈5d,北(ξ1,ξ2,…,ξk)for every 5∈5d,κ(ξ1, SEi…,ξk). Curry and Schoenberg (1966) showed that every spline of degree d(d=1,2,) with kknots(k=1,2,…)has a unique expansion (2.2) In (2.2), βi(x), i=1,2,…, d十k=m are B-splines of degree d, m=d十k is the dimension of the approximation, and where θi's are unknown parameters which need to be estimated. SELECTING THE NUMBER OF KNOTS IN FITTING CARDINAL B-SPLINES 181 Figure 2.2. Cubic Cardinal B-Splines With k Knots. Note that and BOZ0, EBi=1 a is special requirement. This is not true for general splines. The case d=1 corresponds to piecewise linear approximation which is attractively simple but produces a visible roughness, unless the knots are to close each other. For d=2 and 3, we get quadratic and cubic B-splines, respectively. In what follows, we shall concentrate on cubic B-splines. We will define the cubic B-splines over the range [-1, that 1], is, a=-1, b=1. When the interval [-1, 1] is di vided into equal k subintervals, the cubic B-spline centered at zero will like look Figure 2.1. It symmetric is around zero; therefore, we only need to define it for values to the right of zero. (2.3) We number the cubic B-splines as shown in Figure 2.2 by rescaling the original observations x1, x2,(cid:129)c, xn to lie in [-1, 1] using the transformation (2.4) In (2.4) XE[a, b], zE[-1, 1], and the center of itn cubic B-spline is (-1-(4-2i)/k), i=1, 2, •(cid:129)c, k+3, k=1,2, •(cid:129)c . Note that (2.5) B-splines have the following important properties: (2.6) 182 J. JAPAN STATIST. SOC. Vol. 20 No.2 1990 We next briefly discuss density estimation with cardinal B-splines. 3. Density estimation with cardinal B-splines:The EM algorithm and the AIC In conventional statistical inference we often assume that the functional form of the probability density function(p.d.f.)of a random variable is known, and we are inter ested in estimating and/or testing the unknown parameter of this assumed p.d.f. when sample data is given. When we are not sure about the exact functional form of a p.d.f., we can either con sider selecting among several possible alternatives(where the functional form of each p.d.f. is assumed to be known), or consider nonparametric methods of estimating and selecting densities. As an alternative, here we shall propose and consider the estimation of an unknown density f(x)using cardinal B-splines. Let x1, x2,…, xn be an i.i.d. random sample from an unknown p.d.f. f(x). Then the linear approximation to f(x)is given by the following"mixture"of B-splines: (3.1) and Bc(x)'s are cubic cardinal B-splines, and where m=k+3. The log-likelihood is given by (3.2) To obtain the maximum likelihood estimates we apply the EM algorithm of Dempster, et al.(1977)to the"mixture"of B-splines in(3.1). Let x1 x2,…,xn be the observed data from some complete data-set (3.3) {(x,,y1),(x2, y2),…,(xn, yn)}, where y1, y2,…, yxi are missing identifiers. We interpret the data as coming from a mix- ture of distributions, and use fm(x)in(3.1)as an approximation to f(x)to specialize the steps of the EM algorithm to our problem. We have: E-Step: (3.4) where θi(p)is the current estimate of θi, and πi (xf) is the conditional probabilty that the observation vector xf belongs to the ith distribution (with density Bi (x)), which is also called the belonging probability. Note that because of the finite support property of B-splines, we only need to calculate πi(xf)for a given xf at most for 4i's. SELECTING THE NUMBER OF KNOTS IN FITTING CARDINAL B-SPLINES 183 M-Step: (3.5) which is the average of the probabilities that x1, x2,…, xn belongs to ith distribution and estimates the mixing proportion θi. The calculation cycles from one step to the other until we achieve convergence. As the starting values for the EM algorithm, we take the initial values of Bp's to be equal to 1/m. Thus, in this manner we compute the maximum likelihood estimates of the parameters. Now, let θ be the maximum likelihood estimate of θ, and let (3.6) be the corresponding maximized log likelihood function. Since (3.7) then at the convergence of the EM algorithm, AIC for the cardinal B-splines for density estimation becomes (3.8) which is minimized over m to choose the dimension of the approximation. A global convergence result for sequences generated by the EM algorithm is given by Wu and (1983), others in the literature. For the convergence properties of the EM al gorithm for the cardinal B-splines, we refer the reader to Atilgan and Bozdogan (1990). 4. Numerical examples In this section we shall give numerical examples to illustrate the application of car dinal B-splines to density estimation on two real data sets, and show the versatility and efficiency of our proposed new approach by automating the curve fitting process without any subjectivity. All our computations are carried out by using our EMBSPLINE. AIC algorithm for density estimation. The first data set we shall look at is the Chondrite Data (Ahrens (1965)). This is a small data set which consists of the percentages of silica in 22 chondrites, and they are continuous. The chondrite data is widely analyzed in the literature. For this, see, for example, Leonard Good (1978), and Gaskins (1980), Silverman (1981), Ishiguro and Sakamoto (1984). The question of interest here is to estimate the underlying density 184 J. JAPAN STATIST. SOC. Vol. 20 No.2 1990 of this data set, and decide whether or not the underlying probability density has three modes. Using the AIC with cubic cardinal B-splines, and the EM algorithm to obtain the MLE's of θi, our results are shown in Table 1. Looking at Figure 4.1, we see that the minimum of AIC occurs at the dimension of approximation m=11, which gives k=8 as the number of knots. The chi-square ap proach also gives m=11. The density estimate for m=11 is shown in Figure 4.2. We note that this is a trimodal estimate. The estimates of Good and Gaskins(1980)and Silverman(1981)for this data set are also trimodal. They are not surprised about tri modality"…because there are several kinds of chondrites." On the other hand, Leonard (1978)uses a Bayesian density estimation method, and by assuming a normal pr ior for density, obtains a bimodal density estimate for the chondrite data. The second data set we shall consider and analyze is the well-known.Buffalo Snowfall Data. Buffalo snowfall data consists of the 63 yearly values of snow precipitati on in Buffalo, New York(recorded to the nearest tenth of an inch), from 1910-1972. This Table 1. The AIC's for the Chondrite Data when Cardinal B-Splines Are Used for Density Estimation. Note: *Minimum AIC at m=11, where m=k十3=8十3. Figure 4.1. The Plot of AIC(m)vs. m of B-Spline Density Estimate for the Chondrite Data. SELECTING THE NUMBER OF KNOTS IN FITTING CARDINAL B-SPLINES 185 x Figure 4.2. B-Spline Density Estimate for the Chondrite Data, m=11, Table 2. The AIC's for the Buffalo Snowfall Data (1910- 1972) when Cardinal B-Splines Are Used for Density Estimation. Note: *Minimum AIC at m=10, where m=k十3=7十3. data has been analyzed by Parzen(1979), Tapia and Thompson(1978)and others. For further references, we shall refer the reader to the important paper of Parzen(1979). Pr e vious probability density estimates of Buffalo snowfall indicate that it has either unimodal or lrimodal densities. Using the EM algorithm to obtain the MLE of the Bc's of cubic B-splines, our AIC values for m=4,5,…,20 are given in Table 2. Looking at Figure 4.3, we see that the minimum of AIC is attained at m=10. Since we are fitting cubic B-splines, then k, the number of knots in this case is equal to 7. With AIC we chose m=10 as the best dimension of the approximation to the true underlyi ng 186 J. JAPAN STATIST. SOC. Vol. 20 No.2 1990 Figure 4.3. The Plot of AIC (m) vs. m of B-Splines Estimate for the Buffalo Snowfall Data from 1910-1972. Figure 4.4. B-Spline Density Estimate for the Buffalo Snowfall Data from 1910-1972, m=10. Range of the data before transformatio n to (-1,1)is: 25 to 126.4. probability density estimate of the Buffalo snowfall data from 1910-1972. The B-splines estimate for m=10 is shown in Figure 4.4, which is a trimodal estimate. Figure 4.5 shows the B-spline density estimates for the Buffalo snowfall data when the dimension of the approximations are taken to be m=6,10,12, and 17, respectively. As our last example, suppose now we add 7 more extra observations to the Buffalo snowfa皿data and reanalyze this data set again. These seven extra snowfall data for Buffalo, New York were obtained for the years(1973-1980)from the Climatological Data, National Summary, and added to the original(1910-1972)data set. The results of B- spline estimates are presented for this augmented data set in Table 3, since the original data set(1910-1972)gave the controversial trimodal estimate. Looking at Figure 4.6, we see that AIC attains its minimum value at m=9. The corresponding B-spline estimate for this dimension is shown in Figure 4.7. We see that this time the estimate is unimodal. This seems to suggest that a change of regime has occurred with this enlarged data set. SELECTING THE NUMBER OF KNOTS IN FITTING CARDINAL B-SPLINES 187 (a) (b) (c) (a) Figure 4.5. B-Spline Estimates for the Buffalo Snowfall Data from 1910- 1972,(a)m=6,(b)m=10,(c)m=12,and(d)m=17, respectively Table 3. The AIC's for the Buffalo Snowfall Data(1910- 1980)when Cardinal B-Splines Are Used for Density Estimation. Note: *Maximum AIC at m=9, where m=k十3=6十3. The  above density estimates we obtained from the snowfall data assumes that the observations are time independent. In fact, all the previous analyses of this data was done with this assumption. However, when we take the observations as a time series and fit a Box-Jenkins(1970)ARIMA model, based on the parsimony considerations we choose the model(100)x(001)16. That is, we choose the model (4.1) (1-326B)Zt=(1十 …787BIB)ac十56.055 s.d.(.1223) (.1268) (3.592) 188 J. JAPAN STATIST. SOC. Vol. 20 No.2 1990 Figure 4.6. The Plot of AIC(m)vs. m of B-Spline Estimate for the Buffalo Snowfall Data from 1910-1980. Figure 4.7. B-Spline Density Estimate for the Buffalo Snowfall Data from 1910-1980,m=9. Range of the data before the transforma tion to (-1,1)is:25 to 198.2. given in Atilgan(1983, p.187). In(4.1), at's are uncorrelated random variables with mean zero and constant variance. This result is suggesting that the tri-modality we ob served above in our density estimate for the original snowfall data from 1910-1972 is due to seasonality. Indeed, when we fit the model(1,0,0)to the snowfall data and estimat e adensity f0r the residual, we obtain again a trimodal density. For the enlarged data from 1910-1980, the time series model is: (4.2) (1-.351B)Zt=(1十.334B16)ac十54.493. s.d.(.1143) (.1601) (3.965) This shows that we still have a model with periodicity of 16 years, but now the significance of the parameter corresponding to 16 years periodicity is weaker than before. This may be due to the fact that all the additional observations fall to the intermediate to high le vels of snowfall(68-190 inches);hence missing low level snowfall interval. These results suggest that it would be interesting to analyze with the extra new ob-

Description:
B-splines with equally spaced knots (cardinal B-splines) provide a simple, The EM algorithm is developed for a "mixture" of B-splines to obtain the maxi .. reported elsewhere. Acknowledgements. We•@ thank. Professor. Emanual.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.