A Class Of Separate-Type Estimators For Population Mean In Stratified Sampling Using Known Parameters Under Non-Response 1Manoj K. Chaudhary, 2Rajesh Singh , and 3Mukesh Kumar, 4Rakesh K. Shukla and 5Florentin Smarandache 1,2,3,4Department of Statistics, Banaras Hindu University Varanasi-221005, INDIA 5Chair of Department of Mathematics, University of New Mexico, Gallup, USA [email protected],[email protected],[email protected], [email protected] [email protected] ABSTRACT The objective of the present paper is to propose a family of separate-type estimators of population mean in stratified random sampling in presence of non- response based on the family of estimators proposed by Khoshnevisan et al. (2007). Under simple random sampling without replacement (SRSWOR) the expressions of bias and mean square error (MSE) up to the first order of approximation are derived. The comparative study of the family with respect to usual estimator has been discussed. The expressions for optimum sample sizes of the strata in respect to cost of the survey have also been derived. An empirical study is carried out to shoe the properties of the estimators. 2000 Mathematics Subject Classification: 62D05 Key words and phrases : Stratified sampling, separate-type estimator, non- response, auxiliary information, mean square error. 1. INTRODUCTION Upadhyaya and Singh (1999) have suggested the class of estimators in simple random sampling using some known population parameter(s) of an auxiliary variable. These estimators have been extended by Kadilar and Cingi (2003) for stratified random sampling. In an attempt to improve the estimators, Kadilar and Cingi (2005), Shabbir and Gupta (2005, 2006) and Singh and Vishwakarma (2008) have suggested new ratio estimators in stratified random sampling. Using power transformation Singh et al. (2008) have suggested a class of estimators adapting the estimators developed by Kadilar and Cingi (2003). Koyuncu and Kadilar (2008, 2009) have proposed a family of combined-type estimators in stratified random sampling based on the family of estimators proposed by Khoshnevisan et al. (2007). Singh et al. (2008) suggested some exponential ratio type estimators in stratified random sampling. Recently Koyuncu and Kadilar (2010) have suggested a family of estimators in stratified random sampling following Diana (1993) and Kadilar and Cingi (2003). Let Y andX be the study and auxiliary variables respectively, with respective population means Y andX . Khoshnevisan et al. (2007) have proposed a family of estimators for population mean using known values of some population parameters in simple random sampling (SRS) given by aX +b g t = y (1.1) α(ax+b)+(1−α)(aX +b) where a ≠ 0, b are either real numbers or functions of known parameters of the auxiliary variable X . In this paper, we have proposed a family of separate-type estimators of population mean in stratified random sampling in presence of non-response on study variable adapting the above family of estimators. The properties of the proposed family of estimators in comparison with usual estimators have been discussed. The expressions for optimum sample sizes of the strata with respect to cost of the survey have been obtained. 2. SAMPLING STRATEGY AND ESTIMATION PROCEDURES Let us consider a finite population of size,N , is divided into kstrata. Let N be the size of ithstratum(i =1,2,...,k) and a sample of size n is drawn from the i i k ithstratum using SRSWOR scheme such that∑n = n. It is assumed that the i i=1 non-response is detected on study variable Y only and auxiliary variable X is free from non-response. * Let y and x are the unbiased estimators of population means Y and i i i X respectively, for the ithstratum, given as i n y +n y * y = i1 ni1 i2 ui2 (2.1) i n i where y and y are the means based on n response units and u non- ni1 ui2 i1 i2 response units of sub sample selected from n non-response units respectively. i2 x be the sample mean based on n units. i i Therefore an unbiased estimator of population mean Y is given by k * ∑ * y = p y (2.2) st i i i=1 and variance of the estimator is expressed as V(y* ) = ∑k 1 − 1 p2S2 +∑k (ki −1)W p2S2 (2.3) st n N i yi n i2 i yi2 i=1 i=1 i where S2 and S2 are respectively the mean-squares of entire group and non- yi yi2 n response group of study variable in the population for the ithstratum. k = i2 , i u i2 N N p = i and W =non-response rate of the ithstratum in the population= i2 . i N i2 N i 2.1 SUGGESTED FAMILY OF ESTIMATORS Adapting the idea of Khoshnevisan et al. (2007), we propose a family of separate-type estimators of population meanY , given by k T = ∑ pT* (2.4) S i i i=1 aX +b g where T* = y* i (2.5) i i α(axi +b)+(1−α)(aXi +b) Obviously, T is biased forY . Therefore, bias and MSE of T can be S S obtained on using large sample approximations. Let y* =Y (1+e ) ; x = X (1+e ) i i 0 i i 1 such that E(e )= E(e )= 0 and 0 1 ( ) * ( ) V y (k −1)W S2 E e2 = i = f C2 + i i2 Yi2 , 0 2 i Yi 2 Y n Y i i i ( ) E(e2)= V xi = f C2 , 1 2 i Xi X i ( ) * Cov y ,x E(e e )= i i = f ρC C , 0 1 2 i i Yi Xi Y X i i N −n S2 S2 where f = i i , C2 = Yi , C2 = Xi , S2 be the mean-square of entire i N n Yi 2 Xi 2 Xi Y X i i i i group of auxiliary variable in the population for the ithstratum and ρis the i correlation coefficient between Y and X in the ithstratum. Expressing the estimator T in terms of e ande , we get S 0 1 k T = ∑ p Y (1+e )[1+αλe ]−g (2.6) S i i 0 i 1 i=1 aX where λ = i . i aX +b i Suppose αλe < 1 so that [1+αλe ]−g is expandable. Expanding the right i 1 i 1 hand side of the equation (2.6) up to the first order of approximation, we obtain (T −Y)= ∑k Y e −gαλe + g(g +1)α2λ2e2 − gαλe e (2.7) S i 0 i 1 2 i 1 i 0 1 i=1 Taking expectations of both sides of (2.7), we get the bias of T up to the S first order of approximation, as k g(g +1) B(T )= ∑ p f Y α2λ2C2 −αλgρC C (2.8) S i i i 2 i Xi i i Yi Xi i=1 Squaring both side of equation (2.7) and taking expectations on both sides of this equation, we get the MSE(T ) to the first order of approximation as given below: S MSE(TS)= ∑k pi2fiYi2(CY2i +α2λi2g2CX2i −2αλigρiCYiCXi)+ (kin−1)Wi2SY2i2 (2.9) i=1 i 2.2 SOME SPECIAL CASES Case 1: If we putα=1, a =1, b = 0 and g =1 in equation (2.4), we get k X T = ∑p y* i (2.10) S i i x i=1 i which is separate ratio estimator of population mean Y under non-response. Case 2: If α=1, a =1, b = 0 and g = −1, the equation (2.4) gives k x T = ∑p y* i (2.11) S i i X i=1 i which is separate product estimator of population mean Y under non-response. Case 3: If we takeα= 0, a = 0,b = 0 andg = 0, the equation (2.4) provides k ∑ * T = p y (2.12) S i i i=1 which is the usual estimator of population meanY under non-response. Similarly, we can obtain the various existing estimators of the family under non-response on different choices ofα, a, b and g[ See Khoshnevisan et al. (2007)]. 2.3 OPTIMUM CHOICE OF α In order to obtain the optimum α we minimizeMSE(T ) with respect to S α . Differentiating MSE(T ) with respect to α and equating the derivative to S zero, we get the normal equation ∂MSE(TS) = ∑k p2 f Y2[2αλ2C2 −2αλgρC C ]=0 (2.13) ∂α i i i i Xi i i Yi Xi i=1 k ∑ p2 f Y2ρC C i i i i Yi Xi ⇒ α = i=1 (2.14) (opt) k λg∑ p2 f Y2C2 i i i i Xi i=1 Thus the equation (2.14) provides the value of α at which MSE(T )would S be minimum. 2.4 OPTIMUM n WITH RESPECT TO COST OF THE SURVEY i Let C be the cost per unit of selecting n units, C be the cost per unit in i0 i i1 enumerating n units and C be the cost per unit of enumerating u units. Then i1 i2 i2 the total cost for the ithstratum is given by C =C n +C n +C u ∀ i =1,2,...,k (2.15) i i0 i i1 i1 i2 i2 Now, we consider the average cost per stratum W E(C )= n C +C W +C i2 (2.16) i i i0 i1 i1 i2 k i Thus the total cost over all the strata is given by k C = ∑E(C ) 0 i i=1 k W = ∑n C +C W +C i2 (2.17) i i0 i1 i1 i2 k i=1 i Let us consider the function φ= MSE(T )+µC (2.18) S 0 where µ is Lagrangian multiplier. Differentiating the equation (2.18) with respect to n and k respectively and equating the derivatives to zero, we get the i i following normal equations: ∂φ = − pi2 [Y2(C2 +α2λ2g2C2 −2αλgρC C )+(k −1)W S2 ]+ ∂n n2 i Yi i Xi i i Yi Xi i i2 Yi2 i i W µC +C W +C i2 =0 (2.19) i0 i1 i1 i2 k i ∂φ p2W S2 W = i i2 Yi2 −µnC i2 =0 (2.20) ∂k n i i2 k2 i i i From the equations (2.19) and (2.20), we have p Y2(C2 +α2λ2g2C2 −2αλgρC C )+(k −1)W S2 n = i i Yi i Xi i i Yi Xi i i2 Yi2 (2.21) i W µ C +C W +C i2 i0 i1 i1 i2 k i k p S and µ= i i Yi2 (2.22) n C i i2 Putting the value of the µ from equation (2.22) into the equation (2.21), we get C B k = i2 i (2.23) i(opt) S A Yi2 i where A = C +C W i i0 i1 i1 2( ) and B = Y C2 +α2λ2g2C2 −2αλgρC C −W S2 i i Yi i Xi i i Yi Xi i2 Yi2 On substituting k into equation (2.21), n can be expressed as i(opt) i ( ) C BW S p B2 + i2 i i2 Yi2 i i A n = i (2.24) i C AW S µ A2 + i2 i i2 Yi2 i B i The µin terms of total cost C can be obtained by putting the values of 0 k and n from equations (2.23) and (2.24) respectively into equation (2.17) as i(opt) i 1 k ( ) ∑ µ= p AB + C W S (2.25) C i i i i2 i2 Yi2 0 i=1 Thus the n can be expressed in terms of the total cost C as i 0 ( ) C BW S p B2 + i2 i i2 Yi2 C i i A n = 0 i (2.26) i(opt) k ( ) ∑ C AW S p AB + C W S i i i i2 i2 Yi2 A2 + i2 i i2 Yi2 i=1 i B i The optimum values of n and k can be obtained by the expressions i i (2.26) and (2.23) respectively. 3. EMPIRICAL STUDY In this section, we use the data set in Koyuncu and Kadilar (2009). The data concerning the number of teachers as study variable and the number of students as auxiliary variable in both primary and secondary school for 923 districts at 6 regions (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central Anatolia, 5: Black Sea, 6: East and Southeast Anatolia) in Turkey in 2007 (Source: Ministry of Education Republic of Turkey). Table 1: Stratum means, Mean Squares and Correlation Coefficients Stratu n S N i Y X S S S ρ Yi2 i i i Yi Xi XYi i m No. 12 3 703.7 20804.5 30486.75 25237153.5 .93 1 883.835 440 7 1 4 9 1 2 6 11 2 413.0 15180.76 .99 2 9211.79 644.922 9747942.85 200 7 1 0 9 6 10 2 573.1 14309.3 1033.46 27549.69 28294397.0 .99 3 400 3 9 7 0 7 7 4 4 17 3 424.6 18218.93 14523885.5 .98 4 9478.85 810.585 405 0 8 6 1 3 3 20 2 527.0 .98 5 5569.95 403.654 8497.776 3393591.75 180 5 2 3 9 20 3 393.8 12997.5 23094.14 15864573.9 .96 6 711.723 300 1 9 4 9 1 7 5 * Table 2: Percent Relative Efficiency (P.R.E.) of T with respect to y at S st α = 0.9317, a =1and b=1 (opt) W k P.R.E.(T ) i2 i S 2.0 1319.17 2.5 1153.15 0.1 3.0 1026.92 3.5 927.72 2.0 1026.92 2.5 847.70 0.2 3.0 726.55 3.5 639.18 2.0 847.70 2.5 679.59 0.3 3.0 573.20 3.5 499.81 2.0 726.55 2.5 573.20 0.4 3.0 480.16 3.5 417.69 4. CONCLUSION In this paper, a class of separate-type estimators for estimating the population mean in stratified random sampling under non-response has been proposed and method of finding the optimum estimator of the family has also been discussed. We have derived the expressions for optimum sample sizes in respect to cost of the survey. From the Table 2, it is easily observed that the optimum estimator of the proposed class T provides better estimate than usual S * estimator y under non-response. It is also observed that the relative efficiency st of T decreases with increase in the non-response rate W andk . S i2 i REFERENCES Diana, G. (1993) : A class of estimators of the population mean in stratified random sampling, Statistica, 53 (1), 59–66. Hansen, M. H. and Hurvitz, W. N. (1946) : The problem of non-response in sample surveys, Journal of American Statistical Association, 41, 517-529. Kadilar, C. and Cingi, H. (2003) : Ratio estimator in stratified sampling, Biometrical Journal, 45,218-225. Kadilar, C. and Cingi, H. (2005) : A new estimator in stratified random sampling, Communication in Statistics Theory and Methods, 34, 597-602. Khoshnevisan, M., Singh, R., Chauhan, P., Sawan, N. and Smarandache, F. (2007) : A general family of estimators for estimating population mean using known value of some population parameter(s), Far East Journal of Theoretical Statistics, 22, 181-191. Koyuncu, N. and Kadilar, C. (2008) : Ratio and product estimators in stratified random sampling, Journal of Statistical Planning and Inference, 139, 8, 2552- 2558. Koyuncu, N., Kadilar, C. (2009) : Family of estimators of population mean using two auxiliary variables in stratified random sampling, Communication in Statistics Theory and Methods, 38:14, 2398-2417.