Table Of ContentA Class Of Separate-Type Estimators For Population
Mean In Stratified Sampling Using Known Parameters
Under Non-Response
1Manoj K. Chaudhary, 2Rajesh Singh , and 3Mukesh Kumar, 4Rakesh K. Shukla
and 5Florentin Smarandache
1,2,3,4Department of Statistics, Banaras Hindu University
Varanasi-221005, INDIA
5Chair of Department of Mathematics, University of New Mexico,
Gallup, USA
1ritamanoj15@gmail.com,2rsinghstat@gmail.com,3Mukesh.stat@gamil.com,
4rimgyanu@gamil.com 5smarand@unm.edu
ABSTRACT
The objective of the present paper is to propose a family of separate-type
estimators of population mean in stratified random sampling in presence of non-
response based on the family of estimators proposed by Khoshnevisan et al.
(2007). Under simple random sampling without replacement (SRSWOR) the
expressions of bias and mean square error (MSE) up to the first order of
approximation are derived. The comparative study of the family with respect to
usual estimator has been discussed. The expressions for optimum sample sizes
of the strata in respect to cost of the survey have also been derived. An
empirical study is carried out to shoe the properties of the estimators.
2000 Mathematics Subject Classification: 62D05
Key words and phrases : Stratified sampling, separate-type estimator, non-
response, auxiliary information, mean square error.
1. INTRODUCTION
Upadhyaya and Singh (1999) have suggested the class of estimators in
simple random sampling using some known population parameter(s) of an
auxiliary variable. These estimators have been extended by Kadilar and Cingi
(2003) for stratified random sampling. In an attempt to improve the estimators,
Kadilar and Cingi (2005), Shabbir and Gupta (2005, 2006) and Singh and
Vishwakarma (2008) have suggested new ratio estimators in stratified random
sampling. Using power transformation Singh et al. (2008) have suggested a class
of estimators adapting the estimators developed by Kadilar and Cingi (2003).
Koyuncu and Kadilar (2008, 2009) have proposed a family of combined-type
estimators in stratified random sampling based on the family of estimators
proposed by Khoshnevisan et al. (2007). Singh et al. (2008) suggested some
exponential ratio type estimators in stratified random sampling. Recently
Koyuncu and Kadilar (2010) have suggested a family of estimators in stratified
random sampling following Diana (1993) and Kadilar and Cingi (2003).
Let Y andX be the study and auxiliary variables respectively, with
respective population means Y andX . Khoshnevisan et al. (2007) have
proposed a family of estimators for population mean using known values of some
population parameters in simple random sampling (SRS) given by
aX +b g
t = y (1.1)
α(ax+b)+(1−α)(aX +b)
where a ≠ 0, b are either real numbers or functions of known parameters of the
auxiliary variable X .
In this paper, we have proposed a family of separate-type estimators of
population mean in stratified random sampling in presence of non-response on
study variable adapting the above family of estimators. The properties of the
proposed family of estimators in comparison with usual estimators have been
discussed. The expressions for optimum sample sizes of the strata with respect
to cost of the survey have been obtained.
2. SAMPLING STRATEGY AND ESTIMATION PROCEDURES
Let us consider a finite population of size,N , is divided into kstrata. Let
N be the size of ithstratum(i =1,2,...,k) and a sample of size n is drawn from the
i i
k
ithstratum using SRSWOR scheme such that∑n = n. It is assumed that the
i
i=1
non-response is detected on study variable Y only and auxiliary variable X is free
from non-response.
*
Let y and x are the unbiased estimators of population means Y and
i i i
X respectively, for the ithstratum, given as
i
n y +n y
*
y = i1 ni1 i2 ui2 (2.1)
i n
i
where y and y are the means based on n response units and u non-
ni1 ui2 i1 i2
response units of sub sample selected from n non-response units respectively.
i2
x be the sample mean based on n units.
i i
Therefore an unbiased estimator of population mean Y is given by
k
* ∑ *
y = p y (2.2)
st i i
i=1
and variance of the estimator is expressed as
V(y* ) = ∑k 1 − 1 p2S2 +∑k (ki −1)W p2S2 (2.3)
st n N i yi n i2 i yi2
i=1 i=1 i
where S2 and S2 are respectively the mean-squares of entire group and non-
yi yi2
n
response group of study variable in the population for the ithstratum. k = i2 ,
i u
i2
N N
p = i and W =non-response rate of the ithstratum in the population= i2 .
i N i2 N
i
2.1 SUGGESTED FAMILY OF ESTIMATORS
Adapting the idea of Khoshnevisan et al. (2007), we propose a family of
separate-type estimators of population meanY , given by
k
T = ∑ pT* (2.4)
S i i
i=1
aX +b g
where T* = y* i (2.5)
i i
α(axi +b)+(1−α)(aXi +b)
Obviously, T is biased forY . Therefore, bias and MSE of T can be
S S
obtained on using large sample approximations. Let
y* =Y (1+e ) ; x = X (1+e )
i i 0 i i 1
such that E(e )= E(e )= 0 and
0 1
( )
*
( ) V y (k −1)W S2
E e2 = i = f C2 + i i2 Yi2 ,
0 2 i Yi 2
Y n Y
i i i
( )
E(e2)= V xi = f C2 ,
1 2 i Xi
X
i
( )
*
Cov y ,x
E(e e )= i i = f ρC C ,
0 1 2 i i Yi Xi
Y X
i i
N −n S2 S2
where f = i i , C2 = Yi , C2 = Xi , S2 be the mean-square of entire
i N n Yi 2 Xi 2 Xi
Y X
i i i i
group of auxiliary variable in the population for the ithstratum and ρis the
i
correlation coefficient between Y and X in the ithstratum.
Expressing the estimator T in terms of e ande , we get
S 0 1
k
T = ∑ p Y (1+e )[1+αλe ]−g (2.6)
S i i 0 i 1
i=1
aX
where λ = i .
i aX +b
i
Suppose αλe < 1 so that [1+αλe ]−g is expandable. Expanding the right
i 1 i 1
hand side of the equation (2.6) up to the first order of approximation, we obtain
(T −Y)= ∑k Y e −gαλe + g(g +1)α2λ2e2 − gαλe e (2.7)
S i 0 i 1 2 i 1 i 0 1
i=1
Taking expectations of both sides of (2.7), we get the bias of T up to the
S
first order of approximation, as
k g(g +1)
B(T )= ∑ p f Y α2λ2C2 −αλgρC C (2.8)
S i i i 2 i Xi i i Yi Xi
i=1
Squaring both side of equation (2.7) and taking expectations on both sides of this
equation, we get the MSE(T ) to the first order of approximation as given below:
S
MSE(TS)= ∑k pi2fiYi2(CY2i +α2λi2g2CX2i −2αλigρiCYiCXi)+ (kin−1)Wi2SY2i2 (2.9)
i=1 i
2.2 SOME SPECIAL CASES
Case 1: If we putα=1, a =1, b = 0 and g =1 in equation (2.4), we get
k X
T = ∑p y* i (2.10)
S i i
x
i=1 i
which is separate ratio estimator of population mean Y under non-response.
Case 2: If α=1, a =1, b = 0 and g = −1, the equation (2.4) gives
k x
T = ∑p y* i (2.11)
S i i
X
i=1 i
which is separate product estimator of population mean Y under non-response.
Case 3: If we takeα= 0, a = 0,b = 0 andg = 0, the equation (2.4) provides
k
∑ *
T = p y (2.12)
S i i
i=1
which is the usual estimator of population meanY under non-response.
Similarly, we can obtain the various existing estimators of the family
under non-response on different choices ofα, a, b and g[ See Khoshnevisan et
al. (2007)].
2.3 OPTIMUM CHOICE OF α
In order to obtain the optimum α we minimizeMSE(T ) with respect to
S
α . Differentiating MSE(T ) with respect to α and equating the derivative to
S
zero, we get the normal equation
∂MSE(TS) = ∑k p2 f Y2[2αλ2C2 −2αλgρC C ]=0 (2.13)
∂α i i i i Xi i i Yi Xi
i=1
k
∑ p2 f Y2ρC C
i i i i Yi Xi
⇒ α = i=1 (2.14)
(opt) k
λg∑ p2 f Y2C2
i i i i Xi
i=1
Thus the equation (2.14) provides the value of α at which MSE(T )would
S
be minimum.
2.4 OPTIMUM n WITH RESPECT TO COST OF THE SURVEY
i
Let C be the cost per unit of selecting n units, C be the cost per unit in
i0 i i1
enumerating n units and C be the cost per unit of enumerating u units. Then
i1 i2 i2
the total cost for the ithstratum is given by
C =C n +C n +C u ∀ i =1,2,...,k (2.15)
i i0 i i1 i1 i2 i2
Now, we consider the average cost per stratum
W
E(C )= n C +C W +C i2 (2.16)
i i i0 i1 i1 i2 k
i
Thus the total cost over all the strata is given by
k
C = ∑E(C )
0 i
i=1
k W
= ∑n C +C W +C i2 (2.17)
i i0 i1 i1 i2 k
i=1 i
Let us consider the function
φ= MSE(T )+µC (2.18)
S 0
where µ is Lagrangian multiplier. Differentiating the equation (2.18) with respect
to n and k respectively and equating the derivatives to zero, we get the
i i
following normal equations:
∂φ = − pi2 [Y2(C2 +α2λ2g2C2 −2αλgρC C )+(k −1)W S2 ]+
∂n n2 i Yi i Xi i i Yi Xi i i2 Yi2
i i
W
µC +C W +C i2 =0 (2.19)
i0 i1 i1 i2 k
i
∂φ p2W S2 W
= i i2 Yi2 −µnC i2 =0 (2.20)
∂k n i i2 k2
i i i
From the equations (2.19) and (2.20), we have
p Y2(C2 +α2λ2g2C2 −2αλgρC C )+(k −1)W S2
n = i i Yi i Xi i i Yi Xi i i2 Yi2 (2.21)
i
W
µ C +C W +C i2
i0 i1 i1 i2 k
i
k p S
and µ= i i Yi2 (2.22)
n C
i i2
Putting the value of the µ from equation (2.22) into the equation (2.21),
we get
C B
k = i2 i (2.23)
i(opt) S A
Yi2 i
where A = C +C W
i i0 i1 i1
2( )
and B = Y C2 +α2λ2g2C2 −2αλgρC C −W S2
i i Yi i Xi i i Yi Xi i2 Yi2
On substituting k into equation (2.21), n can be expressed as
i(opt) i
( )
C BW S
p B2 + i2 i i2 Yi2
i i A
n = i (2.24)
i
C AW S
µ A2 + i2 i i2 Yi2
i B
i
The µin terms of total cost C can be obtained by putting the values of
0
k and n from equations (2.23) and (2.24) respectively into equation (2.17) as
i(opt) i
1 k ( )
∑
µ= p AB + C W S (2.25)
C i i i i2 i2 Yi2
0 i=1
Thus the n can be expressed in terms of the total cost C as
i 0
( )
C BW S
p B2 + i2 i i2 Yi2
C i i A
n = 0 i (2.26)
i(opt) k ( )
∑ C AW S
p AB + C W S
i i i i2 i2 Yi2 A2 + i2 i i2 Yi2
i=1 i B
i
The optimum values of n and k can be obtained by the expressions
i i
(2.26) and (2.23) respectively.
3. EMPIRICAL STUDY
In this section, we use the data set in Koyuncu and Kadilar (2009). The
data concerning the number of teachers as study variable and the number of
students as auxiliary variable in both primary and secondary school for 923
districts at 6 regions (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central
Anatolia, 5: Black Sea, 6: East and Southeast Anatolia) in Turkey in 2007
(Source: Ministry of Education Republic of Turkey).
Table 1: Stratum means, Mean Squares and Correlation
Coefficients
Stratu n S
N i Y X S S S ρ Yi2
i i i Yi Xi XYi i
m No.
12 3 703.7 20804.5 30486.75 25237153.5 .93
1 883.835 440
7 1 4 9 1 2 6
11 2 413.0 15180.76 .99
2 9211.79 644.922 9747942.85 200
7 1 0 9 6
10 2 573.1 14309.3 1033.46 27549.69 28294397.0 .99
3 400
3 9 7 0 7 7 4 4
17 3 424.6 18218.93 14523885.5 .98
4 9478.85 810.585 405
0 8 6 1 3 3
20 2 527.0 .98
5 5569.95 403.654 8497.776 3393591.75 180
5 2 3 9
20 3 393.8 12997.5 23094.14 15864573.9 .96
6 711.723 300
1 9 4 9 1 7 5
*
Table 2: Percent Relative Efficiency (P.R.E.) of T with respect to y at
S st
α = 0.9317, a =1and b=1
(opt)
W k P.R.E.(T )
i2 i S
2.0 1319.17
2.5 1153.15
0.1 3.0 1026.92
3.5 927.72
2.0 1026.92
2.5 847.70
0.2 3.0 726.55
3.5 639.18
2.0 847.70
2.5 679.59
0.3 3.0 573.20
3.5 499.81
2.0 726.55
2.5 573.20
0.4 3.0 480.16
3.5 417.69
4. CONCLUSION
In this paper, a class of separate-type estimators for estimating the
population mean in stratified random sampling under non-response has been
proposed and method of finding the optimum estimator of the family has also
been discussed. We have derived the expressions for optimum sample sizes in
respect to cost of the survey. From the Table 2, it is easily observed that the
optimum estimator of the proposed class T provides better estimate than usual
S
*
estimator y under non-response. It is also observed that the relative efficiency
st
of T decreases with increase in the non-response rate W andk .
S i2 i
REFERENCES
Diana, G. (1993) : A class of estimators of the population mean in stratified
random sampling, Statistica, 53 (1), 59–66.
Hansen, M. H. and Hurvitz, W. N. (1946) : The problem of non-response in
sample surveys, Journal of American Statistical Association, 41, 517-529.
Kadilar, C. and Cingi, H. (2003) : Ratio estimator in stratified sampling,
Biometrical Journal, 45,218-225.
Kadilar, C. and Cingi, H. (2005) : A new estimator in stratified random
sampling, Communication in Statistics Theory and Methods, 34, 597-602.
Khoshnevisan, M., Singh, R., Chauhan, P., Sawan, N. and Smarandache, F.
(2007) : A general family of estimators for estimating population mean using
known value of some population parameter(s), Far East Journal of
Theoretical Statistics, 22, 181-191.
Koyuncu, N. and Kadilar, C. (2008) : Ratio and product estimators in stratified
random sampling, Journal of Statistical Planning and Inference, 139, 8, 2552-
2558.
Koyuncu, N., Kadilar, C. (2009) : Family of estimators of population mean
using two auxiliary variables in stratified random sampling, Communication in
Statistics Theory and Methods, 38:14, 2398-2417.