ebook img

Mathematics of Sampling PDF

48 Pages·1949·5.165 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Mathematics of Sampling

MATHEMATICS OF SAMPLING by WALTER A. HENDRICKS* Visiting lecturer t0 the Statistical Summer Sessi0n held at the Virginia Polytechnic Institute, August 5 t0 September 5, 1947. Dr. B0yd Harshbarger, Statistician, was in charge 0f the Statistical Summer Sessi0n. 6 8 5 9 106gle 0o 0o 4g 2d- 9p 1# o.3se ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:21 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP Lithoprinted in U.S.A. EDWARDS BROTHERS, INC. ANN ARBOR, MICHIGAN 1949 6 8 5 9 106gle 0o 0o 4g 2d- 9p 1# o.3se ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:21 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP MATHEMATICS OF SAMPLING* Walter A. Hendricks The Theory of sampling is essentially the theory of errors of measurement originally developed for the physical sciences. The mathematical treatment of samples in modern times represents an adaptation of classical error theory, with some few modifications and additions, to a variety of other problems. It is clear that these problems have much in common. A physical measurement, or the average of several such measurements, is subject to errors of observation; an in- dividual unit in a sample, or the average for several such units, deviates in like manner from a corresponding true average in the universe of inquiry. If sam- pling is random, that is if every unit in the universe has an equal chance of be- ing included in the sample and every possible combination of units has an equal chance of occurring, such deviations behave in the same way as random errors of observation in a series of physical measurements. When a series of measurements is made on a constant physical quantity, those measurements are subject to errors of observation that are random in character so long as there is no consistent bias on the part of the observer or the instrument with which the measurements are made. The fraction of times an error of a size between e and e + de will occur can be represented by the equation In this equation f(e) represents the height of the ordinate of the frequency curve of errors at any specified value of e. The quantity f(e)de thus represents the area of a rectangular element under the curve of width de at that value of e. It is clear that the relative frequencies with which errors of different sizes occur are then represented by specified areas under the frequency curve. As we are speaking in terms of fractions of the total frequency, we can say that equa- tion (l) represents the probability of obtaining an error between e and e + de. Throughout the present discussion probability will be defined in terms of the relative frequency with which a specified event may be expected to occur. The probability with which a specified set of two independent errors will occur is equal to the product of their separate probabilities of occurring. For example, if the probability of occurrence of an error between ei and ei + dei is Classical Error Theory dFe = f(e)de (1) dFei = f(ei)dei (2) and the probability of occurrence of an error between ez and e2 + de2 is dF = f(e2)de62 8 (3) 95 (cid:3895)These note106s summarize a courglese of lectures given during the Statistical Summer Session at Virginia Polytechnic Institute Aug0ust &-September 5.o 1947. 0o 4g 2d- 9p 1# o.3se ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:21 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP 2 MATHEMATICS OF SAMPLING then the probability that in a set of two measurements this particular combination will occur is given by dF6l,e2 = f(ei)f(e2)deide2 (4) In equation (4) the quantity f(ei)f(e2) represents the height of an ordinate of a frequency surface in 3-dimensional space, erected at the point (ei,e2) on the base. Equation (4) thus represents an element of volume under the surface ob- tained by multiplying the height f(ei)f(e2) by the area deide2 at the base. So far nothing has been said about the mathematical form of the distribution of errors. Classical error theory assumes a normal distribution, that is, equa- tion (l) is assumed to be of the form dFe = (cid:2802)7(cid:2802)e 2a de (5) In many cases this equation describes the distributions actually found in prac- tice reasonably well, but it is by no means universally applicable. It will be discussed in detail here because of its historical importance and its undisputed utility in a large number of practical applications. If the normal law is assumed to hold, equation (4) may be written in the form _ 1 el+p = W^ye 2 *deidea (6) As stated previously, this represents an element of volume under the frequency surface obtained by multiplying the height of an ordinate erected at the point (ei,e2) by the area deide2. However, it is more convenient from the standpoint of mathematical manipulation to work with an element of volume defined in a dif- ferent way. If we let ^^ = X2 (7) CJ equation (6) can be written in terms of X2 which is called chi square. The ordi- nates of the frequency surface obviously have the same height for all values of ei and e2 for which chi square has a constant value. Consequently combinations of 2 errors which yield the same value of chi square have equal probabilities of occurring. Instead of discussing the probability of occurrence of a specified set of errors, it therefore is generally more useful to discuss the probability of occurrence of a specified value of chi square. To transform equation (6) into an equation giving the probability of occur- rence of a value of chi square between X2 and X2 + dX2, it is necessary only to note that eq6uation (7) is the equation of a circle in the ei,e2 plane with a ra- 8 dius equal to95 aX. All ordinates of the frequency surface erected along the cir- cumference106 of the circle have thglee same height. The area of the circle is equal to 7cct2X2. 0As X2 is increased boy an amount equal to dX2 the area of the circle is 0o increased by24 7ia2dX2. d-g The element19 of volume under th#pe frequency surface consisting of a cylindrical shell boundeo.3d by all possible ordseinates erected along the circumferences of the ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP MATHEMATICS OF SAMPLING 3 two circles is equal to (cid:2802)5'7 r e 2 -ua dX or a (2ti) |e-^2dX2. This relationship follows at once from the theorems of elementary geometry which show that the volume of a solid like the cylindrical shell under discussion is equal to the product of the altitude and the area of the base. The interesting feature of this relationship is that the volume of the cylindrical shell repre- sents the fraction of times that a set of two errors of measurement will yield a value of chi square between X2 and Xz + dX2. We have, therefore, for sets of two measurements, dF = |e^2dX2 (8) This same type of reasoning can be followed for sets of any specified number of errors, although the geometrical configurations become more difficult to vis- ualize. In general, if the probability of occurrence of a particular set of n measurements is 22 2 1 i 9i + e2 + (cid:2802) + en = (cid:2802) se-2 5^ deide2 (cid:2802)- den (9) n a ( 27i) 2 22 2 ei + ez + + en , , and X2 = ~ (cid:2802)2 (1C) then the probability of occurrence of a particular value of chi square is given by dFy2= -t^(cid:2802) e ^V^dX2 (11) 22r(§) When n = 2 equation (11) reduces to equation (8) as a special case. The quantity n in equation (ll) is called the number of degrees of freedom. In this particu- lar case it happens to be exactly equal to the total number of measurements be- cause we are discussing n independent errors. Equation (ll) is one of the most basic formulas of error theory and has ex- tensive applications in the theory of sampling. It is the additive property of chi square that makes equation (ll) so useful. If we have a value of chi square computed from equation (10) for ni degrees of freedom, another for n2 degrees of freedom, and so on up to a value for np degrees of freedom, each of these values of chi square6 is distributed separately according to equation (ll) with n taking 8 the particula95r values ni, n2, , np. It is at once apparent that a value of chi square c106omputed from the rgleelation X2 = X2 + X0| + (cid:2802) + X2 o 0o is distributed24 according to equatd-gion (ll) with n = ni + n2 19+ + nB. #p o.3se ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP 4 MATHEMATICS OF SAMPLING Furthermore any value of chi square computed for n degrees of freedom can later be broken down into a number of components, each distributed separately according to equation (ll), with n taking appropriate values. This relationship will now be used to derive the distributions of arithmetic means and standard errors estimated from random samples drawn from a normal uni- verse. By definition the population mean \i and squared standard error of a varia- ble X are given by * = E(X) (12) <x2 = E(X - (i)2 where E is a symbol meaning "expected value of" or "average value of" the quanti- ty in parentheses for the universe. The numerical values of and a are seldom known in advance of taking a sample; generally the sample is drawn for the purpose of obtaining estimates of one or both of these population parameters. If we have a random sample of n values of X, these estimates are -X=MH (13) s2 S(X - x): We know that n independent errors are associated with the n values of X, any one of these errors being given by ej = Xj - (i, an(i that s(e2) is distributed as chi square with n degrees of freedom. This is the starting point in the analysis. The next step is to write each error in terms of Vj = Xj - x and u = x - n as fol- lows: ei = vi + u e2 = v2 + u (14) en = vn + u By virtue of the fact that S(v) = 0, it is evident that s(e2) = S(v2) + nu2. S(e2) The quantity (cid:2802)(cid:2802)T~ which is distributed as chi square with n degrees of freedom ^ o ^ 4. S( v2) nu2 m may thus be resolved into a component (cid:2802)-^z~ and a second component -^75- . The first component is distributed as chi square with n- 1 degrees of freedom while the second is distributed as chi square with one degree of freedom. From equa- S( v2) tions (13) w6e have the relation s2 = , or S(v2) = (n- l)s2. Consequently 8 (n- l)s2 n - l95 the quantity106 '^g*.(cid:2802) is distributegled as chi square with n - 1 degrees of freedom. The distribu0tion functions of estoimated arithmetic means and standard errors may 0o now be obta24ined directly. To getd-g the distribution function of u we substitute nu2 19#p the quantityo.3 (cid:2802)5- for X2 in the chsei square distribution for one degree of freedom (7 co_u obtaining th27/e equation, ess 0c 21 1i nu 2 "2 2et/2g/ac 'T^e 'w^) ddl.handle.n^>(15) athitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP MATHEMATICS OP SAMPLING 5 By expressing the differential element in terms of u we obtain ..a dFu = /n" -| e nu o^" du (16) vhich shows that u is distributed normally about zero with a standard error equal to (cid:2802) . In other words X is distributed normally about the population mean [i with a a standard error of . ( \ 2 To obtain the distribution function of s we first substitute g for X2 a in the chi square distribution for n- 1 degrees of freedom, obtaining _ 1 (n-l)ss dF(n-l)B2 n-i [(n-l^ 2=1 2 d "(n-l)s2" 2 a a (17) from which we then obtain dP = ^ s n-3 2 2 n-1 (n-l)2 9 ^2 sn-2ds. (18) Equations (l66) and (l8) will now be used to derive another basic distribution 8 of the theory95 of errors: the t distribution, where t is defined by the relation t = 106gle /n u 0o 0o (19) 24d-g As demonstr19ated above, the dis#ptribution of u is normal with a standard error of -j(cid:2802) . Conseqo.3uently the quantity sen U is normally distributed with unit standard /n a co_u error. Equat27/ion (19), however, inessvolves s rather than a and the distribution of 0c ts admevpilaetse.s W choet/2ennsi dweer aabrely dferoamlin ag nwg/acoitrhm laalr dgies tsraibmuptiloens ,w thhee nd iss tisri bcuotmiopnu atepdp rforoamch sems tahlle nsWnto-oe1rn msetasal r oftof w rcmiltadl.handle.nh.s Tstihhceea ljd oeeirnrritov rda ittsihoterniob oruyft .it athitrust.orohTneh efeu xdnaeccrtitivo danit siootfrn iub i usa tnniodont s do.fif fti cisu lotn.e of the mile- hh dFu, s = -5=p://2' ww. n_C 2T1(n nm I-lT) + (n-MT / htt1) e http://w s(.Fs2nrto0_m2)d tuh diss e.q12 08:27 Guation we obtain thdigitized / e joint distribution of t and s by letting u n-1 r, ^ 21 29- v/ n e- as follows ~4-07T (cid:3697)> I (n-1) + tfHsogl (n-1) 2 ~\01Go ^snf_f»1rd(ti) rd(sf).on 2ain, d m (21) eo neratblic D eu GP 6 MATHEMATICS OF SAMPLING This equation gives the distribution of t for any specified value of s, and the distribution of t for all possible values of s is obtained by integrating with respect to s from zero to infinity. The quantity in square brackets and dt are treated as constants while the integration is performed and the integral can be readily evaluated in terms of Gamma functions. The final result may be written in the form dF+ = r(f) , (cid:2816) /(n-iS«r(2p) (i+^r« <«> Random Sampling in Practice The formulas given above are of the utmost importance in modern sampling theory. Many sampling fluctuations actually are described quite well by the nor- mal law of errors; that is particularly true of the sampling fluctuations of av- erages for large samples. In practice we are usually more concerned about the frequency distributions of averages than about the frequency distributions of the individual observations and it is fortunate that the frequency distributions of averages for large samples are approximated quite well by the normal law even when the sampling fluctuations of the individual observations deviate considera- bly from the normal law. Furthermore the formula for computing the standard er- ror of an average from the standard error of the individual observations, '1 - TT<«> does not depend upon a normal distribution of errors at all. This can be demon- strated quite readily. Let the average of n independent observations drawn from the same universe be x = ~(Xi + X2 + --- + X_) (24) n 11 then Ax = ^(AXi + AXi + --- + AXn) (25) and (Ax)2 = ^SS(AXi) (AXj) (26) the average or expected value of (Ax)2 is the squared standard error or variance of x. We have E(Ax)2 = a| (27) Similarly the expected value of a term of the kind (AXj)(AX2) is given by ff2 when i = j E(AXi)(AX.) m* 2 1 J ra when 6i \ J 8 (28) 95 in which r is106 the coefficient of coglerrelation between the errors in any two indi- vidual obser0vations and a2 is thoe squared standard error or variance of an indi- 0o vidual obser24vation. When the erd-grors in the n observations are independent, as they are spe19cified to be in this c#pase, r = o. Therefore o.3se ou c_ 7/ss 2e 0c et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP MATHEMATICS OF SAMPLING 7 E(Ax)2 = jE(AXi)s + E(AX2)2 + --- + E(AXn)2 (cid:2816) (29) or 4 - ^(na2) - i (30) These relationships obviously do not depend upon a normal distribution of errors. They do depend upon the condition that the errors be independent: this, however, is assured by the specification that the sampling be random. Furthermore it is understood that the sample of n observations is a sample from an infinite population, that is from an unlimited supply of possible observations. At this point it may be well to discuss the particular situation that arises when samples are drawn from a finite population. First of all we may note that it is theoret- ically possible to compute the true average of such a population simply by includ- ing all possible observations in the sample. The standard error of that average would be zero, for repeated samples taken in the same way would clearly include the same observations and yield exactly the same average. That fact itself sug- gests that averages for samples drawn from a finite population will have smaller sampling errors than samples of the same size from an unlimited or infinite popu- lation . The formula for computing the variance of an average for random samples of n observations can be derived,in different ways and written in different forms. The procedure followed here is to regard the finite population of N observations as being itself a random sample from an infinite parent population and to define a in terms of the variability of individual observations in that infinite popula- tion. This may seem somewhat artificial and it may appear to introduce some un- necessary complications, but it is in fact a mathematical model that simplifies the analysis of finite populations considerably. The advantages of such a view- point will become clear in subsequent discussions. For the problem now at hand it will be shown how the formula for the variance of sample averages for samples of n observations from a finite population of N observations may be derived with- out difficulty. Let a represent the variability of individual observations in the infinite population of which the finite population of N is itself a sample. Let x be the average of a sample of n observations from the finite population, m the average of all N observations in the finite population, and |x the average of the hypothetical infinite parent population. We have x - (x = (x - m) + (m - |x) (31) (x - (x)2 = (x - m)2 + 2(x - m)(m - (x) + (m - (x)2 (32) The expected value of (x - |x)2 is simply the variance of averages for random sam- ples of n obs6ervations from an infinite population and is equal to The ex- 8 pected value95 of (x - m)2 is the variance of averages for random samples of n ob- servations fr106om the finite populagletion of N. Its value is as yet unknown and may be represen0ted by aj. The expecoted value of (m - |x)2 is simply the variance of 0o averages for24 random samples ofd-g N observations from the infinite population and is equal to **r.19 The expected value#p of 2(x-m)(m- jx) is zero because under conditions of random so.3ampling there wouldse be no correlation between (x-m) and (m- \i). Stat- ed in terms coof an equation we h_uave E(x- fx)2 = E27/(x-m)2 + E(m- jx)2ess 0c (33) et/2g/ac dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP 8 MATHEMATICS OP SAMPLING or 2 2 £ 2 , g n x N (34) Solving equation (34) for gives the required result -(cid:2816)(cid:3697)e-i) <»> 2 a - x This equation is often written in the alternative form .2 5 n V N N - n (36) When using equations (35) or (36) it is important to remember that aj meas- ures the variability of averages for repeated random samples of n observations drawn from a finite population of N observations, but that a measures the varia- bility of individual observations in the infinite hypothetical parent population of which the finite population of N is itself a sample. In practice the numeri- cal value of a is generally not known but must be estimated from sample data. This estimate, denoted by s, is computed from a sample of n observations by the formula a _ S(X - x); 3 (cid:2802) ~ _ n (37) It is computed in this way regardless of whether the sample was drawn from an in- finite population or from a finite population because s refers to an infinite population in either case. Even when every possible observation in a finite pop- ulation is included in the sample, we compute - 3(; - (cid:3697))* (38) N-l This emphasizes the fact that s is an estimate of a for an infinite parent popu- lation and is not intended to measure the variability of individual observations in the finite population. If we require information about the variability of in- dividual obs6ervations in the finite population we set n = 1 in the equation 8 2 s2 /N - n\95 because a s106ingle observation magley be regarded as an average derived from one ob- servation. If0 s is computed fromo equation (38), this process yields 0o (40) 24d-g which gives 19precisely the variab#pility of individual observations in the finite population. o.3The use of s as an esestimate of a for a hypothetical infinite parent population tcohus leads to no inco_unsistency. It is a useful device that will be used again i27/n future discussionsess of more complicated problems. 0c TInh em purcohp setrattiet/2eisst iocfa sl laitse raant uerset,i mpaag/acrtteic oufl aar ldye tsheartv eo fs oolmdeer dveintatailgeed, awttee fnintido no zalso. dl.handle.nathitrust.or hh p://ww. MT / htthttp://w 12 08:27 Gdigitized / 9-e- 4-0ogl 1o 0G on 2ain, d m eo neratblic D eu GP

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.