* Work supported by the Air Force Office of Scientific Research, A.F.S.C., U.S.A.F., Grant No. AFOSR-7402736. SOME INVARIANCE PRINCIPLES RELATING TO JACKKNIFING AND THEIR ROLE IN SEQUENTIAL ANALYSIS* by Pranab Kumar Sen Department of Biostatistics University of North Carolina at Chapel Hill Institute of Statistics Mimeo Series No. 1055 February 1976 \ SOME INVARIANCE PRINCIPLES RELATING TO JACKKNIFING AND THEIR ROLE IN SEQUENTIAL ANALYSIS* Pranab Kumar Sen University of North Carolina, Chapel Hill ABSTRACT For a broad class of jackknife statistics, it is shown that the Tukey estimator of the variance converges almost surely to its popula- tion counterpart. Moreover, the usual invariance principles (relating to the Wiener process approximations) usually filter through jackknifing under no extra regularity conditions. These results are then incorporated in providing a bounded-length (sequential) confidence interval and a pre- assigned-strength sequential test for a suitable parameter based on jack- knife estimators. AMS 1970 Classification No: 60BIO, 62E20, 62G35,62LIO Key Words &Phrases: Almost sure convergence, Brownian motions, bounded length confidence intervals; invariance principles, preassigned strength sequential tests, jackknife, stopping time, tightness, Tukey estimator of the variance, U-statistics, von Mises' functionals. * Work supported by the Air Force Office of Scientific Research, A.F.S.C., U.S.A.F., Grant No. AFOSR-74-2736. -2- 1. Introduction The jackknife estimator, originally introduced for bias reduction by Quenouille and extended by Tukey for robust interval estimation, has been studied thoroughly by a host of workers during the past twenty years; along with some extensive bibliography, detailed studies are made in the recent papers of Arvesen (1969), Schucany, Gray and Owen (1971), Gray, Watkins and Adams (1972) and Miller (1974). One of the major concerns is the asympto tic normality of the studentized form of the jackknife statistics. The purpose of the present investigation is to focus on some deeper asymptotic properties of jackknife estimators and to stress their role in the asympto tic theory of sequential procedures bdsed on jackknifing. Specifically, the almost sure convergence of the Tukey estimator of the variance is established here for a broad class of jackknife statistics and their asymptotic normality results are strengthened to appropriate (weak as well as strong) invariance principles yielding Wiener process approxima tions for the tail-sequence of jackknife estimators. These results are then incorporated in providing (i) a bounded-length (sequential) con fidence interval and (ii) a prescribed-strength sequential test for a suitable parameter based on jackknife estimators. Section 2 deals with the preliminary notions along with some new interpretations of the jackknife estimator and the Tukey estimator of the variance. For convenience of presentation, in Section 3, we adopt the framework of Arvesen (1969) and present the invariance principles for jackknifing U-statistics. Section 4 displays parallel results for general estimators. The two sequential problems of estimation and test ing are treated in the last two sections of the paper. -3- 2. Preliminary Notions Let {X., i ~ l} be a sequence of independent and identically distri 1 buted random vectors (i.i.d.r.v) with a distribution function F, defined on the p(~l) - dimensional Euclidean space RP. Let e (2.1) = T (Xl' ... 'X ), n ~ 1 n n n be a sequence of estimators of a parameter 8, such that (2.2) E8n = e+ n-181 + n-2S2 + (=> E(8An-8) = O(n-1)) where the B. are unknown constants. Let us denote by ] Ai (2.3) 8 = T l(X '· .. ,X. l' X·l,···,X), l~i~n , n-l n- l 1- 1+ n A Ai (2.4) 8 = nS (n-l)8 l' l~i~n , n,i n n- (2.5) 8* = n-lIn. 18A . = ne - (n-l){n-lIni=18Ai- } n 1= n,1 n n l Then, 8* is termed the jackknife estimator of 8. Clearly, by (2.2), n (2.3), and (2.5), -2 (2.6) E8~ = 8 - B/n(n-l) + ... (=> E(8*-8) = O(n )). n Further let, n I I (2.7) v* = __1__ I [8 ._8*]2 = (n-l) (Si - 1 8j 1)2. n n-l. 1 n,1 n i=l n-l n j=l n- 1= In support of the Tukey conjecture, various authors have shown that under suitable regularity conditions, !-.< !-.< V (2.8) n 2(8n*-8)/ [vn*] 2 + N(0,1) as n + 00 • -4- Our contention is to obtain stronger results concerning (i) the almost sure (a.s.) convergence of v* and (ii) Wiener process approximations n for the tail-sequence {e~-e; k ~ nl. For simplicity, we assume that p = 1 i.e., the X. are real valued 1 and R = (_00, 00) • For every n(~l), the order statistics corresponding to Xl"'" Xn are denoted by Xn,1:0; ••• :0; Xn,n Let Cn = C(Xn,1"'" Xn,n, Xl"") be the cr-field generated by (X l""'X ) andby X .,j~l. n+ n, n,n n+J Then, C is non-increasing in n(~l). Note that given C , X ., j ~ 1 n n n+J are all held fixed while (Xl, ... ,X ) are interchangeable and assume all n possible permutations of with equal conditional proba- (Xn,1"" ,Xn,n) -1 bility (n!) . Hence, I -l~n ~i (2.9) E(eA 1 C ) = n L'_le 1 a.e. , n- n 1- n- and, therefore, by (2.5) and (2.9), (2.10) e* = ne - (n-l)E(8 llC) = 8 + (n-l)E{8 -8 llC} a.e. n n n- n n n n- n e . Clearly, if {8 ,C } is a reverse-martingale, e* = otherwise, the n n n n' jackknifing consists in adding up the correction factor e (2.11) e* - = (n-l) E{(8 -8 1) IC } n n n n- n It follows by similar arguments that (2.12) v* =n(n-l) Var{(8 -8 1) Ic } n n n- n = n(r-l){E[(8 -8 1)2IC ] - (E[(8 -8 l)IC ])2} . n n- n n n- n These interpretations and representations for jackknifing are quite useful for our subsequent results. -5- For further reduction of bias, higher order jackknife estimators have been proposed by various workers (see [6, 12]). The second order jackknife estimator (see (4.20) of [12]) can be written in our notations as (2.13) and a similar expression holds for the higher order jackknifing. In fact, we have also a second interpretation for 8* e** etc. from the n' n weighted least squares point of view. In most of the cases to follow, we shall observe that for some VI > 0 and real v ' 2 (2.14) (2.15) Cov{n!z(8 -e), (n-l)!z(8 -e)} =jn-l [v +n-lv +O(n-2)] n n-l n 1 2 '" -1-2 Also, by (2.2), E(en-e) = n ~\ +n 8 + . .. . Thus, neglecting terms 2 -2 of O(n ), the weighted least squares method of estimating consists in minimizing 2'" 1 2 "" 1 1 (2.16) n (en - e - n f\) - 2n(n-l) (en - e - nSl) (8A _ - 8 - n-l Sl) n l '" 1 2 +n(n-l)(en_l-e-nS ) l with respect to e and Sl; the simultaneous equations yield (2.17) 8 = n§ - (n-l)8 w n n-l Ie ). and our e* = E(e Similarly, by (2.14)-(2.15), on writing n w n (2.18) (k-l)8 _ , k~ 2 , k l (2.19) Var(Z ) Cov(Z ,Z 1) n n n- -6- Thus, Z , Z 1 are asymptotically uncorrelated and by (2.2), n n- EZ :: e- 8/k(k-l) + O(k-3). Hence, considering k (2.20) and minimizing with respect to (e, ( ), we obtain the weighted least 2 squares solution (2.21) In fact, we obtain the same solution by working with (8,8 1,8 2) n n- n- and applying the weighted least squares method directly on it. Again, e** :: E(8 Ie ). In general, if we want to reduce the bias to the O(n-k -l) , n w n for some k 2: 1, then we need to work with (eA ,... ,eA k) and the k-th n n- e ) order jackknife estimator is the conditional expectation (given of n e, the weighted least squares estimator of neglecting 8 ., j 2: 1 in k+J (2.2) and terms in (2.14)-(2.15). Thus, we have the following. Theorem 2.1. Under assumptions (2.14)-(2.15), the jackknife estimators e ) (of different orders) are the conditionaZ expectations (given of the n weighted Least squares estimators obtained from the originaZ (biased) esti- mators for the successive sampLe sizes. Since in Section 3, we shall be concerned with jackknifing functions of V-statistics, we find it convenient to introduce the following notations at this stage. Let ¢(X , ... ,X ), symmetric in its m arguments, be a l m Borel measurable kerneL of degree m(2:l) and consider the regular func- tionaL (estimable parameter) (2.22) -7- I I where F = {F: I;(F) < oo}. Then, for n ~ m, the U-statistics correspond- ing to I; is defined by [n) (2.23) U = -lIe <j>(X., .•• ,x. ) e = {I::;; i 1 < . • • < i ::;; n} n m n,m 11 1m n,m m Note that EU = I;(F), V n ~ m. Further, let n (2.24) for h =0, ... ,m, where ?::O = 0 and cPO = 1;. We assume that (2.25) O<l;,?:: <00 (where 1 m 3. Invariance Principles Relating to Jackknifing U-Statistics We shall be concerned here mainly with the following two types of estimators: (i) Let g, defined on R, have a bounded second derivative in some neighborhood of 1;, and (3.1) (ii) For some positive integer q, we have e Iq oct ~ (3.2) = U(s) , n m , n s= n,s n where U(O) = U is an unbiased estimator of e = I;(F), n n -1 -2 -3 (3.3) ctn,O=l+n cO,l+n c ,2+ 0(n ), O -8- U~l) ,u~q) , ... are appropriate U-statistics with expectations 8 ", .,8 1 q (unknown but finite) and -h -h-l = (3.4) CI.n,h n ch,0 + 0(n ), h ~ 1 the c . are real constants; possibly, some being equal to o. The S,] classical von Mises' (1947) differentiable statistical function (corre- sponding to E,(F)) is a special case of (3.2) with q =m and cO,1=-(;) . First, we consider the following. Theorem 3.1. For {e} defined by (3.1) or (3.2)-(3.4), n (3.5) Vn* -+ y2 a. s., as n-+ oo , where (3.6) Proof. In the context of weak convergence of Rao-Blackwell estimator of distribution functions, Bhattacharyya and Sen (1974) have shown that under (2.25), (3.7) n(n-l)E[(Un-l-Un)2Ien] -+ m2~1 a.s., as n -+ 00 • On the other hand, as in Section 2, 2Ie ] \,n i 2 (3.8) n(n-l)E[(U l-U) = (n-l)l,'_l[U l-U] n- n n 1- n- n where the are defined as in (2.3) with T being replaced by n-l U l' Hence, from (3.7) and (3.8), we obtain that n- -9- (3.9) a.s., as n + 00 • Further, {U , C , n ~ m} is a reverse martingale, so that U + E,(F) n n n a.s., as n + 00, and hence, by (3.9), I max I i (3.10) 1-<l'<-n Un_l-E,(F) + 0 a.s., as n + 00 • First, consider the case of (3.1). Then, we have e 8 - (3.11) = g(U )-g(U ) n-1 n n-1 n ]+~g"(hU = g'(U )[U 1-U +(l-h)U l)[U 1-U ]2, O<h<l. n n- n n n- n- n Note that E[U 11C] = U a.e. and further by (3.7), (3.8), (3.10) and n- n n the boundedness of g" (in a neighborhood of E,), we have }! (3.12) IE{g"(hU +(l-h)U l)[U 1-U ]21C n n- n- n n $; m~x I fIChU +(l-h)Ui )I{.!. \'~ [Ui -U ]2} l$;l$;n g n n-1 n L1=1 n-1 n -2 = O(n ) a.s., as n + 00 • Hence, we obtain from (3.11) and (3.12) that I (3.13) E(8" n-1-8An Cn) = O(n-2) a.s., as n + 00 • Similarly, "" "" 2 -4 (3.14) Var{(8 1-8)IC}=E{(8 1-8) C}+O(n) a.s. n- n n n- n 1 n -l\,n i 2-4 = n Li=l[g(Un_1) - g(Un)] +O(n ) a.s., as n + 00 • Again as in (3.12), for some 0 < h < 1
Description: