ebook img

Exact L^2-distance from the limit for QuickSort key comparisons (extended abstract) PDF

0.15 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Exact L^2-distance from the limit for QuickSort key comparisons (extended abstract)

EXACT L2-DISTANCE FROM THE LIMIT FOR QUICKSORT KEY COMPARISONS (EXTENDED ABSTRACT) 2 PATRICKBINDJEME 1 0 JAMESALLENFILL 2 n a J Abstract 1 3 Usingarecursiveapproach,weobtainasimpleexactexpressionfortheL2-distancefrom thelimitinR´egnier’s[5]classicallimittheoremforthenumberofkeycomparisonsrequired ] by QuickSort. A previous study by Fill and Janson [1] using a similar approach found R P that the d2-distance is of order between n−1logn and n−1/2, and another by Neininger . andRuschendorf[4]foundthattheZolotarev ζ3-distanceisofexactordern−1logn. Our h expression reveals that the L2-distanceis asymptotically equivalent to (2n−1lnn)1/2. t a m [ 1. Introduction, review of related literature, and summary 1 v We consider Hoare’s [3] QuickSort sorting algorithm applied to an infinite 5 stream of iid (independent and identically distributed) uniform random variables 4 U ,U ,.... QuickSortchoosesthefirstkeyU asthe“pivot”,compareseachofthe 1 2 1 4 other keys to it, and then proceeds recursively to sort both the keys smaller than 6 the pivotandthoselargerthanit. If, for example,the initialroundofcomparisons . 1 finds U < U , then U is used as the pivot in the recursive call to the algorithm 2 1 2 0 that sorts the keys smaller than U because it is the first element in the sequence 2 1 U ,U ,... which is smaller than U . In a natural and obvious way, a realization 1 1 2 1 : (requiring infinite time) of the algorithm produces an infinite rooted binary search v tree which with probability one has the completeness property that each node has i X two child-nodes. r Essentiallythesamealgorithmcanofcoursebeappliedtothetruncatedsequence a U ,U ,...,U for any finite n, where the recursion ends by declaring that a list of 1 2 n size0or1isalreadysorted. LetK denotethenumberofkeycomparisonsrequired n by QuickSort to sort U ,U ,...,U . Then, with the way we have set things up, 1 2 n allthe randomvariablesK aredefined ona commonprobability space,andK is n n nondecreasing in n. Indeed, K −K is simply the cost of inserting U into the n n−1 n usual (finite) binary search tree formed from U ,...,U . 1 n−1 In this framework, R´egnier [5] used martingale techniques to establish the fol- lowing Lp-limit theorem; she also proved almost sure convergence. We let µ :=EK . n n Date:January26,2012. Research supported by the Acheson J. Duncan Fund for the Advancement of Research in Statistics. 1 2 PATRICKBINDJEME JAMESALLENFILL Theorem 1.1 (R´egnier [5]). There exists a random variable T satisfying Kn−µn Lp Y := −→T n n+1 for every finite p. Ro¨sler [6] characterized the distribution of R´egnier’s limiting T as the unique fixed point of a certain distributional transformation, but he also described explic- itly how to construct a random variable having the same distribution as T. We will describe his explicit construction in equivalent terms, but first we need two paragraphsof notation. The nodes of the complete infinite binary search tree are labeled in the natural binaryway: therootgetsanemptylabelwrittenεhere,theleft(respectively,right) child is labeled 0 (resp., 1), the left child of node 0 is labeled 00, etc. We write Θ:=∪ {0,1}k for the set of all such labels. If V denotes the key inserted at 0≤k<∞ θ nodeθ ∈Θ,letL (resp.,R )denotethelargestkeysmallerthanV (resp.,smallest θ θ θ key larger than V ) inserted at any ancestor of θ, with the exceptions L :=0 and θ θ R :=1 if the specified ancestor keys do not exist. Further, for each node θ, define θ φ :=R −L , U :=φ /φ , θ θ θ θ θ0 θ (1.1) G :=φ C(U )=φ −2φ lnφ +2φ lnφ +2φ lnφ , θ θ θ θ θ θ θ0 θ0 θ1 θ1 where for 0<x<1 we define (1.2) C(x):=1+2xlnx+2(1−x)ln(1−x). Let 1 ≤ p < ∞. The d -metric is the metric on the space of all probability p distributions with finite pth absolute moment defined by d (F ,F ) := infkX −X k , p 1 2 1 2 p wherewetaketheinfimumofLp-distancesoverallpairsofrandomvariablesX and 1 X (defined on the same probability space) with respective marginal distributions 2 F and F . By the d -distance between two random variables we mean the d - 1 2 p p distance between their distributions. WearenowpreparedtostateRo¨sler’smainresult. Note:Hereandlaterresults have been adjusted slightly as necessary to utilize the same denominator n + 1 (rather than n) that R´egnier used. Theorem1.2(R¨osler[6]). Foranyfinitep,theinfiniteseriesY = ∞ G j=0 |θ|=j θ convergesinLp,andthesequenceY =(K −µ )/(n+1)convergesinthed -metric n n n P Pp to Y. Of course it follows from Theorems 1.1–1.2 that T and Y have the same distri- bution. The purpose of the present extended abstract is to show that in fact T =Y and to provide asimpleexplicitexpressionfor the L2-distance between Y and Y valid for every n; this is done in Theorem 1.4 below. n We are aware of only two previous studies of the rate of convergence of Y n to Y, and both of those concern certain distances between distributions rather than between random variables. The first study, by Fill and Janson [1], provides upper and lower bounds on d (Y ,Y) for general p; we choose to focus here on d . p n 2 Theorem 1.3 (Fill and Janson [1]). There is a constant c > 0 such that for any n≥1 we have cn−1lnn≤d (Y ,Y)<2n−1/2. 2 n EXACT L2-DISTANCE FOR QUICKSORT KEY COMPARISONS 3 Toourknowledge,thegapbetweentherates(logn)/nandn−1/2 hasnotbeennar- rowed. NeiningerandRuschendorff[4]usedtheZolotarevζ -metricandfoundthat 3 the correctratein that metric is n−1logn, but their techniques are notsufficiently sharp to obtain ζ (Y ,Y)∼c˜n−1lnn for some constant c˜. 3 n In our main Theorem 1.4, proved using the same recursive approach as in FillandJanson[1], wefindnotonlythe lead-orderasymptoticsfortheL2-distance kY −Yk ,butinfactanexactexpressionforgeneraln. Itisinterestingtonotethat n 2 theraten−1/2(logn)1/2forL2-convergenceislargereventhantheupper-boundrate of n−1/2 for d -convergence from Theorem 1.3. 2 Theorem 1.4 (main theorem). For n≥0 we have ∞ 6 lnn 1 kY −Yk2 =(n+1)−1 2H +1+ −4 k−2 =2 +O , n 2 n n+1 n n (cid:18) (cid:19) k=n+1 (cid:18) (cid:19) X where H := n j−1 is the nth harmonic number and the asymptotic expression n j=1 holds as n→∞. P The remainder of this extended abstract is devoted to a proof of Theorem 1.4, which is completed in Section 5. 2. Preliminaries In this section we provide recursive representations of Y (for general n) and Y n that will be useful in proving Theorem 1.4. Our first proposition concerns the limit Y and givesa sample-pointwise extensionof the verywellknown[6] distribu- tional identity satisfied by Y. Recall the notation (1.1) and the definition of Y in Theorem 1.2 as the infinite series ∞ G in L2. j=0 |θ|=j θ Proposition 2.1. There exist ranPdom vPariables F and H for θ ∈Θ such that θ θ (i) the joint distributions of (G :θ ∈Θ), of (F :θ ∈Θ), and of (H :θ ∈Θ) θ θ θ agree; (ii) (F :θ ∈Θ) and (H :θ ∈Θ) are independent; θ θ (iii) the series ∞ ∞ (2.1) Y(0) := F and Y(1) := H θ θ Xj=0|Xθ|=j Xj=0|Xθ|=j converge in L2; (iv) the random variables Y(0) and Y(1) are independent, each with the same distribution as Y, and (2.2) Y =C(U)+UY(0)+UY(1). Here U :=U , with U :=1−U , and C is defined at (1.2). 1 1 Proof. Recall from (1.1) that G =φ −2φ lnφ +2φ lnφ +2φ lnφ . θ θ θ θ θ0 θ0 θ1 θ1 For θ ∈Θ, define the random variable ϕ (respectively, ψ ) by θ θ ϕ :=φ /U (resp., ψ :=φ /U). θ 0θ θ 1θ 4 PATRICKBINDJEME JAMESALLENFILL ThenU andϕ areindependent(resp.,U andψ areindependent), ϕ andψ each θ θ θ θ have the same distribution as φ , and θ G =UF and G =UH , 0θ θ 1θ θ where F :=ϕ −2ϕ lnϕ +2ϕ lnϕ +2ϕ lnϕ , θ θ θ θ θ0 θ0 θ1 θ1 H :=ψ −2ψ lnψ +2ψ lnψ +2ψ lnψ . θ θ θ θ θ0 θ0 θ1 θ1 The proposition follows easily from the clear equality L(F :θ ∈Θ)=L(G :θ ∈Θ)=L(H :θ ∈Θ), θ θ θ of joint laws and the fact that ∞ ∞ ∞ Y = G =G + G + G θ ε 0θ 1θ Xj=0|Xθ|=j Xj=0|Xθ|=j Xj=0|Xθ|=j ∞ ∞ = C(U)+U F +U H θ θ Xj=0|Xθ|=j Xj=0|Xθ|=j = C(U)+UY(0)+UY(1). (cid:3) We next proceed to provide an analogue [namely, (2.4)] of (2.2) for each Y , n rather than Y, but first we need a little more notation. Given 0 ≤ x < y ≤ 1, let (Uxy) be the subsequence of (U ) that falls n n≥1 n n≥1 in (x,y). The random variable K (x,y) is defined to be the (random) number of n key comparisons used to sort Uxy,...,Uxy using QuickSort. The distribution of 1 n K (x,y) of course does not depend on (x,y). n We now define the random variable (2.3) Y := [K (L ,R )−µ ]/[ν (n)+1], n,θ νθ(n) θ θ νθ(n) θ with the centering here motivated by the fact that µ is the conditional expec- νθ(n) tation of K (L ,R ) given (ν (n),L ,R ). Then for n≥1 we have νθ(n) θ θ θ θ θ n ν (n)+1 ν (n)+1 0 1 (2.4) Y = C (ν (n)+1)+ Y + Y , n n 0 n,0 n,1 n+1 n+1 n+1 where, as in [2], for 1≤i≤n we define C (i):= 1(n−1+µ +µ −µ ). n n i−1 n−i n Wenoteforfuturereferencethattheclassicaldivide-and-conquerrecurrenceforµ n asserts precisely that n (2.5) C (i)=0 n i=1 X for n≥1. EXACT L2-DISTANCE FOR QUICKSORT KEY COMPARISONS 5 It follows from (2.2) and (2.4) that for n≥1 we have ν (n)+1 ν (n)+1 Y −Y = 0 Y −UY(0) + 1 Y −UY(1) n n,0 n,1 n+1 n+1 (cid:20) (cid:21) (cid:20) (cid:21) n + C (ν (n)+1)−C(U) n 0 n+1 (cid:20) (cid:21) (2.6) =:W +W +W . 1 2 3 ConditionallygivenU andν (n),therandomvariablesW andW areindependent, 0 1 2 each with vanishing mean, and W is constant. Hence 3 E[(Y −Y)2|U,ν (n)]=E[W2|U,ν (n)]+E[W2|U,ν (n)]+W2 n 0 1 0 2 0 3 and thus, taking expectations and using symmetry, for n≥1 we have (2.7) a2 :=E(Y −Y)2 =EW2+EW2+EW2 =2EW2+EW2. n n 1 2 3 1 3 Note that (2.8) a2 =EY2 =σ2 :=7− 2π2 0 3 (for example, [2]). 3. Analysis of EW2 1 In this section we analyze EW2, producing the following result. Recall the 1 definition of σ2 at (2.8). Proposition 3.1. Let n≥1. For W defined as at (2.6), we have 1 1 n−1 σ2 EW2 = (k+1)2a2 + . 1 n(n+1)2 k 6(n+1) k=0 X For that, we first prove the following two lemmas. Lemma 3.2. For any n≥1, we have ν (n)+1 2 2 1 n−1 k+1 2 E 0 Y −Y(0) = a2. n+1 n,0 n n+1 k "(cid:18) (cid:19) (cid:16) (cid:17) # Xk=0(cid:18) (cid:19) Lemma 3.3. For any n≥1, we have ν (n)+1 2 2 σ2 E 0 −U Y(0) = . n+1 6(n+1) "(cid:18) (cid:19) (cid:16) (cid:17) # Proof of Lemma 3.2. There is a probabilistic copy Y∗ =(Y∗) of the stochastic n process (Y ) such that n Y ≡Y∗ n,0 ν0(n) and Y∗ and Y(0) are independent of (U,ν (n)). This implies 0 ν (n)+1 2 2 ν (n)+1 2 2 E 0 Y −Y(0) = E 0 Y∗ −Y(0) . n+1 n,0 n+1 ν0(n) "(cid:18) (cid:19) (cid:16) (cid:17) # "(cid:18) (cid:19) (cid:16) (cid:17) # 6 PATRICKBINDJEME JAMESALLENFILL By conditioning on ν (n), which is uniformly distributed on {0,...,n−1}, we get 0 ν (n)+1 2 2 ν (n)+1 2 E 0 Y −Y(0) =E 0 a2 n+1 n,0 n+1 ν0(n) "(cid:18) (cid:19) (cid:16) (cid:17) # "(cid:18) (cid:19) # n−1 2 1 k+1 = a2. n n+1 k k=0(cid:18) (cid:19) X (cid:3) Proof of Lemma 3.3. Conditionally given ν (n) and Y(0), we have that U is 0 distributed as the order statistic of rank ν (n)+1 from a sample of size n from 0 the uniform(0,1)distribution,namely,Beta(ν (n)+1,n−ν (n)), withexpectation 0 0 [ν (n)+1]/(n+1)andvariance[(ν (n)+1)(n−ν (n))]/[(n+1)2(n+2)]. So,using 0 0 0 also the independence of ν (n) and Y(0), we find 0 ν (n)+1 2 2 E 0 −U Y(0) n+1 "(cid:18) (cid:19) (cid:16) (cid:17) # (ν (n)+1)(n−ν (n)) 2 (ν (n)+1)(n−ν (n)) =E 0 0 Y(0) =σ2E 0 0 (n+1)2(n+2) (n+1)2(n+2) (cid:20) (cid:16) (cid:17) (cid:21) (cid:20) (cid:21) σ2 1 n−1 σ2 1 = × (k+1)(n−k)= × n(n+1)(n+2) (n+1)2(n+2) n n(n+1)2(n+2) 6 k=0 X σ2 = . 6(n+1) (cid:3) Proof of Proposition 3.1. We have 2 ν (n)+1 EW2 =E 0 Y −UY(0) 1 n+1 n,0 (cid:20) (cid:21) 2 ν (n)+1 ν (n)+1 =E 0 Y −Y(0) + 0 −U Y(0) n,0 n+1 n+1 (cid:20) (cid:16) (cid:17) (cid:18) (cid:19) (cid:21) ν (n)+1 2 2 =E 0 Y −Y(0) n,0 n+1 "(cid:18) (cid:19) (cid:16) (cid:17) # 2 ν (n)+1 +E 0 −U (Y(0))2 n+1 "(cid:18) (cid:19) # ν (n)+1 ν (n)+1 +2E 0 Y −Y(0) 0 −U Y(0) . n,0 n+1 n+1 (cid:20) (cid:16) (cid:17)(cid:18) (cid:19) (cid:21) The result follows from Lemmas 3.2–3.3, and the fact that, conditionally given (ν (n),Y ,Y(0)), the randomvariable U is distributed Beta(ν (n)+1,n−ν (n)), 0 n,0 0 0 so that the last expectation in the preceding equation vanishes. (cid:3) 4. Analysis of EW2 3 In this section we analyze EW2, producing the following result. 3 EXACT L2-DISTANCE FOR QUICKSORT KEY COMPARISONS 7 Proposition 4.1. For any n≥1 we have σ2−7 4 n+2 4 b2 :=EW2 = + H(2)+ H , n 3 3 3 n+1 n 3n(n+1)2 n (cid:18) (cid:19) where H = n j−1 is the nth harmonic number and H(2) := n j−2 is the n j=1 n j=1 nth harmonic number of the second order. P P For that, we first prove the following two lemmas. Lemma 4.2. For any 1≤k ≤n we have 1 1 D(n,k):= tk−1(1−t)n−k(lnt)dt=H −H , k−1 n B(k,n−k+1) Z0 where B is the beta function. Lemma 4.3. For any n≥1 we have n E[C (ν (n)+1)C(U)]= E[C (ν (n)+1)]2. n 0 n 0 n+1 Proof of Lemma 4.2. The result can be proved for each fixed n ≥ 1 by back- wardsinduction on k and integrationby parts, but we give a simpler proof. Recall the defining expression 1 B(α,β)= tα−1(1−t)β−1dt Z0 for the beta function when α,β >0. Differentiating with respect to α gives 1 tα−1(1−t)β−1(lnt)dt=B(α,β)[ψ(α)−ψ(α+β)], Z0 where ψ is the classical digamma function, i.e., the logarithmic derivative of the gamma function. But it is well known that ψ(j)=H for positive integers j, so j−1 the lemma follows by setting α=k and β =n−k+1. (cid:3) Proof of Lemma 4.3. We knowthatν (n)+1∼unif{1,2,...,n}andthat,con- 0 ditionally given ν (n), the random variable U has the Beta(ν (n)+1,n−ν (n)) 0 0 0 distribution. So from Lemma 4.2, repeated use of (2.5), and the very well known and easily derived explicit expression µ =2(n+1)H −4n, n≥0, n n 8 PATRICKBINDJEME JAMESALLENFILL we have E[C (ν (n)+1)C(U)] n 0 1 n 1 1 = C (j) tj−1(1−t)n−jC(t)dt n n B(j,n−j+1) j=1 Z0 X n 1 j n−j+1 = C (j)[1+2 (H −H )+2 (H −H )] n j n+1 n−j+1 n+1 n n+1 n+1 j=1 X n 1 = C (j)[2jH +2(n−j+1)H ] n j n−j+1 n(n+1) j=1 X n 1 = C (j)[2jH −4(j−1)+2(n−j+1)H −4(n−j)] n j−1 n−j n(n+1) j=1 X n n 1 1 = C (j)[µ +µ ]= C (j)[µ +µ −µ ] n j−1 n−j n j−1 n−j n n(n+1) n(n+1) j=1 j=1 X X n n 1 n = × C (j)2 = E[C (ν (n)+1)]2, n n 0 n+1 n n+1 j=1 X as desired. (cid:3) Proof of Proposition 4.1. It follows from Lemma 4.3 that 2 n b2 =E C (ν (n)+1)−C(U) n n+1 n 0 (cid:20) (cid:21) n 2 n = E[C (ν (n)+1)]2−2 E[C (ν (n)+1)C(U)]+EC(U)2 n 0 n 0 n+1 n+1 (cid:18) (cid:19) (cid:18) (cid:19) 2 n =EC(U)2− E[C (ν (n)+1)]2. n 0 n+1 (cid:18) (cid:19) Knowing that EC(U)2 =σ2/3, and from the proof of Lemma A.5 in [2] that E[C (ν (n)+1)]2 = 7 1+ 1 2− 4 1+ 2 1+ 1 H(2)− 4n−3H , n 0 3 n 3 n n n 3 n we have (cid:0) (cid:1) (cid:0) (cid:1)(cid:0) (cid:1) σ2−7 4 n+2 4 b2 = + H(2)+ H , n 3 3 n+1 n 3n(n+1)2 n (cid:18) (cid:19) as claimed. (cid:3) 5. A closed form for a2 n In this final section we complete the proof of Theorem 1.4, for which we need one more lemma. Lemma 5.1. For H(2) = n j−2, thenth harmonic number of thesecond order, n j=1 we have P n H(2) = (n+1)H(2)−H j n n j=1 X for any nonnegative integer n. EXACT L2-DISTANCE FOR QUICKSORT KEY COMPARISONS 9 The lemma is well known and easily proved. Proof of main Theorem 1.4. For n ≥ 1 we have from the decomposition (2.7) and Propositions 3.1 and 4.1 that n−1 2 a2 = (k+1)2a2 n n(n+1)2 k k=0 X σ2 n+2 7 4 n+2 4 + − + H(2)+ H , 3 n+1 3 3 n+1 n 3n(n+1)2 n (cid:18) (cid:19) (cid:18) (cid:19) and we recall from (2.8) that a2 =σ2. Setting x := (n+1)2a2, we have x = σ2 0 n n 0 and n−1 2 x = x +c for n≥1, n k n n k=0 X with σ2 7 4 4 c := (n+2)(n+1)− (n+1)2+ (n+2)(n+1)H(2)+ H . n 3 3 3 n 3n n This is a standard divide-and-conquer recurrence relation for x , with solution n n kc −(k−1)c x =(n+1) σ2+ k k−1 , n≥0. n k(k+1) " # k=1 X After straightforwardcomputation involving the identity in Lemma 5.1, one finds 6 a2 =(n+1)−1 2H +1+ +σ2−7+4H(2) n n n+1 n (cid:18) (cid:19) ∞ 6 lnn 1 =(n+1)−1 2H +1+ −4 k−2 =2 +O , n n+1 n n (cid:18) (cid:19) k=n+1 (cid:18) (cid:19) X as claimed. (cid:3) References [1] James AllenFillandSvante Janson.Quicksortasymptotics. J. Algorithms, 44(1):4–28, 2002. Analysisofalgorithms. [2] James AllenFillandSvante Janson.Quicksort Asymptotics : Appendix.Unpublished,Avail- ablefromhttp://www.ams.jhu.edu/~fill/, 2004. [3] C.A.R.Hoare.Quicksort.Comput. J.,5:10–15, 1962. [4] RalphNeiningerandLudgerRu¨schendorf.Ratesofconvergence forQuicksort.J. Algorithms, 44(1):51–62, 2002. Analysisofalgorithms. [5] M.R´egnier.AlimitingdistributionofQuicksort.RAIROInformatique Th´eorique etApplica- tions,23:335–343, 1989. [6] U. Ro¨sler. A limit theorem for Quicksort. RAIRO Informatique Th´eorique et Applications, 25:85–100, 1991. Department of Applied Mathematics and Statistics, The Johns Hopkins University, 34thandCharlesStreets, Baltimore,MD21218-2682USA E-mail address: [email protected] and [email protected]

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.