ebook img

Probabilistic lower bounds on maximal determinants of binary matrices PDF

0.22 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Probabilistic lower bounds on maximal determinants of binary matrices

Probabilistic lower bounds on maximal determinants of binary matrices 6 1 0 2 Richard P. Brent Judy-anne H. Osborn t Australian National University The University of Newcastle c O Canberra, ACT 2600 Callaghan, NSW 2308 Australia Australia 5 2 Warren D. Smith ] Center for Range Voting O 21 Shore Oaks Drive C Stony Brook, NY 11790 . h USA t a m In memory of Mirka Miller 1949–2016 [ 7 Abstract v 5 Let (n) be the maximal determinant for n n 1 -matrices, 3 2 and (nD) = (n)/nn/2 be the ratio of (n) to th×e Ha{d±am}ard upper R D D 6 bound. Using the probabilistic method, we prove new lower bounds 0 on (n) and (n) in terms of d = n h, where h is the order of a . D R − 1 Hadamard matrix and h is maximal subject to h n. For example, ≤ 0 5 2 d/2 1 (n)> if 1 d 3, and R πe ≤ ≤ v: (cid:18) (cid:19) Xi (n)> 2 d/2 1 d2 π 1/2 if d>3. r R πe − 2h a (cid:18) (cid:19) (cid:18) (cid:16) (cid:17) (cid:19) By a recent result of Livinskyi, d2/h1/2 0 as n , so the second boundiscloseto(πe/2)−d/2forlargen. →Previousl→ow∞erboundstended tozeroasn withdfixed,exceptinthecasesd 0,1 . Ford 2, →∞ ∈{ } ≥ our bounds are better for all sufficiently large n. If the Hadamard conjecture is true, then d 3, so the first bound above shows that (n) is bounded below by ≤a positive constant (πe/2)−3/2 >0.1133. R 1 Introduction Let (n) be the maximal determinant possible for an n n matrix with ele- D × ments in 1 . Hadamard [15] proved that (n) nn/2, and the Hadamard {± } D ≤ conjecture isthatamatrixachievingthisupperboundexistsforeachpositive integer n divisible by four. The function (n) := (n)/nn/2 is a measure R D of the sharpness of the Hadamard bound. Clearly (n) = 1 if a Hadamard R matrix of order n exists; otherwise (n) < 1. In this paper we give lower R bounds on (n) and (n). D R Let be the set of orders of Hadamard matrices, and let h be H ∈ H maximal subject to h n. Then d = n h can be regarded as the “gap” ≤ − between n and the nearest (lower) Hadamard order. We are interested the case that n is not a Hadamard order, i.e. d > 0 and (n) < 1. R Except in the cases d 0,1 , previous lower bounds on (n) tended to ∈{ } R zero as n . For example, the well-known bound of Clements and Lind- → ∞ ström[11,CorollarytoThm.2]showsthat (n) > (3/4)n/2,and[5, Thm. 9] shows that (n) (ne/4)−d/2. In contrRast, our results imply that, for R ≥ fixedd, (n)isboundedbelowbyapositiveconstant(dependingonlyon d). R Our lower bound proof uses the probabilistic method pioneered by Erdős (see for example [1, 13]). This method does not appear to have been applied previously to the Hadamard maximal determinant problem, except in the case d = 1 (so n 1 mod4); in this case the concept of excess has been ≡ used [14], and lower bounds on the maximal excess were obtained by the probabilistic method [2, 9, 13, 14]. §2 describes our probabilistic construction and determines the mean µ and variance σ2 of elements in the Schur complement generated by the con- struction (see Lemmas 2.6 and 2.9). Informally, we adjoin d extra columns to an h h Hadamard matrix A, and fill their h d entries with random × × (uniformly and independently distributed) 1 values. Then we adjoin d ex- ± tra rows, andfill theird (h+d)entries with values chosen deterministically × in a way intended to approximately maximise the determinant of the final matrix A. To do so, we use the fact that this determinant can be expressed in terms of the d d Schur complement of A in A. × In thee case d = 1, this method is essentially the same as the known method involving the excess of matrices Hadameard-equivalent to A, and leads to the same bounds that can be obtained by bounding the excess in a probabilistic manner. In §3 we give lower bound results on both (n) and (n). Of course, a D R lower bound on (n) immediately gives an equivalent lower bound on (n). D R However, we use some elementary inequalities to obtain simpler (though 2 slightly weaker) bounds on (n). For example, if d 3 then Theorem 3.6 R ≤ states that (n) hh/2(µd η), where µ and η are certain functions of h and d. TheoDrem 3≥.6 also stat−es the (weaker) result that (n)> (πe/2)−d/2. R The lower bound on (n) clearly shows that the ratio of our bound to the Hadamard bound is aRt least (πe/2)−3/2 > 0.1133, whereas this conclusion is not immediately obvious from the lower bound on (n). D We outline the bounds on (n) here. Theorem 3.4 gives a lower bound R d/2 2 π 1/2 (n) > 1 d2 (1) R πe − 2h (cid:18) (cid:19) (cid:18) (cid:19) (cid:16) (cid:17) which is nontrivial whenever h > πd4/2. By the results of Livinskyi [20], d = O(h1/6) as h (see [7, §6] for details), so the condition h > πd4/2 holds for all suffici→ent∞ly large n. Also, as n , d2/h1/2 = O(n−1/6) 0, sothelowerbound(1)iscloseto(πe/2)−d/2.→Fo∞rfixedd> 1andlargen→,our lower bounds on (n) are better than previous bounds (see Table 1 in §4). R Theorem 3.6 applies only for d 3, but whenever it is applicable it gives ≤ sharperresultsthanTheorem3.4. Infact,Theorem3.6showsthatthefactor 1 O(d2/h1/2) in (1) can be omitted when d 3, giving (n)> (πe/2)−d/2. − ≤ R Theorem 3.6 is always applicable if the Hadamard conjecture is true, since this conjecture implies that d 3. ≤ In§4,wegivesomenumericalexamplestoillustrateTheorems3.4and3.6, and to compare our results with previous bounds on (n) and/or (n). D R Rokicki et al [23] showed, by extensive computation, that (n) 1/2 R ≥ for n 120, and conjectured that this inequality always holds. It seems difficul≤t to bridge the gap between the constants 1/2 and (πe/2)−3/2 by the probabilistic method. The best that we can do is to improve the term of order d2/h1/2 in the bound (1) at the expense of a more complicated proof – for details see [7]. 2 The probabilistic construction We now describe our probabilistic construction and prove some of its prop- erties. In the case d = 1 our construction reduces to that of Best [2]. Let A be a Hadamard matrix of order h 4. We add a border of d rows ≥ and columns to give a larger (square) matrix A of order n. The border is defined by matrices B, C and D as shown: e A B A = . (2) C D (cid:20) (cid:21) e 3 The d d matrix D CA−1B is known as the Schur complement of A in A × − after Schur [24]. The Schur complement lemma (see for example [12]) gives e det(A) = det(A)det(D CA−1B). (3) − Inourconstructionthemeatrices A,B,andC haveentries in 1 . Weallow {± } the matrix D to have entries in 0, 1 , but each zero entry can be replaced { ± } byoneof+1or 1withoutdecreasing det(A),soanylowerboundsthatwe − | | obtain on max( det(A)) are valid lower bounds on maximal determinants | | of n n 1 -matrices. Note that the Schuer complement is not in general × {± } a 1 -matrix. e {± } In the proof of Lemma 3.2 we show that our choice of B, C and D gives a Schur complement D CA−1B that, with positive probability, has − sufficiently large determinant. From equation (3) and the fact that A is a Hadamard matrix, a large value of det(D CA−1B) implies a large value of − det(A). 2.1 eDetails of the probabilistic construction Let A be any Hadamard matrix of order h. B is allowed to range over the set of all h d 1 -matrices, chosen uniformly and independently from the × {± } 2hd possibilities. The d h matrix C = (c ) is a function of B. We choose ij × c = sgn(ATB) , ij ji where +1 if x 0, sgn(x) := ≥ ( 1 if x < 0. − To complete the construction, we choose D = I. As mentioned above, it − is inconsequential that D is not a 1 -matrix. {± } 2.2 Properties of the construction DefineF = CA−1B andG= F D = F+I (so GistheSchurcomplement defined above). Note that, sinc−e A is a Hadam−ard matrix, AT = hA−1, so hF = CATB. Since B is random, we expectthe elements of ATB to beusually of order h1/2. The definition of C ensures that there is no cancellation in the inner products defining the diagonal entries of hF = C (ATB). Thus, we expect · the diagonal entries f of F to be nonnegative and of order h1/2, but the ii off-diagonal entries f (i = j) to be of order unity with high probability. ij 6 4 Similarly for the elements of G. This intuition is justified by Lemmas 2.6 and 2.9. In the following we denote the expectation of a random variable X by E[X], and the variance by V[X] = E[X2] E[X]2. − 1 Lemmas 2.1–2.2 are essentially due to Best [2] and Lindsey. Lemma 2.1. If h 2 and F = (f ) is chosen as above, then ij ≥ h 2−hh if i= j, E[f ] = h/2 ij  (cid:18) (cid:19) 0 if i = j.  6 Proof. The case i = j followsas in Best [2, proof of Theorem 3]. The case i = j is easy, since B is chosen randomly. 6 Lemma2.2. IfF = (f )ischosenasabove, then f h1/2 for1 i,j d. ij ij | |≤ ≤ ≤ Proof. The matrix Q := h−1/2AT is orthogonal with rows and columns of unit length (in the Euclidean norm). Thus Qb = b = h1/2 for each 2 2 || || || || column b of B. Since h1/2F = C.QB, each element h1/2f of h1/2F is the ij inner product of a row of C (having length h1/2) and a column of QB (also having length h1/2). It follows from the Cauchy-Schwartz inequality that h1/2f h1/2 h1/2 = h, so f h1/2. ij ij | | ≤ · | | ≤ Lemma 2.3. If F is chosen as above and i,j k,ℓ = , then f and ij { }∩{ } ∅ f are independent. kℓ Proof. This follows from the fact that f depends only on the fixed matrix ij A and on columns i and j of B. Lemma 2.4. Let A 1 h×h be a Hadamard matrix, C 1 d×h, and U = CA−1. Then, fo∈r e{a±ch}i with 1 i d, ∈ {± } ≤ ≤ h u2 =1. ij j=1 X Proof. Since A is Hadamard, UUT = h−1CCT. Also, since c = 1, ij ± diag(CCT)= hI. Thus diag(UUT) = I. Lemma 2.5. If F = (f ) is chosen as above, then ij E[f2]= 1 for i = j. (4) ij 6 1See[13, footnote on pg. 88]. 5 Proof. We can assume, without loss of generality, that i = 1, j > 1. Write F = UB, where U = CA−1 = h−1CAT. Now f = u b , (5) 1j 1k kj k X where 1 u = c a , c = sgn b a . 1k 1ℓ kℓ 1ℓ m1 mℓ h ! ℓ m X X Observe that c and u depend only on the first column of B. Thus, f 1ℓ 1k 1j depends only on the first and j-th columns of B. If we fix the first column of B and take expectations over all choices of the other columns, we obtain E[f2 ]= E u u b b . 1j 1k 1ℓ kj ℓj " # k ℓ XX The expectation of the terms with k = ℓ vanishes, and the expectation of 6 the terms with k = ℓ is u2 . Thus, (4) follows from Lemma 2.4. k 1k Lemma 2.6. Let A bePa Hadamard matrix of order h 4 and B, C be 1 -matrices chosen as above. Let G = F +I where F =≥CA−1B. Then {± } h h E[g ] = 1+ , (6) ii 2h h/2 (cid:18) (cid:19) E[g ] = 0 for 1 i,j d, i = j, (7) ij ≤ ≤ 6 h(h 1) h/2 2 h2 h 2 V[g ] = 1+ − , (8) ii 2h+1 h/4 − 22h h/2 (cid:18) (cid:19) (cid:18) (cid:19) V[g ] = 1 for 1 i,j d, i = j. (9) ij ≤ ≤ 6 Proof. Since G = F +I, the results (6), (7) and (9) follow from Lemma 2.1 and Lemma 2.5 above. Thus, we only need to prove (8). Since g = f +1, ii ii it is sufficient to compute V[f ]. ii Since A is a Hadamard matrix, hF = CATB. We compute the second moment about the origin of the diagonal elements hf of hF. Since h is a ii Hadamard order and h 4, we can write h = 4k where k Z. Consider h ≥ ∈ independent random variables X 1 , 1 j h, where X = +1 with j j ∈ {± } ≤ ≤ probability 1/2. Define random variables S , S by 1 2 4k 2k 4k S = X , S = X X . 1 j 2 j j − j=1 j=1 j=2k+1 X X X 6 Consider a particular choice of X ,...,X and suppose that k + p of 1 h X ,...,X are +1, and that k+q of X ,...,X are +1. Then we have 1 2k 2k+1 4k S = 2(p + q) and S = 2(p q). Thus, taking expectations over all 24k 1 2 − possible (equally likely) choices, we see that 4 2k 2k E[S S ] = 4E[p2 q2 ] = p2 q2 | 1 2| | − | 24k k+p k+q | − | p q (cid:18) (cid:19)(cid:18) (cid:19) XX 4 2k 2 h2 2k 2 = 2k2 = . 24k · k 2h+1 k (cid:18) (cid:19) (cid:18) (cid:19) Here the closed form for the double sum is a special case of [4, Prop. 1.1]. By the definitions of B, C and F, we see that hf is a sum of the form ii Y +Y + +Y , where each Y is a random variable with the same distri- 1 2 h j ··· bution as S , and each product Y Y (for j = ℓ) has the same distribution 1 j ℓ | | 6 as S S . Also, Y2 has the same distribution as S 2 = S2. The random | 1 2| j | 1| 1 variables Y are not independent, but by linearity of expectations we obtain j h2 2k 2 h2E[f2] = hE[S2]+h(h 1)E[S S ] = h2+h(h 1) . ii 1 − | 1 2| − · 2h+1 k (cid:18) (cid:19) This gives 2 h(h 1) 2k E[f2]= 1+ − . ii 2h+1 k (cid:18) (cid:19) The result for V[g ] now follows from V[g ]= V[f ]= E[f2] E[f ]2. ii ii ii ii − ii For convenience we write µ(h) := E[g ] = E[f ]+1 and σ(h)2 := V[g ]. ii ii ii If h is understood from the context we write simply µ and σ2 respectively. To estimate µ and σ2 from Lemma 2.6, we need a sufficiently accurate estimate for a central binomial coefficient 2m (where m = h/2 or h/4). An m asymptotic expansion for ln 2m may bededuced from Stirling’s asymptotic m (cid:0) (cid:1) expansion of lnΓ(z), as in [16]. However, [16] does not give an error bound. (cid:0) (cid:1) We state such a bound in the following Lemma. Lemma 2.7. If k and m are positive integers, then 2m ln(πm) k−1 B (1 4−j) ln = mln4 2j − m1−2j +e (m), (10) k m − 2 − j(2j 1) (cid:18) (cid:19) j=1 − X where B e (m) < | 2k| m1−2k. (11) k | | k(2k 1) − 7 Proof. Using the facts that m is real and positive, and that the sign of the Bernoullinumber B is( 1)k−1,weobtainfromOlver[21,(4.03)and(4.05) 2k − of Ch. 8] that k−1 ln(2π) B lnΓ(m) = (m 1)lnm m+ + 2j m1−2j ( 1)kr (m), −2 − 2 2j(2j 1) − − k j=1 − X (12) where B 0 <r (m) < | 2k| m1−2k. (13) k 2k(2k 1) − Now 2m (2m)! 2 Γ(2m) = = , m m!m! mΓ(m)2 (cid:18) (cid:19) so from (12) and the same equation with m 2m we obtain (10) with 7→ e (m) = ( 1)k(2r (m) r (2m)). k k k − − Using the bound (13), this gives ( 1)k B e (m) = − | 2k|m1−2kθ, k k(2k 1) − where 2−2k < θ < 1. In particular, θ < 1, so we obtain the desired − | | bound (11). Remark 2.8. Lemma 2.7 can be sharpened. In fact, e (m) has the same k sign as the first omitted term (corresponding to j = k) and has smaller magnitude. This is proved in [3, Corollary 2]. We now show that µ(h) is of order h1/2, and that σ(h) is bounded. Lemma 2.9. For h 4Z, h 4, we have ∈ ≥ σ(h)2 < 1 (14) and 2h 2h +0.9 < µ(h) < +1. (15) π π r r Proof. From Lemma 2.7 with k = 2 and m a positive integer, we have 2m 4m 1 θ m = exp + , (16) m √πm −8m 180m3 (cid:18) (cid:19) (cid:20) (cid:21) 8 where θ < 1. m | | First consider the bounds (16) on µ(h). Taking m = h/2 and using the expression (6) for µ(h), the inequality (15) is equivalent to m 1 m 2m m < < . π − 20 4m m π r (cid:18) (cid:19) r The upper bound is immediate from (16), since 1 + 1 < 0. −8m 180m3 For the lower bound, a computation verifies the inequality for m = 2, since 2/π 1 < 3 = m 2m . Hence, we can assume that m 4. The − 20 4 4m m ≥ lower bound now follows from (16), since p (cid:0) (cid:1) m 2m m 1 1 m 1 1 > exp > 1 4m m π −8m − 180m3 π − 8m − 180m3 (cid:18) (cid:19) r (cid:20) (cid:21) r (cid:20) (cid:21) and m 1 1 1 + < . π 8m 180m3 20 r (cid:20) (cid:21) Now consider the upper bound (14) on σ(h)2. From (16) we have h/2 2 2h+2 1 32 < exp + h/4 πh −h 45h3 (cid:18) (cid:19) (cid:20) (cid:21) and h 2 22h+1 1 4 > exp . h/2 πh −2h − 45h3 (cid:18) (cid:19) (cid:20) (cid:21) Using these inequalities in (8) and simplifying gives 2h 1 32 1 4 σ(h)2 < 1+ exp + exp π −h 45h3 − −2h − 45h3 (cid:20) (cid:18) (cid:19) (cid:18) (cid:19)(cid:21) 2 1 32 exp + . (17) − π −h 45h3 (cid:18) (cid:19) Itiseasytoseethattheterminsquarebrackets isnegative forh 4, so(17) ≥ implies (14). Remark 2.10. We can show from (17) and a corresponding lower bound on σ(h)2 that σ(h+4)2 < σ(h)2, so σ(h)2 is monotonic decreasing and bounded above by σ(4)2 = 1. Also, for large h we have σ(h)2 = (1 3/π)+O(1/h). 4 − Since these results are not needed below, we omit the details. 9 3 A probabilistic lower bound We now prove lower bounds on (n) and (n) where, as usual, n = h+d D R and h is the order of a Hadamard matrix. The key result is Lemma 3.2. Theorem 3.4 simply converts the result of Lemma 3.2 into lower bounds on (n) and (n), giving away a little for the sake of simplicity in the latter D R case. For the proof of Lemma 3.2 we need the following bound on the de- terminant of a matrix which is “close” to the identity matrix. It is due to Ostrowski [22, eqn. (5,5)]; see also [8, Corollary 1]. Lemma 3.1 (Ostrowski). If M = I E Rd×d, e ε for 1 i,j d, ij − ∈ | | ≤ ≤ ≤ and dε 1, then ≤ det(M) 1 dε. ≥ − The idea of Lemma 3.2 is that we can, with positive probability, apply Lemma 3.1 to the matrix M = µ−1G, thus obtaining a lower bound on the maximum value attained by det(G). Lemma 3.2. Suppose d 1, 4 h , n = h+d, G as in 2.2. Then, ≥ ≤ ∈ H § with positive probability, detG d2 1 . (18) µd ≥ − µ Proof. Let λ be a positive parameter to be chosen later, and µ = µ(h). We say that G is good if the conditions of Lemma 3.1 apply with M = µ−1G and ε = λ/µ. Otherwise G is bad. Assume 1 i,j d. From Lemma 2.6, V[g ] = 1 for i = j; from ij ≤ ≤ 6 Lemma 2.9, V[g ] = σ2 < 1. It follows from Chebyshev’s inequality [10] ii that 1 P[g λ] for i= j, | ij| ≥ ≤ λ2 6 and σ2 P[g µ λ] . | ii − | ≥ ≤ λ2 Thus, d(d 1) dσ2 d2 P[G is bad] − + < . ≤ λ2 λ2 λ2 Taking λ = d gives P[G is bad] < 1, so P[G is good] is positive. When- ever G is good we can apply Lemma 3.1 to µ−1G, obtaining µ−ddet(G) = det(µ−1G) 1 dε = 1 dλ/µ = 1 d2/µ. ≥ − − − 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.