ebook img

Fast initial conditions for Glauber dynamics PDF

0.43 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Fast initial conditions for Glauber dynamics

FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS EYAL LUBETZKY AND ALLAN SLY 7 Abstract. InthestudyofMarkovchainmixingtimes,analysishascenteredontheperformancefroma 1 worst-casestartingstate. Here,inthecontextofGlauberdynamicsfortheone-dimensionalIsingmodel, 0 2 we show how new ideas from information percolation can be used to establish mixing times from other starting states. At high temperatures we show that the alternating initial condition is asymptotically n a the fastest one, and, surprisingly, its mixing time is faster than at infinite temperature, accelerating as J the inverse-temperature β ranges from 0 to β = 1arctanh(1). Moreover, the dominant test function 0 2 3 1 depends on the temperature: at β <β it is autocorrelation, whereas at β >β it is the Hamiltonian. 0 0 2 ] 1. Introduction R P In the study of mixing time of Markov chains, most of the focus has been on determining the asymp- h. totics of the worst-case mixing time, while relatively little is known about the relative effect of different t initial conditions. The latter is quite natural from an algorithmic perspective on sampling, since one a m would ideally initiate the dynamics from the fastest initial condition. However, until recently, the tools [ available for analyzing Markov chains on complex systems, such as the Ising model, were insufficient 1 for the purpose of comparing the effect of different starting states; indeed, already pinpointing the v asymptotics of the worst-case state for Glauber dynamics for the Ising model can be highly nontrivial. 2 In this paper we compare different initial conditions for the Ising model on the cycle. In earlier 4 0 work [11], we analyzed three different initial conditions. The all-plus state is provably the worst 6 initial condition up to an additive constant. Another is a quenched random condition chosen from ν, 0 . the uniform distribution on configurations, which with high probability has a mixing time which is 1 0 asymptotically as slow. A third initial condition is an annealed random condition chosen from ν, i.e., 7 to start at time 0 from the uniform distribution, which is asymptotically twice as fast as all-plus. 1 Hereweconsidertwonaturaldeterministicinitialconfigurations. Thefirstisthealternatingsequence : v (cid:40) i 1 i ≡ 0 (mod 2) X xalt(i) = r −1 i ≡ 1 (mod 2), a whichwewillshowisasymptoticallythefastest deterministicinitialcondition—yetstrictlyslowerthan starting from the annealed random condition—for all β < β := 1 arctanh(1) (at β = β they match). 0 2 3 0 The second is the bi-alternating sequence (cid:40) 1 i ≡ 0,3 (mod 4) xblt(i) = −1 i ≡ 1,2 (mod 4). For convenience we will assume that n is a multiple of 4, which ensures that the configurations are semi-translation invariant and turns both sequences into eigenvectors of the transition matrix of simple random walk on the cycle. (This is not necessary for the main result but leads to cleaner analysis.) 1 2 EYALLUBETZKYANDALLANSLY t mix all-plus / quenched bi-alternating 1logn 2 alternating 3logn 8 annealed 1logn 4 1arctanh(1) 1arctanh(1) β 2 3 2 2 Figure 1. Asymptotic mixing time from the alternating and bi-alternating initial conditions as per Theorem 1, compared to the known behavior of worst-case (all-plus) and random initial conditions. In what follows, set θ = θβ = 1−tanh(2β), and let txm0ix(ε) denote the time it takes the dynamics to reach total variation distance at most ε from stationarity, starting from the initial condition x . 0 Theorem 1. For every β > 0 and 0 < ε < 1 there exist C(β) and N(β,ε) such that the following hold for Glauber dynamics for the Ising model on the cycle Z/nZ at inverse-temperature β for all n > N. (i) Alternating initial condition: (cid:12) (cid:12) (cid:12)txalt(ε)−max(cid:8) 1 , 1 (cid:9)logn(cid:12) ≤ Cloglogn. (cid:12) mix 4−2θ 4θ (cid:12) (ii) Bi-alternating initial condition: (cid:12) (cid:12) (cid:12)txblt(ε)−max(cid:8)1, 1 (cid:9)logn(cid:12) ≤ Cloglogn. (cid:12) mix 2 4θ (cid:12) Surprisingly, the mixing time for the alternating initial condition begins as actually faster than the infinite temperature model: it decreases as a function of β before increasing when β > 1 arctanh(1). 2 3 Thefollowingtheoremsummarizestheboundsweprovedin[8,11]fortheall-plusandrandominitial conditions. See Figure 1 for the relative performance of all these different initial conditions. Theorem 2 ([8,11]). In the same setting of Theorem 1, the following hold. (i) All-plus initial condition x+ ≡ 1: (cid:12) (cid:12) (cid:12)tx+ (ε)− 1 logn(cid:12) ≤ Cloglogn. (cid:12) mix 2θ (cid:12) (ii) Quenched random initial condition: ν(cid:0)(cid:8)x0 : (cid:12)(cid:12)txm0ix(ε)− 21θ logn(cid:12)(cid:12) ≤ Cloglogn(cid:9)(cid:1) → 1 as n → ∞. (iii) Annealed random initial condition: (cid:12)(cid:12)tνmix(ε)− 41θ logn(cid:12)(cid:12) ≤ Cloglogn. FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS 3 (Note that, in the case of the all-plus initial conditions, the mixing time tx+ (ε) is known in higher mix precision: it was shown [8,11] to be within an additive constant (depending on ε and β) of 1 logn.) 2θ The upper bounds on the mixing times in Theorem 1 rely on the information percolation framework introduced by the authors in [11]. The asymptotically matching lower bounds in that theorem are derived from two test functions: the autocorrelation function, which for instance matches our upper bound on the alternating initial condition for β > β ; and the Hamiltonian test function, which gives 0 rise to the following lower bound on every deterministic initial condition. Proposition 3. Let X be Glauber dynamics for the Ising model on Z/nZ at inverse-temperature β. t For every sequence of deterministic initial conditions x , the dynamics at time 0 t = 1 logn−8loglogn (cid:63) 4−2θ is at total variation distance 1−o(1) from equilibrium; that is, lim inf(cid:107)P (X ∈ ·)−π(cid:107) = 1. n→∞ x0 x0 t(cid:63) tv As a consequence of this result and Theorem 1, Part (i), we see that the initial condition xalt is indeed the optimal deterministic one in the range β < β , and that β marks the smallest β where a 0 0 deterministic initial condition can first match the performance of the annealed random condition. The mixing time estimates in Theorem 1 (as well as those in Theorem 2) imply, in particular, that Glauber dynamics for the Ising model on the cycle, from the respective starting configurations, exhibits the cutoff phenomenon—a sharp transition in its distance from stationarity, which drops along a negligible time period known as the cutoff window (here, O(loglogn), vs. tmix which is of order logn) from near its maximum to near 0. Until recently, only relatively few occurrences of this phenomenon, that was discovered by Aldous and Diaconis in the early 1980’s (see [1,2,4,5]), were rigorously verified, eventhoughitisbelievedtobewidespread(e.g., Peresconjectured[6, Conjecture1],[7, §23.2]cutofffor theIsingmodelonanysequenceoftransitivegraphswhenthemixingtimeisoforderlogn); see[7,§18]. For the Ising model on the cycle, the longstanding lower and upper bounds on tmix from a worst-case 1−o(1) 1+o(1) initialconditiondifferedbyafactorof2—inournotation, lognand logn—whilecutoffwas 2θ θ conjectured to occur (see, e.g., [7, Theorem 15.4], as well as [7, pp. 214,248 and Question 8 in p. 300]). This was confirmed in [8], where the above lower bound was shown to be tight, via a proof that relied on log-Sobolev inequalities and applied to Zd, for any dimension d ≥ 1, so long as the system features a certaindecay-of-correlationpropertyknownasstrongspatialmixing. Thisresultwasreproducedin[11] (with a finer estimate for the cutoff window) via the new information percolation method. Soon after, a remarkably short proof of cutoff for the cycle—crucially hinging on the correspondence between the one-dimensional Ising model and the “noisy voter” model—was obtained by Cox, Peres and Steif [3]. It is worthwhile noting that the arguments both in [3] and in [8] are tailored to worst-case analysis, and donotseemtobeabletotreatspecificinitialconditionsasexaminedhere. Incontrast, theinformation percolation approach does allow one to control the subtle effect of various initial conditions on mixing. 1−o(1) 1−o(1) Toconcludethissection,weconjecturethatProposition3alsoholdsfort = max{ , }logn, (cid:63) 4−2θ 4θ i.e., that xalt is asymptotically fastest among all the deterministic initial conditions at all β > 0. We further conjecture that the obvious generalization of xalt to (Z/nZ)d for d ≥ 2 (a checkerboard for d = 2) is the analogous fastest deterministic initial condition throughout the high-temperature regime. 4 EYALLUBETZKYANDALLANSLY 2. Update support and information percolation In this section we define the update support and use the framework of information percolation (see the papers [9,12] as well as the survey paper [10] for an exposition of this method) to upper bound the total variation distance with alternating and bi-alternating initial conditions. 2.1. Basic Notation. The Ising model on a finite graph G with vertex-set V and edge-set E is a distribution over the set of configurations Ω = {±1}V; each σ ∈ Ω is an assignment of plus/minus spins to the sites in V, and the probability of σ ∈ Ω is given by the Gibbs distribution π(σ) = Z−1eβ(cid:80)uv∈Eσ(u)σ(v), (2.1) where Z is a normalizer (the partition-function) and β is the inverse-temperature, here taken to be non-negative (ferromagnetic). The (continuous-time) heat-bath Glauber dynamics for the Ising model is the Markov chain—reversible w.r.t. the Ising measure π—where each site is associated with a rate-1 Poisson clock, and as the clock at some site u rings, the spin of u is replaced by a sample from the marginal of π given all other spins. See [13] for an extensive account of this dynamics. In this paper we focus on the graph G = Z/nZ and will let X denote the Glauber dynamics Markov chain on G. t AnimportantnotionofmeasuringtheconvergenceofaMarkovchain(X )toitsstationaritymeasure t π is its total-variation mixing time, denoted tmix(ε) for a precision parameter 0 < ε < 1. From initial condition x we denote 0 txm0ix(ε) = inf(cid:8)t : (cid:107)Px0(Xt ∈ ·)−π(cid:107)tv ≤ ε(cid:9), and the overall mixing time as measured from a worst-case initial condition is tmix(ε) = maxtxm0ix(ε), x0∈Ω where here and in what follows P denotes the probability given X = x , and the total-variation x0 0 0 distance (cid:107)µ1−µ2(cid:107)tv is defined as maxA⊂Ω|µ1(A)−µ2(A)| = 12 (cid:80)σ∈Ω|µ1(σ)−µ2(σ)|. 2.2. Information percolation clusters. The dynamics can be viewed as a deterministic function of X and a random “update sequence” of the form (J ,U ,t ),(J ,U ,t ),..., where 0 < t < t < ... 0 1 1 1 2 2 2 1 2 are the update times (the ringing of the Poisson clocks), the J ’s are i.i.d. uniformly chosen sites (which i clocks ring), and the U ’s are i.i.d. uniform variables on [0,1] (to generate coin tosses). There are i a variety of ways to encode such updates but in the case of the one-dimensional model there is a particularly useful one. We add an extra variable S which is a randomly selected neighbor of U Then i i given the sequence of (J ,S ,U ,t ) the updates are processed sequentially as follows: set t = 0; the i i i i 0 configuration X for all t ∈ [t ,t ) (i ≥ 1) is obtained by updating the site J via the unit variable t i−1 i i as follows: if U ≤ θ = 1 − tanh(2β) update the spin at J to a uniformly random value and with i i probability 1−θ set it to the spin of S . i With this description of the dynamics, we can work backwards to describe how the configurations at time t (or at any intermediate time) depend on the initial condition. The update support function, (cid:63) denoted Fs(A,s1,s2), as introduced in [8], is the random set whose value is the minimal subset S ⊂ Λ which determines the spins of A given the update sequence along the interval (s ,s ]. 1 2 FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS 5 We now describe the support of a vertex v ∈ V as it evolves backwards in time from s to s . 2 1 Initially, Fs(v,s2,s2) = {v}; then, updates in reverse chronological order alter the support: given the next update (Ji,Si,Ui,ti), if Ji = Fs(v,ti+1,s2) and Ui ≤ θ then Fs(v,ti,s2) is set to ∅, and if Ui > θ then it is set to Si. Thus, backwards in time Fs(v,t,s2) performs a continuous-time simple random walk with jump rate 1−θ which is killed at rate θ. We refer to the full trajectory of the update support of a vertex as the history of the vertex. The survival time for a walk is exponential and so for t ≤ t , 1 2 P(Fs(v,t1,t2) (cid:54)= ∅) = e−(t2−t1)θ. (2.2) For general sets A we have that Fs(A,s1,s2) = (cid:83)v∈AFs(v,s1,s2) and taken together the collection of the update supports of the vertices are a set of coalescing killed continuous-time random walks. A key use of these histories is to effectively bound the spread of information, as achieved by the following lemma. Lemma 2.1. For any t we have that (cid:18) (cid:19) P max max |v−Fs(v,s,t)| ≥ 1 log2n ≤ O(n−10). v∈Z/nZ 0≤s≤t 10 Fs(v,s,t)(cid:54)=∅ Proof. By equation (2.2) we have that P[Fs(Z/nZ,t−log3/2n,t) (cid:54)= ∅] = O(n−10) so it is sufficient to show that (cid:18) (cid:19) P max |v−Fs(v,s,t)| ≥ 1 log2n ≤ O(n−11). 10 t−log3/2n≤s≤t Fs(v,s,t)(cid:54)=∅ This probability is bounded above by the probability of a rate 1−θ continuous-time random walk to makeatleast 1 log2njumpsbytimelog3/2n. ThisisexactlytheprobabilitythataPoissonwithmean 10 (1−θ)log3/2n is at least 1 log2n, which satisfies the required bound by standard tail bounds. (cid:4) 10 3. Upper bounds We will consider the dynamics run up to time t and derive an upper bound on its mixing time. We (cid:63) will first estimate the total variation distance not of the full dynamics but simply at a single vertex from initial conditions xalt and xblt. Lemma 3.1. For v ∈ Z/nZ we have that, (cid:107)P (X (v) ∈ ·)−π| (cid:107) = 1e−(2−θ)t(cid:63), xalt t(cid:63) v tv 2 (cid:107)P (X (v) ∈ ·)−π| (cid:107) = 1e−t(cid:63). xblt t(cid:63) v tv 2 Proof. We will begin with the case of initial condition xalt. Of course π| is the uniform measure v on {±1}. The history Fs(v,t,t(cid:63)) is killed before time 0 with probability 1−e−θt(cid:63) and on this event is uniform on {±1}. Condition that it survives to time 0 and let Y(s) = xalt(Fs(v,t(cid:63)−s,t(cid:63))). This is simply a continuous-time random walk on {±1} which switches state at rate 1−θ. Thus, (cid:40) 1 + 1e−2(1−θ)s if a = xalt(v), P(Y(s) = a) = 2 2 1 − 1e−2(1−θ)s otherwise. 2 2 6 EYALLUBETZKYANDALLANSLY It therefore follows that (cid:107)P(Y(t(cid:63)) ∈ ·)−π|v(cid:107)tv = 12e−2(1−θ)t(cid:63), and altogether, (cid:107)P (X (v) ∈ ·)−π| (cid:107) = 1e−2(1−θ)t(cid:63)e−θt(cid:63) = 1e−(2−θ)t(cid:63). xalt t(cid:63) v tv 2 2 The case of xblt follows similarly, with the exception that Y(s) has jump rate 1(1−θ) since it only 2 switches sign with probability 1 each step. (cid:4) 2 3.1. Update Support. In this subsection we analyse the geometry of the update support similarly to [8] in order to approximate the Markov chain as a product measure. Let κ = 4 and define the 1−θ support time as t = t −κloglogn. By Lemma 2.1 we expect the histories to not travel “too far” − (cid:63) along the time-interval t to t ; precisely, if we define B as the event (cid:63) − (cid:26) (cid:27) B = v∈mZa/xnZ t−m≤as≤xt(cid:63) |v−Fs(v,s,t(cid:63))| ≤ 110 log2n , Fs(v,s,t(cid:63))(cid:54)=∅ then by Lemma 2.1, P(B) ≥ 1−n−10. (3.1) The following event says that the support at time t clusters into small well separated components. − Let A be the event that there exists a set of intervals W ,...,W ⊂ Z/nZ that (i) cover the support: 1 m (cid:91) {x : Fs(x,t−,t(cid:63)) (cid:54)= ∅} ⊂ Wi, (3.2) i (ii) have logarithmic size: maxW ≤ log3n, (3.3) i i and (iii) are well-separated: min d(W ,W ) ≥ log2n. (3.4) i i(cid:48) i,i(cid:48) Lemma 3.2. We have that P(A) ≥ 1−O(n−9). Proof. Define the following intervals on Z/nZ: M = {2ilog2n,...,(2i+1)log2n} (1 ≤ i ≤ n ). i 2log2n (cid:83) Restricting B to M , we let i (cid:26) (cid:27) B(cid:48) = max max |v−Fs(v,s,t(cid:63))| ≤ 110 log2n . v∈∪iMi t−≤s≤t(cid:63) Fs(v,s,t(cid:63))(cid:54)=∅ Since B(cid:48) ⊃ B we have that P(B(cid:48)) ≥ 1−n−10 by Lemma 3.1. Next, let D be the event i Di = {Fs(Mi,t−,t(cid:63)) = ∅}. By a union bound and equation (2.2), we have that 1 P(D ) ≥ 1−|M |e(1−θ)κloglogn ≥ 1− , i i logn and so P(cid:0)Dc | B(cid:48)(cid:1) ≤ P(Dic) ≤ 2 . i P(B(cid:48)) logn FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS 7 Moreover, conditional on B(cid:48) the events D are conditionally independent since the history of M is i i determined by the updates within the set {v : d(v,M ) ≤ 1 log2n} which are disjoint. Hence, for all i, i 10 (cid:16) (cid:17) (cid:18) 2 (cid:19)110logn P Dc,Dc ,...,Dc | B(cid:48) ≤ ≤ n−10; i i+1 i+ 1 logn logn 10 hence, (cid:16) (cid:17) (cid:16) (cid:12) (cid:17) P Dc,Dc ,...,Dc ≤ P Dc,Dc ,...,Dc (cid:12) B(cid:48) +P(cid:0)B(cid:48)c(cid:1) ≤ 2n−10. i i+1 i+ 1 logn i i+1 i+ 1 logn (cid:12) 10 10 Taking a union bound over all i we have that (cid:16) (cid:17) P ∃i : Dc,Dc ,...,Dc ≤ n−9. i i+1 i+ 1 logn 10 Wehavethusarrivedatthefollowing: withprobabilityatleast1−n−9, foreveryv ∈ Z/nZthereexists a block of log2n consecutive vertices whose histories are killed before t within distance 1 log3n on − 5 boththerightandtheleft, implyingtheexistenceofthedecompositionandcompletingthelemma. (cid:4) When the event A holds we will assume that there is some canonical choice of the W ’s. We set i Vi = Fs(Wi,t−,t(cid:63)). (3.5) On the event that both A and B hold, the sets V are disjoint, and satisfy i mind(V ,V ) ≥ 1 log2n and maxdiam(V ) ≤ 2log3n. (3.6) i,i(cid:48) i i(cid:48) 2 i i We will make use of Lemma 3.3 from [9], a special case of which is the following. Lemma 3.3 ([9]). For any 0 ≤ s ≤ t and any set of vertices W we have that (cid:13)(cid:13)Px0(Xt(W) ∈ ·)−π|W(cid:13)(cid:13)tv ≤ E(cid:104)(cid:13)(cid:13)(cid:13)Px0(Xs(Fs(W,s,t)) ∈ ·)−π|Fs(W,s,t)(cid:13)(cid:13)(cid:13)tv(cid:105) . Using this result, we have that (cid:13)(cid:13)Px0(Xt(cid:63) ∈ ·)−π(cid:13)(cid:13)tv ≤ E(cid:104)(cid:13)(cid:13)(cid:13)Px0(cid:0)Xt−((cid:83)iVi) ∈ ·(cid:1)−π|∪iVi(cid:13)(cid:13)(cid:13)tv(cid:105) . (cid:83) 3.2. Coupling with product measures. On the event A ∩ B we couple Xt−( iVi) and π|(cid:83)iVi with product measures. Since the V ’s depend only on the updates along the interval [t ,t ] and are i − (cid:63) independentofthedynamicsuptotimet wewilltreattheV asfixeddeterministicsetssatisfying(3.6). − i Let(π(1),...,π(m))beaproductmeasureofmcopiesofπ. Then,bytheexponentialdecayofcorrelation of the one-dimensional Ising model, (cid:13) (cid:13) (cid:13)(cid:13)(π(1)|V1,...,π(m))|Vm −π(cid:12)(cid:12)(cid:83)iVi(cid:13)(cid:13)tv ≤ n−10. (3.7) (1) (m) Next, let X ,...,X be m independent copies of the dynamics up to time t . Define the event t t − (cid:26) (cid:27) E = max max |v−Fs(v,s,t−)| ≤ 110 log2n , v∈∪iVi 0≤s≤t− Fs(v,s,t−)(cid:54)=∅ 8 EYALLUBETZKYANDALLANSLY and for each 1 ≤ j ≤ m define the analogous event (cid:26) (cid:27) E(j) = max max |v−Fs(j)(v,s,t−)| ≤ 110 log2n , v∈∪iVi 0≤s≤t− Fs(j)(v,s,t−)(cid:54)=∅ where Fs(j) is the support function for the dynamics Xt(j). From Lemma 2.1, together with a union bound, we infer that P(E) = P(E(j)) ≥ 1−O(n−10). (3.8) Let X˜ denote X conditioned on E and, similarly, let X˜(j) denote X(j) conditioned on E(j). Then t t t t (cid:13) (cid:13) (cid:13)P(X˜(j)(V ) ∈ ·)−P(X(j)(V ) ∈ ·)(cid:13) ≤ P(E(j)) ≤ n−10, (cid:13) t− j t− j (cid:13)tv and so (cid:13) (cid:16) (cid:17) (cid:16) (cid:17)(cid:13) (cid:13)P (X˜(1)(V ),...,X˜(m)(V )) ∈ · −P (X(1)(V ),...,X(m)(V )) ∈ · (cid:13) ≤ n−9. (cid:13) t− 1 t− m t− 1 t− m (cid:13)tv Now,sincethelawsoftheX˜ (V )fordistinctidependondisjointsetsofupdates,theyareindependent t− i and equal in distribution to X˜(i)(V ), hence t− i (X˜(1)(V ),...,X˜(m)(V )) =d (X˜ (V ),...,X˜ (V )). t− 1 t− m t− 1 t− m Since X˜ is X conditioned on E, (cid:13) (cid:16) (cid:17) (cid:13) (cid:13)P (X˜ (V ),...,X˜ (V )) ∈ · −P(cid:0)(X (V ),...,X (V )) ∈ ·(cid:1)(cid:13) ≤ P(E) ≤ n−10. (cid:13) t− 1 t− m t− 1 t− m (cid:13)tv Combining the previous three equations we find that (cid:13) (cid:16) (cid:17) (cid:13) (cid:13)P (X(1)(V ),...,X(m)(V )) ∈ · −P(cid:0)(X (V ),...,X (V )) ∈ ·(cid:1)(cid:13) ≤ 2n−9. (3.9) (cid:13) t− 1 t− m t− 1 t− m (cid:13)tv Thus, to show that (cid:13)(cid:13)Px0(Xt(cid:63) ∈ ·)−π(cid:13)(cid:13)tv → 0 it is sufficient to prove that (cid:13) (cid:16) (cid:17) (cid:13) (cid:13)P (X(1)(V ),...,X(m)(V )) ∈ · −(π(1)| ,...,π(m)| )(cid:13) → 0. (3.10) (cid:13) t− 1 t− m V1 Vm (cid:13)tv 3.3. Local L2 distance. Let L = 10, and for each i set Si = inf{s : |Fs(Vi,t−−s,t−)| ≤ L} , with S = 0 if |V | ≤ L. i i First we bound the right tail of the distribution of Si. If |Fs(Vi,t−−s,t−)| > L then at least L+1 histories from V have survived to time t −s and not intersected. Hence, by equation (2.2), i − (cid:18) (cid:19) |V | P(|Fs(Vi,t−−s,t−)| > 10) ≤ i e−(L+1)sθ ≤ e−(L+1)sθlog3(L+1)n. L+1 Therefore, for 0 < s < t we see that − P(S ≥ s) ≤ e−s(L+1)θlog3(L+1)n. (3.11) i Let I denote the event that for all i we have that S < t . By (3.11), i − P(I) ≤ e−(L+1)θt−nlog3(L+1)n, (3.12) FAST INITIAL CONDITIONS FOR GLAUBER DYNAMICS 9 and so t ≥ 2 logn implies that P(I) → 1. On the event I, we define (cid:63) Lθ Ui = Fs(Vi,t−−Si,t−). Applying Lemma 3.3 we have that (cid:13) (cid:16) (cid:17) (cid:13) (cid:13)P (X(1)(V ),...,X(m)(V )) ∈ · −(π(1)| ,...,π(m)| )(cid:13) (cid:13) t− 1 t− m V1 Vm (cid:13)tv (cid:13) (cid:16) (cid:17) (cid:13) ≤ (cid:13)P (X(1) (U ),...,X(m) (V )) ∈ · −(π(1)| ,...,π(m)| )(cid:13) . (3.13) (cid:13) t−−S1 1 t−−Sm m U1 Um (cid:13)tv Lemma 3.4. There exists C = C(β) > 0 such that, for every |U | ≤ L and 0 ≤ S < t , i i −  (cid:13)(cid:13)P (cid:16)(X(i) (U ) ∈ · (cid:12)(cid:12) U ,S (cid:17)−π(i)| (cid:13)(cid:13) ≤ Ct−exp[−(t−−Si)min{2θ,2−θ}] x0 = xalt, (cid:13) x0 t−−Si i (cid:12) i i Ui(cid:13)tv Ct exp[−(t −S )min{2θ,1}] x = xblt. − − i 0 Proof. We will consider the case of xalt, the proof for xblt follows similarly. Let R denote the first time i the history coalesces to a single point: Ri = inf{r : |Fs(Ui,t−−Si−r,t−−Si)| ≤ 1} , with the convention Ri = t−−Si if |Fs(Ui,0,t−−Si)| ≥ 2. By equation (2.2), (cid:18) (cid:19) L P(R > r | U ,S ) ≤ e−2rθ. i i i 2 Denote the vertex ai = Fs(Ui,t−−Si−Ri,t−−Si). By Lemmas 3.1 and 3.3 we have that (cid:13) (cid:16) (cid:17) (cid:13) (cid:104)(cid:13) (cid:16) (cid:17) (cid:12) (cid:13) (cid:105) (cid:13)P X(i) (U ) ∈ · | U ,S −π(i)| (cid:13) ≤ E (cid:13)P X(i) (a ) ∈ · | U ,S −π(i)| (cid:12) U ,S (cid:13) (cid:13) t−−Si i i i Ui(cid:13)tv (cid:13) t−−Si−Ri i i i ai (cid:12) i i(cid:13)tv ≤ E(cid:104)e−(2−θ)(t−−Si−Ri) (cid:12)(cid:12) Ui,Si(cid:105) . (3.14) We estimate the right hand side as follows: E(cid:104)e−(2−θ)(t−−Si−Ri) (cid:12)(cid:12) Ui,Si(cid:105) ≤ (cid:100)t−(cid:88)−Si(cid:101)P(Ri ∈ (k−1,k))e−(2−θ)(t−−Si−k) k=1 (cid:100)t−(cid:88)−Si(cid:101)(cid:18)L(cid:19) ≤ e−2(k−1)θe−(2−θ)(t−−Si−k) 2 k=1 ≤ Ct e−(t−−Si)min{2θ,2−θ}, (3.15) − where the final inequality follows by taking the maximal term in the sum. This, together with (3.14), completes the proof of the lemma. (cid:4) WenowappealtotheL1-to-L2 reductiondevelopedin[8,9]. RecallthattheL2-distanceonmeasures is defined as (cid:16)(cid:88)(cid:12)µ(x) (cid:12)2 (cid:17)1/2 (cid:107)µ−π(cid:107) = (cid:12) −1(cid:12) π(x) , L2(π) (cid:12)π(x) (cid:12) x 10 EYALLUBETZKYANDALLANSLY and set m M = (cid:88)(cid:13)(cid:13)P [(X(i) (U ) ∈ · | U ,S ]−π(i)| (cid:13)(cid:13)2 . (3.16) t i=1(cid:13) x0 t−−Si i i i Ui(cid:13)L2(π(i)|Ui) By [9, Proposition 7], (cid:13)(cid:13)P(cid:16)(X(1) (U ),...,X(m) (V )) ∈ ·(cid:17)−(π(1)| ,...,π(m)| )(cid:13)(cid:13) ≤ (cid:112)M . (3.17) (cid:13) t−−S1 1 t−−Sm m U1 Um (cid:13)tv t We are now ready to prove the upper bound for the main theorem. Proof of Theorem 1, Upper bound. Again we focus on the case of xalt. Set (cid:18) (cid:19) 1 3L+6 t = logn+ κ+ loglogn. (cid:63) (4−2θ)∧4θ (4−2θ)∧4θ With this choice of t we have that P(Ic) → 0 and so, by equations (3.10), (3.13) and (3.17), it is (cid:63) sufficient to show that E[M 1 ] → 0. (3.18) t I Since each vertex is either plus or minus with probability that is uniformly bounded below by e−2β , e−2β+e2β given any choice of conditioning on the other vertices, we have that (cid:18) e−2β (cid:19)L min min π| (x) ≥ . Ui x∈{±1}Ui Ui e−2β +e2β Comparing the L1 and L2 bounds we have that for any measures µ and set U , i (cid:88) 1 (cid:12) (cid:12)2 (cid:107)µ| −π| (cid:107)2 = (cid:12)µ| (x)−π| (x)(cid:12) Ui Ui L2(π|Ui) x π|Ui(x)(cid:12) Ui Ui (cid:12) (cid:18)e−2β +e2β(cid:19)L (cid:12) (cid:12)2 ≤ 2L max (cid:12)µ| (x)−π| (x)(cid:12) e−2β x∈{±1}Ui(cid:12) Ui Ui (cid:12) (cid:18)e−2β +e2β(cid:19)L(cid:13) (cid:13)2 ≤ 2L (cid:13)µ| −π| (cid:13) . e−2β (cid:13) Ui Ui(cid:13)tv Thus, by Lemma 3.4, E[M 1 ] ≤ E(cid:20)2L(cid:18)e−2β +e2β(cid:19)L(cid:88)m (cid:13)(cid:13)P [(X(i) (U ) ∈ · | U ,S ]−π(i)| (cid:13)(cid:13)2 (cid:21) t I e−2β (cid:13) x0 t−−Si i i i Ui(cid:13)tv i=1 (cid:18)e−2β +e2β(cid:19)L (cid:20)(cid:16) (cid:17)2(cid:21) ≤ 2L nE Ct e−(t−−Si)min{2θ,2−θ} e−2β − (cid:104) (cid:105) ≤ C(cid:48)(β)e−t−min{4θ,4−2θ}nlog2nE emin{4θ,4−2θ}Si (cid:104) (cid:105) = C(cid:48)(β)(logn)−(3L+4) E emin{4θ,4−2θ}Si ,

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.