Probability Theory and Stochastic Processes with Applications Oliver Knill Overseas Press Probability and Stochastic Processes with Applications Probability and Stochastic Processes with Applications Oliver Knill OVERSEAS PRESS (INDIA) PVT. LTD. Copyright © Oliver Knill Read. Office: Overseas Press India Private Limited 7/28, Ansari Road, Daryaganj NewDelhi-110 002 Email: [email protected] Website: www.overseaspub.com Sales Office: Overseas Press India Private Limited 2/15, Ansari Road, Daryaganj NewDelhi-110 002 Email: [email protected] Website: www.overseaspub.com All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher/Authors. Edition : 2009 10 digit ISBN : 81 - 89938 - 40 -1 13 digit ISBN : 978 - 81 - 89938 - 40 - 6 Published by Narinder Kumar Lijhara for Overseas Press India Private Limited, 7/28, Ansari Road, Daryaganj, New Delhi-110002 and Printed in India. Contents Preface 3 1 Introduction 5 1.1 What is probability theory? Vt 5 1.2 Some paradoxes in probability theory 12 1.3 Some applications of probability theory 16 2 Limit theorems 23 2.1 Probability spaces, random variables, independence 23 2.2 Kolmogorov's 0 — 1 law, Borel-Cantelli lemma 34 2.3 Integration, Expectation, Variance 39 2.4 Results from real analysis 42 2.5 Some inequalities 44 2.6 The weak law of large numbers 50 2.7 The probability distribution function 56 2.8 Convergence of random variables 59 2.9 The strong law of large numbers 64 2.10 Birkhoff's ergodic theorem 68 2.11 More convergence results 72 2.12 Classes of random variables 78 2.13 Weak convergence 90 2.14 The central limit theorem 92 2.15 Entropy of distributions 98 2.16 Markov operators 107 2.17 Characteristic functions 110 2.18 The law of the iterated logarithm 117 3 Discrete Stochastic Processes 123 3.1 Conditional Expectation 123 3.2 Martingales 131 3.3 Doob's convergence theorem 143 3.4 Levy's upward and downward theorems 150 3.5 Doob's decomposition of a stochastic process 152 3.6 Doob's submartingale inequality 157 3.7 Doob's Cp inequality 159 3.8 Random walks 162 1 Contents 3.9 The arc-sin law for the ID random walk 167 3.10 The random walk on the free group 171 3.11 The free Laplacian on a discrete group . . 175 3.12 A discrete Feynman-Kac formula 179 3.13 Discrete Dirichlet problem 181 3.14 Markov processes 186 Continuous Stochastic Processes 191 4.1 Brownian motion 191 4.2 Some properties of Brownian motion 198 4.3 The Wiener measure 205 4.4 Levy's modulus of continuity 207 4.5 Stopping times 209 4.6 Continuous time martingales 215 4.7 Doob inequalities 217 4.8 Khintchine's law of the iterated logarithm 219 4.9 The theorem of Dynkin-Hunt 222 4.10 Self-intersection of Brownian motion 223 4.11 Recurrence of Brownian motion 228 4.12 Feynman-Kac formula 230 4.13 The quantum mechanical oscillator 235 4.14 Feynman-Kac for the oscillator 238 4.15 Neighborhood of Brownian motion 241 4.16 The Ito integral for Brownian motion 245 4.17 Processes of bounded quadratic variation 255 4.18 The Ito integral for martingales 260 4.19 Stochastic differential equations 264 Selected Topics 275 5.1 Percolation 275 5.2 Random Jacobi matrices 286 5.3 Estimation theory 292 5.4 Vlasov dynamics 298 5.5 Multidimensional distributions 306 5.6 Poisson processes 311 5.7 Random maps 316 5.8 Circular random variables 319 5.9 Lattice points near Brownian paths 327 5.10 Arithmetic random variables 333 5.11 Symmetric Diophantine Equations 343 5.12 Continuity of random variables 349 Preface These notes grew from an introduction to probability theory taught during the first and second term of 1994 at Caltech. There was a mixed audience of undergraduates and graduate students in the first half of the course which covered Chapters 2 and 3, and mostly graduate students in the second part which covered Chapter 4 and two sections of Chapter 5. Having been online for many years on my personal web sites, the text got reviewed, corrected and indexed in the summer of 2006. It obtained some enhancements which benefited from some other teaching notes and research, I wrote while teaching probability theory at the University of Arizona in Tucson or when incorporating probability in calculus courses at Caltech and Harvard University. Most of Chapter 2 is standard material and subject of virtually any course on probability theory. Also Chapters 3 and 4 is well covered by the litera ture but not in this combination. The last chapter "selected topics" got considerably extended in the summer of 2006. While in the original course, only localization and percolation prob lems were included, I added other topics like estimation theory, Vlasov dy namics, multi-dimensional moment problems, random maps, circle-valued random variables, the geometry of numbers, Diophantine equations and harmonic analysis. Some of this material is related to research I got inter ested in over time. While the text assumes no prerequisites in probability, a basic exposure to calculus and linear algebra is necessary. Some real analysis as well as some background in topology and functional analysis can be helpful. I would like to get feedback from readers. I plan to keep this text alive and update it in the future. You can email this to [email protected] and also indicate on the email if you don't want your feedback to be acknowl edged in an eventual future edition of these notes. 4 Contents To get a more detailed and analytic exposure to probability, the students of the original course have consulted the book [105] which contains much more material than covered in class. Since my course had been taught, many other books have appeared. Examples are [21, 34]. For a less analytic approach, see [40, 91, 97] or the still excellent classic [26]. For an introduction to martingales, we recommend [108] and [47] from both of which these notes have benefited a lot and to which the students of the original course had access too. For Brownian motion, we refer to [73, 66], for stochastic processes to [17], for stochastic differential equation to [2, 55, 76, 66, 46], for random walks to [100], for Markov chains to [27, 87], for entropy and Markov operators [61]. For applications in physics and chemistry, see [106]. For the selected topics, we followed [32] in the percolation section. The books [101, 30] contain introductions to Vlasov dynamics. The book of [1] gives an introduction for the moment problem, [75, 64] for circle-valued random variables, for Poisson processes, see [49, 9]. For the geometry of numbers for Fourier series on fractals [45]. The book [109] contains examples which challenge the theory with counter examples. [33, 92, 70] are sources for problems with solutions. Probability theory can be developed using nonstandard analysis on finite probability spaces [74]. The book [42] breaks some of the material of the first chapter into attractive stories. Also texts like [89, 78] are not only for mathematical tourists. We live in a time, in which more and more content is available online. Knowledge diffuses from papers and books to online websites and databases which also ease the digging for knowledge in the fascinating field of proba bility theory. Oliver Knill Chapter 1 Introduction 1.1 What is probability theory? Probability theory is a fundamental pillar of modern mathematics with relations to other mathematical areas like algebra, topology, analysis, ge ometry or dynamical systems. As with any fundamental mathematical con struction, the theory starts by adding more structure to a set ft. In a similar way as introducing algebraic operations, a topology, or a time evolution on a set, probability theory adds a measure theoretical structure to ft which generalizes "counting" on finite sets: in order to measure the probability of a subset A C ft, one singles out a class of subsets A, on which one can hope to do so. This leads to the notion of a cr-algebra A. It is a set of sub sets of ft in which on can perform finitely or countably many operations like taking unions, complements or intersections. The elements in A are called events. If a point u in the "laboratory" ft denotes an "experiment", an "event" A £ A is a subset of ft, for which one can assign a proba bility P[A] e [0,1]. For example, if P[A] = 1/3, the event happens with probability 1/3. If P[A] = 1, the event takes place almost certainly. The probability measure P has to satisfy obvious properties like that the union AUB of two disjoint events A, B satisfies P[iUJB]=P[A]+ P[J5] or that the complement Ac of an event A has the probability P[AC] = 1 - P[A]. With a probability space (ft,.4,P) alone, there is already some interesting mathematics: one has for example the combinatorial problem to find the probabilities of events like the event to get a "royal flush" in poker. If ft is a subset of an Euclidean space like the plane, P[A] = JAf(x,y) dxdy for a suitable nonnegative function /, we are led to integration problems in calculus. Actually, in many applications, the probability space is part of Euclidean space and the cr-algebra is the smallest which contains all open sets. It is called the Borel cr-algebra. An important example is the Borel cr-algebra on the real line. Given a probability space (ft, A, P), one can define random variables X. A random variable is a function X from ft to the real line R which is mea surable in the sense that the inverse of a measurable Borel set B in R is " Chapter 1. Introduction in A. The interpretation is that if uj is an experiment, then X(u>) mea sures an observable quantity of the experiment. The technical condition of measurability resembles the notion of a continuity for a function / from a topological space (fi, O) to the topological space (R,U). A function is con tinuous if f~l{U) G O for all open sets U € U. In probability theory, where functions are often denoted with capital letters, like X, Y,..., a random variable X is measurable if X~l(B) € A for all Borel sets B £ B. Any continuous function is measurable for the Borel a-algebra. As in calculus, where one does not have to worry about continuity most of the time, also in probability theory, one often does not have to sweat about measurability is sues. Indeed, one could suspect that notions like a-algebras or measurability were introduced by mathematicians to scare normal folks away from their realms. This is not the case. Serious issues are avoided with those construc tions. Mathematics is eternal: a once established result will be true also in thousands of years. A theory in which one could prove a theorem as well as its negation would be worthless: it would formally allow to prove any other result, whether true or false. So, these notions are not only introduced to keep the theory "clean", they are essential for the "survival" of the theory. We give some examples of "paradoxes" to illustrate the need for building a careful theory. Back to the fundamental notion of random variables: be cause they are just functions, one can add and multiply them by defining (X + y)(w) = X(lj) + Y(w) or (XY)(u>) = X(lo)Y(uj). Random variables form so an algebra C. The expectation of a random variable X is denoted by E[X] if it exists. It is a real number which indicates the "mean" or "av erage" of the observation X. It is the value, one would expect to measure in the experiment. If X = 1B is the random variable which has the value 1 if u> is in the event B and 0 if lj is not in the event B, then the expectation of X is just the probability of B. The constant random variable X(w) = a has the expectation E[X] = a. These two basic examples as well as the linearity requirement E[aX + bY] = aE[X] +bE[Y] determine the expectation for all random variables in the algebra C: first one defines expectation for finite sums YJi=\ adBt called elementary random variables, which approximate general measurable functions. Extending the expectation to a subset C1 of the entire algebra is part of integration theory. While in calculus, one can live with the Riemann integral on the real line, which defines the integral by Riemann sums f* f(x) dx ~ \ J2i/ne[a,b] /(*/«)> the integral defined in measure theory is the Lebesgue integral. The later is more fundamental and probability theory is a major motivator for using it. It allows to make statements like that the probability of the set of real numbers with periodic decimal expansion has probability 0. In general, the probability of A is the expectation of the random variable X(x) = f(x) = lA{x). In calculus, the integral f0 f(x) dx would not be defined because a Riemann integral can give 1 or 0 depending on how the Riemann approximation is done. Probabil ity theory allows to introduce the Lebesgue integral by defining f* f(x) dx as the limit of £ Yn=i f(xi) for n -> oo, where n are random uniformly distributed points in the interval [a, b]. This Mcnte Carlo definition of the Lebesgue integral is based on the law of large numbers and is as intuitive