Probabilistic Arguments in Mathematics A Ph.D. Thesis by Don Berry University College London I, Don Berry, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. 1 Abstract This thesis addresses a question that emerges naturally from some observations about contemporary mathematical practice. Firstly, mathematicians always demand proof for the acceptance of new results. Secondly, the ability of mathematicians to tell if a discourse gives expression to a proof is less than perfect, and the computers they use are subject to a variety of hardware and software failures. So false results are sometimes accepted, despite insistence on proof. Thirdly, over the past few decades, researchers have also developed a variety of methods that are probabilistic in nature. Even if carried out perfectly, these procedures only yield a conclusion that is very likely to be true. In some cases, these chances of error are precisely specifiable and can be made as small as desired. The likelihood of an error arising from the inherently uncertain nature of these probabilistic algorithms can therefore be made vanishingly small in comparison to the chances of an error arising when implementing an equivalent deductive algorithm. Moreover, the structure of probabilistic algorithms tends to minimise these Implementation Errors too. So overall, probabilistic methods are sometimes more reliable than deductive ones. This invites the question: ‘Are mathematicians rational in continuing to reject these probabilistic methods as a means of establishing mathematical claims?’ 2 Table of Contents Chapter 1: Proof and Practice in Mathematics 1.i. Proofs and Proof Presentations 1.ii. Examples of Proof Presentations 1.iii. Acceptance and Publication Requirements 1.iv. Acceptability Conditions for Journal Articles 1.v. Permanence, Reliability, Consensus and Autonomy 1.vi. Recent Developments in Mathematics 1.vii. Conclusion Chapter 2: Non-deductive Arguments 2.i. Non-deductive Methods 2.ii. Non-deductive Techniques in the Context of Discovery 2.iii. Computers and Non-Deductive Methods as Warrant 2.iv. The Influence of Background Knowledge 2.v. The Goldbach Conjecture 2.vi. Acceptance Without Proof 2.vii. Conclusion Chapter 3: Probabilistic Methods 3.i. Las Vegas and Monte Carlo Algorithms 3.ii. Algorithmic Identity Theory and Polynomial Comparison 3.iii. Hypothesis Testing and Statistical Inference 3.iv. The Rabin-Miller Algorithm 3.v. Rabin-Miller in Action 3.vi. Intuition and Cognitive Bias 3.vii. Conclusion Chapter 4: Two Kinds of Error 4.i. Computer Errors 4.ii. Knowledge and Epistemic Externalism 4.iii. A Pragmatic Approach to Epistemic Concepts 4.iv. Human Errors 4.v. Public Acceptance and Autonomy 4.vi. Autonomy, Permanence, Reliability and Consensus Revisited 4.vii. Conclusion Chapter 5: Normative Standards Within Mathematics 5.i. Means-Ends Reasoning and the Epistemic Objectives of Mathematicians 5.ii. The Rationality of Public Acceptance 5.iii. Mathematics as Concerning Abstracta 5.iv. The Decline of Visual Intuition 5.v. Conceptual Clarity in Contemporary Mathematics 5.vi. Formalizing Mathematics 5.vii. Conclusion 3 Chapter 6: Mathematics and Probability 6.i. Public Acceptance and Non-Mathematical Arguments 6.ii. Probabilistic Arguments Reconsidered 6.iii. Interpreting Probability Statements 6.iv. Hardware-Based Approaches 6.v. Software-Base Approaches 6.vi. Probabilistic Inference 6.vii. Conclusion Chapter 7: Should Mathematicians Play Dice? 7.i. Concluding Summary 7.ii. Epilogue 4 1. Proof and Practice in Mathematics This chapter is about a rule that characterises contemporary mathematical practice: that mathematicians require proof for the acceptance of mathematical claims. The opening four sections clarify what this entails. In the fifth section I will discuss four desirable features of mathematical practice that this insistence upon proof helps to secure. These are that mathematics has a literature that is both permanent and highly reliable; that there is consensus amongst mathematicians about which results have been definitively established; and that these researchers can in principle find intellectually autonomous reasons for their mathematical beliefs. Lastly, in the final section we see how mathematical practice has changed due to new developments and technologies in recent years. 1.i. Proofs and Proof Presentations We begin with a rough characterisation of proof, which will suffice to fix a subject matter for discussion. A proof is a finite set of propositions with a particular kind of cumulative inferential structure. It must be possible to arrange the propositions into a sequence such that every proposition is either an axiom or else follows from one or more propositions appearing earlier in the sequence via some accepted mathematical rule of inference (apart from temporary assumptions that are discharged later in the sequence, as in reductio arguments). To use the language of graph theory, we can thus arrange the propositions into a finite, rooted, directed tree, where each proposition is represented by a vertex and edges represent implications. This is illustrated by the following diagram of the Cohen Structure Theorem: 5 It is clear that if we regard a proof as simply a finite sequence of propositions then some entire sections of argument can be reordered whilst keeping the cumulative inference nature of the argument intact. Yet intuitively we would in many cases still want to regard this as the same proof, despite us now having a different sequence of propositions. Again, this is also true if proofs are identified with directed graphs. However, this individuation need not concern us, and so we will move on to clarifying the other terms in the definition. The question ‘What is a proposition?’ is of course a difficult one to answer in itself. We can give at least a partial characterisation: a proposition is the content expressed by a declarative sentence, such as ‘There are infinitely many primes’. Propositions are bearers of truth or falsity and are abstract, non-linguistic entities that are the shared objects of propositional attitudes such as belief.1 By ‘accepted mathematical inference’ we shall for now mean the collection of rules that mathematicians acknowledge as acceptable inferences and standardly make use of in the proofs they construct. The question of how this sociological description might be replaced with a more satisfying mathematical formulation will be discussed in Section 5.vi. We shall then see that these rules of inference are truth-preserving, so that it will be impossible for the conclusion to be false if the premisses are true. Knowing the existence of a proof, we are thereby provided with a special kind of a priori warrant for believing its conclusion. Once a proof has been found for a result it is known as a ‘theorem’ thereafter: a significant label that confers upon it the status of having been conclusively established. In this thesis, proofs thus characterised will be carefully distinguished from what I shall herein call ‘proof presentations’ (though mathematicians may use the term ‘proof’ for this concept too). These are the written discourses actually published by mathematicians in journal articles, textbooks and lecture notes. Proof presentations give expression to proofs, and together with lectures are the chief means by which they are communicated. A number of metaphors come to mind for how this is achieved: perceptual, semiotic, cartographic. A good proof presentation enables any competent reader to know that a proof exists and thus can provide them with strong reasons for believing that its conclusion is a deductive consequence of the axioms or premisses employed. In order to achieve this, proof presentations must make the inferential relations of the propositions they express transparent. It is not enough that they do in fact stand in these relations if this is not clear to a reader. However, the sentences of a proof presentation need not be in one-to-one correspondence with the propositions of the proof it presents. Indeed, proof presentations generally contain gaps, and often make use of subsidiary results that are merely quoted. Where this is the case, we must look at both this proof presentation and at presentations of proofs of the cited results to know a full proof of the claim in its entirety. Proof presentations may comprise a mixture of natural language, mathematical notation and diagrams. The sentences employed can be declarative in character 1 In this context, we also extend the term ‘proposition’ to include what is expressed by a sentence containing one or more free variables, such as ‘let 𝐺 be a group’. 6 (‘triangles ABC and DEF are similar’), but they may also be in the imperative mood (‘construct a tangent to the circle 𝛾 meeting the line segment 𝑋𝑌 at right- angles’). Definitions may be idiosyncratic or delineate entirely new concepts (‘we say a group is Klein-free if it contains no subgroup isomorphic to ℤ ×ℤ ’). Proofs ! ! of theorems are often achieved by first securing auxiliary results or lemmas which are then drawn together to complete the argument. As mentioned, many of these lemmas are not proved explicitly but quoted from other sources, or are well-known enough that they can simply be stated (‘we now invoke the Bolzano-Weierstrass Theorem, showing the existence of a convergent subsequence 𝑥 ’). Corollaries !(!) or straightforward consequences of the main theorem are often deduced after it has been proved. Conjectures may be put forward with various degrees of assurance, and informal comments such as suggestions for further applications of derived techniques are often made – though both are sharply distinguished from the central mathematical content of the argument. In general, then, the claim that mathematicians require proof means that in order for a new result to become accepted (in a sense to be clarified below) a proof presentation must be supplied along with it. In the next section, we get more acquainted with both concepts by looking at some concrete examples of arguments that mathematicians would agree do constitute presentations of proofs. 1.ii. Examples of Proof Presentations In what follows, the word ‘argument’ is used to mean either a proof itself or a written discourse that expresses or purports to express a proof. Individual proof presentations are demarcated using boxes. Theorem 1.2. ! = !!! + !!! for all 𝑛 ≥ 2 and 1 ≤ 𝑘 ≤ 𝑛−1 ! ! !!! The binomial coefficients ! – read ‘n-choose-k’ – can be defined as the number ! of ways of choosing a 𝑘-sized subset from an 𝑛-set for 𝑛 ≥ 0 and 0 ≤ 𝑘 ≤ 𝑛. We may also define them using the formula ! = !! , which is easily shown to be ! !! !!! ! extensionally equivalent. Using the latter definition, we can easily prove Theorem 1.2 as follows, using algebra. 𝑛−1 𝑛−1 (𝑛−1)! (𝑛−1)! ! !+! ! = + 𝑘 𝑘−1 𝑘!(𝑛−𝑘−1)! (𝑘−1)!(𝑛−𝑘)! (𝑛−𝑘)(𝑛−1)! 𝑘(𝑛−1)! = + 𝑘!(𝑛−𝑘)! 𝑘!(𝑛−𝑘)! 𝑛(𝑛−1)! 𝑛! 𝑛 = = = ! ! 𝑘(𝑛−𝑘)! 𝑘(𝑛−𝑘)! 𝑘 7 Although this argument makes use of a number of properties of fractions that are not explicitly mentioned, it is clear that the result in question is unequivocally established, and that it adequately gives expression to a proof. The presentation of a different proof of the same result might run as follows, this time using the set- theoretic definition: Consider all the 𝑘-subsets of a given 𝑛 set. Let 𝑥 denote one specific member of the 𝑛-set. We organise the 𝑘-subsets into two kinds according to whether or not they contain 𝑥. The number of 𝑘-subsets that do not contain 𝑥 is equal to !!!!!, ! as we must select all 𝑘 elements from the remaining 𝑛−1 elements not identical with 𝑥. The number of 𝑘-subsets that do contain 𝑥 is equal to !!!!!, !!! because this time we need only choose another (𝑘−1) elements to adjoin to {x} to make a subset of size 𝑘. Hence, the total number of 𝑘-subsets of an 𝑛-set is !!!!!+!!!!!, which proves the identity. ! !!! The conclusion has again been conclusively established, but this time the character of the proof is much different, and the reliance on natural language in the presentation is far more substantial than for the previous proof. We have also ! made use of a very different definition for , although one that is known to be ! extensionally equivalent. Thus the same relation between mathematical objects is established, even though reference to them is achieved in a different way. In the latter proof the reader is not merely following mechanical manipulations of symbols but are invited to use their imaginations to picture a collection of objects that are then manipulated in space and time. These kinds of combinatorial arguments are not foolproof; individual steps may be quite demanding to follow, and one needs to have developed certain skills in order to construct or check them. In this case, the combinatorial argument was also more explanatory: it enabled us to see why the two given expressions turn out to be equal. Sometimes a combinatorial argument may ask us to go even further in conducting a thought-experiment about a practical situation. We illustrate this by looking at a second theorem about binomial coefficients. Theorem 1.3. 𝑘 ! = 𝑛 !!! for all 𝑛 ≥ 1 and 1 ≤ 𝑘 ≤ 𝑛 ! !!! 8 Consider a university faculty with 𝑛 members, 𝑛 ≥ 1. We count the number of ways of choosing a committee of 𝑘 members with one of these members appointed as a chairperson (1 ≤ 𝑘 ≤ 𝑛). We can either choose the whole committee first and then select one of its members to be the chair – which can be done in 𝑘!!! ways – or we can first select the chairperson, and then choose ! the other (𝑘−1) members of the committee from the remaining (𝑛−1) members of the faculty – which can be done in 𝑛!!!!! ways. These two !!! expressions must therefore be equal, which is precisely the result to be proved. It will surely be conceded that once again we have found a proof, although it is hard to say exactly how we arrive at this decision. Is the language of the presentation precise enough? Have we paid enough attention to boundary cases, where mistakes often occur because steps are not valid for very small values? For instance, if 𝑘 = 0 the argument does not make sense, because 𝑥 cannot be a member of the 𝑘-subset. More generally, we must always ensure that any values the symbols can take do correspond to a possible instance of the combinatorial interpretation provided. It is also clear that the details of the particular model I have chosen to base this argument on are irrelevant to the logical structure of the argument itself. The choice of this practical scenario was merely for psychological convenience, and the faculty members individuals that the argument refers to could easily be replaced with mathematical objects if necessary. Nevertheless, I intend all of the arguments given so far to count as proof presentations as they stand. Let us now turn to a third example. Problem 1.4. Consider a knockout tournament with 2! players, where 𝑛 ≥ 1. How many matches will be needed to determine the winner? One obvious approach is to proceed as follows: There are 2!!! matches in the first round, and half as many in each subsequent round down to the final. Hence, the total number of matches is given by 2!!! +2!!! +⋯+2+1 = !!!! = 2! −1. !!! However, consider the following elegant argument: 9 Each player other than the winner must lose exactly once, and exactly one such loss occurs every match. So the number of matches is 2! −1. This way of reaching the conclusion might seem to the uninitiated to be less satisfactory than the first because it is purely conceptual, and the argument involves no algebra. Yet it is arguably superior in every way. It doesn’t rely on a subsidiary result about the sum of a geometric series, nor an implicit invocation of the principle of mathematical induction. And clearly it can be generalised to other cases far more easily. This shows that a good proof presentation may be very different from how mathematics sometimes appears in the popular imagination, i.e. numerical or symbolic calculation, or the rigid application of formal rules. Theorem 1.5. (Fermat’s Little Theorem2) Let 𝑝 be prime and 𝑎 ∈ ℤ!. Then 𝑎! ≡ 𝑎 mod 𝑝 Consider the following argument, taken verbatim from Arthur Engel’s excellent problem-solving manual.3 We have pearls with 𝑎 colors. From these we make necklaces with exactly 𝑝 pearls. First, we make a string of pearls. There are 𝑎! different strings. If we throw away the 𝑎 one-colored strings 𝑎! −𝑎 strings will remain. We connect the ends of each string to get necklaces. We find that two strings that differ only by a cyclic permutation of its pearls result in indistinguishable necklaces. But there are 𝑝 cyclic permutations of 𝑝 pearls on a string. Hence the number of distinct necklaces is (𝑎! −𝑎)/𝑝. Because of its interpretation this is an integer. So 𝑝|𝑎! −𝑎 Some details have been skipped over here. For instance, it is not entirely trivial that a cyclic permutation of a polychromatic string of length 𝑝 always yields a different string, and is only true when 𝑝 is prime. The argument also makes less sense for the case 𝑎 = 1, as then there will be no polychromatic strings. These small omissions are easily fixed; however, there is a deeper reason why the argument as it stands will still be unsatisfactory, and hence not a proof presentation at all. Although the guiding intuition is correct, there is a fundamental 2 Not to be confused with Fermat’s Last Theorem, the proof of which this footnote is sadly too narrow to contain. 3 Arthur Engel, Problem-Solving Strategies (New York: Springer, 1998), 120. 10
Description: