COMPLEXITY OF ALGORITHMS Series of Lecture Notes and Workbooks for Teaching Undergraduate Mathematics Algoritmuselmélet Algoritmusok bonyolultsága Analitikus módszerek a pénzügyben és a közgazdaságtanban Analízis feladatgyűjtemény I Analízis feladatgyűjtemény II Bevezetés az analízisbe Complexity of Algorithms Differential Geometry Diszkrét matematikai feladatok Diszkrét optimalizálás Geometria Igazságos elosztások Introductory Course in Analysis Mathematical Analysis – Exercises I Mathematical Analysis – Problems and Exercises II Mértékelmélet és dinamikus programozás Numerikus funkcionálanalízis Operációkutatás Operációkutatásipéldatár Parciális differenciálegyenletek Példatár az analízishez Pénzügyi matematika Szimmetrikus struktúrák Többváltozós adatelemzés Variációszámítás és optimális irányítás László Lovász COMPLEXITY OF ALGORITHMS Eötvös Loránd University Faculty of Science Typotex 2014 ©2014–2019,LászlóLovász,EötvösLorándUniversity,MathematicalInsti- tute Reader: Katalin Friedl Edited by Zoltán Király and Dömötör Pálvölgyi The first version of these lecture notes was translated and supplemented by Péter Gács (Boston University). Creative Commons NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0) This work can be reproduced, circulated, published and performed for non- commercialpurposeswithoutrestrictionbyindicatingtheauthor’sname,but it cannot be modified. ISBN 978 963 279 244 6 Prepared under the editorship of Typotex Publishing House (http://www.typotex.hu) Responsible manager: Zsuzsa Votisky Technical editor: József Gerner Made within the frameworkofthe projectNr.TÁMOP-4.1.2-08/2/A/KMR- 2009-0045,entitled “Jegyzetek és példatárak a matematika egyetemi oktatá- sához” (Lecture Notes and Workbooks for Teaching Undergraduate Mathe- matics). KEY WORDS: Complexity, Turing machine, Boolean circuit, algorithmic decidability, polynomial time, NP-completeness, randomized algorithms, in- formation and communication complexity, pseudorandom numbers, decision trees, parallel algorithms, cryptography,interactive proofs. SUMMARY:Thestudyofthecomplexityofalgorithmsstartedinthe1930’s, principally with the development of the concepts of Turing machine and al- gorithmic decidability. Throughthe spread of computers and the increase of their power this discipline achieved higher and higher significance. In these lecture notes we discuss the classicalfoundations of complexity the- ory like Turing machines and the halting problem, as well as some leading new developments: information and communication complexity, generation of pseudorandom numbers, parallel algorithms, foundations of cryptography and interactive proofs. Contents Introduction 1 Some notation and definitions . . . . . . . . . . . . . . . . . . . . . 2 1 Models of Computation 5 1.1 Finite automata . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 The Turing machine . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 The Random Access Machine . . . . . . . . . . . . . . . . . . 21 1.4 Boolean functions and Boolean circuits . . . . . . . . . . . . . 27 2 Algorithmic decidability 37 2.1 Recursive and recursively enumerable languages . . . . . . . . 38 2.2 Other undecidable problems . . . . . . . . . . . . . . . . . . . 43 2.3 Computability in logic . . . . . . . . . . . . . . . . . . . . . . 49 2.3.1 Godel’s incompleteness theorem. . . . . . . . . . . . . 49 2.3.2 First-order logic . . . . . . . . . . . . . . . . . . . . . 52 3 Computation with resource bounds 59 3.1 Polynomial time . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.2 Other complexity classes . . . . . . . . . . . . . . . . . . . . . 74 3.3 General theorems on space and time complexity. . . . . . . . 77 4 Non-deterministic algorithms 87 4.1 Non-deterministic Turing machines . . . . . . . . . . . . . . . 88 4.2 Witnesses and the complexity of non-deterministic algorithms 90 4.3 Examples of languages in NP . . . . . . . . . . . . . . . . . . 95 4.4 NP-completeness . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.5 Further NP-complete problems . . . . . . . . . . . . . . . . . 109 5 Randomized algorithms 119 5.1 Verifying a polynomial identity . . . . . . . . . . . . . . . . . 119 5.2 Primality testing . . . . . . . . . . . . . . . . . . . . . . . . . 123 i 5.3 Randomized complexity classes . . . . . . . . . . . . . . . . . 128 6 Information complexity 133 6.1 Information complexity . . . . . . . . . . . . . . . . . . . . . 134 6.2 Self-delimiting information complexity . . . . . . . . . . . . . 139 6.3 The notion of a random sequence . . . . . . . . . . . . . . . . 143 6.4 Kolmogorovcomplexity, entropy and coding . . . . . . . . . . 145 7 Pseudorandom numbers 153 7.1 Classical methods. . . . . . . . . . . . . . . . . . . . . . . . . 154 7.2 The notion of a pseudorandom number generator . . . . . . . 156 7.3 One-way functions . . . . . . . . . . . . . . . . . . . . . . . . 160 7.4 Candidates for one-way functions . . . . . . . . . . . . . . . . 164 7.4.1 Discrete square roots . . . . . . . . . . . . . . . . . . . 164 8 Decision trees 167 8.1 Algorithms using decision trees . . . . . . . . . . . . . . . . . 168 8.2 Non-deterministic decision trees . . . . . . . . . . . . . . . . . 173 8.3 Lower bounds on the depth of decision trees . . . . . . . . . . 176 9 Algebraic computations 183 9.1 Models of algebraic computation . . . . . . . . . . . . . . . . 183 9.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 9.2.1 Arithmetic operations on large numbers . . . . . . . . 185 9.2.2 Matrix multiplication . . . . . . . . . . . . . . . . . . 187 9.2.3 Inverting matrices . . . . . . . . . . . . . . . . . . . . 189 9.2.4 Multiplication of polynomials . . . . . . . . . . . . . . 190 9.2.5 Discrete Fourier transform. . . . . . . . . . . . . . . . 192 9.3 Algebraic complexity theory . . . . . . . . . . . . . . . . . . . 194 9.3.1 The complexity of computing square-sums . . . . . . . 194 9.3.2 Evaluation of polynomials . . . . . . . . . . . . . . . . 195 9.3.3 Formula complexity and circuit complexity . . . . . . 198 10 Parallel algorithms 201 10.1 Parallel random access machines . . . . . . . . . . . . . . . . 201 10.2 The class NC . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 11 Communication complexity 211 11.1 Communication matrix and protocol-tree . . . . . . . . . . . 212 11.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 11.3 Non-deterministic communication complexity . . . . . . . . . 219 11.4 Randomized protocols . . . . . . . . . . . . . . . . . . . . . . 223 ii 12 An application of complexity: cryptography 225 12.1 A classical problem . . . . . . . . . . . . . . . . . . . . . . . . 225 12.2 A simple complexity-theoretic model . . . . . . . . . . . . . . 226 12.3 Public-key cryptography . . . . . . . . . . . . . . . . . . . . . 227 12.4 The Rivest–Shamir–Adleman code (RSA code) . . . . . . . . 229 13 Circuit complexity 233 13.1 Lower bound for the Majority Function . . . . . . . . . . . . 234 13.2 Monotone circuits . . . . . . . . . . . . . . . . . . . . . . . . 237 14 Interactive proofs 239 14.1 How to save the last move in chess? . . . . . . . . . . . . . . 239 14.2 How to check a password– without knowing it? . . . . . . . . 241 14.3 How to use your password– without telling it? . . . . . . . . 241 14.4 How to prove non-existence? . . . . . . . . . . . . . . . . . . 243 14.5 How to verify proofs that keep the main result secret? . . . . 246 14.6 How to referee exponentially long papers? . . . . . . . . . . . 246 14.7 Approximability . . . . . . . . . . . . . . . . . . . . . . . . . 248 iii Introduction The need to be able to measure the complexity of a problem, algorithm or structure, and to obtain bounds and quantitative relations for complexity arises in more and more sciences: besides computer science, the traditional branchesofmathematics,statisticalphysics,biology,medicine,socialsciences andengineeringarealsoconfrontedmoreandmorefrequentlywiththisprob- lem. In the approach taken by computer science, complexity is measured by the quantity of computational resources (time, storage, program, communi- cation) used up by a particular task. These notes deal with the foundations of this theory. Computation theory can basically be divided into three parts of different character. First, the exact notions of algorithm, time, storage capacity, etc. must be introduced. For this, different mathematical machine models must be defined, and the time and storage needs of the computations performed on these need to be clarified (this is generally measured as a function of the sizeofinput). Bylimitingtheavailableresources,therangeofsolvableprob- lems getsnarrower;this is howwearriveatdifferentcomplexityclasses. The most fundamental complexity classes provide an important classification of problems arising in practice, but (perhaps more surprisingly) even for those arisinginclassicalareasofmathematics;thisclassificationreflectsthepracti- calandtheoreticaldifficultyofproblemsquitewell. Therelationshipbetween differentmachinemodelsalsobelongstothisfirstpartofcomputationtheory. Second, one must determine the resource need of the most important al- gorithms in various areas of mathematics, and give efficient algorithms to prove that certain important problems belong to certain complexity classes. In these notes, we do not strive for completeness in the investigation of con- crete algorithms and problems; this is the task of the corresponding fields of mathematics (combinatorics, operations research, numerical analysis, num- ber theory). Nevertheless, a large number of algorithms will be described and analyzed to illustrate certain notions and methods, and to establish the complexity of certain problems. Third, one must find methods to prove “negative results”, i.e., to show that some problems are actually unsolvable under certain resource restric- 1 2 Introduction tions. Often, these questions can be formulated by asking whether certain complexity classes are different or empty. This problem area includes the question whether a problem is algorithmically solvable at all; this question cantodaybe consideredclassical,andthere aremanyimportantresults con- cerning it; in particular, the decidability or undecidability of most problems of interest is known. The majority of algorithmic problems occurring in practice is, however, suchthatalgorithmicsolvabilityitself is notinquestion,the questionis only whatresourcesmustbe used forthe solution. Suchinvestigations,addressed to lower bounds, are very difficult and are still in their infancy. In these notes, we can only give a taste of this sort of results. In particular, we discuss complexity notions like communication complexity or decision tree complexity,wherebyfocusingonlyononetypeofratherspecialresource,we can give a more complete analysis of basic complexity classes. It is, finally, worth noting that if a problem turns out to be “difficult” to solve,this isnotnecessarilyanegativeresult. Moreandmoreareas(random numbergeneration,communicationprotocols,cryptography,dataprotection) need problems and structures that are guaranteedto be complex. These are important areas for the application of complexity theory; from among them, we will deal with random number generation and cryptography, the theory of secret communication. We use basic notions of number theory, linear algebra, graph theory and (to a small extent) probability theory. However, these mainly appear in ex- amples,thetheoreticalresults—withafewexceptions—areunderstandable without these notions as well. IwouldliketothankLászlóBabai, GyörgyElekes, AndrásFrank, GyulaKatona,ZoltánKirályandMiklósSimonovitsfortheiradvice regarding these notes, and Dezső Miklós for his help in using MATEX, in which the Hungarian original was written. The notes were later translated into English by Péter Gács and meanwhile also extended, corrected by him. László Lovász Some notation and definitions Afinitesetofsymbolswillsometimesbecalledanalphabet. Afinitesequence formed from some elements of an alphabet Σ is called a word. The empty word will also be considered a word, and will be denoted by . The set of ∅ words of length n over Σ is denoted by Σn, the set of all words (including the empty word) over Σ is denoted by Σ∗. A subset of Σ∗, i.e., an arbitrary set of words, is called a language.