ebook img

Theory of automata PDF

270 Pages·1969·13.492 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Theory of automata

THEORY OF AUTOMATA ARTO SALOMAA Professor of Mathematics, University of Turku, Finland PERGAMON PRESS OXFORD · LONDON · EDINBURGH · NEW YORK TORONTO · SYDNEY · PARIS · BRAUNSCHWEIG Pergamon Press Ltd., Headington Hill Hall, Oxford 4 & 5 Fitzroy Square, London W.l Pergamon Press (Scotland) Ltd., 2 & 3 Teviot Place, Edinburgh 1 Pergamon Press Inc., Maxwell House, Fairview Park, Elmsford, New York 10523 Pergamon of Canada Ltd., 207 Queen's Quay West, Toronto 1 Pergamon Press (Aust.) Pty. Ltd., 19a Boundary Street, Rushcutters Bay, N.S.W. 2011, Australia Pergamon Press S.A.R.L., 24 rue des Écoles, Paris 5e Vie weg & Sohn GmbH, Burgplatz 1, Braunschweig (c) 1969 A. Salomaa AU Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photo­ copying, recording or otherwise, without the prior permission of Pergamon Press Ltd. First edition 1969 Library of Congress Catalog Card No. 71-76796 Printed in Hungary 08 013376 2 To my wife and children Kai and Kirsti PREFACE THE past ten years have witnessed the vigorous growth of mathe­ matical disciplines concerning models for information-processing. In addition to the classical theory of mechanically executable processes (or Turing machines, algorithms or recursive functions), attention has been focused on more restricted models such as finite deterministic automata which were investigated originally in connection with sequen­ tial switching circuits. The results are so numerous and diverse that, although one would limit the choice to material independent of tech­ nological developments, one could still write several books entitled "Theory of Automata" with no or very little common material. Even for the same material, the approach can vary from an engineering oriented one to a highly abstracted mathematical treatment. In selecting material for this book, I have had two principles in mind. In the first place, I have considered the finite deterministic automaton as the basic model. All other models, such as finite non-deterministic and probabilistic automata as well as pushdown and linear bounded automata, are treated as generalizations of this basic model. Secondly, the formalism chosen to describe finite deterministic automata is that of regular expressions. To give a detailed exposition regarding this formalism, I have included a separate chapter on the algebra of regular expressions. This book deals with mathematical aspects of automata theory, rather than applications. However, I have tried to avoid unnecessary abstractions because no advanced mathematical background is assumed on the part of the reader. The reader is supposed to be familiar with the very basic notions in set theory, algebra and probability theory. One of the reasons for the choice of the formalism of regular expressions (rather than predicate calculus or more algebraic formalisms) was that it can be developed very briefly ab ovo. ix X PREFACE This book is self-contained. All results stated as theorems are proven (except for minor details and proofs analogous to earlier proofs which are sometimes left as exercises). Some results are mentioned without- proof s in Sections 3, 4 and 7 of Chapter IV. The level of presentation corresponds to that of beginning graduate or advanced undergraduate work. An attempt has been made to cover also the most recent develop­ ments. Exercises form an important part of the book. They are theoretical rather than "numerical". Many of them contain material which could equally well be treated in the text. However, the text is independent of the exercises (except for some minor points in the proofs). Because some exercises are very difficult, the reader is encouraged to consult the given reference works. In addition to the exercises, there are also "problems" at the end of some sections. The problems are topics suggested for research. Both of the Chapters III and IV can be studied also independently, provided they are supplemented with a few definitions. A reader who wants only the basic facts concerning various types of automata may read only Sections 1, 2 and 4 of Chapter I, 1 and 2 of Chapter II and 1-8 of Chapter IV. These sections form a self-contained "short course". Thereby, some of the more difficult proofs in Chapter IV may be omitted. ACKNOWLEDGEMENTS THIS book has been written during my stay with the Computer Science Department of the University of Western Ontario. I want to thank the Head of the Department, Dr. J. F. Hart, for providing me the opportunity to work within such a stimulating atmosphere. Special thanks are due to Dr. Neil Jones whose sophistication in formal lan­ guages has been of an invaluable help in the preparation of many sections in the book. Of the other members of the Department, I want to thank especially Mr. Andrew Szilard for many fruitful discussions and critical suggestions. Of my colleagues at the University of Turku, I express my gratitude to Professor K. Inkeri for much encouragement and help during many years. I want to thank also Professor Lauri Pimiä for planning and drawing the figures and Messrs. Magnus Steinby and Paavo Turakainen for reading the manuscript and making many useful comments. ARTO SALOMAA London, Ontario NOTE TO THE READER REFERENCES to theorems, equations, etc., without a roman numeral mean the item in the chapter where the reference is made. References to other chapters are indicated by roman numerals. Thus, (1.2.1) means the equation (2.1) in Chapter I and Problem III.6.1 means Problem 6.1 in Chapter III. We use customary set theoretic notations. In particular, AXB de­ notes the Cartesian product of the sets A and B and 9 {x\Pi,...,P } k denotes the set of all elements x which possess each of the properties Pi, ..., P . k The symbol D is used to mark the end of the proof of a theorem. Throughout the book, the expression "if and only if" is abbreviated as "iff". Xli CHAPTER I FINITE DETERMINISTIC AUTOMATA AUTOMATA are mathematical models of devices which process informa­ tion by giving responses to inputs. Models for discrete deterministic computing devices possessing a finite memory will be our concern in this chapter. Their behavior at a particular time instant depends not only on the latest input but on the entire sequence of past inputs. They are capable of only a finite number of inputs, internal states and outputs and are, furthermore, deterministic in the sense that an input sequence uniquely determines the output behavior. § 1. Regular languages By an alphabet we mean a non-empty set. The elements of an alpha­ bet / are called letters. In most cases, we shall consider only finite alphabets. Sometimes a subscript is used to indicate the number of letters. Thus, I is an alphabet with r letters. r A word over an alphabet /is a finite string consisting of zero or more letters of/, whereby the same letter may occur several times. The string consisting of zero letters is called the empty word, written λ. Thus Λ, X\ #I#2^2J Χ±Χ±Χ±Χ±Χχ X^K^K\X 9 9 2 are words over the alphabet / = {x x }. The set of all words over 2 l9 2 an alphabet /is denoted by W(J). Clearly, W(I) is infinite. If P and Q are words over an alphabet /, then their catenation PQ is also a word over /. Clearly, catenation is an associative operation and the empty word λ is an identity with respect to catenation: Ρλ = λΡ=Ρ, for any word P. For a word P and a natural number /, the notation P' means the word obtained by catenating i copies of the word P. P° de­ notes the empty word λ. i 2 THEORY OF AUTOMATA By the length of a word P, in symbols lg (P), is meant the number of letters in P when each letter is counted as many times as it occurs. By definition, lg (X) = 0. The length function possesses some of the formal properties of logarithm: lg(P0 = lg(P)+lg(0), lg(/*) = /lg(P), for/SET 0. A word P is a subword of a word Q iff there are words Pi and P, 2 possibly empty, such that Q = PiPP. If Pi = A (P = A), then P is 2 2 termed an initial (a ywa/) subword of Q. If P is a subword of g, then clearly lg(P)^lg(ß). Subsets of W(I) are referred to as languages over the alphabet /. For instance, the following sets are languages over the alphabet h = {xi X2} · 9 L\ = |^iX2^2> #i#2-£l5 Xi#2^1^2j» Z/2 = {A, -*lj *2j ^l^lj X\X2i ·*2*1> <*2*2/> £3 = {xi \p prime}, £4 = {x%y\i natural number}, L5 = {χ[χ 1 i natural number}. 2 The languages Lx and L% are finite, i.e. contain only a finite number of words, whereas the languages L3-L5 are infinite. The language L con­ 2 sists of all words over the alphabet {x x ] with length less than or l9 2 equal to 2. We consider also the empty language Ζ,, i.e. the language containing φ no words. Note that Ζ, is not the same as the language L or {A} φ x consisting of the empty word A. Sometimes we identify an element and its unit set to simplify notation: we may denote simply by P the lan­ guage {P} consisting of the word P. In such cases, the meaning of the notation will be clear from the context. We shall now introduce some operations for the family of languages over a fixed alphabet /. Because languages are sets, we may consider the Boolean operations. The sum or union of two languages Lx and L* is denoted by L1+L2, their intersection by LxC\L and the complement 2i of a language L with respect to the set W(I) by ~L. We use also the notation L1—L2 = Iifi (^£2). § 1. REGULAR LANGUAGES 3 The catenation (or product) of two languages L\ and L2, in symbols L1L2, is defined to be the language UU = {PQ\P £L Q<l L }. l9 2 Catenation of languages is associative because catenation of words is associative. Thus, the notation L\ i ^ 1, is meaningful for a language L. Furthermore, we define L° to be the language L consisting of the x empty word. Note that the languages L^ and L are zero and identity x elements with respect to catenation of languages: for any language L, ΖΧψ = Ζ,φ-L = Ι,ψ and LL = L L = L. X A The catenation closure (or iteration) of a language L, in symbols L* 9 is defined to be the sum of all powers of L: 00 v = Σ L'. Later on, we shall define some further operations for the family of languages. We shall now introduce the notion of a regular expression which will be one of the basic notions in our subsequent discussions. Consider the auxiliary alphabet /' = {+,*,Φ, e» and any alphabet / such that / and /' are disjoint. A regular expression over the alphabet / is any word over the union of the alphabets / and /' which satisfies the condition formulated in the following DEFINITION, (i) Each letter belonging to the alphabet / is a regular expression over /, and so is the letter φ. (ii) If a and β are regular expressions over /, then so are also (a+/?), (aß) and a*. (iii) Nothing is a regular expression over /, unless its being so follows from a finite number of applications of (i) and (ii). Thus, for instance, φ, φ*, xi, (X1+X2), ((xi+xi)(xi+X2)y (1.1) are regular expressions over the alphabet h = {xi, ^2}. DEFINITION. Each regular expression y over an alphabet / denotes a language j γ \ over / according to the following conventions :

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.