Formal Syntax and Semantics of Programming Languages A Laboratory Based Approach Kenneth Slonneger University of Iowa Barry L. Kurtz Louisiana Tech University Addison-Wesley Publishing Company Library of Congr ess Cataloging-in-Publication Data Slonneger, Kenneth. Formal syntax and semantics of programming languages: a laboratory based approach / Kenneth Slonneger, Barry L. Kurtz. p.cm. Includes bibliographical references and index. ISBN 0-201-65697-3 1.Pr ogramming languages (Electronic computers)--Syntax. 2.Pr ogramming languages (Electronic computers)--Semantics. I. Kurtz, Barry L. II. Title. QA76.7.S59 1995 005.13'1--dc20 94-4203 CIP Copyright © 1995 by Addison-Wesley Publishing Company, Inc. ISBN 0-201-65697-3 1234 5 6 7 8 9 10-MA-979695 Preface This text developed out of our experiences teaching courses covering the formal semantics of programming languages. Independently we both devel- oped laboratory exercises implementing small programming languages in Prolog following denotational definitions. Prolog proved to be an excellent tool for illustrating the formal semantics of programming languages. We found that these laboratory exercises were highly successful in motivating students since the hands-on experience helped demystify the study of for- mal semantics. At a professional meeting we became aware of each other’s experiences with a laboratory approach to semantics, and this book evolved from that conference. Although this text has been carefully written so that the laboratory activities can be omitted without loss of continuity, we hope that most readers will try the laboratory approach and experience the same success that we have ob- served in our classes. Overall Goals We have pursued a broad spectrum of definitional techniques, illustrated with numerous examples. Although the specification methods are formal, the presentation is “gentle”, providing just enough in the way of mathemati- cal underpinnings to produce an understanding of the metalanguages. We hope to supply enough coverage of mathematics and formal methods to jus- tify the definitional techniques, but the text is accessible to students with a basic grounding in discrete mathematics as presented to undergraduate computer science students. There has been a tendency in the area of formal semantics to create cryptic, overly concise semantic definitions that intimidate students new to the study of programming languages. The emphasis in this text is on clear notational conventions with the goals of readability and understandability foremost in our minds. As with other textbooks in this field, we introduce the basic concepts using mini-languages that are rich enough to illustrate the fundamental concepts, yet sparse enough to avoid being overwhelming. We have named our mini- languages after birds. vi PREFACE Wren is a simple imperative language with two types, integer and Bool- ean, thus allowing for context-sensitive type and declaration checking. It has assignment, if, while, and input/output commands. Pelican, a block-structured, imperative language, is an extension of Wren containing the declaration of constants, anonymous blocks, procedures, and recursive definitions. The description of continuations in denotational semantics requires a modi- fied version of Wren with goto statements, which we call Gull. This mini- language can be skipped without loss of continuity if continuations are not covered. Organization of the Text The primary target readership of our text is first-year graduate students, although by careful selection of materials it is also accessible to advanced undergraduate students. The text contains more material than can be cov- ered in a one semester course. We have provided a wide variety of tech- niques so that instructors may choose materials to suit the particular needs of their students. Dependencies between chapters are indicated in the graph below. We have purposely attempted to minimize mutual interdependencies and to make our presentation as broad as possible. 1 2 3 4 5 6 8 9 11 12 7 8 10 13 Only sections 2 and 3 of Chapter 8 depend on Chapter 5. The text contains a laboratory component that we describe in more detail in a moment. How- ever, materials have been carefully organized so that no components of the non-laboratory sections of the text are dependent on any laboratory activi- PREFACE vii ties. All of the laboratory activities except those in Chapter 6 depend on Chapter 2. Overview The first four chapters deal primarily with the syntax of programming lan- guages. Chapter 1 treats context-free syntax in the guise of BNF grammars and their variants. Since most methods of semantic specification use ab- stract syntax trees, the abstract syntax of languages is presented and con- trasted with concrete syntax. Language processing with Prolog is introduced in Chapter 2 by describing a scanner for Wren and a parser defined in terms of Prolog logic grammars. These utilities act as the front end for the prototype context checkers, inter- preters, and translators developed later in the text. Extensions of BNF gram- mars that provide methods of verifying the context-sensitive aspects of pro- gramming languages—namely, attribute grammars and two-level grammars— are described in Chapters 3 and 4. Chapters 5 through 8 are devoted to semantic formalisms that can be clas- sified as operational semantics. Chapter 5 introduces the lambda calculus by describing its syntax and the evaluation of lambda expressions by reduc- tion rules. Metacircular interpreters are consider in Chapter 6, which intro- duces the self-definition of programming languages. Chapter 7 describes the translation of Wren into assembly language using an attribute grammar that constructs the target code as a program is parsed. Two well-known operational formalisms are treated in Chapter 8: the SECD machine—an abstract machine for evaluating the lambda calculus—and structural operational semantics—an operational methodology for describ- ing the semantics of programming languages in terms of logical rules of inference. We use this technique to specify the semantics of Wren formally. The last five chapters present three traditional methods of defining the se- mantics of programming languages formally and one recently proposed tech- nique. Denotational semantics, one of the most complete and successful methods of specifying a programming language, is covered in Chapter 9. Specifications of several languages are provided, including a calculator lan- guage, Wren, Pelican, and Gull, a language whose semantics requires con- tinuation semantics. Denotational semantics is also used to check the con- text constraints for Wren. Chapter 10 deals with the mathematical founda- tions of denotational semantics in domain theory by describing the data structures employed by denotational definitions. Chapter 10 also includes a justification for recursive definitions via fixed-point semantics, which is then applied in lambda calculus evaluation. viii PREFACE Axiomatic semantics, dealt with in Chapter 11, has become an important component of software development by means of proofs of correctness for algorithms. The approach here presents axiomatic specifications of Wren and Pelican, but the primary examples involve proofs of partial correctness and termination. The chapter concludes with a brief introduction to using assertions as program specifications and deriving program code based on these assertions. Chapter 12 investigates the algebraic specification of ab- stract data types and uses these formalisms to specify the context constraints and the semantics of Wren. Algebraic semantics also provides an explana- tion of abstract syntax. Chapter 13 introduces a specification method, action semantics, that has been proposed recently in response to criticisms arising from the difficulty of using formal methods. Action semantics resembles denotational seman- tics but can be viewed in terms of operational behavior without sacrificing mathematical rigor. We use it to specify the semantics of the calculator lan- guage, Wren, and Pelican. The text concludes with two short appendices introducing the basics of programming in Prolog and Scheme, which is used in Chapter 6. The Laboratory Component A unique feature of this text is the laboratory component. Running through- out the text is a series of exercises and examples that involve implementing syntactic and semantic specifications on real systems. We have chosen Prolog as the primary vehicle for these implementations for several reasons: 1. Prolog provides high-level programming enabling the construction of deri- vation trees and abstract syntax trees as structures without using pointer programming as needed in most imperative languages. 2. Most Prolog systems provide a programming environment that is easy to use, especially in the context of rapid prototyping; large systems can be developed one predicate at a time and can be tested during their con- struction. 3. Logic programming creates a framework for drawing out the logical prop- erties of abstract specifications that encourages students to approach problems in a disciplined and logical manner. Furthermore, the specifi- cations described in logic become executable specifications with Prolog. 4. Prolog’s logic grammars provide a simple-to-use parser that can serve as a front end to language processors. It also serves as a direct implemen- tation of attribute grammars and provides an immediate application of BNF specifications of the context-free part of a language’s grammar. PREFACE ix An appendix covering the basics of Prolog is provided for students unfamil- iar with logic programming. Our experience has shown that the laboratory practice greatly enhances the learning experience. The only way to master formal methods of language definition is to practice writing and reading language specifications. We in- volve students in the implementation of general tools that can be applied to a variety of examples and that provide increased motivation and feedback to the students. Submitting specifications to a prototyping system can un- cover oversights and subtleties that are not apparent to a casual reader. As authors, we have frequently used these laboratory approaches to help “de- bug” our formal specifications! Laboratory materials found in this textbook are available on the Internet via anonymous ftp from ftp.cs.uiowa.edu in the subdirectory pub/slonnegr. Laboratory Activities Chapter 2: Scanning and parsing Wren Chapter 3: Context checking Wren using an attribute grammar Chapter 4: Context checking Hollerith literals using a two-level grammar Chapter 5: Evaluating the lambda calculus using its reduction rules Chapter 6: Self-definition of Scheme (Lisp) Self-definition of Prolog Chapter 7: Translating (compiling) Wren programs following an attribute grammar Chapter 8: Interpreting the lambda calculus using the SECD machine Interpreting Wren according to a definition using structural operational semantics Chapter 9: Interpreting Wren following a denotational specification Chapter 10: Evaluating a lambda calculus that includes recursive defini- tions Chapter 12: Interpreting Wren according to an algebraic specification of the language Chapter 13: Translating Pelican programs into action notation following a specification in action semantics. Contents Chapter 1 SPECIFYING SYNTAX 1 1.1 GRAMMARS AND BNF 2 Context-Free Grammars 4 Context-Sensitive Grammars 8 Exercises 8 1.2 THE PROGRAMMING LANGUAGE WREN 10 Ambiguity 12 Context Constraints in Wren 13 Semantic Errors in Wren 15 Exercises 16 1.3 VARIANTS OF BNF 18 Exercises 20 1.4 ABSTRACT SYNTAX 21 Abstract Syntax Trees 21 Abstract Syntax of a Programming Language 23 Exercises 29 1.5 FURTHER READING 30 Chapter 2 INTRODUCTION TO LABORATORY ACTIVITIES 31 2.1 SCANNING 33 Exercises 39 2.2 LOGIC GRAMMARS 40 Motivating Logic Grammars 41 Improving the Parser 44 Prolog Grammar Rules 46 Parameters in Grammars 47 Executing Goals in a Logic Grammar 49 Exercises 49 2.3 PARSING WREN 50 Handling Left Recursion 52 Left Factoring 55 Exercises 56 2.4 FURTHER READING 57 Chapter 3 ATTRIBUTE GRAMMARS 59 3.1 CONCEPTS AND EXAMPLES 59 Examples of Attribute Grammars 60 Formal Definitions 66 Semantics via Attribute Grammars 67 Exercises 71 3.2 AN ATTRIBUTE GRAMMAR FOR WREN 74 The Symbol Table 74 Commands 80 Expressions 82 Exercises 90 3.3 LABORATORY: CONTEXT CHECKING WREN 92 Declarations 96 Commands 99 Expressions 101 Exercises 102 3.4 FURTHER READING 103 Chapter 4 TWO-LEVEL GRAMMARS 105 4.1 CONCEPTS AND EXAMPLES 105 Fortran String Literals 111 Derivation Trees 113 Exercises 115 4.2 A TWO-LEVEL GRAMMAR FOR WREN 116 Declarations 117 Commands and Expressions 124 Exercises 132 4.3 TWO-LEVEL GRAMMARS AND PROLOG 132 Implementing Two-Level Grammars in Prolog 133 Two-Level Grammars and Logic Programming 136 Exercises 138 4.4 FURTHER READING 138 Chapter 5 THE LAMBDA CALCULUS 139 5.1 CONCEPTS AND EXAMPLES 140 Syntax of the Lambda Calculus 140 Curried Functions 143 Semantics of Lambda Expressions 145 Exercises 146 5.2 LAMBDA REDUCTION 147 Reduction Strategies 151 Correlation with Parameter Passing 155 Constants in the Pure Lambda Calculus 156 Functional Programming Languages 158 Exercises 158 5.3 LABORATORY: A LAMBDA CALCULUS EVALUATOR 160 Scanner and Parser 160 The Lambda Calculus Evaluator 162 Exercises 165 5.4 FURTHER READING 166 Chapter 6 SELF-DEFINITION OF PROGRAMMING LANGUAGES 167 6.1 SELF-DEFINITION OF LISP 167 Metacircular Interpreter 169 Running the Interpreter 174 Exercises 178 6.2 SELF-DEFINITION OF PROLOG 179 Displaying Failure 181 Exercises 185 6.3 FURTHER READING 185 Chapter 7 TRANSLATIONAL SEMANTICS 187 7.1 CONCEPTS AND EXAMPLES 187 A Program Translation 189 Exercises 191 7.2 ATTRIBUTE GRAMMAR CODE GENERATION 191 Expressions 193 Commands 201 Exercises 213 7.3 LABORATORY: IMPLEMENTING CODE GENERATION 215 Commands 217 Expressions 219 Exercises 221 7.4 FURTHER READING 222 Chapter 8 TRADITIONAL OPERATIONAL SEMANTICS 223 8.1 CONCEPTS AND EXAMPLES 224 VDL 226 Exercises 227 8.2 SECD: AN ABSTRACT MACHINE 228 Example 231 Parameter Passing 232 Static Scoping 233 Exercises 234 8.3 LABORATORY: IMPLEMENTING THE SECD MACHINE 235 Exercises 237 8.4 STRUCTURAL OPERATIONAL SEMANTICS: INTRODUCTION 238 Specifying Syntax 239 Inference Systems and Structural Induction 242 Exercises 244 8.5 STRUCTURAL OPERATIONAL SEMANTICS: EXPRESSIONS 245 Semantics of Expressions in Wren 245