ebook img

Lightweight Compiler Techniques PDF

262 Pages·2006·0.668 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Lightweight Compiler Techniques

Nils M Holm Lightweight Compiler Techniques Nils M Holm Lightweight Compiler Techniques Copyright © 1996,2002,2006 Nils M Holm <[email protected]> Print and Distribution: Lulu Press, Inc This book was typeset using TROFF(1) and friends on a FreeBSD system. "Lightweight Compiler Techniques" was printed in small quantities and distributed by myself in the 1990’s. I always wanted to create a second edition, but never managed to do so. However, a few people still seem to be interested in this work, so I decided to launch TROFF one last time and try to get the original manuscript into a printable form, and here it is. Note: this is the original, unrevised text from 1996 with a few minor corrections from 2002. The layout may look a bit odd here and there. This is because the original book had a different size. I am sorry, but I currently do not have the time to fix this. Nils M Holm, June 2006 Table of Contents 1. An Introduction to T3X .................................................. 1 1.1 Programs ........................................................................ 3 1.1.1 The Input Alphabet ........................................................ 4 1.1.2 Comments ...................................................................... 4 1.1.3 Naming Conventions ...................................................... 5 1.1.4 Data Declarations ........................................................... 5 1.1.5 Factors ............................................................................ 8 1.1.6 Expressions .................................................................... 12 1.1.7 Constant Expressions ..................................................... 21 1.1.8 Statements ...................................................................... 22 1.1.9 Local Storage ................................................................. 30 1.1.10 Procedures .................................................................... 32 1.1.11 The Main Program ....................................................... 38 1.2 The T3X Object Model .................................................. 39 1.2.1 Object Oriented Programming ....................................... 39 1.2.2 Classes ............................................................................ 40 1.2.3 Objects ............................................................................ 42 1.2.4 Instances ......................................................................... 42 1.2.5 Class Dependencies ........................................................ 43 1.2.6 Methods and Messages .................................................. 45 1.3 The Complete Scoping Model ....................................... 48 1.3.1 Class Conflicts ................................................................ 49 1.4 Meta Commands ............................................................ 50 1.5 Runtime Support Classes ............................................... 51 1.5.1 The T3X Core Class ....................................................... 52 1.5.2 The Char Class ............................................................... 57 1.5.3 The IOStream Class ....................................................... 59 1.5.4 The Memory Class ......................................................... 62 1.5.5 The String Class ............................................................. 63 1.5.6 The Tcode Class ............................................................. 67 1.5.7 The System Class ........................................................... 67 1.5.8 The TTYCtl Class .......................................................... 72 1.5.9 The XMem Class ............................................................ 75 2. HowaCompiler Works .................................................. 77 2.1 Scanning ......................................................................... 78 2.2 Parsing ............................................................................ 80 2.3 Semantic Analysis .......................................................... 84 2.4 Optimizing ...................................................................... 87 2.5 Glue Generation ............................................................. 89 2.6 ASimplified Model ........................................................ 93 3. The Tcode Machine ........................................................ 97 3.1 The Tcode Architecture .................................................. 97 3.1.1 The Registers of the Tcode Machine .............................. 98 3.2 Fundamental Definitions ................................................ 99 3.3 Notation .......................................................................... 101 3.4 Declarations .................................................................... 101 3.5 Arithmetic ...................................................................... 102 3.6 Memory .......................................................................... 103 3.7 Procedures ...................................................................... 104 3.8 Branches ......................................................................... 105 3.9 EXEC and HALT ........................................................... 105 3.10 External Linkage ............................................................ 107 3.11 Programming Conventions ............................................. 108 3.12 Startup Conditions .......................................................... 110 4. The T3X Translator ........................................................ 113 4.1 The Commented TXTRN Listing .................................. 113 4.2 Techniques and Foundations .......................................... 169 4.2.1 Input Buffering ............................................................... 169 4.2.2 Delayed Code Generation .............................................. 171 4.2.3 Generating FlowControl Statements ............................. 172 4.2.4 Falling Precedence Parsing ............................................ 173 4.2.5 Syntactic Ambiguities .................................................... 179 4.2.6 Local Contexts ................................................................ 180 4.2.7 Class LevelScopes ......................................................... 181 5. Tcode Optimization ........................................................ 185 5.1 The Commented TXOPT Listing ................................... 185 5.2 Optimization Algorithms ............................................... 207 5.2.1 Constant Expression Folding ......................................... 207 5.2.2 Constant Condition Folding ........................................... 210 5.2.3 Jump to Jump Redirection .............................................. 210 5.2.4 Eliminating Dead Procedures ......................................... 212 5.2.5 The Order of Optimizations ........................................... 214 6. Native Code Generation ................................................. 215 6.1 The Target Language ...................................................... 215 6.2 The Task of Code Transformation .................................. 217 6.3 VSM Instruction Inlining ............................................... 218 6.3.1 PUSH/POP Elimination ................................................. 219 6.4 Code Synthesis ............................................................... 220 6.5 Cyclic Register Allocation ............................................. 224 6.5.1 Working Around Specialized Registers ......................... 225 6.6 Procedure Calls .............................................................. 228 6.7 Relational Operations and Branches .............................. 231 7. Appendices ..................................................................... 235 7.1 The T3X Grammar ......................................................... 235 7.2 Summaries ...................................................................... 242 7.2.1 Statements ...................................................................... 242 7.2.2 Operators ........................................................................ 243 7.2.3 Runtime Support Procedures .......................................... 244 7.2.4 Escape Sequences ........................................................... 245 7.2.5 Optimization Templates ................................................. 246 7.3 References ...................................................................... 247 7.3.1 Examples ........................................................................ 247 7.3.2 Pictures ........................................................................... 248 7.3.3 Tables ............................................................................. 249 7.3.4 Index ............................................................................... 250 1 An Introduction to T3X T 3X is a small, portable, procedural, block-structured, recursive, almost typeless, and to some degree object oriented language. Its syntax is derived from Pascal and BCPL, and its object oriented model is similar to that of Java, but much simpler. The structured approach to programming is well-understood, provides a sufficient degree of abstraction and can be easily translated into native machine code at the same time. The object model eases the development of general and resusable code. T3X is an imperative language. This means that a program consists of a set of instructions which tell the computer in what way to manipulate the data defined by the program. An instruction is also called a statement. In structured programming languages, there are four fundamental ways of formulating statements: •Assignments •Sequences •Branches •Iterations In a sequence − which is basically a list of statements − the statements are processed from the top towards the bottom of the list. Each statement is guarranteed to be completely processed when the next one is interpreted. A branch is a statement which is executed only if an associated condition applies. Iteration is the repetition of a statement depending on a condition. In a block-structured language, statements may be grouped in statement blocksorcompound statements. Each block may have its own local data which cannot be affected by statements contained in other blocks. An additional layer of abstraction is added to an imperative, block-structured language by providing user-definedproceduresorfunctions(in this book, these terms will be used synonymously). A procedure is a statement or a set of statements which is bound to a symbolic name. A procedure can be executed by coding a call to that procedure. Most languages provide a mechanism to transport data to a procedure and return a value to the calling program. Some languages (like BCPL and Pascal) make a distinction between procedures and functions, others (likeK&RC)donot. In languages which make a distinction between procedures and functions, only functions may return values. In T3X, all procedures return values, but the caller is free to ignore them. Therefore, procedures and functions are basically the same. Another level of abstraction is provided by adding an object model to the language. The object model of T3X consists solely of •Classes •Objects •Messages Classes are used to encapsulate code and data of a program. It may contain any number 2 1.0 An Introduction to T3X of data objects and procedures. Only public procedures (so-calledmethods)may be called by other procedures (or methods). Objects are used to instantiate classes. Each instance of a class has its own private data area. Hence the same class may be used for a template to create independent objects. Messages are used to activate methods of specific objects. T3X does not provide inheritance, since it is the source of a whole load of semantic problems. It does not support different protection levels (like public data or ’friend’ relationships), either,because these concepts undermine the object oriented model. T3X is analmost typelesslanguage. There exist twodifferent types, so-calledatomic variables which may hold small data objects, like characters, numbers and references to other data objects, and vectors which are used to store logically connected groups of small data objects. Additionally, there are constants, templates for defining structured data objects and classes, and different types of procedure declarations. The T3X compiler does not allow some combinations of operators which do not make sense (like assigning a value to a procedure or sending a message to a vector). Consequently, T3X’s type checking is much more strict than for example BCPL’s, but much less restrictive than Pascal’s. Weakly typed and typeless languages have been exposed to a lot of critique in the past, because they are considered ‘insecure’, but the degree of simplicity and flexibility which is bought by ’sacrificing’ this bit of security is immense. The type checking mechanisms of the T3X language are limited to the detection of •wrong argument counts in procedure calls •assignments to non-variables •calls of non-procedures •instantiations of non-classes •sending messages to non-objects •sending non-messages to objects •dependencies on non-classes BTW: during the development of an early version of T3X, a severe bug occurred in the compiler. After tracking it down, it turned out that was limited to the (type-safe) ANSI C version of the translator and did not addect the T3X version. Of course, this was coincidence, but to some degree it weakens the proposition that typeless languages are per se insecure and dangerous. The remainder of this chapter describes the structure of T3X programs and the single components a program may consist of.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.