ebook img

Joe Armstrong Thesis PDF

295 Pages·2003·0.82 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Joe Armstrong Thesis

Making reliable distributed systems in the presence of d so ware errors Final version (with corrections) — last update 20 November 2003 Joe Armstrong A Dissertation submitted to the Royal Institute of Technology in partial fulfilment of the requirements for the degree of Doctor of Technology The Royal Institute of Technology Stockholm, Sweden December 2003 Department of Microelectronics and Information Technology ii TRITA–IMIT–LECS AVH 03:09 ISSN 1651–4076 ISRN KTH/IMIT/LECS/AVH-03/09–SE and SICS Dissertation Series 34 ISSN 1101–1335 ISRN SICS–D–34–SE (cid:13)c Joe Armstrong, 2003 Printed by Universitetsservice US-AB 2003 iii To Helen, Thomas and Claire iv Abstract T he work described in this thesis is the result of a research program started in 1981 to find better ways of programming Telecom applica- tions. These applications are large programs which despite careful testing will probably contain many errors when the program is put into service. We assume that such programs do contain errors, and investigate methods for building reliable systems despite such errors. The research has resulted in the development of a new programming language (called Erlang), together with a design methodology, and set of libraries for building robust systems (called OTP). At the time of writing the technology described here is used in a number of major Ericsson, and Nortel products. A number of small companies have also been formed which exploit the technology. The central problem addressed by this thesis is the problem of con- structing reliable systems from programs which may themselves contain errors. Constructing such systems imposes a number of requirements on any programming language that is to be used for the construction. I discuss these language requirements, and show how they are satisfied by Erlang. Problems can be solved in a programming language, or in the stan- dard libraries which accompany the language. I argue how certain of the requirements necessary to build a fault-tolerant system are solved in the language, and others are solved in the standard libraries. Together these d form a basis for building fault-tolerant so ware systems. No theory is complete without proof that the ideas work in practice. To demonstrate that these ideas work in practice I present a number of case studies of large commercially successful products which use this technol- ogy. At the time of writing the largest of these projects is a major Ericsson v vi ABSTRACT product, having over a million lines of Erlang code. This product (the AXD301) is thought to be one of the most reliable products ever made by Ericsson. Finally, I ask if the goal of finding better ways to program Telecom applications was fulfilled—I also point to areas where I think the system could be improved. Contents Abstract v 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Ericsson background . . . . . . . . . . . . . . . . . . . . . 2 Chronology . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter by chapter summary . . . . . . . . . . . . . . . . 7 2 The Architectural Model 11 2.1 Definition of an architecture . . . . . . . . . . . . . . . . . 12 2.2 Problem domain . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Philosophy . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Concurrency oriented programming . . . . . . . . . . . . 19 2.4.1 Programming by observing the real world . . . . . 21 2.4.2 Characteristics of a COPL . . . . . . . . . . . . . . 22 2.4.3 Process isolation . . . . . . . . . . . . . . . . . . . 22 2.4.4 Names of processes . . . . . . . . . . . . . . . . . 24 2.4.5 Message passing . . . . . . . . . . . . . . . . . . . 25 2.4.6 Protocols . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.7 COP and programmer teams . . . . . . . . . . . . 26 2.5 System requirements . . . . . . . . . . . . . . . . . . . . . 27 2.6 Language requirements . . . . . . . . . . . . . . . . . . . . 28 2.7 Library requirements . . . . . . . . . . . . . . . . . . . . . 29 2.8 Application libraries . . . . . . . . . . . . . . . . . . . . . 30 2.9 Construction guidelines . . . . . . . . . . . . . . . . . . . 31 2.10 Related work . . . . . . . . . . . . . . . . . . . . . . . . . 32 vii viii ABSTRACT 3 Erlang 39 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3 Sequential Erlang . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.1 Data structures . . . . . . . . . . . . . . . . . . . . 44 3.3.2 Variables . . . . . . . . . . . . . . . . . . . . . . . 46 3.3.3 Terms and patterns . . . . . . . . . . . . . . . . . 47 3.3.4 Guards . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3.5 Extended pattern matching . . . . . . . . . . . . . 49 3.3.6 Functions . . . . . . . . . . . . . . . . . . . . . . . 50 3.3.7 Function bodies . . . . . . . . . . . . . . . . . . . 52 3.3.8 Tail recursion . . . . . . . . . . . . . . . . . . . . 52 3.3.9 Special forms . . . . . . . . . . . . . . . . . . . . . 54 3.3.10 case . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.11 if . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.3.12 Higher order functions . . . . . . . . . . . . . . . . 55 3.3.13 List comprehensions . . . . . . . . . . . . . . . . . 57 3.3.14 Binaries . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.15 The bit syntax . . . . . . . . . . . . . . . . . . . . 60 3.3.16 Records . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.17 epp . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.18 Macros . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.19 Include files . . . . . . . . . . . . . . . . . . . . . 66 3.4 Concurrent programming . . . . . . . . . . . . . . . . . . 66 3.4.1 register . . . . . . . . . . . . . . . . . . . . . . . . 67 3.5 Error handling . . . . . . . . . . . . . . . . . . . . . . . . 68 3.5.1 Exceptions . . . . . . . . . . . . . . . . . . . . . . 69 3.5.2 catch . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.3 exit . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5.4 throw . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.5.5 Corrected and uncorrected errors . . . . . . . . . 72 3.5.6 Process links and monitors . . . . . . . . . . . . . 73 3.6 Distributed programming . . . . . . . . . . . . . . . . . . 76 3.7 Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.8 Dynamic code change . . . . . . . . . . . . . . . . . . . . 78 ix 3.9 A type notation . . . . . . . . . . . . . . . . . . . . . . . . 80 3.10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4 Programming Techniques 85 4.1 Abstracting out concurrency . . . . . . . . . . . . . . . . . 86 4.1.1 A fault-tolerant client-server . . . . . . . . . . . . . 92 4.2 Maintaining the Erlang view of the world . . . . . . . . . . 101 4.3 Error handling philosophy . . . . . . . . . . . . . . . . . . 104 4.3.1 Let some other process fix the error . . . . . . . . 104 4.3.2 Workers and supervisors . . . . . . . . . . . . . . 106 4.4 Let it crash . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.5 Intentional programming . . . . . . . . . . . . . . . . . . . 109 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5 Programming Fault-tolerant Systems 115 5.1 Programming fault-tolerance . . . . . . . . . . . . . . . . . 116 5.2 Supervision hierarchies . . . . . . . . . . . . . . . . . . . . 118 5.2.1 Diagrammatic representation . . . . . . . . . . . . 120 5.2.2 Linear supervision . . . . . . . . . . . . . . . . . . 121 5.2.3 And/or supervision hierarchies . . . . . . . . . . . 122 5.3 What is an error? . . . . . . . . . . . . . . . . . . . . . . . 123 5.3.1 Well-behaved functions . . . . . . . . . . . . . . . 126 6 Building an Application 129 6.1 Behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . 129 6.1.1 How behaviours are written . . . . . . . . . . . . . 131 6.2 Generic server principles . . . . . . . . . . . . . . . . . . . 132 6.2.1 The generic server API . . . . . . . . . . . . . . . 132 6.2.2 Generic server example . . . . . . . . . . . . . . . 135 6.3 Event manager principles . . . . . . . . . . . . . . . . . . 137 6.3.1 The event manager API . . . . . . . . . . . . . . . 139 6.3.2 Event manager example . . . . . . . . . . . . . . . 141 6.4 Finite state machine principles . . . . . . . . . . . . . . . . 141 6.4.1 Finite state machine API . . . . . . . . . . . . . . 143 6.4.2 Finite state machine example . . . . . . . . . . . . 144 x ABSTRACT 6.5 Supervisor principles . . . . . . . . . . . . . . . . . . . . . 146 6.5.1 Supervisor API . . . . . . . . . . . . . . . . . . . . 146 6.5.2 Supervisor example . . . . . . . . . . . . . . . . . 147 6.6 Application principles . . . . . . . . . . . . . . . . . . . . 153 6.6.1 Applications API . . . . . . . . . . . . . . . . . . . 153 6.6.2 Application example . . . . . . . . . . . . . . . . . 154 6.7 Systems and releases . . . . . . . . . . . . . . . . . . . . . 156 6.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 7 OTP 161 7.1 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 8 Case Studies 167 8.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . 168 8.2 AXD301 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 d 8.3 Quantitative properties of the so ware . . . . . . . . . . . 171 8.3.1 System Structure . . . . . . . . . . . . . . . . . . . 174 8.3.2 Evidence for fault recovery . . . . . . . . . . . . . 177 8.3.3 Trouble report HD90439 . . . . . . . . . . . . . . 177 8.3.4 Trouble report HD29758 . . . . . . . . . . . . . . 180 8.3.5 Deficiencies in OTP structure . . . . . . . . . . . . 181 8.4 Smaller products . . . . . . . . . . . . . . . . . . . . . . . 185 8.4.1 Bluetail Mail Robustifier . . . . . . . . . . . . . . . 185 8.4.2 Alteon SSL accelerator . . . . . . . . . . . . . . . 188 8.4.3 Quantitative properties of the code . . . . . . . . . 189 8.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 9 APIs and Protocols 193 9.1 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 9.2 APIs or protocols? . . . . . . . . . . . . . . . . . . . . . . 197 9.3 Communicating components . . . . . . . . . . . . . . . . . 198 9.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 10 Conclusions 201 10.1 What has been achieved so far? . . . . . . . . . . . . . . . 201

Description:
sodware errors. Final version (with corrections) — last update 20 November 2003. Joe Armstrong. A Dissertation submitted to the Royal Institute of Technology in partial .. and sell Erlang to external customers and to provide training and consulting data stored in dets should survive the crash.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.