ebook img

Introduction to Computer Data Representation PDF

268 Pages·5.322 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Introduction to Computer Data Representation

Introduction to Computer Data Representation Peter Fenwick The University of Auckland (Retired) New Zealand Bentham Science Publishers Bentham Science Publishers Bentham Science Publishers Executive Suite Y - 2 P.O. Box 446 P.O. Box 294 PO Box 7917, Saif Zone Oak Park, IL 60301-0446 1400 AG Bussum Sharjah, U.A.E. USA THE NETHERLANDS [email protected] [email protected] [email protected] Please read this license agreement carefully before using this eBook. Your use of this eBook/chapter constitutes your agreement to the terms and conditions set forth in this License Agreement. This work is protected under copyright by Bentham Science Publishers to grant the user of this eBook/chapter, a non- exclusive, nontransferable license to download and use this eBook/chapter under the following terms and conditions: 1. This eBook/chapter may be downloaded and used by one user on one computer. The user may make one back-up copy of this publication to avoid losing it. The user may not give copies of this publication to others, or make it available for others to copy or download. For a multi-user license contact [email protected] 2. All rights reserved: All content in this publication is copyrighted and Bentham Science Publishers own the copyright. You may not copy, reproduce, modify, remove, delete, augment, add to, publish, transmit, sell, resell, create derivative works from, or in any way exploit any of this publication’s content, in any form by any means, in whole or in part, without the prior written permission from Bentham Science Publishers. 3. The user may print one or more copies/pages of this eBook/chapter for their personal use. The user may not print pages from this eBook/chapter or the entire printed eBook/chapter for general distribution, for promotion, for creating new works, or for resale. Specific permission must be obtained from the publisher for such requirements. Requests must be sent to the permissions department at E-mail: [email protected] 4. The unauthorized use or distribution of copyrighted or other proprietary content is illegal and could subject the purchaser to substantial money damages. The purchaser will be liable for any damage resulting from misuse of this publication or any violation of this License Agreement, including any infringement of copyrights or proprietary rights. 5. The following DRM (Digital Rights Management) policy is applicable on this eBook for the non-library / personal / single-user. Library / institutional / multi-users will get a DRM free copy and they may implement their own institutional DRM policy. • 25 ‘Copy’ commands can be executed every 7 days. The text selected for copying cannot extend to more than one single page. • 25 pages can be printed every 7 days. • eBook files are not transferable to multiple computer/devices. If you wish to use the eBook on another device, you must send a request to [email protected] along with the original order number that you received when the order was placed. Warranty Disclaimer: The publisher does not guarantee that the information in this publication is error-free, or warrants that it will meet the users’ requirements or that the operation of the publication will be uninterrupted or error-free. This publication is provided "as is" without warranty of any kind, either express or implied or statutory, including, without limitation, implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the results and performance of this publication is assumed by the user. In no event will the publisher be liable for any damages, including, without limitation, incidental and consequential damages and damages for lost data or profits arising out of the use or inability to use the publication. The entire liability of the publisher shall be limited to the amount actually paid by the user for the eBook or eBook license agreement. Limitation of Liability: Under no circumstances shall Bentham Science Publishers, its staff, editors and authors, be liable for any special or consequential damages that result from the use of, or the inability to use, the materials in this site. eBook Product Disclaimer: No responsibility is assumed by Bentham Science Publishers, its staff or members of the editorial board for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products instruction, advertisements or ideas contained in the publication purchased or read by the user(s). Any dispute will be governed exclusively by the laws of the U.A.E. and will be settled exclusively by the competent Court at the city of Dubai, U.A.E. You (the user) acknowledge that you have read this Agreement, and agree to be bound by its terms and conditions. Permission for Use of Material and Reproduction Permission Information for Users Outside the USA: Bentham Science Publishers grants authorization for individuals to photocopy copyright material for private research use, on the sole basis that requests for such use are referred directly to the requestor's local Reproduction Rights Organization (RRO). The copyright fee is US $25.00 per copy per article exclusive of any charge or fee levied. In order to contact your local RRO, please contact the International Federation of Reproduction Rights Organisations (IFRRO), Rue Joseph II, 9-13 I000 Brussels, Belgium; Tel: +32 2 234 62 60; Fax: +32 2 234 62 69; E-mail: [email protected]; url: www.ifrro.org This authorization does not extend to any other kind of copying by any means, in any form, and for any purpose other than private research use. Permission Information for Users in the USA: Authorization to photocopy items for internal or personal use, or the internal or personal use of specific clients, is granted by Bentham Science Publishers for libraries and other users registered with the Copyright Clearance Center (CCC) Transactional Reporting Services, provided that the appropriate fee of US $25.00 per copy per chapter is paid directly to Copyright Clearance Center, 222 Rosewood Drive, Danvers MA 01923, USA. Refer also to www.copyright.com To Brenda CONTENTS Foreword i Preface iii CHAPTERS Introduction and Overview 1 1. Numbers and Computers 5 2. Binary and Other Representations 17 3. Signed, and Other, Representations 29 4. Basic Arithmetic and Logic 45 5. Computer Arithmetic 77 6. Floating-Point Representations 103 7. Logarithmic Representations 127 8. Characters and Text 133 9. Universal (Variable Length) Codes 159 10. Checksums and Error Control 191 11. Miscellaneous Topics 219 12. Concluding Comments 249 Bibliography 251 Index 259 i Foreword It is my great pleasure to recommend this excellent book written by my friend and colleague, Professor Peter Fenwick. During the eleven years I have known him, we have had many a discussion, often touching on topics covered here. Though this is the closest we have come to a collaboration I have little doubt that had we met earlier in our careers we would have collaborated extensively. A major contribution of this book is to bring a historical perspective to many topics that are so widely accepted that it might not be obvious there were choices to bemade. Thebinaryrepresentationofnumberswassoobviouseveninthe1940sthat Burks, Goldstine and von Neumann are said to have “adopted it seemingly without discussion”. But Burks et al considered floating point representation, then argued against supporting it. Long ago I heard it claimed that von Neumann believed any mathematician”worthhissalt”shouldbeabletospecifyfloatingpointcomputations using only integers. In any case, floating point only came into its own in the 1980s, with the broad acceptance of the IEEE standards. Professor Fenwick shows great insightintowhyittookdecadestogetrightsomethingasbasicastherepresentation of numbers. A second important contribution is discussion of the introduction of redundancy to increase reliability in the presence of errors: check sums and variable-length (universal) codes. While simple check sums are frequently discussed, I know of no comparable source for a general discussion of Universal codes, an important but somewhat obscure subject. IagreewithProfessorFenwick’squote,that“everybodythinkstheyknow”about these topics, but there are big holes, even today. Surely most of us have superficial knowledge that fails us when we really need to work through the details. This book covers a huge range of material, thoroughly and concisely. I have taught a good bit of the material, but I learned much, even in areas where I claim some expertise. The book displays a deep understanding of the many and varied requirements for digital representation of information, from the obvious integers and floating point, to Zeckendorf representations and Gray codes; from 2’s complement to logarithmic arithmetic; from Elias and Levenstein codes to Rice and Golomb codes and on to Ternary Comma and Fibonacci codes. ii In addition to the plethora of ways to represent numbers, it also covers represen- tation of characters and strings. While the book will serve very well as a reference, it is also fascinating reading. Many pages are devoted to obscure topics, interesting largelybecauseoftheirplaceinhistory, butoutsidethedomainofaclassictextbook on computer organization or architecture. These are perhaps the most important sections, precisely because they had to be understood and discarded to get us where we are now. This book definitely does not qualify for the subtitle, “Data Representation for Dummies”. While it quickly surveys common forms of representation, the pace and breadth will bewilder the true novice. On occasions, it uses terms unfamiliar (at least to an American), requiring another source. Appropriately, Professor Fenwick acknowledges the role of Wikipedia, which covers rather more topics than his book, but certainly not as coherently. The author has a wry, if somewhat subtle, sense of humour which often surfaces unexpectedly: it’sabitofastretch,butofcoursethedescriptionandfigureregarding Gray codes include a “grey area”! Discussion of the roles and interaction of precision, accuracy and range is superb. Floating point representation is highly precise, so why is it dangerous for use in financialcalculations? ProfessorFenwickpointsoutsomethingthathadnotoccurred to me: a “quite ordinary calculator” is capable of more precise arithmetic than a 32-bit [IEEE single-precision] floating point computation. That explains why the calculator “app” on my iPad has both less range, and less precision, than the HP calculator I bought 35 years ago! A topic rarely covered so clearly is “unwarranted precision”, the process of using aprecisemathematicaloperationtoapparentlyincreaseaccuracy(significantdigits) of a number. Professor Fenwick points out confusion over precision created by the fact that the speed of light is so close to 300,000,000 metres per second—and the fact that scientific notation provides information about the accuracy of a value (pp. 106-107). I especially liked his discussion of the sins of the popular press, for example, by apparently increasing precision in the process of converting units: an altitude “10,000 feet”—accurate to, say, ±100 metres—becomes the apparently moreprecise, butinaccurate, “3048metres”. Itisunfortunatethatthegenerallevel of this book is beyond comprehension for most journalists! In short, this is a fascinating book that will appeal to many because of its au- thoritative exploration of how we represent information. But it will also serve as a reference for those requiring—or simply enjoying—the ability to choose efficient representations that lead to accurate results. It’s a good read, and a great book to keep handy. James R. Goodman, United States of America Fellow IEEE, Fellow ACM, 2013 Eckert-Mauchly Award iii PREFACE This book arose from lectures on data representation given to first year Com- puter Science students at the University of Auckland. But then it grew as I realised that ever-more material seemed relevant, useful, or just interesting. To a large extent it reflects my own journey through computing from about 1964–2004, starting from logic design, through computer hardware, computer arithmetic and data communications into, finally, data compression. Thus the computers that I reference are largely those with which I have at least passing experience. (There are of course many others that I have not encountered, but few of these are mentioned.) And the footnotes and asides often come from personal experience; many are distant recollections which I cannot now attribute. A comment made by one person who read this book was “This is an area that everybody thinks they know, but really nobody really knows very well”. While most elementary Computer Science books certainly describe some data representation (usually restricted to current “best practice”), and other books give great detail of specialised topics such as floating point, there seems to be a great gap in the middle. It is this gap, giving reasonable coverage of most data types from first principles, that I hope this book supplies. It deals mostly with data at the architectural level, with no mention of the trees, lists etc as normally covered in Data Structures courses. The main exception here is the description of text strings – characters are of little inter- est in isolation; strings are the usual entity to be manipulated and are often regarded as a data primitive. It also includes a comprehensive coverage of variable-length integer representations and of checksums, both topics which seem to have little overall coverage in the general literature. Peter Fenwick The University of Auckland, New Zealand (retired) email : [email protected] iv Acknowledgements The book was started while I was employed at the University of Auckland, but with no explicit support. I acknowledge the assistance from Brian Hicks and Murray Johns who, many years ago, introduced me to computers, and some of whose insights are still present in this book. Bob Doran, Amos Omondi and Brian Carpenter read early drafts and suggested valuable extra topics. Assistance was also received from Prof F.P. Brooks, Dr R.F. Rice and Jørgen Ibsen. Special thanks go to Jim Goodman who provided many useful comments while preparing the Foreword. And last but not least Brenda, who has endured many years (probably far too many!) of “The Book”. Conflicts of Interest There are no conflicts of interest. Send Orders for Reprints to [email protected] Introduction to Computer Data Representation, 2014, 1-4 1 Introduction and Overview The Background “Myfirstcomputer”,in1964, wasanIBM1620withallof20000decimaldigits (10kcharacters)of20µsmemoryandaFloatingMultiplytimeof10ms(yes, 10 milliseconds. A division took 50ms and you could see it on the panel lights.). By contrast, a modest current desktop computer might be larger/faster by perhaps 1 million times. (The 1620 had no external storage, but a computer which replaced it in 1967 had a 1 Mbyte disk cartridge, which has perhaps a similar relation to modern storage capacities.) While in those early computers time was important (after all you could, usually, just wait longer), a very real problemwasmemory. Alltoooftenthealgorithmordatastructurewasdecided more by memory efficiency than by computational speed. (And memory was expensive, say 10s of cents per byte in 1960s currency, or several dollars in 2013, so you seldom had much available.) Thus it was often essential to know just how data was actually held, es- pecially if there was a lot of it1. “Good” programmers were keenly aware of the detailed structure of records and other data structures. These matters re- ally “hit home” to me in the late 1990s, when I was teaching an introductory Computer Science course and had to change from Pascal to Java. The details of data storage and representation just vanished into the mysteries of Classes and such, buried under layers of abstraction; I suddenly realised that much of my years (decades?) of hard-won knowledge and skills were largely obsolete. At about the same time, I also realised that too many books presented data representation as “This the way it is, and nothing else is important”. Clearly this is wrong and much can be learnt from why things once-popular 1The word “data”, while strictly the plural of “datum” will be treated here as a singular “noun of multitude”, following a widely-accepted usage. The plural form “data are” may be used where the individual components are identifiable and important. Peter Fenwick All rights reserved - © 2014 Bentham Science Publishers

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.