ebook img

Exploratory Data Analysis PDF

711 Pages·1977·18.22 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Exploratory Data Analysis

62 exhibit 2/3: Easy re-expression exhibit 2 of chapter 3: reference table Break table for two-decimal logs A) MAIN BREAK TABLE Break log Break log Break log Break log Break log 9886 1567 2483 3936 6237 1012 .00 1603 .20 2541 .40 4027 .60 6383 .80 1035 .01 1641 .21 2600 .41 4121 .61 6531 .81 1059 .02 1679 .22 2661 .42 4217 .62 6683 .82 1084 .03 1718 .23 2723 .43 4315 .63 6839 .83 1109 .04 1758 .24 2786 .44 4416 .64 6998 .84 1135 05 1799 25 2851 .45 4519 .65 7161 .85 1161 .06 1841 .26 2917 .46 4624 .66 7328 .86 1189 .07 1884 .27 2985 .47 4732 .67 7499 .87 1216 .08 1928 .28 3055 .48 4842 .68 7674 .88 1245 .09 1972 .29 3126 .49 4955 .69 7852 .89 1274 .10 2018 .30 3199 .50 5070 .70 8035 .90 1303 .11 2065 .31 3273 .51 5188 .71 8222 .91 1334 .12 2113 .32 3350 .52 5309 .72 8414 .92 1365 .13 2163 .33 3428 .53 5433 .73 8610 .93 1396 .14 2213 .34 3508 .54 5559 .74 8810 .94 1429 .15 2265 .35 3589 .55 5689 .75 9016 .95 1462 .16 2317 .36 3673 .56 5821 .76 9226 .96 1496 .17 2371 .37 3758 .57 5957 .77 9441 .97 1531 .18 2427 .38 3846 .58 6095 .78 9661 .98 1567 .19 2483 .39 3936 .59 6237 .79 9886 .99 When in doubt use an even answer; thus, 1462 gives .16 and 1496 gives .18. B) SETTING DECIMAL POINTS 1 1 +0 -1 10 0.1 +1 -2 100 0.01 +2 -3 1000 0.001 +3 -4 10,000 0.0001 +4 -5 100,000 0.00001 +5 -6 1,000,000 0.000001 C) EXAMPLES Number B . A log number log 137.2 2+.14= 2.14 log 0.03694 -2 + .57 = -1.43 log 0.896 -1 + .95 = -0.05 log 174,321 +5 + .24 = 5.24 See text, page 63, for example of use. 3D: quick roots and quick reciprocals/exhibit 7 71 exhibit 7 of chapter 3: reference table Break table for (square) roots A) EXAMPLES Start by dividing number into periods of two digits, decimal point falrrngb!3tween periods. Thus 124.2 is 1 24 2, but 1242 is 12 42. Like wise 0.00654 is 0065 4 or 654. Number Periods from B from C Number 124.2 1 24 2 abo 112 11.2 1242 12 42 abo 35 35 . . 00654 00654 .Ox 80 .080 B) BREAK TABLES to SET DECIMAL POINT --enter between bold figures, leave with light figures 1 a .x .01 abo .Ox .00 01 abc. .00x . 00 00 01 abcd. .000x . 00 00 00 01 C) MAIN BREAK TABLE--in and out as in panel B I Break I I Root I I Break I I Root I I Break I I Root I I Break I I Root I I Breakl I Root I 98 01 24964 5 66 1560 3540 100 160 240 40 60 1 02 01 26244 5 86 1640 3740 102 164 246 41 62 1 06 09 2 75 56 6 20 1722 39 69 104 168 252 42 64 1 10 25 28900 650 1806 42 25 106 172 258 43 66 1 1449 3 02 76 6 81 1892 44 69 108 176 264 44 68 1 18 81 3 1684 7 13 19 80 47 61 110 180 270 45 70 1 23 21 331 24 7 45 20 70 50 41 112 184 276 46 72 1 27 69 3 45 96 7 78 21 62 53 29 114 188 282 47 74 1 32 25 3 61 00 8 12 22 56 56 25 116 192 288 48 76 1 36 89 3 76 36 8 47 23 52 59 79 118 196 294 49 78 1 41 61 39204 882 24 50 62 41 120 200 30 50 80 1 48 84 40804 9 30 25 50 65 61 124 204 31 51 82 1 58 76 42436 9 92 26 52 68 89 128 208 32 52 84 1 69 00 4 41 00 10 56 27 56 72 25 132 212 33 53 86 1 79 56 4 57 96 11 22 28 62 7569 136 216 34 54 88 1 90 44 47524 11 90 29 70 79 21 140 220 35 55 90 201 64 49284 12 60 30 80 82 81 144 224 36 56 92 2 13 16 5 10 76 13 32- 31 92 8649 148 228 37 57 94 2 25 00 5 29 00 1406 3306 90 25 152 232 38 58 96 2 37 16 5 47 56 1482 34 22 9409 156 236 39 59 98 24964 56644 1560 3540 98 01 ~~ff '~' See page 75, f~ample of use. 72 exhibit 8/3: Easy re-expression exhibit 8 of chapter 3: reference table Break table for (negative) reciprocals (using -1000/number) A) BREAK TABLES for SETTING DECIMAL POINT Break Start Start Break 1000 1000 .x a. 10,000 100 .Ox abo 100,000 10 .0 Ox abc . 1,000,000 1. .0 00x abed . 10,000,000 O. 1 .0 000x abcde . 100,000,000 0.01 Examples Number A B -1000/number 124.2 a . -80 -8.0 .0 4739 abcde. -212 -212**. 1242. .x -80 -.80 B) MAIN BREAK TABLE--digits of negative reciprocal I Break I I Value I I Break I I Value I I Break I I Value I I Break I \Value\ \ Break \ \ Value \ 990 1639 2469 4115 617 -100 -60 -40 -240 -160 1010 1681 2532 4202 633 -98 -59 -39 -236 -156 1030 1709 2597 4274 649 -96 -58 -38 -232 -152 1053 1739 2667 4347 666 -94 -57 -37 -228 -148 1075 1770 2740 4425 685 -92 -56 -36 -224 -144 1099 1802 2816 4504 704 -90 -55 -35 -220 -140 1124 1835 2899 4587 725 -88 -54 -34 -216 -136 1149 1869 2985 4672 746 -86 -53 -33 -212 -132 1176 1905 3077 4762 769 -84 -52 -32 -208 -128 1205 1942 3175 4854 793 -82 , -51 -31 -204 -124 1235 1980 3287 4950 820 -80 -50 -30 -200 -120 1266 2020 3367 505 840 -78 -49 -294 -196 -118 1299 2062 3448 515 855 -76 -48 -288 -192 -116 1333 2105 3509 526 870 -74 -47 -282 -188 -114 1370 2151 3584 538 885 -72 -46 -276 -184 -112 1408 2198 3663 549 901 -70 -45 -270 -180 -110 1449 2247 3745 562 917 -68 -44 -264 -176 -108 1493 2299 3831 575 935 -66 -43 -258 -172 -106 1538 2353 3922 588 952 -64 -42 -252 -168 -104 1587 2410 4016 602 9'11 -62 -41 -246 -164 -102 1639 2469 4115 617 990 See text, page 78, for example of use. JOHN W. TUKEY Princeton University and Bell Telephone Laboratories Exploratory Data Analysis ADDISON-WESLEY PUBLISHING COMPANY Rcadinf.{. Massachllsctts • Mcnlo Park. Cal(f'ornia London· Amsterdam· DOll Mills. Ontario· S,vdncy This book is in the ADDISONcWESLEY SERIES IN BEHAVIORAL SCIENCE: QUANTITATIVE METHODS Consulting Editor: FREDERICK MOSTELLER The writing of this work. together with some of the research reported herein. was supported in part by the United States Government. Copyright [) 1977 by Addison-Wesley Publishing Company. Inc. Philippines copyright 1977 by Addison-Wesley Publish ing Company. Inc. All rights reserved. No part of this publication may be repro duced. stored in a retrieval system. or transmitted. in any form or by any means. electronic. mechanical. photocopying. recording. or otherwise. without the prior written permission of the publisher. Printed in the United States of America. Published simultaneously in Canada. Library of Congress Cptalog Card No. 76-50g0. C;(SBN 0-201-07616-0 CDEFGHIJ-HA-798 Dedicated to the memory of CHARLIE WINSOR, biometrician, and EDGAR ANDERSON, botanist, data analysts both, from whom the author learned much that could not have been learned elsewhere v Preface This book is based on an important principle: It is important to understand what you CAN DO before you learn to measure how WELL you seem to have DONE it. Learning first what you can do will help you to work more easily and effec tively. This book is about exploratory data analysis, about looking at data to see what it seems to say. It concentrates on simple arithmetic and easy-to-draw pictures. It regards whatever appearances we have recognized as partial de scriptions, and tries to look beneath them for new insights. Its concern is with appearance, not with confirmation. Examples, NOT case histories The book does not exist to make the case that exploratory data analysis is useful. Rather it exists to expose its readers and users to a considerable variety of techniques for looking more effectively at one's data. The examples are not intended to be complete case histories. Rather they show isolated techniques in action on real data. The emphasis is on general techniques, rather than specific problems. A basic problem about any body of data is to make it more easily and effectively handleable by minds--our minds, her mind, his mind. To this general end: <> anything that makes a simpler description possible makes the description more easily handleable. <> anything that looks below the previously described surface makes the description more effective. So we shall always be glad (a) to simplify description and (b) to describe one layer deeper. In particular: <> to be able to say that we looked one layer deeper, and found nothing, is a definite step forward--though not as far as to be able to say that we looked deeper and found thus-and-such. <> to be able to say that "if we change our point of view in the following way ... things are simpler" is always a gain--though not quite as much as to be able to say "if we don't bother to change our point of view (some other) things are equally simple". vi Preface Thus, for example, we regard learning that log pressure is almost a straight line in the negative reciprocal of absolute temperature as a real gain, as compared to saying that pressure increases with temperature at an evergrowing rate. Equally, we regard being able to say that a batch of values is roughly symmetrically distributed on a log scale as much better than to say that the raw values have a very skew distribution. In rating ease of description, after almost any reasonable change of point of view, as very important, we are essentially asserting a belief in quantitative knowledge--a belief that most of the key questions in our world sooner or later demand answers to "by how much?" rather than merely to "in which direc tion?" . Consistent with this view, we believe, is a clear demand that pictures based on exploration of data should force their messages upon us. Pictures that emphasize what we already know--" security blankets" to reassure us--are frequently not worth the space they take. Pictures that have to be gone over with a reading glass to see the main point are wasteful of time and inadequate of effect. The greatest value of a picture is when it forces us to notice what we never expected to see. We shall not try to say why specific techniques are the ones to use. Besides pressures of space and time, there are specific reasons for this. Many of the techniques are less than ten years old in their present form--some will improve noticeably. And where a technique is very good, it is not at all certain that we yet know why it is. We have tried to use consistent techniques wherever this seemed reason able, and have not worried where it didn't. Apparent consistency speeds learning and remembering, but ought not to be allowed to outweigh noticeable differences in performance. In summary, then, we: <> leave most interpretations of results to those who are experts III the subject-matter field involved. <> present techniques, not case histories. <> regard simple descriptions as good in themselves. <> feel free to ask for changes in point of view in order to gain such simplicity. <> demand impact from our pictures. <> regard every description (always incomplete!) as something to be lifted off and looked under (mainly by using residuals). <> regard consistency from one technique to another as desirable, not essen tial. Confirmation The principles and procedures of what we call confirmatory data analysis are both widely used and one of the great intellectual products of our century.

Description:
The approach in this introductory book is that of informal study of the data. Methods range from plotting picture-drawing techniques to rather elaborate numerical summaries. Several of the methods are the original creations of the author, and all can be carried out either with pencil or aided by han
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.