ebook img

Applied Statistics: Principles and Examples PDF

193 Pages·1981·5.522 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Applied Statistics: Principles and Examples

Applied Statistics PRINCIPLES AND EXAMPLES Applied Statistics PRINCIPLES AND EXAMPLES D.R. cox E.J. SNELL Department of Mathematics, Imperial College, University of London LONDON NEW YORK CHAPMAN AND HALL First published 1981 by Chapman and Hall Ltd 11 New Fetter Lane, London EC4P 4EE Published in the USA by Chapman and Hall in association with Methuen, Inc. 733 Third Avenue, New York NY 10017 © 1981 D.R. Cox and E.l. Snell Softcover reprint of the hardcover 1st edition 1981 at the University Press, Cambridge ISBN-13: 978-94-009-5840-1 e-ISBN-13: 978-94-009-5838-8 DOl: 10.1007/978-94-009-5838-8 This title is available in both hardbound and paper back editions. The paperback edition is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form of binding or cover other than that in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser. All rights reserved. No part of this book may be reprinted, or reproduced or utilized in any form or by any electronic, mechanical or other means, now known or hereafter invented, including photocopying and recording, or in any information storage and retrieval system, without permission in writing from the Publisher. British Library Cataloguing in Publication Data Cox, D.R. Applied statistics. 1. Mathematical statistics I. Title II. Snell, E.l. 519.5 QA276 Contents Preface page vii PART I PRINCIPLES 1. Nature and objectives of statistical analysis 3 1.1 Introduction 3 1.2 Data quality 4 1.3 Data structure and quantity 5 1.4 Phases of analysis 6 1.5 Styles of analysis 7 1.6 Computational and numerical analytical aspects 7 1.7 Response and explanatory variables 8 1.8 Types of investigation 10 1.9 Purposes of investigation 12 2. Some general concepts 14 2.1 Types of observation 14 2.2 Descriptive and probabilistic methods 15 2.3 Some aspects of probability models 17 3. Some strategical aspects 20 3.1 Introduction 20 3.2 Incorporation of related data and external information 20 3.3 Role of special stochastic models 21 3.4 Achievement of economical and consistent description 21 3.5 Attitudes to assumptions 23 3.6 Depth and complexity of analysis appropriate 24 3.7 Analysis in the light of the data 25 4. Some types of statistical procedure 28 4.1 Introduction 28 4.2 Formulation of models: generalities 28 4.3 Formulation of models: systematic component 29 4.4 Formulation of models: random component 33 4.5 Calculation of summarizing quantities 35 v vi Contents 4.6 Graphical analysis 36 4.7 Significance tests 37 4.8 Interval estimation 39 4.9 Decision procedures 41 4.10 Examination of the adequacy of models 42 4.11 Parameters and parameterization 42 4.12 Transformations 46 4.13 Interaction 47 PART II EXAMPLES A Admissions to intensive care unit 53 B Intervals between adjacent births 58 C Statistical aspects of literary style 63 D Temperature distribution in a chemical reactor 68 E A 'before and after' study of blood pressure 72 F Comparison of industrial processes in the presence of trend 77 G Cost of construction of nuclear power plants 81 H Effect of process and purity index on fault occurrence 91 I Growth of bones from chick embryos 95 J Factorial experiment on cycles to failure of worsted yarn 98 K Factorial experiment on diets for chickens 103 L Binary preference data for detergent use 107 M Fertilizer experiment on growth of cauliflowers 112 N Subjective preference data on soap pads 116 o Atomic weight of iodine 121 P Multifactor experiment on a nutritive medium 126 Q Strength of cotton yarn 131 R Biochemical experiment on the blood of mice 135 S Voltage regulator performance 139 T Intervals between the failure of air-conditioning equipment in aircraft 143 U Survival times of leukemia patients 148 V A retrospective study with binary data 151 W Housing and associated factors 155 X Educational plans of Wisconsin schoolboys 162 Summary of examples 165 Further sets of data 168 References 181 Author index 185 SUbject index 187 Preface There are many books which set out the more commonly used statistical methods in a form suitable for applications. There are also widely available computer packages for implementing these techniques in a relatively painless way. We have in the present book concentrated not so much on the techniques themselves but rather on the general issues involved in their fruitful application. The book is in two parts, the first dealing with general ideas and principles and the second with a range of examples, all, however, involving fairly small sets of data and fairly standard techniques. Readers who have experience of the application of statistical methods may want to concentrate on the first part, using the second part, and better still their own experience, to illuminate and criticize the general ideas. If the book is used by students with little or no experience of applications, a selection of examples from the second part of the book should be studied first, any general principles being introduced at a later stage when at least some background for their understanding is available. After some hesitation we have decided to say virtually nothing about detailed computation. This is partly because the procedures readily available will be different in different institutions. Those having access to GUM will find that most of the examples can be very conveniently handled; however the parameterization in GUM, while appropriate for the great generality achieved, is not always suitable for interpretation and presentation of con clusions. Most, although not all, of the examples are in fact small enough to be analysed on a good pocket calculator. Students will find it instructive themselves to carry out the detailed analysis. We do not put forward our analyses of the examples as definitive. If the examples are used in teaching statistical methods, students should be en couraged to tryout their own ideas and to compare thoughtfully the con clusions from alternative analyses. Further sets of data are included for use by students. Many of the examples depend in some way on application of the method of least squares or analysis of variance or maximum likelihood. Some famili arity with these is assumed, references being given for specific points. The examples all illustrate real applications of statistical methods to some branch of science or technology, although in a few cases fictitious data have vii viii Preface been supplied. The main general limitation on the examples is, as noted above, that inevitably they all involve quite small amounts of data, and im portant aspects of statistical analysis specific to large amounts of data are therefore not well covered. There is the further point that in practice over elaboration of analysis is to be avoided. With very small sets of data, simple graphs and summary statistics may tell all, yet we have regarded it as legiti mate for illustration in some cases to apply rather more elaborate analyses than in practice would be justified. We are grateful to Dr C. Chatfield, University of Bath, for constructive comments on a preliminary version of the book. D.R. Cox E.J. Snell London, September 1980 Part I Principles Chapter 1 Nature and objectives of statistical analysis 1.1 Introduction Statistical analysis deals with those aspects of the analysis of data that are not highly specific to particular fields of study. That is, the object is to provide concepts and methods that will, with suitable modification, be applicable in many different fields of application; indeed one of the attractions of the subject is precisely this breadth of potential application. This book is divided into two parts. In the first we try to outline, without going into much specific detail, some of the general ideas involved in applying statistical methods. In the second part, we discuss some special problems, aiming to illustrate both the general principles discussed earlier and also particular techniques. References to these problems are given in Part I where appropriate. While all the examples are real, discussion of them is inhibited by two fairly obvious constraints. Firstly, it is difficult in a book to convey the interplay between subject-matter considerations and statistical analysis that is essential for fruitful work. Secondly, for obvious reasons, the sets of data analysed are quite small. In addition to the extra computation involved in the analysis of large sets of data, there are further difficulties connected, for example, with its being hard in large sets of data to detect initially unanticip ated complications. To the extent that many modern applications involve large sets of data, this book thus paints an oversimplified picture of applied statistical work. We deal very largely with methods for the careful analysis and interpreta tion of bodies of scientific and technological data. Many of the ideas are in fact very relevant also to procedures for decision making, as in industrial acceptance sampling and automatic process control, but there are special issues in such applications, arising partly from the relatively mechanical nature of the final procedures. In many applications, however, careful con sideration of objectives will indicate a specific feature of central interest. Special considerations enter also into the standardized analysis of routine test data, for example in a medical or industrial context. The need here may be for clearly specified procedures that can be applied, again in a quite mechanical fashion, giving sensible answers in a wide range of circumstances, and allowing possibly for individual 'management by exception' in extremely 3 4 Applied statistics [1.1 peculiar situations; quite commonly, simple statistical procedures are built in to measuring equipment. Ideally, largely automatic rejection of 'outliers' and routine quality control of measurement techniques are incorporated. In the present book, however, we are primarily concerned with the individual analysis of unique sets of data. 1.2 Data quality We shall not in this book deal other than incidentally with the planning of data collection, e.g. the design of experiments, although it is clear in a general way that careful attention to design can simplify analysis and strengthen interpretation. We begin the discussion here, however, by supposing that data become available for analysis. The first concerns are then with the quality of the data and with what can be broadly called its structure. In this section we discuss briefly data quality. Checks of data quality typically include: (i) visual or automatic inspection of the data for values that are logically inconsistent or in conflict with prior information about the ranges likely to arise for the various variables. For instances of possibly extreme observa tions, see Examples E and S. Inspection of the minimum and maximum of each variable is a minimal check; (ii) examination of frequency distributions of the main variables to look for small groups of discrepant observations; (iii) examination of scatter plots of pairs of variables likely to be highly related, this detecting discrepant observations more sensitively than (ii); (iv) a check of the methods of data collection to discover the sources, if any, of biases in measurement (e.g. differences between observers) which it may be necessary to allow for in analysis, and to assess the approximate measurement and recording errors for the main variables; (v) a search for missing observations, including observations that have been omitted because of their highly suspicious character. Often missing observations are denoted in some conventional way, such as 0 or 99, and it will be important not to enter these as real values in any analysis. Concern that data quality should be high without extensive effort being spent on achieving unrealistically high precision is of great importance. In particular, recording of data to a large number of digits can be wasteful; on the other hand, excessive rounding sacrifices information. The extent to which poor data quality can be set right by more elaborate analysis is very limited, particularly when appreciable systematic errors are likely to be present and cannot be investigated and removed. By and large such poor-quality data will not merit very detailed analysis.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.