Statistics in Research and Development OTHER STATISTICS TEXTS FROM CHAPMAN & HALL Practical Statistics for Medical Research Douglas Altman The Analysis of Time Series - An Introduction C. Chatfield Problem Solving: A Statistician's Guide C. Chatfield Statistics for Technology C. Chatfield Introduction to Multivariate Analysis C. Chatfield and A. J. Collins Applied Statistics: Principles and Examples D. R. Cox and E. J. Snell An Introduction to Statistical Modelling A. J. Dobson Introduction to Optimization Methods and their Applications in Statistics B. S. Everitt Multivariate Statistics - A Practical Approach B. Flury and H. Riedwyl Readings in Decision Analysis S. French Multivariate Analysis of Variance and Repeated Measures D. J. Hand and C. C. Taylor Multivariate Statistical Methods - a primer Bryan F. Manley Statistical Methods in Agriculture and Experimental Biology R. Mead and R. N. Curnow Elements of Simulation B. J. T. Morgan Probability Methods and Measurement Anthony O'Hagan Essential Statistics D. G. Rees Foundations of Statistics D. G. Rees Decision Analysis: A Bayesian Approach J. Q. Smith Applied Statistics: A Handbook of BMDP Analyses E. J. Snell Applied Statistics: A Handbook of GENSTAT Analyses E. J. Snell and H. R. Simpson Elementary Applications of Probability Theory H. C. Tuckwell Intermediate Statistical Methods G. B. Wetherill Statistical Process Control: Theory and practice G. B. Wetherill and D. W. Brown Further information of the complete range of Chapman & Hall statistics books is available from the publishers. Statistics in Research and Development Second edition Roland Caulcutt BP Chemicals Lecturer University of Bradford IU!ll Springer-Science+Business Media, B.Y. First edition 1983 Second edition 1991 © 1983, 1991 Roland CauIcutt Originally published by Chapman and Hall in 1991. Softcover reprint ofthe hardcover 2nd edition 1991 Typeset in 10/12 Times by Excel Typesetters Co. Hong Kong. ISBN 978-0-412-35890-6 ISBN 978-1-4899-2943-3 (eBook) DOI 10.1007/978-1-4899-2943-3 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the UK Copyright Designs and Patents Act, 1988, this publication may not be reproduced, stored, or transmitted, in any form or by any means, without the prior permission in writing of the publishers, or in the case of reprographie reproduction only in accordance with the terms of the licences issued by the Copyright Licensing Agency in the UK, or in accordance with the terms of licences issued by the appropriate Reproduction Rights Organization outside the UK. Enquiries concerning reproduction outside the terms stated here should be sent to the publishers at the UK address printed on this page. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. British Library Cataloguing in Publication Data Cauicutt, Roland Statistics in research and development. - 2nd ed. 1. Industrial research. Applications of statistical methods I. Title 607.2 Library of Congress Cataloging-in-Publication Data Available Contents Preface x 1 What is statistics? 1 2 Describing the sampie 5 2.1 Introduction 5 2.2 Variability in plant performance 5 2.3 Frequency distributions 7 2.4 Measures of location and spread 13 2.5 Cumulative frequency distributions 19 2.6 The seven basic tools 20 2.7 Summary 22 Problems 24 3 Describing the population 27 3.1 Introduction 27 3.2 Probability distributions 28 3.3 The Poisson distribution 29 3.4 The normal distribution 32 3.5 Normal proba bility graph paper 40 3.6 Process capability 43 3.7 Summary 47 Problems 48 4 Testing and estimation: one sampie 51 4.1 Introduction 51 4.2 Has the yield increased? 51 4.3 The significance testing procedure 56 4.4 What is the new mean yield? 58 4.5 Is such a large sampie really necessary? 61 4.6 Statistical significance and practical importance 63 4.7 Summary 64 Problems 64 Contents VI 5 Testing and estimation: two sampies 66 5.1 Introduction 66 5.2 Comparing the precision of two test methods 66 5.3 Comparing the bias of two test methods 70 5.4 Does the product deteriorate in storage? 72 5.5 A much better experiment 76 5.6 Summary 79 Problems 79 6 Testing and estimation: qualitative data 81 6.1 Introduction 81 6.2 Does the additive affect the taste? 81 6.3 Where do statistical tables come from? 84 6.4 Estimating percentage rejects 87 6.5 Has the additive been effective? 89 6.6 Significance tests with percentages 91 6.7 The binomial distribution 95 6.8 Summary 97 Problems 97 7 Testing and estimation: assumptions 100 7.1 Introduction 100 7.2 Estimating variability 100 7.3 Estimating variability for at-test 102 7.4 Assumptions underlying significance tests - randomness 106 7.5 Assumptions underlying significance tests - the normal distribution 108 7.6 Assumptions underlying significance tests - sampie size 109 7.7 Outliers 110 7.8 Summary 114 Problems 114 8 Statistical process control 117 8.1 Introduction 117 8.2 A control chart for the me an yield 117 8.3 How quickly will change be detected? 121 8.4 Increasing the sensitivity of the control chart 122 8.5 Improving the control chart by averaging 128 8.6 Moving me an charts 132 8.7 Assumptions underlying control charts 138 8.8 Summary 144 Problems 144 Contents Vll 9 Detecting process changes 146 9.1 Introduction 146 9.2 Examining the previous 50 batches 146 9.3 Significance testing 151 9.4 Interpretation 156 9.5 Has the process variability increased? 159 9.6 The use of cusums in statistical process control 163 9.7 Scientific method 165 9.8 Some definitions 168 9.9 Summary 170 Problem 170 10 Investigating the process - an experiment 172 10.1 Introduction 172 10.2 The plant manager's experiment 172 10.3 Selecting the first independent variable 174 10.4 Is the correlation due to chance? 179 10.5 Fitting the best straight line 180 10.6 Goodness of fit 182 10.7 The 'tme' regression equation 185 10.8 Accuracy of prediction 186 10.9 An alternative equation 189 10.10 Summary 190 Problems 190 11 Why was the experiment not successful? 196 11.1 Introduction 196 11.2 An equation with two independent variables 196 11.3 Multiple regression analysis on a computer 199 11.4 An alternative multiple regression equation 205 11.5 Graphical representation of a multiple regression equation 207 11.6 Non-linear relationships 208 11.7 Interactions between independent variables 210 11.8 Summary 211 Problems 212 12 Some simple but etTective experiments 218 12.1 Introduction 218 12.2 The classical experiment (one variable at a time) 218 12.3 Factorial experiments 221 12.4 Estimation of main effects and interaction from the results of a 22 experiment 223 12.5 Distinguishing between real and chance effects 225 12.6 The use of multiple regression with a 22 experiment 231 viii Contents 12.7 A 23 factorial experiment 237 12.8 The use of multiple regression with a 23 experiment 240 12.9 Two replicates of a 23 factorial experiment 243 12.10 More regression analysis 247 12.11 Factorial experiments made simple 250 12.12 Summary 254 Problems 254 13 Adapting the simple experiments 258 13.1 Introduction 258 13.2 The design matrix 258 13.3 Half replicates of a 2n factorial design 260 13.4 Quarter replicates of a 2n factorial design 266 13.5 A useful method for selecting a fraction of a 2n factorial experiment 267 13.6 Finding optimum conditions 269 13.7 Central composite designs 273 13.8 Screening experiments 276 13.9 B10cking and confounding 278 13.10 Summary 282 Problems 283 14 Improving a bad experiment 287 14.1 Introduction 287 14.2 An alternative first experiment 287 14.3 How good is the alternative experiment? 292 14.4 Extending the original experiment 293 14.5 Final analysis 297 14.6 Inspection of residuals 303 14.7 Summary 306 Problems 306 15 Analysis of variance 310 15.1 Introduction 310 15.2 Testing error and sampling variation 310 15.3 Testing error and process capability 320 15.4 Analysis of variance and simple regression 324 15.5 Analysis of variance with multiple regression 328 15.6 Analysis of variance and factorial experiments 331 15;7 Summary 335 Problems 336 16 An introduction to Taguchi techniq~es 338 16.1 Introduction 338 Contents IX 16.2 Orthogonal arrays 338 16.3 Linear graphs 344 16.4 Design, quality and noise 347 16.5 Experiments for robust design 352 16.6 Quality, variability and cost 361 16.7 Summary 363 Problems 364 Appendix A The sigma (l:) notation 368 Appendix B Notation and formulae 369 Appendix C Sampling distributions 373 Appendix D Copy of computer print-out from a multiple regression program 378 Appendix E Partial correlation 381 Appendix F Significance tests on effect estimates from a pr factorial experiment 385 Solutions to problems 389 References and further reading 450 Statistical tables 452 Index 466