Statistical Inference S. D. Silvey University of Glasgow Chapman and Hall London First published 1970 by Penguin Books Ltd Reprinted with corrections 1975 by Chapman and Hall Ltd 11 New Fetter Lane, London EC4P 4EE Printed in Great Britain by Lovte 8- Brydone (Printers) Ltd, Thetford, Norfolk ISBN 0 412 13820 4 © 1975 S. D. Silvey All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form of binding or cover other than that in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser. Distributed in the U.S.A. by Halsted Press, A Division of John Wiley 8 Sons Inc., New York Library of Congress Catalog Card No 75-2254 Contents Preface 11 1 Introduction 13 1.1 Preliminaries 13 1.2 The general inference problem 16 1.3 Estimation 18 1.4 Hypothesis testing 19 1.5 Decision theory 19 Examples 1 19 2 Minimum-Variance Unbiased Estimation 21 2.1 The point estimation problem 21 2.2 ‘Good’ estimates 22 2.3 Sufficiency 25 2.4 The Rao-Blackwell theorem 28 2.5 Completeness 29 2.6 Completeness and M.V.U.E.s 33 2.7 Discussion 34 2.8 Efficiency of unbiased estimators 35 2.9 Fisher’s information 37 2.10 The Cramer-Rao lower bound 38 2.11 Efficiency 39 2.12 Generalization of the Cramer-Rao inequality 41 2.13 Concluding remarks 43 Examples 2 44* 3 The Method of Least Squares 46 3.1 Examples 46 3.2 Normal equations 47 3.3 Geometric interpretation 48 3.4 Identifiability 50 3.5 The Gauss-Markov theorem 51 3.6 Weighted least squares 54 3.7 Estimation of a2 56 7 Contents 3.8 Variance ofleast-squares estimators 57 3.9 Normal theory 58 3.10 Least squares with side conditions 59 3.11 Discussion 64 Examples 3 65 4 The Method of Maximum Likelihood 68 p* 4.1 The likelihood function 68 4.2 Calculation of maximum-likelihood estimates 70 4.3 Optimal-properties of maximum-likelihood estimators 73 4.4 Large-sample properties 74 4.5 Consistency 76 4.6 Large-sample efficiency 77 4.7 Restricted maximum-likelihood estimates 79 Examples 4 84 5 Confidence Sets 87 5.1 Confidence interval 87 5.2 General definition of a confidence set 88 5.3 Construction of confidence sets 89 5.4 Optimal confidence sets 92 Examples 5 92 6 Hypothesis Testing 94 6.1 The Neyman-Pearson theory 96 6.2 Simple hypotheses 97 6.3 Composite hypotheses 102 6.4 Unbiased and invariant tests 104 Examples 6 106 7 The Likelihood-Ratio Test and Alternative ‘Large-Sample’ Equivalents of it 7.1 The likelihood-ratio test 108 7.2 The large-sample distribution of A 112 7.3 TheW-test 115 7.4 The x2 test 118 Examples 7 121 8 Sequential Tests 123 8.1 Definition of a sequential probability ratio test 124 8.2 Error probabilities and the constants A and B 125 8.3 Graphical procedure for an s.p.r. test 127 8.4 Composite hypotheses 129 8.5 Monotone likelihood ratio and the s.p.r. test 130 Examples 8 136 8 Contents 9 Non-Parametric Methods 139 9.1 The Kolmogorov-Smimov test 140 9.2 The x2 goodness-of-fit test 142 93 The Wilcoxon test 143 9.4 Permutation tests 144 9.5 The use of a sufficient statistic for test construction 145 9.6 Randomization 148 Examples 9 151 10 The Bayesian Approach 153 10.1 Prior distributions 153 10.2 Posterior distributions 154 10.3 Bayesian confidence intervals 155 10.4 Bayesian inference regarding hypotheses 1*56 10.5 Choosing a prior distribution 157 10.6 Improper prior distributions 158 Examples 10 159 11 An Introduction to Decision Theory 161 11.1 The two-decision problem 161 11.2 Decision functions 162 11.3 The risk function 162 11.4 Minimax decision functions 165 11.5 Admissible decision functions 166 11.6 Bayes’s solutions 166 11.7 A Bayes’s sequential decision problem 171 Examples 11 175 Appendix A Some Matrix Results 177 Appendix B The Linear Hypothesis 180 References 189 Index 191 Preface Statistics is a subject with a vast field of application, involving problems which vary widely in their character and complexity. However, in tackling these, we use a relatively small core of central ideas and methods. In this book I have attempted to concentrate attention on these ideas, to place them in a general setting and to illustrate them by relatively simple examples, avoiding wherever possible the extraneous difficulties of complicated mathematical manipulation. In order to compress the central body of ideas into a small volume, it is necessary to assume a fair degree of mathematical sophistication on the part of the reader, and the book is intended for students of mathematics who are already accustomed to thinking in rather general terms about spaces, functions and so on. Primarily I had in mind final-year and postgraduate mathematics students. Certain specific mathematical knowledge is assumed in addition to this general sophistication, in particular: a thorough grounding in probability theory and in the methods of probability calculus; a nodding acquaintance with measure theory; considerable knowledge of linear algebra, in terms of both matrices and linear transformations in finite-dimensional vector spaces; and a good working knowledge of calculus of several variables. Probability theory is absolutely essential throughout the book. However only parts of it require the other specific bits of knowledge referred to, and most of the ideas can be grasped without them. There is a continuing controversy among statisticians about the foundations of statistical inference, between protagonists of the so-called frequentist and Bayesian schools of thought. While a single all-embracing theory has obvious attractions (and Bayesian theory is closer to this than frequentist theory), it remains true that ideas from both sides are useful in thinking about practical problems. So in this book I have adopted the attitude that I should include those ideas and methods which I have actually used in practice. At the same time, I have tried to present them in a way which encourages the reader to think critically about them and to form his own view of their relative strengths and weaknesses. It is not my view that all that is required to make a statistician is an under standing of the ideas presented in this book. A necessary preliminary to their use in practice is the setting up of an appropriate probabilistic model for the situation under investigation, and this calls for considerable experience and 11 Preface judgement. I have made no attempt to discuss this aspect of the subject and the book contains no real data whatsoever. Moreover, the examples, some of which are not easy, are intended to provide the reader with an opportunity for testing his understanding of the ideas, and not to develop experience, in establishing mathematical models. I consider it easier to grasp the basic ideas when one is not harassed by the necessity to exercise judgement regarding the model. It is impossible for me to acknowledge individually my indebtedness to all those who have influenced my thinking about statistics, including present and past colleagues, but I must express my gratitude to three in particular: to Dr R. A. Robb who first introduced me to the subject and who supported me strongly yet unobtrusively during my early fumbling steps in applied statistics; to Professor D. V. Lindley whose lectures in Cambridge were most inspiring and whose strong advocacy of the Bayesian approach has forced many besides myself to think seriously about the foundations of the subject; and to Professor E. L. Lehmann whose book on testing statistical hypotheses clarified for me so many of the ideas of the frequentist school. I also wish to thank an anonym ous referee for several suggestions which resulted in an improvement to an original version of the book. Most of the examples for the reader are drawn from examination papers, and I am obliged in particular to the University of Cambridge for permission to reproduce a number of questions from papers for the Diploma in Mathe matical Statistics. These are indicated by (Camb. Dip.). Since the original sources of questions are difficult to trace, I apologize to any colleague who recognizes an unacknowledged question of his own. Finally I am extremely grateful to Miss Mary Nisbet who typed the manu script with admirable accuracy and who sustained extraordinary good humour in the face of numerous alterations. 12 Preface