Theory of Statistical Inference CHAPMAN & HALL/CRC Texts in Statistical Science Series Joseph K. Blitzstein, Harvard University, USA Julian J. Faraway, University of Bath, UK Martin Tanner, Northwestern University, USA Jim Zidek, University of British Columbia, Canada Recently Published Titles Modern Data Science with R, Second Edition Benjamin S. Baumer, Daniel T. Kaplan, and Nicholas J. Horton Probability and Statistical Inference From Basic Principles to Advanced Models Miltiadis Mavrakakis and Jeremy Penzer Bayesian Networks With Examples in R, Second Edition Marco Scutari and Jean-Baptiste Denis Time Series Modeling, Computation, and Inference, Second Edition Raquel Prado, Marco A. R. Ferreira and Mike West A First Course in Linear Model Theory, Second Edition Nalini Ravishanker, Zhiyi Chi, Dipak K. Dey Foundations of Statistics for Data Scientists With R and Python Alan Agresti and Maria Kateri Fundamentals of Causal Inference With R Babette A. Brumback Sampling Design and Analysis, Third Edition Sharon L. Lohr Theory of Statistical Inference Anthony Almudevar For more information about this series, please visit: https://www.crcpress.com/Chapman--Hall/CRC- Texts-in-Statistical-Science/book-series/CHTEXSTASCI Theory of Statistical Inference Anthony Almudevar First edition published 2022 by CRC Press 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742 and by CRC Press 4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN CRC Press is an imprint of Taylor & Francis Group, LLC © 2022 Taylor & Francis Group, LLC Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publica- tion and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans- mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750- 8400. For works that are not available on CCC please contact [email protected] Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Almudevar, Anthony, author. Title: Theory of statistical inference / Anthony Almudevar. Description: Boca Raton : CRC Press, 2021. | Series: Chapman & Hall/CRC texts in statistical science | Includes bibliographical references and index. Identifiers: LCCN 2021032159 (print) | LCCN 2021032160 (ebook) | ISBN 9780367488758 (hardback) | ISBN 9780367502805 (paperback) | ISBN 9781003049340 (ebook) Subjects: LCSH: Mathematical statistics. Classification: LCC QA276 .A46 2021 (print) | LCC QA276 (ebook) | DDC 519.5/4--dc23 LC record available at https://lccn.loc.gov/2021032159 LC ebook record available at https://lccn.loc.gov/2021032160 ISBN: 9780367488758 (hbk) ISBN: 9780367502805 (pbk) ISBN: 9781003049340 (ebk) DOI: 10.1201/9781003049340 Publisher’s note: This book has been prepared from camera-ready copy provided by the authors. Contents Preface xi 1 Distribution Theory 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Probability Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Some Important Theorems of Probability . . . . . . . . . . . . . . . . . . 7 1.4 Commonly Used Distributions . . . . . . . . . . . . . . . . . . . . . . . . 10 1.5 Stochastic Order Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.6 Quantiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.7 Inversion of the CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.8 Transformations of Random Variables . . . . . . . . . . . . . . . . . . . . 22 1.9 Moment Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . 23 1.10 Moments and Cumulants . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2 Multivariate Distributions 37 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.2 Parametric Classes of Multivariate Distributions . . . . . . . . . . . . . . 37 2.3 Multivariate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.4 Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 2.5 Quadratic Forms, Idempotent Matrices and Cochran’s Theorem . . . . . . 45 2.6 MGF and CGF of Independent Sums . . . . . . . . . . . . . . . . . . . . . 49 2.7 Multivariate Extensions of the MGF . . . . . . . . . . . . . . . . . . . . . 51 2.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3 Statistical Models 57 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 Parametric Families for Statistical Inference . . . . . . . . . . . . . . . . . 58 3.3 Location-Scale Parameter Models . . . . . . . . . . . . . . . . . . . . . . . 61 3.4 Regular Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5 Fisher Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.6 Exponential Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.7 Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.8 Complete and Ancillary Statistics . . . . . . . . . . . . . . . . . . . . . . . 82 3.9 Conditional Models and Contingency Tables . . . . . . . . . . . . . . . . . 88 3.10 Bayesian Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 3.11 Indifference, Invariance and Bayesian Prior Distributions . . . . . . . . . . 91 3.12 Nuisance Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 3.13 Principles of Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 3.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 v vi Contents 4 Methods of Estimation 105 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.2 Unbiased Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.3 Method of Moments Estimators . . . . . . . . . . . . . . . . . . . . . . . . 107 4.4 Sample Quantiles and Percentiles . . . . . . . . . . . . . . . . . . . . . . . 108 4.5 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . 109 4.6 Confidence Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 4.7 Equivariant Versus Shrinkage Estimation . . . . . . . . . . . . . . . . . . 123 4.8 Bayesian Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5 Hypothesis Testing 133 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 5.2 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 5.3 Principles of Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . 135 5.4 The Observed Level of Significance (P-Values) . . . . . . . . . . . . . . . 137 5.5 One- and Two-Sided Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.6 Unbiasedness and Stochastic Ordering . . . . . . . . . . . . . . . . . . . . 139 5.7 Hypothesis Tests and Pivots . . . . . . . . . . . . . . . . . . . . . . . . . . 140 5.8 Likelihood Ratio Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.9 Similar Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 6 Linear Models 155 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 6.2 Linear Models – Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 155 6.3 Best Linear Unbiased Estimators (BLUE) . . . . . . . . . . . . . . . . . . 158 6.4 Least Squares Estimators, BLUEs and Projection Matrices . . . . . . . . 161 6.5 Ordinary and Generalized Least Squares Estimators . . . . . . . . . . . . 163 6.6 ANOVA Decomposition and the F Test for Linear Models . . . . . . . . . 168 6.7 One- and Two-Way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . 174 6.8 Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . 181 6.9 Constrained Least Squares Estimation . . . . . . . . . . . . . . . . . . . . 187 6.10 Simultaneous Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . 191 6.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 7 Decision Theory 207 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 7.2 Ranking Estimators by MSE . . . . . . . . . . . . . . . . . . . . . . . . . 208 7.3 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 7.4 The Structure of Decision Theoretic Inference . . . . . . . . . . . . . . . . 215 7.5 Loss and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 7.6 Uniformly Minimum Risk Estimators (The Location-Scale Model) . . . . 221 7.7 Some Principles of Admissibility . . . . . . . . . . . . . . . . . . . . . . . 224 7.8 Admissibility for Exponential Families (Karlin’s Theorem) . . . . . . . . . 226 7.9 Bayes Decision Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 7.10 Admissibility and Optimality . . . . . . . . . . . . . . . . . . . . . . . . . 232 7.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Contents vii 8 Uniformly Minimum Variance Unbiased (UMVU) Estimation 241 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 8.2 Definition of UMVUE’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 8.3 UMVUE’s and Sufficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 8.4 Methods of Deriving UMVUEs . . . . . . . . . . . . . . . . . . . . . . . . 245 8.5 Nonparametric Estimation and U-statistics . . . . . . . . . . . . . . . . . 247 8.6 Rank Based Measures of Correlation . . . . . . . . . . . . . . . . . . . . . 252 8.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 9 Group Structure and Invariant Inference 257 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 9.2 MRE Estimators for Location Parameters . . . . . . . . . . . . . . . . . . 258 9.3 MRE Estimators for Scale Parameters . . . . . . . . . . . . . . . . . . . . 265 9.4 Invariant Density Families . . . . . . . . . . . . . . . . . . . . . . . . . . . 270 9.5 Some Applications of Invariance . . . . . . . . . . . . . . . . . . . . . . . 274 9.6 Invariant Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 9.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 10 The Neyman-Pearson Lemma 289 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 10.2 Hypothesis Tests as Decision Rules . . . . . . . . . . . . . . . . . . . . . . 289 10.3 Neyman-Pearson (NP) Tests . . . . . . . . . . . . . . . . . . . . . . . . . 290 10.4 Monotone Likelihood Ratios (MLR) . . . . . . . . . . . . . . . . . . . . . 294 10.5 The Generalized Neyman-Pearson Lemma . . . . . . . . . . . . . . . . . . 296 10.6 Invariant Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 10.7 Permutation Invariant Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 304 10.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 11 Limit Theorems 315 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 11.2 Limits of Sequences of Random Variables . . . . . . . . . . . . . . . . . . 316 11.3 Limits of Expected Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 11.4 Uniform Integrability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 11.5 The Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 321 11.6 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 11.7 Multivariate Extensions of Limit Theorems . . . . . . . . . . . . . . . . . 326 11.8 The Continuous Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . 329 11.9 MGFs, CGFs and Weak Convergence . . . . . . . . . . . . . . . . . . . . . 330 11.10 The Central Limit Theorem for Triangular Arrays . . . . . . . . . . . . . 332 11.11 Weak Convergence of Random Vectors . . . . . . . . . . . . . . . . . . . . 334 11.12 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 12 Large Sample Estimation — Basic Principles 341 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 12.2 The δ-Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 12.3 Variance Stabilizing Transformations . . . . . . . . . . . . . . . . . . . . . 344 12.4 The δ-Method and Higher-Order Approximations . . . . . . . . . . . . . . 347 12.5 The Multivariate δ-Method . . . . . . . . . . . . . . . . . . . . . . . . . . 353 12.6 Approximating the Distributions of Sample Quantiles: The Bahadur Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 12.7 A Central Limit Theorem for U-statistics . . . . . . . . . . . . . . . . . . 357 12.8 The Information Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 359 viii Contents 12.9 Asymptotic Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 12.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 13 Asymptotic Theory for Estimating Equations 371 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 13.2 Consistency and Asymptotic Normality of M-Estimators . . . . . . . . . 373 13.3 Asymptotic Theory of MLEs . . . . . . . . . . . . . . . . . . . . . . . . . 375 13.4 A General Form for Regression Models . . . . . . . . . . . . . . . . . . . . 376 13.5 Nonlinear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 13.6 Generalized Linear Models (GLM) . . . . . . . . . . . . . . . . . . . . . . 379 13.7 Generalized Estimating Equations (GEE) . . . . . . . . . . . . . . . . . . 385 13.8 Existence and Consistency of M-Estimators . . . . . . . . . . . . . . . . . 387 13.9 Asymptotic Distribution ofθθˆθ . . . . . . . . . . . . . . . . . . . . . . . . . 390 n 13.10 Regularity Conditions for Estimating Equations . . . . . . . . . . . . . . . 391 13.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 14 Large Sample Hypothesis Testing 395 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 14.2 Model Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 14.3 Large Sample Tests for Simple Null Hypotheses . . . . . . . . . . . . . . . 397 14.4 Nuisance Parameters and Composite Null Hypotheses . . . . . . . . . . . 403 14.5 Pearson’s χ2 Test for Independence in Contingency Tables . . . . . . . . . 408 14.6 A Comparison of the LR, Wald and Score Tests . . . . . . . . . . . . . . . 409 14.7 Confidence Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 14.8 Estimating Power for Approximate χ2 Tests . . . . . . . . . . . . . . . . . 411 14.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 A Parametric Classes of Densities 415 B Topics in Linear Algebra 417 B.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 B.2 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 B.3 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 B.4 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 B.5 Dimension of a Subset of Rd . . . . . . . . . . . . . . . . . . . . . . . . . . 425 C Topics in Real Analysis and Measure Theory 427 C.1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 C.2 Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 C.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 C.4 Exchange of Integration and Differentiation . . . . . . . . . . . . . . . . . 430 C.5 The Gamma and Beta Functions . . . . . . . . . . . . . . . . . . . . . . . 432 C.6 Stirling’s Approximation of the Factorial . . . . . . . . . . . . . . . . . . . 432 C.7 The Gradient Vector and the Hessian Matrix . . . . . . . . . . . . . . . . 432 C.8 Normed Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 C.9 Taylor’s Remainder Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 435 Contents ix D Group Theory 437 D.1 Definition of a Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 D.2 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438 D.3 Group Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 D.4 Transformation Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 D.5 Orbits and Maximal Invariants . . . . . . . . . . . . . . . . . . . . . . . . 442 Bibliography 445 Index 453