Table Of ContentStatistics for Industry and Technology
Series Editor
N. Balakrishnan
McMaster University
Department of Mathematics and Statistics
1280 Main Street West
Hamilton, Ontario L8S 4K1
Canada
Editorial Advisory Board
Max Engelhardt
EG&G Idaho, Inc.
Idaho Falls, ID 83415
Harry F. Martz
Group A-1 MS F600
Los Alamos National Laboratory
Los Alamos, NM 87545
Gary C. McDonald
NAO Research & Development Center
30500 Mound Road
Box 9055
Warren, MI 48090-9055
Peter R. Nelson
Department of Mathematcal Sciences
Clemson University
Martin Hall
Box 341907
Clemson, SC 29634-1907
Kazuyuki Suzuki
Communication & Systems Engineering Department
University of Electro Communications
1-5-1 Chofugaoka
Chofu-shi
Tokyo 182
Japan
Goodness-of-Fit Tests and
Model Validity
C. Huber-Carol
N. Balakrishnan
M.S. Nikulin
M. Mesbah
Editors
Springer Science+Business Media, LLC
C. Huber-Carol N. Balakrishnan
Laboratoire de Statistique Medicale Department of Mathematics and Statistics
Universite Rene Descartes—Paris 5 McMaster University
75006 Paris Hamilton, Ontario L8S 4K1
France Canada
M. S. Nikulin M. Mesbah
Laboratoire de Statistique Appliquee
Laboratoire Statistique Mathematique
Universite de Bretagne Sud
Universite Bordeaux 2
56 000 Vannes
33076 Bordeaux Cedex
France
France
and
Laboratory of Statistical Methods
V. Steklov Mathematical Institute
191011 St. Petersburg
Russia
Library of Congress Cataloging-in-Publication Data
A CIP catalogue record for this book is available from the Library of Congress,
Washington D.C., USA.
AMS Subject Classifications: 62-06, 62F03
Printed on acid-free paper. ÜL5) ®
©2002 Springer Science+Business Media New York U^f)
Originally published by Birkhäuser Boston in 2002
Softcover reprint of the hardcover 1st edition 2002
All rights reserved. This work may not be translated or copied in whole or in part without the written permission
of the publisher Springer Science+Business Media, LLC ,
except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form
of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar method
ology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not
especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise
Marks Act, may accordingly be used freely by anyone.
ISBN 978-1-4612-6613-6 ISBN 978-1-4612-0103-8 (eBook)
DOI 10.1007/978-1-4612-0103-8
Typeset by the editors in IATEX.
9 8 7 6 5 4 3 2 1
Contents
Preface xvii
Contributors xix
List of Tables xxvii
List of Figures xxxiii
PART I: HISTORY AND FUNDAMENTALS
1 Karl Pearson and the Chi-Squared Test 3
D. R. Cox
1.1 Karl Pearson 1857-1937: Background to the
Chi-Squared Paper 3
1.2 K. P.: After Chi-Squared 5
1.3 The 1900 Paper 5
1.4 Importance of the Chi-Squared Test 6
References 8
2 Karl Pearson Chi-Square Test-The Dawn of
Statistical Inference 9
C. R. Rao
2.1 Introduction 9
2.2 Large Sample Criteria: The Holy 'frinity 11
2.2.1 Likelihood ratio criterion 11
2.2.2 Wald test 12
2.2.3 Rao's score test 12
2.3 Specification Tests for a
Multinomial Distribution 13
2.3.1 Test of a simple hypothesis 13
2.3.2 Tests of a composite hypothesis 14
2.3.3 Test for goodness-of-fit in a subset of cells 15
2.3.4 Analysis of chi-square 17
2.3.5 Some applications of the chi-square test 18
2.4 Other Tests of Goodness-of-Fit 18
v
vi Contents
2.5 Specification Tests for Continuous Distributions 20
References 22
3 Approximate Models 25
Peter J. Huber
3.1 Models 25
3.2 Bayesian Modeling 27
3.3 Mathematical Statistics and Approximate Models 29
3.4 Statistical Significance and Relevance 31
3.5 Composite Models 32
3.6 The Role of Simulation 38
3.7 Summary Conclusions 40
References 40
PART II: CHI-SQUARED TEST
4 Partitioning the Pearson-Fisher Chi-Squared
Goodness-of-Fit Statistic 45
G. D. Rayner
4.1 Introduction 45
4.2 Neyman Smooth Goodness-of-Fit Tests 46
4.2.1 Smooth goodness-of-fit tests for
categorized data 47
4.2.2 Partitioning the Pearson-Fisher
chi-squared statistic 48
4.3 Constructing the Pearson-Fisher Decomposition 49
4.4 Simulation Study 50
4.5 Results and Discussion 51
References 55
5 Statistical Tests for Normal Family in Presence of
Outlying Observations 57
A i"cha Z erbet
5.1 The Chi-Squared Test of Normality in the
Univariate Case 57
5.1.1 Example: Analysis of the data of Milliken 59
5.2 Bol'shev Test for Outliers 59
5.2.1 Stages of applications of the test of Bol'shev 60
5.2.2 Example 2: Analysis of the data of Daniel (1959) 60
5.3 Power of the Chi-Squared Test 61
References 63
Contents Vll
6 Chi-Squared Test for the Law of Annual Death Rates:
Case with Censure for Life Insurance Files 65
Leo Gerville-Reache
6.1 Introduction 65
6.2 Chi-Squared Goodness-of-Fit Test 66
6.2.1 Statistics with censure 66
6.2.2 Goodness-of-fit test for a composite hypothesis 67
6.3 Demonstration 68
References 69
PART III: GOODNESS-OF-FIT TESTS FOR
PARAMETRIC DISTRIBUTIONS
7 Shapiro-Wilk Type Goodness-of-Fit Tests for
Normality: Asymptotics Revisited 73
Pranab Kumar Sen
7.1 Introduction 73
7.2 Preliminary Notion 74
7.3 SOADR Results for BLUE and LSE 77
7.4 Asymptotics for W~ 81
7.5 Asymptotics Under Alternatives 85
References 87
8 A Test for Exponentiality Based on Spacings for
Progressively Type-II Censored Data 89
N. Balakrishnan, H. K. T. Ng, and N. Kannan
8.1 Introduction 89
8.2 Progressive Censoring 91
8.3 Test for Exponentiality 92
8.3.1 Null distribution of T 93
8.4 Power Function Approximation and Simulation
Results 95
8.4.1 Approximation of power function 95
8.4.2 Monte Carlo power comparison 97
8.5 Modified EDF and Shapiro-Wilk Statistics 98
8.6 Two-Parameter Exponential Case 99
8.7 Illustrative Examples 100
8.7.1 Example 1: One-parameter exponential case 100
8.7.2 Example 2: Two-parameter exponential case 101
8.8 Multi-Sample Extension 102
8.9 Conclusions 103
References 103
viii Contents
9 Goodness-of-Fit Statistics for the Exponential
Distribution When the Data are Grouped 113
Sneh Gulati and Jordan Neus
9.1 Introduction 113
9.2 The Model and the Test Statistics 115
9.3 Asymptotic Distribution 116
9.4 Power Studies 119
References 122
10 Characterization Theorems and Goodness-of-Fit Tests 125
Carol E. Marchetti and Govind S. Mudholkar
10.1 Introduction and Summary 126
10.2 Characterization Theorems 127
10.2.1 Entropy characterizations 127
10.2.2 Statistical independence 128
10.3 Maximum Entropy Tests 130
10.4 Four Z Tests 131
10.5 Byproducts: The G-IG Analogies 134
References 137
11 Goodness-of-Fit Tests Based on Record Data and
Generalized Ranked Set Data 143
Barry C. Arnold, Robert J. Beaver, Enrique Castillo,
and Jose Maria Sarabia
11.1 Introduction 143
11.2 Record Data 144
11.3 Generalized Ranked Set Data 144
11.4 Power 150
11.5 Composite Null Hypotheses 154
11.6 Remarks 156
References 156
PART IV: REGRESSION AND GOODNESS-OF-FIT TESTS
12 Gibbs Regression and a Test of Goodness-of-Fit 161
Lynne Seymour
12.1 Introduction 161
12.2 The Motivation and the Model 162
12.3 Application and Evaluation of the Model 165
12.4 Discussion 169
References 170
13 A CLT for the L_2 Norm of the Regression Estimators
Under a-Mixing: Application to G-O-F Tests 173
Cheikh A. T. Diack
13.1 Introduction 173
13.2 Estimators 174
13.3 A Limit Theorem 175
13.4 Inference 177
13.5 Proofs 178
References 183
14 Testing the Goodness-of-Fit of a Linear Model in
N onparametric Regression 185
Zaher Mohdeb and Abdelkader Mokkadem
14.1 Introduction 185
14.2 The Test Statistic 186
14.3 Simulations 189
References 193
15 A New Test of Linear Hypothesis in Regression 195
Y. Baraud, S. Huet, and B. Laurent
15.1 Introduction 195
15.2 The Testing Procedure 196
15.2.1 Description of the procedure 197
15.2.2 Behavior of the test under the null
hypothesis 198
15.2.3 A toy framework: The case of a known
variance 198
15.3 The Power of the Test 198
15.3.1 The main result 198
15.3.2 Rates of testing 199
15.4 Simulations 201
15.4.1 The simulation experiment 201
15.4.2 The testing procedure 202
15.4.3 The test proposed by Horowitz and
Spokoiny (2000) 202
15.4.4 Results of the simulation study 203
15.5 Proofs 203
15.5.1 Proof of Theorem 15.3.1 203
15.5.2 Proof of Corollary 15.3.1 204
References 206
x Contents
PART V: GOODNESS-OF-FIT TESTS IN SURVIVAL ANALYSIS
AND RELABILITY
16 Inference in Extensions of the Cox Model for
Heterogeneous Populations 211
Odile Pons
16.1 Introduction 211
16.2 Non-Stationary Cox Model 212
16.3 Varying-Coefficient Cox Model 219
References 224
17 Assumptions of a Latent Survival Model 227
Mei-Ling Ting Lee and G. A. Whitmore
17.1 Introduction 227
17.2 Latent Survival Model 228
17.3 Data and Parameter Estimation 229
17.4 Model Validation Methods 230
17.5 Remedies to Achieve a Better Model Fit 233
References 235
18 Goodness-of-Fit Testing for the Cox Proportional
Hazards Model 237
K arthik Devarajan and Nader Ebrahimi
18.1 Introduction 237
18.2 Goodness-of-Fit Testing for the Cox PH Model 240
18.3 Comparison of the Proposed Goodness-of-Fit Test
with Existing Methods 242
18.4 Illustration of the Goodness-of-Fit Test using
Real-Life Data 249
18.5 Concluding Remarks 250
References 251
19 A New Family of Multivariate Distributions for
Survival Data 255
Shulamith T. Gross and Catherine Huber-Carol
19.1 Introduction 255
19.2 Frailty Models: An Overview 255
19.3 The Model 257
19.4 An Application to Skin Grafts Rejection 261
19.4.1 Description of the data 261
References 264