Table Of ContentSpringer Series in Statistics
Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes.
Atkinson/Riani: Robust Diagnotstic Regression Analysis.
Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition.
BolJarine/Zacks: Prediction Theory for Finite Populations.
Borg/Groenen: Modem Multidimensional Scaling: Theory and Applications
Brockwell/Davis: Time Series: Theory and Methods, 2nd edition.
Chen/Shao/Ibrahim: Monte Carlo Methods in Bayesian Computation.
David/Edwards: Annotated Readings in the History of Statistics.
Devroye/Lugosi: Combinatorial Methods in Density Estimation.
Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications.
Eggermont/LaRiccia: Maximum Penalized Likelihood Estimation, Volume I:
Density Estimation.
FahrmeiriTutz: Multivariate Statistical Modelling Based on Generalized Linear
Models, 2nd edition.
Farebrother: Fitting Linear Relationships: A History of the Calculus of Observations
1750-1900.
Federer: Statistical Design and Analysis for Intercropping Experiments, Volume I:
Two Crops.
Federer: Statistical Design and Analysis for Intercropping Experiments, Volume II:
Three or More Crops.
Fienberg/Hoaglin/Kruskal/Tanur (Eds.): A Statistical Model: Frederick Mosteller's
Contributions to Statistics, Science and Public Policy.
Fisher/Sen: The Collected Works ofWassily Hoeffding.
Glaz/Naus/Wallenstein: Scan Statistics.
Good: Permutation Tests: A Practical Guide to Resampling Methods for Testing
Hypotheses, 2nd edition.
Gouriliroux: ARCH Models and Financial Applications.
Grandell: Aspects of Risk Theory.
Haberman: Advanced Statistics, Volume I: Description of Populations.
Hall: The Bootstrap and Edgeworth Expansion.
HardIe: Smoothing Techniques: With Implementation in S.
Harrell: Regression Modeling Strategies: With Applications to Linear Models,
Logistic Regression, and Survival Analysis
Hart: Nonparametric Smoothing and Lack-of-Fit Tests.
Hartigan: Bayes Theory.
Hastie et al: The Elements of Statistical Learning: Data Mining, Inference and Prediction
Hedayat/Sloane/Stujken: Orthogonal Arrays: Theory and Applications.
Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal
Parameter Estimation.
Huet/Bouvier/Gruet/Jolivet: Statistical Tools for Nonlinear Regression: A Practical
Guide with S-PLUS Examples.
Ibrahim/Chen/Sinha: Bayesian Survival Analysis.
Kolen/Brennan: Test Equating: Methods and Practices.
Kotz/Johnson (Eds.): Breakthroughs in Statistics Volume I.
Kotz/Johnson (Eds.): Breakthroughs in Statistics Volume II.
(continued after index)
Springer Series in Statistics
Advisors:
P. Bickel, P. Diggle, S. Fienberg K. Krickeberg,
I. Olkin, N. Wermuth, S. Zeger
Springer Science+Business Media, LLC
Paul W. Mielke, Jr.
Kenneth J. Berry
Permutation Methods
A Distance Function Approach
Springer
Paul W. Mielke, Jr. Kenneth J. Berry
Department of Statistics Department of Sociology
Colorado State University Colorado State University
Fort Collins, Colorado 80523 Fort Collins, Colorado 80523
E-mail: mielke@stat.colostate.edu E-mail: berry@lamar.colostate.edu
Library of Congress Cataloging-in-Publication Data
Mielke, Paul W.
Permutation methods: a distance function approach 1 Paul W. Mielke, Jr., Kenneth 1. Berry.
p. cm. - (Springer series in statistics)
Includes bibliographical references and index.
ISBN 978-1-4757-3451-5 ISBN 978-1-4757-3449-2 (eBook)
DOI 10.1007/978-1-4757-3449-2
1. Statistical hypothesis testing. 2. Resampling (Statistics) I. Berry, Kenneth J.
II. Title. III. Series.
QA277 .M53 2001
519.5'6--dc21 00-067920
Printed on acid-free paper.
© 2001 Springer Science+Business Media New York
Originally published by Springer-Verlag New York, Inc in 200 I.
Softcover reprint of the harcover I st edition 200 I
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher Springer Science+Business Media, LLC
except for brief excerpts in connection with reviews or scholarly analysis. Use
in connection with any form of information storage and retrieval, electronic adaptation, computer
software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the
former are not especially identified, is not to be taken as a sign that such names, as understood by
the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.
Production managed by A. Orrantia; manufacturing supervised by Jacqui Ashri.
Photocomposed pages prepared from the authors' LaTeX files.
9 8 765 4 3 2 I
SPIN 10731001
To our families.
Preface
The introduction of permutation tests by R.A.Fisher relaxed the paramet
ric structure requirement of a test statistic. For example, the structure
of the test statistic is no longer required if the assumption of normality
is removed. The between-object distance function of classical test statis
tics based on the assumption of normality is squared Euclidean distance.
Because squared Euclidean distance is not a metric (i.e., the triangle in
equality is not satisfied), it is not at all surprising that classical tests are
severely affected by an extreme measurement of a single object. A major
purpose of this book is to take advantage of the relaxation of the struc
ture of a statistic allowed by permutation tests. While a variety of distance
functions are valid for permutation tests, a natural choice possessing many
desirable properties is ordinary (i.e., non-squared) Euclidean distance. Sim
ulation studies show that permutation tests based on ordinary Euclidean
distance are exceedingly robust in detecting location shifts of heavy-tailed
distributions. These tests depend on a metric distance function and are
reasonably powerful for a broad spectrum of univariate and multivariate
distributions.
Least sum of absolute deviations (LAD) regression linked with a per
mutation test based on ordinary Euclidean distance yields a linear model
analysis which controls for type I error. These Euclidean distance-based
regression methods offer robust alternatives to the classical method of lin
ear model analyses involving the assumption of normality and ordinary
sum of least square deviations (OLS) regression linked with tests based on
squared Euclidean distance. In addition, consideration is given to a num
ber of permutation tests for (1) discrete and continuous goodness-of-fit,
viii Preface
(2) independence in multidimensional contingency tables, and (3) discrete
and continuous multisample homogeneity. Examples indicate some favor
able characteristics of seldom used tests.
Following a brief introduction in Chapter 1, Chapters 2, 3, and 4 provide
the motivation and description of univariate and multivariate permutation
tests based on distance functions for completely randomized and random
ized block designs. Applications are provided. Chapter 5 describes the linear
model methods based on the linkage between regression and permutation
tests, along with recently developed linear and nonlinear model prediction
techniques. Chapters 6, 7, and 8 include the goodness-of-fit, contingency ta
ble, and multisample homogeneity tests, respectively. Appendix A contains
an annotated listing of the computer programs used in the book, organized
by chapter.
Paul Mielke is indebted to the following former University of Minnesota
faculty members: his advisor Richard B. McHugh for introducing him to
permutation tests, Jacob E. Bearman and Eugene A. Johnson for moti
vating the examination of various problems from differing points of view,
and also to Constance van Eeden and 1. Richard Savage for motivating his
interest in nonparametric methods. He wishes to thank two of his Colorado
State University students, Benjamin S. Duran and Earl S. Johnson, for
stimulating his long term interest in alternative permutation methods. Fi
nally, he wishes to thank his Colorado State University colleagues Franklin
A. Graybill, Lewis O. Grant, William M. Gray, Hariharan K. lyer, David
C. Bowden, Peter J. Brockwell, Yi-Ching Yao, Mohammed M. Siddiqui,
Jagdish N. Srivastava, and James S. Williams, who have provided him
with motivation and various suggestions pertaining to this topic over the
years. Kenneth Berry is indebted to the former University of Oregon faculty
members Walter T. Martin, mentor and advisor, and William S. Robinson
who first introduced him to nonparametric statistical methods. Colorado
State University colleagues Jeffrey 1. Eighmy, R. Brooke Jacobsen, Michael
G. Lacy, and Thomas W. Martin were always there to listen, advise, and
encourage.
Acknowledgments. The authors thank the American Meteorological So
ciety for permission to reproduce excerpts from Weather and Forecasting
and the Journal of Applied Meteorology, Sage Publications, Inc. to repro
duce excerpts from Educational and Psychological Measurement, the Amer
ican Psychological Association for permission to reproduce excerpts from
Psychological Bulletin, the American Educational Research Association for
permission to reproduce excerpts from the Journal of Educational and Be
havioral Statistics, and the editors and publishers to reproduce excerpts
from Psychological Reports and Perceptual and Motor Skills.
The authors also wish to thank the following reviewers for their help
ful comments: Mayer Alvo, University of Ottawa; Bradley J. Biggerstaff,
Centers for Disease Control and Prevention; Brian S. Cade, U.S. Geolog
ical Survey; Hariharan K. lyer, Colorado State University; Bryan F. J.
Preface ix
Manly, WEST, Inc.; and Raymond K. W. Wong, Alberta Environment.
At Springer-Verlag New York, Inc., we thank our editor, John Kimmel,
for guiding the project throughout. We are grateful for the efforts of the
production editor, Antonio D. Orrantia, and the copy editor, Hal Henglein.
We wish to thank Roberta Mielke for reading the entire manuscript and
correcting our errors. Finally, we alone are responsible for any shortcomings
or inaccuracies.
Paul W. Mielke, Jr.
Kenneth J. Berry
Contents
Preface vii
1 Introduction 1
2 Description of MRPP 9
2.1 General Formulation of MRPP ..... . 12
2.1.1 Examples of MRPP ....... . 13
2.2 Choice of Weights and Distance Functions 18
2.3 Probability of an Observed 5 ..... . 21
2.3.1 Resampling Approximation .. . 22
2.3.2 Pearson Type III Approximation 22
2.3.3 Approximation Comparisons .. 26
2.3.4 Group Weights . . . . . . . . . . 27
2.3.5 Within-Group Agreement Measure 28
2.4 Exact and Approximate P-Values. 29
2.5 MRPP with an Excess Group . . . . . . . 32
2.6 Detection of Multiple Clumping ..... . 36
2.7 Detection of Evenly Spaced Location Patterns. 41
2.8 Dependence of MRPP on v ....... . 42
2.9 Permutation Version of One-Way AN OVA 46
2.10 Euclidean and Hotelling Commensuration 49
2.11 Power Comparisons ............ . 53
2.11.1 The Normal Probability Distribution. 59
2.11.2 The Cauchy Probability Distribution. 61