Table Of ContentStatistical Inference Based on
Divergence Measures
© 2006 by Taylor & Francis Group, LLC
STATISTICS: Textbooks and Monographs
D. B. Owen
Founding Editor, 1972–1991
Associate Editors
Statistical Computing/ Multivariate Analysis
Nonparametric Statistics Professor Anant M. Kshirsagar
Professor William R. Schucany University of Michigan
Southern Methodist University
Quality Control/Reliability
Probability Professor Edward G. Schilling
Professor Marcel F. Neuts Rochester Institute of Technology
University of Arizona
Editorial Board
Applied Probability Statistical Process Improvement
Dr. Paul R. Garvey Professor G. Geoffrey Vining
The MITRE Corporation Virginia Polytechnic Institute
Economic Statistics Stochastic Processes
Professor David E. A. Giles Professor V. Lakshmikantham
University of Victoria Florida Institute of Technology
Experimental Designs Survey Sampling
Mr. Thomas B. Barker Professor Lynne Stokes
Rochester Institute of Technology Southern Methodist University
Multivariate Analysis Time Series
Professor Subir Ghosh Sastry G. Pantula
University of North Carolina State University
California–Riverside
Statistical Distributions
Professor N. Balakrishnan
McMaster University
© 2006 by Taylor & Francis Group, LLC
STATISTICS: Textbooks and Monographs
Recent Titles
Asymptotics, Nonparametrics, and Time Series, edited by Subir Ghosh
Multivariate Analysis, Design of Experiments, and Survey Sampling, edited by Subir Ghosh
Statistical Process Monitoring and Control, edited by Sung H. Park and G. Geoffrey Vining
Statistics for the 21st Century: Methodologies for Applications of the Future, edited by C. R.
Rao and Gábor J. Székely
Probability and Statistical Inference, Nitis Mukhopadhyay
Handbook of Stochastic Analysis and Applications, edited by D. Kannan and
V. Lakshmikantham
Testing for Normality, Henry C. Thode, Jr.
Handbook of Applied Econometrics and Statistical Inference, edited by Aman Ullah, Alan
T. K. Wan, and Anoop Chaturvedi
Visualizing Statistical Models and Concepts, R. W. Farebrother and Michael Schyns
Financial and Actuarial Statistics, Dale Borowiak
Nonparametric Statistical Inference, Fourth Edition, Revised and Expanded, edited by Jean
Dickinson Gibbons and Subhabrata Chakraborti
Computer-Aided Econometrics, edited by David EA. Giles
The EM Algorithm and Related Statistical Models, edited by Michiko Watanabe and Kazunori
Yamaguchi
Multivariate Statistical Analysis, Second Edition, Revised and Expanded, Narayan C. Giri
Computational Methods in Statistics and Econometrics, Hisashi Tanizaki
Applied Sequential Methodologies: Real-World Examples with Data Analysis, edited by Nitis
Mukhopadhyay, Sujay Datta, and Saibal Chattopadhyay
Handbook of Beta Distribution and Its Applications, edited by Richard Guarino and Saralees
Nadarajah
Item Response Theory: Parameter Estimation Techniques, Second Edition, edited by Frank
B. Baker and Seock-Ho Kim
Statistical Methods in Computer Security, William W. S. Chen
Elementary Statistical Quality Control, Second Edition, John T. Burr
Data Analysis of Asymmetric Structures, edited by Takayuki Saito and Hiroshi Yadohisa
Mathematical Statistics with Applications, Asha Seth Kapadia, Wenyaw Chan, and Lemuel Moyé
Advances on Models, Characterizations and Applications, N. Balakrishnan, I. G. Bairamov,
and O. L. Gebizlioglu
Survey Sampling: Theory and Methods, Second Edition, Arijit Chaudhuri and Horst Stenger
Statistical Design of Experiments with Engineering Applications, Kamel Rekab and
Muzaffar Shaikh
Quality By Experimental Design, Third Edition, Thomas B. Barker
Handbook of Parallel Computing and Statistics, Erricos John Kontoghiorghes
Statistical Inference Based on Divergence Measures, Leandro Pardo
© 2006 by Taylor & Francis Group, LLC
Statistical Inference Based on
Divergence Measures
Leandro Pardo
Complutense University of Madrid
Spain
Boca Raton London New York
© 2006 by Taylor & Francis Group, LLC
Published in 2006 by
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2006 by Taylor & Francis Group, LLC
Chapman & Hall/CRC is an imprint of Taylor & Francis Group
No claim to original U.S. Government works
Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1
International Standard Book Number-10: 1-58488-600-5 (Hardcover)
International Standard Book Number-13: 978-1-58488-600-6 (Hardcover)
Library of Congress Card Number 2005049685
This book contains information obtained from authentic and highly regarded sources. Reprinted material is
quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts
have been made to publish reliable data and information, but the author and the publisher cannot assume
responsibility for the validity of all materials or for the consequences of their use.
No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic,
mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and
recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration
for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate
system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only
for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Pardo, Leandro.
Statistical inference based on divergence measures / Leandro Pardo
p. cm. -- (Statistics, textbooks and monographs ; v. 185)
Includes bibliographical references and index.
ISBN 1-58488-600-5
1. Divergent series. 2. Entropy (Information theory) 3. Multivariate analysis. 4. Statistical
hypothesis testing. 5. Asymptotic expansions. I. Title. II. Series.
QA295.P28 2005
519.5'4--dc22 2005049685
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
Taylor & Francis Group and the CRC Press Web site at
is the Academic Division of Informa plc. http://www.crcpress.com
© 2006 by Taylor & Francis Group, LLC
This book is dedicated to my wife, Marisa
© 2006 by Taylor & Francis Group, LLC
Preface
Themainpurposeofthisbookistopresentinasystematicwaythesolutionto
some classical problems of statistical inference, basically problems of estimation
and hypotheses testing, on the basis of measures of entropy and divergence, with
applications to multinomial (statistical analysis of categorical data) and general
populations. The idea of using functionals of Information Theory, such as en-
tropies or divergences, in statistical inference is not new. In fact, the so-called
Statistical Information Theory has been the subject of much statistical research
over the last forty years. Minimum divergence estimators or minimum distance
estimators (see Parr, 1981) have been used successfully in models for continuous
and discrete data. Divergence statistics, i.e., those ones obtained by replacing
either one or both arguments in the measures of divergence by suitable estima-
tors, have become a very good alternative to the classical likelihood ratio test
in both continuous and discrete models, as well as to the classical Pearson—type
statistic in discrete models. It is written as a textbook, although many methods
and results are quite recent.
Information Theory was born in 1948, when Shannon published his famous
paper “A mathematical theory of communication.” Motivated by the problem of
efficiently transmitting information over a noisy communication channel, he in-
troducedarevolutionarynewprobabilistic wayofthinkingabout communication
andsimultaneouslycreatedthefirst trulymathematical theory ofentropy. Inthe
citedpaper,twonewconceptswereproposedandstudied: theentropy,ameasure
of uncertainty of a random variable, and the mutual information. Verdu´ (1998),
in his review paper, describes Information Theory as follows: “A unifying theory
with profound intersections with Probability, Statistics, Computer Science, and
other fields. Information Theory continues to set the stage for the development
ofcommunications, datastorage andprocessing, andotherinformationtechnolo-
© 2006 by Taylor & Francis Group, LLC
x Statistical Inference based on Divergence Measures
gies.” Many books have been written in relation to the subjects mentioned by
Verdu´, but the usage of tools arising from Information Theory in problems of
estimation and testing has only been described by the book of Read and Cressie
(1988), when analyzing categorical data. However, the interesting possibility of
introducing alternative test statistics to the classical ones (like Wald, Rao or
Likelihood ratio) in general populations is not yet found in any book, as far as
I am concerned. This is an important contribution of this book to the field of
Information Theory.
But the following interesting question arises: Where exactly can be situated
the origin of the link between Information Theory and Statistics? Lindley (1956)
triestoanswerourquestion, withthe followingwordswithreference to the paper
of Shannon (1948), “The first idea is that information is a statistical concept”
and “The second idea springs from the first and implies that on the basis of the
frequency distribution, there is an essentially unique function of the distribution
whichmeasurestheamountoftheinformation.” ThisfactprovidedKullbackand
Leibler(1951)theopportunityofintroducingameasureofdivergence,asagener-
alization of Shannon’s entropy, called the Kullback-Leibler divergence. Kullback,
laterin1959, wrote theessential book“InformationTheoryandStatistics.” This
bookcanbeconsideredthebeginningofStatisticalInformationTheory, although
it has been necessary to wait a more few years for the statisticians to return to
the problem.
The contents of the present book can be roughly separated in two parts. The
first part is dedicated to make, from a statistical perspective, an overview of the
most important measures of entropy and divergence introduced until now in the
literature of Information Theory, as well as to study their properties, in order
to justify their application in statistical inference. Special attention is paid to
the families of φ-entropies as well as on the φ-divergence measures. This is the
main target of Chapter 1. Chapter 2 is devoted to the study of the asymptotic
behavior of measures of entropy, and the use of their asymptotic distributions
to solve different statistical problems. An important fact studied in this chapter
is the behavior of the entropy measures as diversity indexes. The second part
of the book is dedicated to two important topics: statistical analysis of discrete
multivariatedatainChapters3,4,5,6,7and8,andtestingingeneralpopulations
in Chapter 9.
Thestatisticalanalysisofdiscretemultivariatedata,arisingfromexperiments
© 2006 by Taylor & Francis Group, LLC
Preface xi
where the outcome variables are the number of individuals classified into unique
nonoverlappingcategories, has received a great deal of attention in the statistical
literature in the last forty years. The development of appropriate models for
those kind of data is the common subject of hundreds of references. In these
references, papers and books, the model is tested with the traditional Pearson
goodness-of-fit test statistic or with the traditional loglikelihood ratio test sta-
tistic, and the unknown parameters are estimated using the maximum likelihood
method. However, it is well known that this can give a poor approximation in
many circ umstances, see Read a nd Cressie ( 19 88) , and it is p o ssi ble to get b ette r
results byconsideringgeneral families oftest statistics, as well as general families
of estimators. We use the word “general” in the sense that these families contain
as particular cases the Pearson and loglikelihood ratio test statistics, for testing,
as well as the maximum likelihood estimator, for estimating. In Chapter 3, the
problem of testing goodness-of-fit with simple null hypothesis is studied on the
basis of the φ-divergence test statistics under different situations: Fixed number
of classes, number of classes increasing to infinity, quantile characterization, de-
pendent observationsand misclassified data. The resultsobtainedinthis chapter
are asymptotic and consequently valid just for large sample sizes. In Chapter
4, some methods to improve the accuracy of test statistics, in those situations
where the sample size can not be assumed large, are presented. Chapter 5 is
addressed to the study of a wide class of estimators suitable for discrete data,
either whenthe underlaying distributionisdiscrete, orwhenit iscontinuous, but
the observations are classified into groups: Minimum φ-divergence estimators.
Their asymptotic properties are studied as well as their behavior under the set
up of a mixture of normal populations. A new problem of estimation appears
if we have some functions that constrain the unknown parameters. To solve
this problem, the restricted minimum φ-divergence estimator is also introduced
and studied in Chapter 5. These results will be used in Chapter 8, where the
behaviorofφ-divergences test statisticsin contingency tables is discussed. Chap-
ter 6 deals with the problem of goodness-of-fit with composite null hypothesis.
For this problem, we consider φ-divergence test statistics in which the unknown
parameters are estimated by minimum φ-divergence estimators. In addition to
the classical problem, with fixed number of classes, the following nonstandard
cases are also treated: φ-divergence test statistics when the unknown parameters
are estimated by maximum likelihood estimator, φ-divergence test statistics with
quantile characterizations, φ-divergence test statistics when parameters are esti-
mated from an independent sample, φ-divergence test statistics with dependent
© 2006 by Taylor & Francis Group, LLC
xii Statistical Inference based on Divergence Measures
observations and φ-divergence test statistics when there are some constraints on
the parameters. Chapter 7 covers the important problem of testing in loglinear
models by using φ-divergence test statistics. In this chapter, some of the most
important results appeared in Cressie and Pardo (2000, 2002b), and Cressie et
al. (2003) are presented. The properties of the minimum φ-divergence estima-
tors in loglinear models are studied and a new family of test statistics based on
them is introduced for the problems of testing goodness-of-fit and for testing a
nested sequence of loglinear models. Pearson’s and likelihood ratio test statistics
are members of the new family of test statistics. This chapter finishes with a
simulation study, in which a new test statistic, placed “between” Pearson’s chi-
square and likelihood ratio test statistics, emerged as a good choice, considering
its valuable properties.
Chapter 8 presents a unified study of some classical problems in contingency
tables using the φ-divergence test statistic as well as the minimum φ-divergence
estimator. Weconsidertheproblemsofindependence,symmetry,marginalhomo-
geneityandquasi-symmetryinatwo-waycontingencytableandalsotheclassical
problem of homogeneity.
The domain of application of φ-divergence test statistics goes far beyond
that of multinomial hypothesis testing. The extension of φ-divergence statistics
to testing hypotheses in problems where random samples (one or several) obey
distributional laws from parametric families has also given nice and interesting
results in relation to the classical test statistics: likelihood ratio test, Wald test
statistic or Rao statistic. This topic is considered and studied in Chapter 9.
The exercises and their solutions included in each chapter form a part of
considerable importance of the book. They provide not only practice problems
for students, but also some additional results as complementary materials to the
main text.
I would like to express my gratitude to all the professors who revised parts
of the manuscript and made some contributions. In particular, I would like to
thank Professors Arjun Gupta, Nirian Mart´ın, Isabel Molina, Domingo Morales,
Truc Nguyen, Julio Angel Pardo, Maria del Carmen Pardo and Kostas Zografos.
My gratitude, also, to Professor Juan Francisco Padial for his support in the
technical development of the book.
Special thanks to Professor Arjun Gupta for his invitation to visit the De-
© 2006 by Taylor & Francis Group, LLC