ebook img

Statistical inference based on divergence measures PDF

497 Pages·2006·2.618 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Statistical inference based on divergence measures

Statistical Inference Based on Divergence Measures © 2006 by Taylor & Francis Group, LLC STATISTICS: Textbooks and Monographs D. B. Owen Founding Editor, 1972–1991 Associate Editors Statistical Computing/ Multivariate Analysis Nonparametric Statistics Professor Anant M. Kshirsagar Professor William R. Schucany University of Michigan Southern Methodist University Quality Control/Reliability Probability Professor Edward G. Schilling Professor Marcel F. Neuts Rochester Institute of Technology University of Arizona Editorial Board Applied Probability Statistical Process Improvement Dr. Paul R. Garvey Professor G. Geoffrey Vining The MITRE Corporation Virginia Polytechnic Institute Economic Statistics Stochastic Processes Professor David E. A. Giles Professor V. Lakshmikantham University of Victoria Florida Institute of Technology Experimental Designs Survey Sampling Mr. Thomas B. Barker Professor Lynne Stokes Rochester Institute of Technology Southern Methodist University Multivariate Analysis Time Series Professor Subir Ghosh Sastry G. Pantula University of North Carolina State University California–Riverside Statistical Distributions Professor N. Balakrishnan McMaster University © 2006 by Taylor & Francis Group, LLC STATISTICS: Textbooks and Monographs Recent Titles Asymptotics, Nonparametrics, and Time Series, edited by Subir Ghosh Multivariate Analysis, Design of Experiments, and Survey Sampling, edited by Subir Ghosh Statistical Process Monitoring and Control, edited by Sung H. Park and G. Geoffrey Vining Statistics for the 21st Century: Methodologies for Applications of the Future, edited by C. R. Rao and Gábor J. Székely Probability and Statistical Inference, Nitis Mukhopadhyay Handbook of Stochastic Analysis and Applications, edited by D. Kannan and V. Lakshmikantham Testing for Normality, Henry C. Thode, Jr. Handbook of Applied Econometrics and Statistical Inference, edited by Aman Ullah, Alan T. K. Wan, and Anoop Chaturvedi Visualizing Statistical Models and Concepts, R. W. Farebrother and Michael Schyns Financial and Actuarial Statistics, Dale Borowiak Nonparametric Statistical Inference, Fourth Edition, Revised and Expanded, edited by Jean Dickinson Gibbons and Subhabrata Chakraborti Computer-Aided Econometrics, edited by David EA. Giles The EM Algorithm and Related Statistical Models, edited by Michiko Watanabe and Kazunori Yamaguchi Multivariate Statistical Analysis, Second Edition, Revised and Expanded, Narayan C. Giri Computational Methods in Statistics and Econometrics, Hisashi Tanizaki Applied Sequential Methodologies: Real-World Examples with Data Analysis, edited by Nitis Mukhopadhyay, Sujay Datta, and Saibal Chattopadhyay Handbook of Beta Distribution and Its Applications, edited by Richard Guarino and Saralees Nadarajah Item Response Theory: Parameter Estimation Techniques, Second Edition, edited by Frank B. Baker and Seock-Ho Kim Statistical Methods in Computer Security, William W. S. Chen Elementary Statistical Quality Control, Second Edition, John T. Burr Data Analysis of Asymmetric Structures, edited by Takayuki Saito and Hiroshi Yadohisa Mathematical Statistics with Applications, Asha Seth Kapadia, Wenyaw Chan, and Lemuel Moyé Advances on Models, Characterizations and Applications, N. Balakrishnan, I. G. Bairamov, and O. L. Gebizlioglu Survey Sampling: Theory and Methods, Second Edition, Arijit Chaudhuri and Horst Stenger Statistical Design of Experiments with Engineering Applications, Kamel Rekab and Muzaffar Shaikh Quality By Experimental Design, Third Edition, Thomas B. Barker Handbook of Parallel Computing and Statistics, Erricos John Kontoghiorghes Statistical Inference Based on Divergence Measures, Leandro Pardo © 2006 by Taylor & Francis Group, LLC Statistical Inference Based on Divergence Measures Leandro Pardo Complutense University of Madrid Spain Boca Raton London New York © 2006 by Taylor & Francis Group, LLC Published in 2006 by Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2006 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-10: 1-58488-600-5 (Hardcover) International Standard Book Number-13: 978-1-58488-600-6 (Hardcover) Library of Congress Card Number 2005049685 This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts have been made to publish reliable data and information, but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use. No part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Pardo, Leandro. Statistical inference based on divergence measures / Leandro Pardo p. cm. -- (Statistics, textbooks and monographs ; v. 185) Includes bibliographical references and index. ISBN 1-58488-600-5 1. Divergent series. 2. Entropy (Information theory) 3. Multivariate analysis. 4. Statistical hypothesis testing. 5. Asymptotic expansions. I. Title. II. Series. QA295.P28 2005 519.5'4--dc22 2005049685 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com Taylor & Francis Group and the CRC Press Web site at is the Academic Division of Informa plc. http://www.crcpress.com © 2006 by Taylor & Francis Group, LLC This book is dedicated to my wife, Marisa © 2006 by Taylor & Francis Group, LLC Preface Themainpurposeofthisbookistopresentinasystematicwaythesolutionto some classical problems of statistical inference, basically problems of estimation and hypotheses testing, on the basis of measures of entropy and divergence, with applications to multinomial (statistical analysis of categorical data) and general populations. The idea of using functionals of Information Theory, such as en- tropies or divergences, in statistical inference is not new. In fact, the so-called Statistical Information Theory has been the subject of much statistical research over the last forty years. Minimum divergence estimators or minimum distance estimators (see Parr, 1981) have been used successfully in models for continuous and discrete data. Divergence statistics, i.e., those ones obtained by replacing either one or both arguments in the measures of divergence by suitable estima- tors, have become a very good alternative to the classical likelihood ratio test in both continuous and discrete models, as well as to the classical Pearson—type statistic in discrete models. It is written as a textbook, although many methods and results are quite recent. Information Theory was born in 1948, when Shannon published his famous paper “A mathematical theory of communication.” Motivated by the problem of efficiently transmitting information over a noisy communication channel, he in- troducedarevolutionarynewprobabilistic wayofthinkingabout communication andsimultaneouslycreatedthefirst trulymathematical theory ofentropy. Inthe citedpaper,twonewconceptswereproposedandstudied: theentropy,ameasure of uncertainty of a random variable, and the mutual information. Verdu´ (1998), in his review paper, describes Information Theory as follows: “A unifying theory with profound intersections with Probability, Statistics, Computer Science, and other fields. Information Theory continues to set the stage for the development ofcommunications, datastorage andprocessing, andotherinformationtechnolo- © 2006 by Taylor & Francis Group, LLC x Statistical Inference based on Divergence Measures gies.” Many books have been written in relation to the subjects mentioned by Verdu´, but the usage of tools arising from Information Theory in problems of estimation and testing has only been described by the book of Read and Cressie (1988), when analyzing categorical data. However, the interesting possibility of introducing alternative test statistics to the classical ones (like Wald, Rao or Likelihood ratio) in general populations is not yet found in any book, as far as I am concerned. This is an important contribution of this book to the field of Information Theory. But the following interesting question arises: Where exactly can be situated the origin of the link between Information Theory and Statistics? Lindley (1956) triestoanswerourquestion, withthe followingwordswithreference to the paper of Shannon (1948), “The first idea is that information is a statistical concept” and “The second idea springs from the first and implies that on the basis of the frequency distribution, there is an essentially unique function of the distribution whichmeasurestheamountoftheinformation.” ThisfactprovidedKullbackand Leibler(1951)theopportunityofintroducingameasureofdivergence,asagener- alization of Shannon’s entropy, called the Kullback-Leibler divergence. Kullback, laterin1959, wrote theessential book“InformationTheoryandStatistics.” This bookcanbeconsideredthebeginningofStatisticalInformationTheory, although it has been necessary to wait a more few years for the statisticians to return to the problem. The contents of the present book can be roughly separated in two parts. The first part is dedicated to make, from a statistical perspective, an overview of the most important measures of entropy and divergence introduced until now in the literature of Information Theory, as well as to study their properties, in order to justify their application in statistical inference. Special attention is paid to the families of φ-entropies as well as on the φ-divergence measures. This is the main target of Chapter 1. Chapter 2 is devoted to the study of the asymptotic behavior of measures of entropy, and the use of their asymptotic distributions to solve different statistical problems. An important fact studied in this chapter is the behavior of the entropy measures as diversity indexes. The second part of the book is dedicated to two important topics: statistical analysis of discrete multivariatedatainChapters3,4,5,6,7and8,andtestingingeneralpopulations in Chapter 9. Thestatisticalanalysisofdiscretemultivariatedata,arisingfromexperiments © 2006 by Taylor & Francis Group, LLC Preface xi where the outcome variables are the number of individuals classified into unique nonoverlappingcategories, has received a great deal of attention in the statistical literature in the last forty years. The development of appropriate models for those kind of data is the common subject of hundreds of references. In these references, papers and books, the model is tested with the traditional Pearson goodness-of-fit test statistic or with the traditional loglikelihood ratio test sta- tistic, and the unknown parameters are estimated using the maximum likelihood method. However, it is well known that this can give a poor approximation in many circ umstances, see Read a nd Cressie ( 19 88) , and it is p o ssi ble to get b ette r results byconsideringgeneral families oftest statistics, as well as general families of estimators. We use the word “general” in the sense that these families contain as particular cases the Pearson and loglikelihood ratio test statistics, for testing, as well as the maximum likelihood estimator, for estimating. In Chapter 3, the problem of testing goodness-of-fit with simple null hypothesis is studied on the basis of the φ-divergence test statistics under different situations: Fixed number of classes, number of classes increasing to infinity, quantile characterization, de- pendent observationsand misclassified data. The resultsobtainedinthis chapter are asymptotic and consequently valid just for large sample sizes. In Chapter 4, some methods to improve the accuracy of test statistics, in those situations where the sample size can not be assumed large, are presented. Chapter 5 is addressed to the study of a wide class of estimators suitable for discrete data, either whenthe underlaying distributionisdiscrete, orwhenit iscontinuous, but the observations are classified into groups: Minimum φ-divergence estimators. Their asymptotic properties are studied as well as their behavior under the set up of a mixture of normal populations. A new problem of estimation appears if we have some functions that constrain the unknown parameters. To solve this problem, the restricted minimum φ-divergence estimator is also introduced and studied in Chapter 5. These results will be used in Chapter 8, where the behaviorofφ-divergences test statisticsin contingency tables is discussed. Chap- ter 6 deals with the problem of goodness-of-fit with composite null hypothesis. For this problem, we consider φ-divergence test statistics in which the unknown parameters are estimated by minimum φ-divergence estimators. In addition to the classical problem, with fixed number of classes, the following nonstandard cases are also treated: φ-divergence test statistics when the unknown parameters are estimated by maximum likelihood estimator, φ-divergence test statistics with quantile characterizations, φ-divergence test statistics when parameters are esti- mated from an independent sample, φ-divergence test statistics with dependent © 2006 by Taylor & Francis Group, LLC xii Statistical Inference based on Divergence Measures observations and φ-divergence test statistics when there are some constraints on the parameters. Chapter 7 covers the important problem of testing in loglinear models by using φ-divergence test statistics. In this chapter, some of the most important results appeared in Cressie and Pardo (2000, 2002b), and Cressie et al. (2003) are presented. The properties of the minimum φ-divergence estima- tors in loglinear models are studied and a new family of test statistics based on them is introduced for the problems of testing goodness-of-fit and for testing a nested sequence of loglinear models. Pearson’s and likelihood ratio test statistics are members of the new family of test statistics. This chapter finishes with a simulation study, in which a new test statistic, placed “between” Pearson’s chi- square and likelihood ratio test statistics, emerged as a good choice, considering its valuable properties. Chapter 8 presents a unified study of some classical problems in contingency tables using the φ-divergence test statistic as well as the minimum φ-divergence estimator. Weconsidertheproblemsofindependence,symmetry,marginalhomo- geneityandquasi-symmetryinatwo-waycontingencytableandalsotheclassical problem of homogeneity. The domain of application of φ-divergence test statistics goes far beyond that of multinomial hypothesis testing. The extension of φ-divergence statistics to testing hypotheses in problems where random samples (one or several) obey distributional laws from parametric families has also given nice and interesting results in relation to the classical test statistics: likelihood ratio test, Wald test statistic or Rao statistic. This topic is considered and studied in Chapter 9. The exercises and their solutions included in each chapter form a part of considerable importance of the book. They provide not only practice problems for students, but also some additional results as complementary materials to the main text. I would like to express my gratitude to all the professors who revised parts of the manuscript and made some contributions. In particular, I would like to thank Professors Arjun Gupta, Nirian Mart´ın, Isabel Molina, Domingo Morales, Truc Nguyen, Julio Angel Pardo, Maria del Carmen Pardo and Kostas Zografos. My gratitude, also, to Professor Juan Francisco Padial for his support in the technical development of the book. Special thanks to Professor Arjun Gupta for his invitation to visit the De- © 2006 by Taylor & Francis Group, LLC

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.