ebook img

Information Criteria and Statistical Modeling PDF

282 Pages·2008·1.09 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Information Criteria and Statistical Modeling

Springer Series in Statistics Advisors: P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger Springer Series in Statistics Alho/Spencer: Statistical Demography and Forecasting. Andersen/Borgan/Gill/Keiding: Statistical Models Based on Counting Processes. Atkinson/Riani: Robust Diagnostic Regression Analysis. Atkinson/Riani/Ceriloi: Exploring Multivariate Data with the Forward Search. Berger: Statistical Decision Theory and Bayesian Analysis, 2nd edition. Borg/Groenen: Modern Multidimensional Scaling: Theory and Applications, 2nd edition. Brockwell/Davis: Time Series: Theory and Methods, 2nd edition. Bucklew: Introduction to Rare Event Simulation. Cappé/Moulines/Rydén: Inference in Hidden Markov Models. Chan/Tong: Chaos: A Statistical Perspective. Chen/Shao/Ibrahim: Monte Carlo Methods in Bayesian Computation. Coles: An Introduction to Statistical Modeling of Extreme Values. Devroye/Lugosi: Combinatorial Methods in Density Estimation. Diggle/Ribeiro: Model-based Geostatistics. Dudoit/Van der Laan: Multiple Testing Procedures with Applications to Genomics. Efromovich: Nonparametric Curve Estimation: Methods, Theory, and Applications. Eggermont/LaRiccia: Maximum Penalized Likelihood Estimation, Volume I: Density Estimation. Fahrmeir/Tutz: Multivariate Statistical Modeling Based on Generalized Linear Models, 2nd edition. Fan/Yao: Nonlinear Time Series: Nonparametric and Parametric Methods. Ferraty/Vieu: Nonparametric Functional Data Analysis: Theory and Practice. Ferreira/Lee: Multiscale Modeling: A Bayesian Perspective. Fienberg/Hoaglin: Selected Papers of Frederick Mosteller. Frühwirth-Schnatter: Finite Mixture and Markov Switching Models. Ghosh/Ramamoorthi: Bayesian Nonparametrics. Glaz/Naus/Wallenstein: Scan Statistics. Good: Permutation Tests: Parametric and Bootstrap Tests of Hypotheses, 3rd edition. Gouriéroux: ARCH Models and Financial Applications. Gu: Smoothing Spline ANOVA Models. Gyöfi /Kohler/Krzyźak/Walk: A Distribution-Free Theory of Nonparametric Regression. Haberman: Advanced Statistics, Volume I: Description of Populations. Hall: The Bootstrap and Edgeworth Expansion. Härdle: Smoothing Techniques: With Implementation in S. Harrell: Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. Hart: Nonparametric Smoothing and Lack-of-Fit Tests. Hastie/Tibshirani/Friedman: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hedayat/Sloane/Stufken: Orthogonal Arrays: Theory and Applications. Heyde: Quasi-Likelihood and its Application: A General Approach to Optimal Parameter Estimation. Huet/Bouvier/Poursat/Jolivet: Statistical Tools for Nonlinear Regression: A Practical Guide with S-PLUS and R Examples, 2nd edition. Ibrahim/Chen/Sinha: Bayesian Survival Analysis. Jiang: Linear and Generalized Linear Mixed Models and Their Applications. Jolliffe: Principal Component Analysis, 2nd edition. Knottnerus: Sample Survey Theory: Some Pythagorean Perspectives. Konishi/Kitagawa: Information Criteria and Statistical Modeling. (continued after index) Sadanori Konish.i Genshiro Kitagawa Information Criteria and Statistical Modeling Sadanori Konishi Genshiro Kitagawa FacultyofMathematics The Institute of Statistical Mathematics Kyushu University 4-6-7 Minami-Azabu,Minato-ku 6-10-1 Hakozaki, Higashi-ku Tokyo 106-8569 Fukuoka 812-8581 Japan Japan [email protected] [email protected] ISBN: 978-0-387-71886-6 e-ISBN: 978-0-387-71887-3 Library of Congress Control Number: 2007925718 © 2008 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identifi ed as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper 9 8 7 6 5 4 3 2 1 springer.com Preface Statistical modeling is a critical tool in scientific research. Statistical mod- els are used to understand phenomena with uncertainty, to determine the structure of complex systems, and to control such systems as well as to make reliable predictions in various natural and social science fields. The objective of statistical analysis is to express the information contained in the data of the phenomenon and system under consideration. This information can be expressed in an understandable form using a statistical model. A model also allowsinferencestobemadeaboutunknownaspectsofstochasticphenomena andtohelprevealcausalrelationships.Inpractice,modelselectionandevalu- ation arecentralissues,andacrucialaspectisselectingthemostappropriate model from a set of candidate models. In the information-theoretic approach advocated by Akaike (1973, 1974), theKullback–Leibler(1951)informationdiscrepancyisconsideredasthebasic criterion for evaluating the goodness of a model as an approximation to the true distribution that generates the data. The Akaike information criterion (AIC) was derived as an asymptotic approximate estimate of the Kullback– Leibler information discrepancy and provides a useful tool for evaluating models estimated by the maximum likelihood method. Numerous successful applications of the AIC in statistical sciences have been reported [see, e.g., Akaike and Kitagawa (1998) and Bozdogan (1994)]. In practice, the Bayesian information criterion (BIC) proposed by Schwarz (1978) is also widely used as a model selection criterion. The BIC is based on Bayesian probability and can be applied to models estimated by the maximum likelihood method. The wide availability of fast and inexpensive computers enables the con- structionofvarioustypesofnonlinearmodelsforanalyzingdatawithcomplex structure.Nonlinearstatisticalmodelinghasreceivedconsiderableattentionin variousfieldsofresearch,suchasstatisticalscience,informationscience,com- puter science, engineering, and artificial intelligence. Considerable effort has beenmadeinestablishingpracticalmethodsofmodelingcomplexstructuresof stochasticphenomena.Realisticmodelsforcomplexnonlinearphenomenaare generally characterized by a large number of parameters. Since the maximum vi Preface likelihood method yields meaningless or unstable parameter estimates and leads to overfitting, such models are usually estimated by such methods as the maximum penalized likelihood method [Good and Gaskins (1971), Green and Silverman (1994)] or the Bayes approach. With the development of these flexible modeling techniques, it has become necessary to develop model selec- tion and evaluation criteria for models estimated by methods other than the maximum likelihood method, relaxing the assumptions imposed on the AIC and BIC. Oneofthemainobjectivesofthisbookistoprovidecomprehensiveexpla- nations of the concepts and derivations of the AIC, BIC, and related criteria, together with a wide range of practical examples of model selection and eval- uation criteria. A secondary objective is to provide a theoretical basis for the analysis and extension of information criteria via a statistical functional approach. A generalized information criterion (GIC) and a bootstrap infor- mation criterion are presented, which provide unified tools for modeling and modelevaluationforadiverserangeofmodels,includingvarioustypesofnon- linearmodelsandmodelestimationproceduressuchasrobustestimation,the maximum penalized likelihood method and a Bayesian approach. A general framework for constructing the BIC is also described. In Chapter 1, the basic concepts of statistical modeling are discussed. In Chapter2,modelsarepresentedthatexpressthemechanismoftheoccurrence of stochastic phenomena. Chapter 3, the central part of this book, explains the basic ideas of model evaluation and presents the definition and derivation oftheAIC,inbothitstheoreticalandpracticalaspects,togetherwithawide rangeofpracticalapplications.Chapter4presentsvariousexamplesofstatis- tical modeling based on the AIC. Chapter 5 presents a unified information- theoretic approach to statistical model selection and evaluation problems in termsofastatisticalfunctionalandintroducestheGIC[KonishiandKitagawa (1996)] for the evaluation of a broad class of models, including models esti- matedbyrobustprocedures,maximumpenalizedlikelihoodmethods,andthe Bayes approach. In Chapter 6, the GIC is illustrated through nonlinear sta- tistical modeling in regression and discriminant analyses. Chapter 7 presents the derivation of the GIC and investigates its asymptotic properties, along with some theoretical and numerical improvements. Chapter 8 is devoted to thebootstrapversionofinformationcriteria,includingthevariancereduction technique that substantially reduces the variance associated with a Monte Carlo simulation. In Chapter 9, the Bayesian approach to model evaluation, such as the BIC, ABIC [Akaike (1980b)] and the predictive information cri- terion [Kitagawa (1997)] are discussed. The BIC is also extended such that it can be applied to the evaluation of models estimated by the method of regularization. Finally, in Chapter 10, several model selection and evaluation criteria such as cross-validation, generalized cross-validation, final prediction error (FPE), Mallows’ C , the Hannan–Quinn criterion, and ICOMP are in- p troduced as related topics. Preface vii We would like to acknowledge the many people who contributed to the preparation and completion of this book. In particular, we would like to ac- knowledge with our sincere thanks Hirotugu Akaike, from whom we have learned so much about the seminal ideas of statistical modeling. We have been greatly influenced through discussions with Z. D. Bai, H. Bozdogan, D. F. Findley, Y. Fujikoshi, W. Gersch, A. K. Gupta, T. Higuchi, M. Ichikawa, S. Imoto, M. Ishiguro, N. Matsumoto, Y. Maesono, N. Nakamura, R. Nishii, Y. Ogata, K. Ohtsu, C. R. Rao, Y. Sakamoto, R.Shibata,M.S.Srivastava,T.Takanami,K.Tanabe,M.Uchida,N.Yoshida, T. Yanagawa, and Y. Wu. We are grateful to three anonymous reviewers for comments and sugges- tions that allowed us to improve the original manuscript. Y. Araki, T. Fujii, S.Kawano,M.Kayano,H.Masuda,H.Matsui,Y.Ninomiya,Y.Nonaka,and Y. Tanokura read parts of the manuscript and offered helpful suggestions. We would especially like to express our gratitude to D. F. Findley for his previous reading of this manuscript and his constructive comments. We are also deeply thankful to S. Ono for her help in preparing the manuscript by LATEX. John Kimmel patiently encouraged and supported us throughout the final preparation of this book. We express our sincere thanks to all of these people. Sadanori Konishi Genshiro Kitagawa Fukuoka and Tokyo, Japan February 2007 Contents 1 Concept of Statistical Modeling............................ 1 1.1 Role of Statistical Models ................................ 1 1.1.1 Description of Stochastic Structures by Statistical Models........................................... 1 1.1.2 Predictions by Statistical Models .................... 2 1.1.3 Extraction of Information by Statistical Models ....... 3 1.2 Constructing Statistical Models ........................... 4 1.2.1 Evaluation of Statistical Models–Road to the Information Criterion ........................ 4 1.2.2 Modeling Methodology............................. 5 1.3 Organization of This Book................................ 7 2 Statistical Models ......................................... 9 2.1 Modeling of Probabilistic Events and Statistical Models ...... 9 2.2 Probability Distribution Models ........................... 10 2.3 Conditional Distribution Models........................... 17 2.3.1 Regression Models................................. 17 2.3.2 Time Series Model................................. 24 2.3.3 Spatial Models .................................... 27 3 Information Criterion...................................... 29 3.1 Kullback–Leibler Information ............................. 29 3.1.1 Definition and Properties........................... 29 3.1.2 Examples of K-L Information ....................... 32 3.1.3 Topics on K-L Information ......................... 33 3.2 Expected Log-Likelihood and Corresponding Estimator ...... 35 3.3 Maximum Likelihood Method and Maximum Likelihood Estimators ............................................. 37 3.3.1 Log-Likelihood Function and Maximum Likelihood Estimators ....................................... 37 x Contents 3.3.2 Implementation of the Maximum Likelihood Method by Means of Likelihood Equations ................... 38 3.3.3 Implementation of the Maximum Likelihood Method by Numerical Optimization ......................... 40 3.3.4 Fluctuations of the Maximum Likelihood Estimators... 44 3.3.5 Asymptotic Properties of the Maximum Likelihood Estimators ....................................... 47 3.4 Information Criterion AIC................................ 51 3.4.1 Log-Likelihood and Expected Log-Likelihood ......... 51 3.4.2 Necessity of Bias Correction for the Log-Likelihood .... 52 3.4.3 Derivation of Bias of the Log-Likelihood.............. 55 3.4.4 Akaike Information Criterion (AIC).................. 60 3.5 Properties of MAICE .................................... 69 3.5.1 Finite Correction of the Information Criterion......... 69 3.5.2 Distribution of Orders Selected by AIC .............. 71 3.5.3 Discussion........................................ 73 4 Statistical Modeling by AIC ............................... 75 4.1 Checking the Equality of Two Discrete Distributions......... 75 4.2 Determining the Bin Size of a Histogram ................... 77 4.3 Equality of the Means and/or the Variances of Normal Distributions .................................. 79 4.4 Variable Selection for Regression Model .................... 84 4.5 Generalized Linear Models................................ 88 4.6 Selection of Order of Autoregressive Model ................. 92 4.7 Detection of Structural Changes........................... 96 4.7.1 Detection of Level Shift ............................ 96 4.7.2 Arrival Time of a Signal............................ 99 4.8 Comparison of Shapes of Distributions .....................101 4.9 Selection of Box–Cox Transformations .....................104 5 Generalized Information Criterion (GIC) ..................107 5.1 Approach Based on Statistical Functionals..................107 5.1.1 Estimators Defined in Terms of Statistical Functionals.......................................107 5.1.2 Derivatives of the Functional and the Influence Function .........................................111 5.1.3 Extension of the Information Criteria AIC and TIC....115 5.2 Generalized Information Criterion (GIC) ...................118 5.2.1 Definition of the GIC ..............................119 5.2.2 Maximum Likelihood Method: Relationship Among AIC, TIC, and GIC................................124 5.2.3 Robust Estimation ................................128 5.2.4 Maximum Penalized Likelihood Methods .............134

Description:
Winner of the 2009 Japan Statistical Association Publication Prize. The Akaike information criterion (AIC) derived as an estimator of the Kullback-Leibler information discrepancy provides a useful tool for evaluating statistical models, and numerous successful applications of the AIC have been repor
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.