ebook img

Maximum Likelihood Estimation of Functional Relationships PDF

117 Pages·1992·2.125 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Maximum Likelihood Estimation of Functional Relationships

Lectu re Notes in Statistics Edited by J. Berger, S. Fienberg, J. Gani, K. Krickeberg, I. Olkin, and B. Singer 69 Nico J. D. Nagelkerke Maximum Likelihood Estimation of Functional Relationships Springer-Verlag Berlin Heidelberg New York London Paris Tokyo Hong Kong Barcelona Budapest Author Nico J. D. Nagelkerke International Statistical Institute Prinses Beatrixlaan, P.O. Box 950 2270 AZ. Voorburg, The Netherlands Mathematical Subject Classification: 62A10, 62B05, 62F03, 62F05, 62F10, 62F12, 62H25, 62J99 ISBN-13: 978-0-387-97721-8 e-ISBN-13: 978-1-4612-2858-5 001: 10.1007/978-1-4612-2858-5 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re·use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Dupli· cation of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copy· right Law. © Springer-Verlag Berlin Heidelberg 1992 Softcover reprint of the hardcover 1 st edition 1992 Typesetting: Camera ready by author 47/3140·543210 - Printed on acid·free paper PREFACE The theory of functional relationships concerns itself with inference from models with a more complex error structure than those existing in regression models. We are familiar with the bivariate linear relationship having measurement errors in both variables and the fact that the standard regression estimator of the slope underestimates the true slope. One complication with inference about parameters in functional relationships, is that many of the standard properties of likelihood theory do not apply, at least not in the form in which they apply to e.g. regression models. This is probably one of the reasons why these models are not adequately discussed in most general books on statistics, despite their wide applicability. In this monograph we will explore the properties of likelihood methods in the context of functional relationship models. Full and conditional likelihood methods are both considered. Possible modifications to these methods are considered when necessary. Apart from exloring the theory itself, emphasis shall be placed upon the derivation of useful estimators and their second moment properties. No attempt is made to be mathematically rigid. Proofs are usually outlined with extensive use of the Landau 0(.) and 0(.) notations. It is hoped that this shall provide more insight than the inevitably lengthy proofs meeting strict standards of mathematical rigour. Although many authors consider structural and functional relationships to be the same topic, the likelihood theory of these two models is essentially different. In addition, the "randomness" assumptions made in structural relationships are highly unrealistic in many, especially natural, sciences. Therefore, in this monograph structural relationships are only introduced when strictly necessary. Topics such as identification of (structural) relationships by means of higher moments have consequently been omitted. This monograph is not intended to be a survey of what is known about functional relationships. Neither is it meant to be a cookbook for problem solving in this field, although (multivariate) linear models have received a great deal of attention because of their practical usefulness. Several numerical examples are presented to illustrate the theory. This monograph has two purposes. The first is to explore the IV potentials of likelihood theory in the field of functional relationships. The second is to summarize some classical results in this field. Results which I feel, partly in view of the frequency with which they are encountered in practice, but also because they constitute an essential element of the science of statistics, should be part of the knowledge of every statistician. N.J.D.N. Table of Contents Preface i 1: Introduction 1 I. Introduction 1 II. Inference 3 III.Contro11ed variables 7 IV.Out1ine of the following chapters 9 2:Maximum likelihood estimation of functional relationships 11 I. Introduction 11 II-Maximization of the likelihood under constraints 13 A.Direct elimination 13 B.The Lagrange multiplier method 13 III.The conditional likelihood 15 IV.Maximum likelihood estimation for mUltivariate normal distributions with known covariance matrix 21 A.Derivation of the normal equations 21 B.The simple linear functional relationship 25 C.Estimation using Sprent's generalized residuals 27 D.Non-1inear models 29 E.lnconsistency of non-linear ML estimators 34 F.Linearization of the normal equations 37 V.Maximum likelihood estimation for mUltivariate normal distributions with unknown covariance matrix 39 A.Estimation with replicated observations 40 B.Estimation without replicated observations 43 C.A sadd1epoint solution to the normal equations 46 VI.Covariance matrix of estimators 49 A.The asymptotic method 50 B.The bootstrap 51 C.The jackknife 52 VII.Error distributions depending on the true variables 55 VIII.Proportion of explained variation 58 3:The mUltivariate linear functional relationship 62 I. Introduction 62 II. Identifiability 64 III.Heteroscedastic errors 66 A.Known error covariance matrix 66 B.Unknown error covariance matrix 68 IV.Homoscedastic errors 69 A.Known error covariance matrix 69 B.Misspecification 71 C.The eigenvalue method 75 D.Unknown error covariance matrix 82 V.Factor space 87 VI.The asymptotic distribution of the parameter estimators 88 A.Asymptotic covariance matrix 88 B.Consistency and asymptotic normality 93 C.Hypothesis tests 94 VII.Rep1icated observations 96 VIII. Instrumental variables 99 References 103 Subject index 108 1. INTRODUCTION 1. I. Introduction In many sciences the use of mathematical models to represent relationships between variables is well established. A classical example taken from physics is Boyle's law which functionally relates pressure (P) and volume (V) of a fixed amount of gas at a constant temperature, P.V = constant (1.1) Another well known example stemming from physics is Newton's law of gravitation, 2 d F = gIDt.m2 (1. 2) which relates the force of attraction (F) due to gravitation between two bodies with masses m and m to the distance (d) between them. t 2 In economics, mathematical models are used to study the relationship between variables like taxation levels, interest rates, GNP,growth rates and so on. For instance, an economist might model the relationship between the output (P) of steel plants and the cost in capital (C) and labour (L) using a so-called Cobb-Douglas production function, (1. 3) In psychology mathematical methods are used to "explain" the scores of individuals on many different test items (x, ••• ,x) in terms of t p (linear) combinations of fewer underlying common factors Le. "traits" (u) e.g. intelligence, speed, spatial orientation and "factors"(e) that pertain to individual test items only x = Lu+e (1.4) where L is the matrix which relates the vector u to the vector x. In physiology the behaviour of skeletal muscles may be modelled by a hyperbolic relationship between force of contraction (F) and velocity (V) , 2 FV+aF+bV+c o (1. 5) and so forth. Often, models contain unknown parameters. This is usually due to the fact that the theory which has led to the formulation of the model is only able to predict the functional form of the model. This is the case with Newton's gravitational law where g, the gravitational constant (g.c.) is a parameter whose value cannot be deduced from Newton's theory • Experiments or observational data are then required to estimate the unknown parameters. When the variables related by the model are measured impecisely, i.e. subject to measurement error,one must take into account the structure of the measurement errors, in that one must make a model of the errors themselves , thereby converting a deterministic model relating the underlying (true) values of the variables into a stochastic, probabilistic or statistical model relating the observations of the variables. Such stochastic models are an extension of deterministic models in the sense that in addition to the deterministic relationships relating underlying unobserved quantities (the true ("latent") values of the variables) e.g. true masses, true forces, to each other, they also specify the relationship in probabilistic terms between these underlying quantities and their measurement. For instance, Newton's model (1.2) could be extended wi th a stochastic model for the measurement of masses. Assuming that the measurement errors are (at least approximately) normally (Gaussian) distributed with zero mean (an assumption that will be made throughout the book) such a model could be written. m (1.6.1) (1.6.2) where, following the tradition of denoting unobservables and parameters (which are also unobservable) by greek symbols, the true mass is denoted by ~ which translates the normally distributed random error variable e into the random variable m which is thus observed. In statistics, models like 1. 6 are called "functional relationships". When the true values of the variables related by the functional relationship are random variables (e.g. the factors u in model (1.4) ), it is common to speak of "structural relationships". In sciences such as psychology or sociology, it may often be reasonable 3 to assume that the underlying true variables are random. It is, however, hardly ever necessary to make such an assumption in order to stud~ the relationships among variables. within the natural sciences, the assumption of random true variables (mass, force) is usually absurd. In this book we will, therefore, be primarily concerned with functional relationships. I. II. Inference How can one go about making inference about the unknown parameters, i . e. either to estimate them or to formulate statements of (un)certainty about them? A method which is particularly appropriate for parameter estimation in complex situations, is the maximum likelihood (ML) method. This method utilizes as parameter estimates those values of the parameters which maximize the likelihood; this likelihood being defined as the differential (element) of the simultaneous probability distribution function of all observations viewed as a function of the unknown parameters. Let z denote all (stochastic) observations and rr all parameters in the model,then the likelihood lik(rrlz) is, lik(rrlz) = dPr(zlrr) = p(zlrr)dz (1. 7) where p(zlrr) is the derivative of Pr(zlrr)with respect to z, Le. p(zlrr) is a density. It is common to ignore the differential element dz in the likelihood since it does not carry any information about the parameters (s) rr. However, this differential element dz cannot be ignored in transformations of variables when the transformation rules ("laws") depend upon unknown parameters (Kalbfleisch and Sprott(1970)). In such cases the determinant of the Jacobian (functional determinant) is a function of the parameters rr and should be taken into consideration. Let t=T(z,rr) be a transformation (one to one) of the observations z. Then, p(zlrr)dz = p(tlrr)dt (1. 8) Hence, at p(zlrr) p(tlrr) lazl (1. 9) 4 1:;1 where denotes the Jacobian of the transformation. Let us take as an example the bivariate linear functional relationship. Consider a simple linear functional relationship between variables ~I and ~I ' ~I = (3~1 (i=l, •• ,n) (1.10) Let YI = ~I + ell and XI = ~I + £1 be observed,where £1 mutually independent and identically distributed (i.i.d.) variates. The likelihood is, ignoring the differential elements dXI and dYI' (1.11) where, denotes the normal N(0,u2) density, or (2rru2 ) -n exp { - (1. 12) Consider the orthogonal (independence preserving) transformation, (1.13.1) (1.13.2) The determinant of the Jacobian of the transformation is 1+(32 , and the likelihood expressed in terms of Vi and ui is, n (2rru2 ) -1 [exp {- 21 (1. 14) I Because of our ignorance about l; I we cannot use the part of the likelihood containing u. However, since u and VI are mutually I independent we can maximize the second part of the likelihood pertaining to v, 1 n (2rru2 ) - 2 exp{- 2. ( 1.15) 2 Taking logarithms and differentiation with respect to (3 yields, 5 (1_~2)8 + ~(S - 8 ) o (1.16) xy yy xx where, S xy :S xx :S yy After solving for ~ and taking the root which maximizes (1.15) we find 1 (S -S + [(S -8 )2+482 ]2 }/2S (1.17) yy xx yy xx xy xy This is the well known solution to the problem of orthogonal least squares (Kendall and Stuart(1967». Note that the regression (of y on x) estimator of ~ is 8 18 wi th xy xx expected value ~(Sxx- n(12)/sxx. This regession estimator underestimates ~, it is thus said to be "attenuated". If (12 were known, we could have estimated ~ by the "deattenuated" regression estimator ~=S (8 -n(12)-1~reqr (in fact, cf. chapter 3, this is a xx xx modified maximum likelihood estimator for known (12=var(e) and unknown var(~». Although the use of transformations which depend upon unknown parameters may give very simple answers to some problems, this approach can only be attractive in the presence of a theory that guides us in finding appropriate transformations. For some models, as we shall see further on, such a theory is available in the form of the theory of conditional likelihoods. In the absence of such a theory, however, we shall use the likelihood in its natural (full) form, in which the differential element dz can be ignored. For the simple bivariate functional relationship (1.10) this "natural" likelihood is, n rP (Yi -~I;I) rP (xl-I;i) (1. 18) I Unfortunately, the likelihood is a function of additional parameters (I;I} which do not interest us, that is they are "nuisance" parameters and are associated with single observations only. Such parameters are called "incidental". Taking logarithms we find as the loglikelihood, (1. 19)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.