Table Of ContentArnan Ullah (Editor)
Semiparametric
and N onparametric
Econometrics
With 12 Figures
Physica-Verlag Heidelberg
Editorial Board
Wolfgang Franz, University Stuttgart, FRG
Baldev Raj, Wilfrid Laurier University, Waterloo, Canada
Andreas Worgotter, Institute for Advanced Studies, Vienna, Austria
Editor
Aman UUah, Department of Economics,
University of Western Ontario, London, Ontario, N6A 5C2, Canada
First published in "Empirical Economics"
Vol. 13, No.3 and 4,1988
CIP-Kurztitelaufnahme der Deutschen Bibliothek
Semiparametric and nonparametric econometrics / Aman Ullah (ed.) -Heidelberg: Physica-Verl. ;
New York : Springer, 1989
(Studies in empirical economics)
NE: Ullah, Aman [Hrsg.j
ISBN 978-3-642-51850-8 ISBN 978-3-642-51848-5 (eBook)
DOI 10.1007/978-3-642-51848-5
This work is subject to copyright. All rights are reserved, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks.
Duplication of this publication or parts thereof is only permitted under the provisions of the
German Copyright Law of September 9, 1965, in its version of June 24, 1985, and a copyright fee
must always be paid. Violations fall under the prosecution act of the German Copyright Law.
© Physica-Verlag Heidelberg 1989
Softcover reprint of the hardcover 1st edition 1989
The use of registered names, trademarks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Printing: Kiliandruck, Grtinstadt
Bookbinding: T. Gansert GmbH, Weinheim-Sulzbach
710017130-543210
Introduction
Over the last three decades much research in empirical and theoretical economics has
been carried on under various assumptions. For example a parametric functional form
of the regression model, the heteroskedasticity, and the autocorrelation is always as
sumed, usually linear. Also, the errors are assumed to follow certain parametric distri
butions, often normal. A disadvantage of parametric econometrics based on these
assumptions is that it may not be robust to the slight data inconsistency with the
particular parametric specification. Indeed any misspecification in the functional form
may lead to erroneous conclusions. In view of these problems, recently there has been
significant interest in 'the semiparametric/nonparametric approaches to econometrics.
The semiparametric approach considers econometric models where one component
has a parametric and the other, which is unknown, a nonparametric specification
(Manski 1984 and Horowitz and Neumann 1987, among others). The purely non
parametric approach, on the other hand, does not specify any component of the
model a priori. The main ingredient of this approach is the data based estimation of the
unknown joint density due to Rosenblatt (1956). Since then, especially in the last
decade, a vast amount of literature has appeared on nonparametric estimation in
statistics journals. However, this literature is mostly highly technical and this may
partly be the reason why very little is known about it in econometrics, although see
Bierens (1987) and Ullah (1988).
The focus of research in this volume is to develop the ways of making semi
parametric and nonparametric techniques accessible to applied economists. With this
in view the paper by Hartog and Bierens explore a nonparametric technique for estimat
ing and testing an earning function with discrete explanatory variables. Raj and Siklos
analyse the role of fiscal policy in S1. Louis model using parametric and nonparametric
techniques. Then there are papers on the nonparametric kernel estimators and their ap
plications. For example, Hong and Pagan look into the performances of nonparametric
kernel estimators for regression coefficient and heteroskedasticity. They also compare
the behaviour of nonparametric estimators with the Fourier Series estimators. Another
interesting application is the forecasting of U.S. Hog supply. This is by Moschini et aI.
VI Introduction
A systematic development of nonparametric procedure for estimation and testing
is given in Ullah's paper. The important issue in the applications of nonparametric
techniques is the selection of window-width. The Survey by Marron in this regard is
extremely useful for the practioners as well as theoretical researchers. Scott also discus
ses this issue in his paper which deals with the analysis of income distribution by the
histogram method.
Finally there are two papers on semiparametric econometrics. The paper by
Horowitz studies various semiparametric estimators for censored regression models,
and the paper by Tiwari et al. provides the Bayesian flavour to the semiparametric
prediction problems.
The work on this volume was initiated after the first conference on semiparametric
and nonparametric econometrics was held at the University of Western Ontario in
May, 1987. Most of the contributors of this volume are the participants of this con
ference, though the papers contributed here are not necessarily the papers presented at
the conference. I take this opportunity to thank all the contributors, discussants and
reviewers without whose help this volume would not have taken the present form.
I am also thankful to M. Parkin for his enthusiastic support to the conference and
other activities related to nonparametric econometrics. It was also a pleasure to co
ordinate the work on this volume with B. Raj, co-editor of Empirical Economics.
Arnan Ullah
University of Western Ontario
References
Bierens H (1987) Kernel estimation of regression function. In Bewley TF (ed) Advances in econo
metrics. Cambridge University Press, New York, pp 99-144
Horowitz J, Newmann GR (1987) Semiparametric estimation of employment duration models.
Econometric Reviews 5 -40
Manski CF (1984) Adoptive estimation of nonlinear regression models. Econometric Reviews
3(2):149-194
Rosenblatt M (1956) Remarks on some nonparametric estimates of density function. Annals
of Mathematical Statistics 27 :832-837
Ullah A (1988) Nonparametric estimation of econometric functions. Canadian Journal of Econo
mics 21 :625-658
Contents
The Asymptotic Efficiency of Semiparametric Estimators for Censored linear
Regression Models
J. L. Horowitz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
Nonparametric Kernel Estimation Applied to Forecasting:
An Evaluation Based on the Bootstrap
G. Moschini, D. M. Prescott and T. Stengos . . . . . . . . . . . . . . . . . . . . . . . .. 19
Calibrating Histograms with Application to Economic Data
D. W. Scott and H.-P. Schmitz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 33
The Role of Fiscal Policy in the St. Louis Model:
Nonparametric Estimates for a Small Open Economy
B. Raj and P. L. Siklos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 47
Automatic Smoothing Parameter Selection: A Survey
J. S. Marron. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 65
Bayes Prediction Density and Regression Estimation
- A Semiparametric Approach
R. C. Tiwari, S. R. Jammalamadaka and S. Chib . . . . . . . . . . . . . . . . . . . . .. 87
Nonparametric Estimation and Hypothesis Testing in Econometric Models
A. Ullah . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 101
Some Simulation Studies of Nonparametric Estimators
Y. Hong and A. Pagan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 129
Estimating a Hedonic Earnings Function with a Nonparametric Method
J. Hartog and H. J. Bierens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 145
The Asymptotic Efficiency of Semiparametric Estimators
for Censored Linear Regression Models 1
By J. L. Horowitz2
Abstract: This paper presents numerical comparisons of the asymptotic mean square estimation
errors of semiparametric generalized least squares (SGLS), quantile, symmetrically censored least
squares (SCLS), and tobit maximum likelihood estimators of the slope parameters of censored
linear regression models with one explanatory variable. The results indicate that the SCLS estima
tor is less efficient than the other two semiparametric estimators. The SGLS estimator is more ef
ficient than quantile estimators when the tails of the distribution of the random component of the
model are not too thick and the probability of censoring is not too large. The most efficient
semiparametric estimators usually have smaller mean square estimation errors than does the tobit
estimator when the random component of the model is not normally distributed and the sample
size is 500-1,000 or more.
1 Introduction
There are a variety of economic models in which data on the dependent variable is
censored. For example, observations of the durations of spells of employment or the
lifetimes of capital goods may be censored by the termination of data acquisition, in
which case the dependent variables of models aimed at explaining these durations are
censored. In models of lab~r supply, the quantity of labor supplied by an individual
may be continuously distributed when positive but may have positive probability of
being zero owing to the existence of comer-point solutions to the problem of choos
ing the quantity of labor that maximizes an individual's utility. Labor supply then fol
lows a censored probability distribution.
I thank Herman J. Bierens for comments on an earlier draft of this paper.
2 Joel 1. Horowitz, Department of Economics, University of Iowa, Iowa City, IA 52242, USA.
2 J. L. Horowitz
A typical model of the relation between a censored dependent variable y and a
vector of explanatory variables x is
y = max (0, a + fix + u), (1)
where a is a scalar constant, ~ is a vector of constant parameters, and u is random. The
standard methods for estimating a and ~, including maximum likelihood (Amerniya
1973) and two-stage methods (Heckman 1976, 1977) require the distribution ofu to
be specified a priori up to a finite set of constant parameters. Misspecification of this
distribution causes the parameter estimates to be inconsistent. However, economic
theory rarely gives guidance concerning the distribution of u, and the usual estimation
techniques do not provide convenient methods for identifying the distribution from
the data.
Recently, a variety of distribution-free or semiparametric methods for estimating
~ and, in some cases either a or the distribution of u, have been developed (Duncan
1986; Fernandez 1986; Horowitz 1986, 1988; Powell 1984, 1986a, b). These methods
require the distribution of u to satisfy regularity conditions, but it need not be known
otherwise. Among these methods, three - quantile estimation (Powell 1984, 1986b),
symmetrically censored least squares (SCLS) estimation (Powell 1986a) and sernipara
metric M estimation (Horowitz 1988) - yield estimators of ~ that are N1/2 -consistent
and asymptotically normal. These three methods permit the usual kinds of statistical
inferences to be made while minimizing the possibility of obtaining inconsistent esti
mates of ~ due to misspecification of the distribution of u.
None of the known N1/2 -consistent semiparametric estimators achieves the
asymptotic efficiency bound of Cosslett (1987). Intuition suggests that SCLS is likely
to be less efficient than the other two, but precise information on the relative ef
ficiencies of the different estimators is not available. In addition, limited empirical re
sults (Horowitz and Neumann 1987) suggest that semiparametric estimation may entail
a substantial loss of estimation efficiency relative to parametric maximum likelihood
estimation. It is possible, therefore, that with samples of the sizes customarily en
countered in applications, the use of semiparametric estimators causes a net increase in
mean square estimation error relative to the use of a parametric estimator based on a
misspecified model. However, precise information on the relative errors of parametric
and semiparametric estimators is not available.
Expressions for the asymptotic mean square estimation errors of the various
estimators are available, but their complexity precludes analytic comparisons of estima
tion errors, even for very simple models. Consequently, it is necessary to use numerical
experiments to obtain insight into the relative efficiencies of the estimators. This
paper reports the results of a group of such experiments. The experiments consist of
evaluating numerically the asymptotic variances of three semiparametric estimators of
the slope parameter ~ in a variety of censored linear regression models with one ex-
The Asymptotic Efficiency of Semiparametric Estimators 3
planatory variable. The estimators considered are quantile estimators, semiparametric
generalized least squares (SGLS) estimators (a special case of semiparametric M estima
tors), and SCLS estimators. The numerically determined variances of these estimators
are compared with each other, with the Cosslett efficiency bound, and with an asymp
totic approximation to the mean square estimation error of the maximum likelihood
estimator of t3 based on correctly and erroneously specified distributions of the ran
dom error term u.
The next section describes the estimators used in the paper. Section 3 describes
the models on which the numerical experiments were based and presents the results.
Section 4 presents the conclusions of this research.
2 Description of the Estimators
It is assumed in the remainder of this paper that x and t3 are scalars, u is independent
of x, and that estimation of a and t3 is based on a simple random sample {Yn, Xn:
n = 1, ... , N} of the variables (y, x) in equation (1).
a) Quantile Estimators
Let 0 be any number such that 0 < 0 < 1. Let l(A) denote the indicator of the event
=
A. That is, l(A) 1 if A occurs and 0 otherwise. In quantile estimation based on the 0
quantile, the estimators of a and t3 satisfy
N
[aN(0),bN(0)]=argminN-1 ~ po[Yn-max(O,a+bxn)], (2)
n=l
where aN(O) and b N(O) are the estimators, and for any real z
Po(z) == [0 - l(z < O)]z. (3)
This estimator can be understood intuitively as follows. Let Uo denote the 0 quantile
of the distribution of u. Then max (0, Uo + a + Itt) is the 0 quantile of Y conditional
n
on x. For any random variable z, Epo(z - is minimized over the parameter ~ by set
ting ~ equal to the 0 quantile of the distribution of z. Therefore, Epo [y - max (0,
4 1. 1. Horowitz
Ue +a + bx)] is minimized with respect to (a, b) at (a, b) = (ex, (3). Quantile estimation
consists of minimizing the sample analogue of EPe [y - max (0, Ue + a + bx )].
Powell (1984, 1986b) has shown that subject to regularity conditions, bN(8) con
verges almost surely to (3 and aN(8) converges almost surely to ex + ue. Moreover,
N1/2[aN(8) - ex - ue, bN(8) - (3]' is asymptotically bivariate normally distributed with
=
mean zero and covariance matrix V Q(8) w(8)D-1 , where
w(8) = 8(1 - 8)/[f(ue )]2, (4)
f is the probability density function of u,
D(8) = E[ 1( ex + Ue + {3x > O)X'X], (5)
and X = (1, x). This paper is concerned exclusively with the ({3, (3) component of
VQ(8) - that is, with the asymptotic variance of N1/2[bN(8) - (3].
b) The Semiparametric Generalized Least Squares Estimator
If the cumulative distribution function of U were known, {3 could be estimated by non
linear generalized least squares (NGLS). Moreover, an asymptotically equivalent esti
mator could be obtained by taking one Newton step from any N1/2 -consistent estima
tor toward the NGLS estimator. When the cumulative distribution function of U is
unknown, one might consider replacing it with a consistent estimator. The SGLS
method consists, essentially, of carrying out one-step NGLS estimation after replacing
the unknown distribution function with a consistent estimate.
To define the SGLS estimator precisely, let F denote the unknown cumulative
distribution function of u, and let bN be any N1/2-consistent estimator of (3. Given
any scalar b, let FN(-, b) denote the Kaplan-Meier (1958) estimator of F based on
y - bx. In other words, Y n - bx n is treated as an estimate of the censored but unobserv
able random variable Yn - (3xn, and FN(-, b) is the estimator of F that is obtained by
applying the method of Kaplan and Meier (1958) to the sequence {Yn - bxn} (n = 1,
..., N). Let F(·, b) denote the almost sure limit of FN(·, b) as N ~ 00. It follows from
the strong consistency of the Kaplan-Meier estimator based on y - (3x that F(-, (3) =
=
F(·). For each b, let Fb(·, b) aF(·, b)/ab, and let FNb(·, b) be the consistent estima
tor of Fb(·, b) defined by
(6)
Description:Over the last three decades much research in empirical and theoretical economics has been carried on under various assumptions. For example a parametric functional form of the regression model, the heteroskedasticity, and the autocorrelation is always as sumed, usually linear. Also, the errors are