Table Of ContentHow Efficient is the Kalman Filter at
Estimating Affine Term Structure Models?
†
Jens H. E. Christensen
Jose A. Lopez
and
Glenn D. Rudebusch
Federal Reserve Bank of San Francisco
101 Market Street, Mailstop 1130
San Francisco, CA 94105
Preliminary and incomplete draft. Comments are welcome.
Abstract
WeperformacarefullyorchestratedsimulationstudytoanalyzethebiasoftheKalman
filterinestimatingarbitrage-freeNelson-Siegel(AFNS)modelswithandwithoutstochas-
tic volatility. For Gaussian AFNS models, we document significant finite-sample bias in
the estimated mean-reversion parameters. Since the Kalman filter is consistent and ef-
ficient for that model class, this exercise provides a measure of the finite-sample bias
that will affect any estimator. For AFNS models with stochastic volatility, significant
finite-sample upward estimation bias remains, but it is not materially larger than in the
Gaussian model. Hence, we recommend estimation based on the Kalman filter for both
types of AFNS models and corresponding affine term structure models in general.
JEL Classification: C13, C58, G12, G17.
Keywords: arbitrage-free Nelson-Siegel models, finite-sample bias, stochastic volatility
†Wethankseminar participantsat theSecondHumboldtCopenhagen Conference on Financial Economet-
ricsforcommentsonan earlierdraft ofthispaper. Theviewsinthispaperaresolely theresponsibility ofthe
authorsandshould notbeinterpretedasreflectingtheviewsof theFederalReserveBank ofSan Francisco or
theBoard of Governors of theFederal ReserveSystem.
This version: August 24, 2015.
1 Introduction
Interest rate volatility is a topic of great research interest given its role in derivatives pricing
and portfolio risk management. However, as compared to the empirical results presented
in the extensive GARCH literature, the results of modeling interest rate volatility within
the more commonly used affine, arbitrage-free models of the term structure have been less
clear-cut, partly due to the difficulty in estimating their parameters.
Estimation of flexible affine term structure models is complicated and time consuming,
partly dueto thefairly large numberof parameters, and partly dueto thelatent natureof the
state variables in such models. The latter causes the estimation to be plagued by numerous
local maxima that are distinct in the sense that they are not invariant affine transformations1
of each other and therefore may have very different economic implications, see Duffee (2011)
and Kim and Orphanides (2012) for discussions of these issues.
To overcome those problems, Christensen et al. (2011, henceforth CDR) introduce the
affine arbitrage-free class of Nelson-Siegel term structure models (henceforth referred to as
AFNS models). These are affine term structure models that preserve the level, slope, and
curvaturefactor loading structureinthebondyield functionknownfromthestandardNelson
and Siegel (1987) yield curve model. These models are easy to estimate because the role
of each factor is predetermined and does not vary for any admissible set of parameters.
Furthermore, in that model class, the state variables are Gaussian with constant volatility.
As a consequence, the models can be estimated with the standard Kalman filter, which
is equivalent to exact maximum likelihood estimation and therefore is both efficient and
consistent in the limit. However, despite its consistency and efficiency, the Kalman filter
remains subject to any unavoidable finite-sample bias.
In a recent paper, Christensen et al. (2014a, henceforth CLR) generalize the AFNS
model framework introduced in CDR by incorporating stochastic volatility into the state
variables. These models are also easy to estimate, again due to the imposed Nelson-Siegel
factor loading structure. CLR estimate their models using the standard Kalman filter and
report model fit on par with the original Gaussian AFNS model. Now, though, the Kalman
filter is no longer efficient and potentially inconsistent because it only approximates the true
probability distribution of the state variables by matching the first and second moment,
essentially treating the state variables as if they were Gaussian. Thus, in addition to any
finite-samplebias,thereispotentialforaddedbiasarisingfromthefactthattheKalmanfilter
is only an approximation to the true likelihood function. Despite this concern, Kalman filter-
basedestimationofaffinetermstructuremodelswithstochasticvolatility isrelativelycommon
in empirical term structure analysis,2 but the size of any bias in realistic three-factor settings
1See Dai and Singleton (2000) for thedefinition of thisconcept.
2For examples, see Duffee (1999), Driessen (2005), Feldhu¨tter and Lando (2008), and Christensen et al.
(2015).
1
has not been studied in detail in the existing term structure literature (to the best of our
knowledge). In this paper, we focus on the AFNS model classes with and without stochastic
volatility. This provides us with an ideal setting to study both the finite-sample bias and the
added bias from using the Kalman filter for estimation of affine non-Gaussian models. As an
alternative, Joslin et al. (2011) and Hamilton and Wu (2012) provide identification schemes
that facilitate the estimation of affine Gaussian models in that they avoid the filtering of the
unobserved latent factors.3 However, it is not obvious if or how those approaches extend to
affine non-Gaussian models. Thus, the AFNS-based identification of affine Gaussian models
provided by CDR and extended by CLR to affine non-Gaussian models remains an important
contribution without which the analysis in this paper would not have been feasible.4
Because interest rates are highly persistent, empirical autoregressive models, including
dynamic term structure models, suffer from substantial small-sample estimation bias. Specif-
ically, model estimates will generally be biased toward a dynamic system that displays much
less persistence than the true process (so estimates of the real-world mean-reversion matrix,
KP, are upward biased). Furthermore, if the degree of interest rate persistence is underes-
timated, future short rates would be expected to revert to their mean too quickly causing
their expected longer-term averages to be too stable. Therefore, the bias in the estimated
dynamics distorts the decomposition of yields and contaminates estimates of long-maturity
term premiums.
To study this finite-sample problem in detail, we start out simulating and estimating
Gaussian AFNS models for which the Kalman filter is an efficient estimator as already noted.
Wesimulateshortten-yearandlongforty-yearsamplestostudythefinite-samplebiasproblem
directly. We allow for low and high noise to assess how data quality affects our conclusions.
Furthermore, for the benchmark Gaussian AFNS model, we also analyze samples at weekly
frequency in addition to the monthly frequency used throughout, but since this turns out to
matter little for our conclusions, we do not repeat this exercise for the models with stochastic
volatility. We then proceed to simulate and estimate AFNS models with stochastic volatility
in a similarly careful way.
Our findings can be summarized as follows.
In the Gaussian AFNS model, there is a significant finite-sample upward bias in the
estimates of the mean-reversion rate of the Nelson-Siegel level factor due to its near unit-root
property. In addition, there is a more modest, finite-sample upward estimation bias in the
mean-reversionparametersfortheslopeandcurvaturefactorthankstotheirlowerpersistence.
Importantly, there is no finite-sample bias in the estimated mean parameters of any of the
factors. Furthermore, all parameters that relate to the model’s Q-dynamics used for pricing
3Andreasen and Christensen (2015) offer an alternative way of estimating non-Gaussian term structure
models.
4The related literature include Duan and Simonato (1995), Lund (1997), De Jong (2000), Duffee and
Stanton (2004), and Duffeeand Stanton (2008) among others.
2
and fitting the cross section of yields are well determined and without any measurable bias.
This property turns out to hold for non-Gaussian models as well. However, the accuracy of
the estimated Q-dynamics is affected by the amount of noise in the data. Finally, the data
frequency plays no role for these conclusions as both weekly and monthly simulated data
produce similar results. However, in the weekly samples, the parameter standard deviations
estimated fromtheoptimizedlikelihoodfunctionintheKalmanfiltertendtobetoolow. This
makes the upward biased mean-reversion parameters appear even more significant than they
are, which complicates model selection. Hence, we document one of the unusual situations
where more data do not necessarily lead to better inference. For selecting the appropriate
specification of the mean-reversion matrix, which matters for forecast performance, term
premiumdecompositions etc., wethereforerecommendto rely onmonthly ratherthan weekly
data.
We then proceed to simulate and estimate AFNS models with stochastic volatility gener-
ated by the level factor in one set of exercises, and with stochastic volatility generated by the
curvature factor in another set of exercises.
First, we findthat the finite-sample upward bias in theestimated mean-reversion parame-
ters is notmaterially different inthemodels withstochastic volatility relative to theGaussian
AFNS model. The intuition behind this result is that the time series properties of the three
state variables are primarily determined by the Nelson-Siegel factor loading structure, which
is almost identical for all AFNS models with and without stochastic volatility. For similar
reasons we also see little bias in the estimated mean parameters in these models.
Second, we analyze in detail the ability of the Kalman filter to estimate the volatility
sensitivity parameters that determine the degree to which the stochastic volatility factor
affects the volatility of the unconstrained factors in each model. For U.S. Treasury yields,
thesesensitivityparametersareoftenestimatedtobenegligible(seeCLRforanexample)and
we report similar results. To assess whether this is a general weakness of the Kalman filter
whenappliedtomodelswithstochasticvolatility, weperformseparatesimulationexperiments
with large values for the sensitivity parameters. Our results show that the Kalman filter is in
fact able to estimate them with some accuracy. Thus, when their estimated values are tiny
and insignificant, it is most likely because the data call for them to be so.
Third, in general, it is the case that the parameters that primarily affect the models’
fit to the cross section of yields tend to have small or no bias, but their accuracy varies
positively with the quality of the data. We note one exception though. In the AFNS model
with stochastic volatility generated by the curvature factor, the mean of the curvature factor
under the risk-neutral Q measure is not well identified. However, we show that this can be
solved at practically no cost by fixing it at a low value that is exactly high enough that the
curvature factor does not reach its zero lower bound.
Another key finding is that the Kalman filter is as efficient at filtering state variables in
3
non-Gaussian models as it is at filtering in Gaussian models, in particular under optimal con-
ditions with high-quality data. As a consequence, the fit of the AFNS models with stochastic
volatility is as good as, if not better than, the fit of the Gaussian AFNS model.
Finally, in light of the low interest rate environment in recent years, we emphasize that
ourstudyhasnobaringon how Kalmanfilter-based estimations performwhenyields arenear
their lower bound and exhibit asymmetric behavior for that reason. This is a task that we
leave for future research. Still, the results we report could serve as a useful benchmark even
for that kind of exercise.
The rest of the paper is structured as follows. Section 2 describes our sample of U.S.
Treasury yields and motivates our focus on theNelson-Siegel yield curve model, whileSection
3 briefly details the original Gaussian AFNS model of the term structure. Section 4 goes on
to describe the five classes of AFNS models with stochastic volatility dynamics introduced
in CLR. Section 5 details the estimation methodology, while Section 6 describes the simu-
lation study. Section 7 contains the results from the simulation exercises for the Gaussian
AFNS model, while Sections 8 and 9 contain the results for the AFNS models with stochastic
volatility generated by the level and curvature factor, respectively. Section 10 concludes the
paper.
2 Motivation for the Nelson-Siegel Model
In this section, we motivate our focus on the Nelson-Siegel yield curve model using principal
components analysis. Recall that principal components analysis decomposes the observed
data into a number of factors equal to the number of time series and ranks those factors
according to how much of the observed variation each factor explains.
The specific Treasury yields we analyze to obtain realistic parameter sets to be used
in our simulation exercises are zero-coupon yields constructed by the method described in
Gu¨rkaynak et al. (2007) and briefly detailed here.5 For each business day a zero-coupon yield
curve of the Svensson (1995)-type
1 e−λ1τ 1 e−λ1τ 1 e−λ2τ
y (τ) = β + − β + − e−λ1τ β + − e−λ2τ β
t 0 1 2 3
λ τ λ τ − λ τ −
1 1 2
h i h i
is fitted to price a large pool of underlying off-the-run Treasury bonds. Thus, for each busi-
ness day, we have the fitted values of the four coefficients (β (t),β (t),β (t),β (t)) and two
0 1 2 3
parameters (λ (t),λ (t)). From this data set zero-coupon yields for any relevant maturity
1 2
can becalculated. As demonstrated by Gu¨rkaynak et al. (2007), this discount function prices
the underlying pool of bonds extremely well. By implication, the zero-coupon yields derived
from this approach constitute a very good approximation to the true underlying Treasury
5The Board of Governors of the Federal Reserve updates the data on its website at
http://www.federalreserve.gov/pubs/feds/2006/index.html.
4
0
1
10−year yield
5−year yield
1−year yield
3−month yield
8
nt
e 6
c
er
p
n
e i
at 4
R
2
0
1988 1992 1996 2000 2004 2008
Figure 1: Time Series of Treasury Yields.
IllustrationoftheweeklyobservedTreasuryzero-couponbondyieldscoveringtheperiodfromDecem-
ber 4, 1987, to January 2, 2009. The yields shown have maturities: Three-month, one-year,five-year,
and ten-year.
Maturity Mean Std. dev.
Skewness Kurtosis
in months in % in %
3 4.52 2.02 0.03 2.41
6 4.61 2.05 -0.01 2.40
12 4.77 2.04 -0.04 2.41
24 5.03 1.95 -0.03 2.43
36 5.24 1.86 0.02 2.39
60 5.58 1.72 0.15 2.25
84 5.85 1.62 0.26 2.13
120 6.16 1.52 0.36 2.05
Table 1: Summary Statistics of Treasury Yields.
Summary statistics for the sample of weekly observed Treasury zero-coupon bond yields covering the
period from December 4, 1987,to January 2, 2009.
zero-coupon yield curve.6
Tohave themostactive partofthematurity spectrumrepresented, weconstructTreasury
zero-coupon bond yields with the following maturities: 3-month, 6-month, 1-year, 2-year, 3-
year, 5-year, 7-year, and 10-year. We use weekly data (Fridays) and limit our sample to the
6D’Amico and King (2013) show that the Svensson functional form has had some difficulty at times in
fitting theunderlyingbond prices since the peak of the financial crisis. This explains why we end our sample
onJanuary2,2009. Furthermore,weemphasizethatwemerelyusetheU.S.Treasuryyieldstoobtainrealistic
parametersetstobeusedinthemodelsimulations. Hence,ultimately,theaccuracyoftheSvenssonsmoothed
curvedoes not matter for our exercise and theconclusions we draw.
5
Maturity Loading on
in months First P.C. Second P.C. Third P.C.
3 -0.38 -0.44 0.52
6 -0.39 -0.38 0.19
12 -0.40 -0.25 -0.21
24 -0.38 -0.03 -0.47
36 -0.36 0.12 -0.42
60 -0.33 0.33 -0.11
84 -0.30 0.44 0.18
120 -0.27 0.53 0.45
% explained 94.12 5.58 0.27
Table2: Eigenvectors of the First Three Principal Components of Treasury Yields.
The loadings of yields of various maturities on the first three principal components are shown. The
finalrowshowstheproportionofallbondyieldvariabilityaccountedforbyeachprincipalcomponent.
The dataconsistofweeklyU.S. Treasuryzero-couponbondyieldsfromDecember4,1987,toJanuary
2, 2009.
period from December 4, 1987, to January 2, 2009. The summary statistics are provided in
Table 1, while Figure 1 illustrates the constructed time series of the three-month, one-year,
five-year, and ten-year Treasury zero-coupon yields.
Researchers have typically found that three factors are sufficient to model the time-
variation inthecross section of Treasurybondyields (e.g., Litterman andScheinkman, 1991).
Indeed, for our weekly Treasury bond data, 99.97% of the total variation is accounted for by
three factors. Table 2 reports the eigenvectors that correspond to the first three principal
components of our data. The first principal component accounts for 94.1% of the variation in
the Treasury bond yields, and its loading across maturities is uniformly negative. Thus, like
a level factor, a shock to this component changes all yields in the same direction irrespective
of maturity. The second principal component accounts for 5.6% of the variation in these data
and has sizable negative loadings for the shorter maturities and sizable positive loadings for
the long maturities. Thus, like a slope factor, a shock to this component steepens or flattens
the yield curve. Finally, the third component, which accounts for only 0.3% of the variation,
has a U-shaped factor loading as a function of maturity, which is naturally interpreted as a
curvature factor.
In summary, three factors can explain more than 99.97% of the variation in this set of
Treasury bond yields, and they have properties consistent with an interpretation of level,
slope, and curvature as in the Nelson-Siegel model detailed in the following.
6
3 The AFNS Model with Constant Volatility
Inthissection,webrieflyreviewtheAFNSmodelwithconstantvolatility, throughoutreferred
to as the AFNS specification.7,8 We start from a standard continuous-time affine arbitrage-
0
free structure (Duffie and Kan, 1996) that underlies all the models to be estimated in this pa-
per. Torepresentanaffinediffusionprocess,defineafilteredprobability space(Ω, ,( ),Q),
t
F F
where the filtration ( ) = : t 0 satisfies the usual conditions (Williams, 1997). The
t t
F {F ≥ }
state variables X are assumed to be a Markov process defined on a set M Rn that solves
t
⊂
the following stochastic differential equation (SDE)9
dX = KQ(t)[θQ(t) X ]dt+Σ(t)D(X ,t)dWQ, (1)
t t t t
−
where WQ is a standard Brownian motion in Rn, the information of which is contained in
the filtration ( ). The drift terms θQ : [0,T] Rn and KQ : [0,T] Rn×n are bounded,
t
F → →
continuous functions.10 Similarly, the volatility matrix Σ : [0,T] Rn×n is assumed to be a
→
bounded, continuous function, while D : M [0,T] Rn×n is assumed to have the following
× →
diagonal structure
γ1(t)+δ1(t)X ... 0
t
p ... ... ... ,
0 ... γn(t)+δn(t)X
t
p
where
γ1(t) δ1(t) ... δ1(t)
1 n
γ(t) = ... , δ(t) = ... ... ... ,
γn(t) δn(t) ... δn(t)
1 n
γ : [0,T] Rn and δ : [0,T] Rn×n are bounded, continuous functions, and δi(t) denotes
→ →
the ith row of the δ(t)-matrix. Finally, the instantaneous risk-free rate is assumed to be an
affine function of the state variables
′
r = ρ (t)+ρ (t)X ,
t 0 1 t
7Ournomenclature follow CLR and draws on Dai and Singleton (2000). OurAFNSn models are members
of their An(3) class of models, which havethree state variables and n square-root processes.
8This model has been shown to exhibit both good in-sample fit and out-of-sample forecast accuracy for
various yield curves. The empirical analysis conducted in CDR is based on unsmoothed Fama-Bliss data for
nominal Treasury yields. Christensen et al. (2010) examine yields for nominal and real Treasuries as per
Gu¨rkaynak et al. (2007, 2010), while Christensen et al. (2014b) examine short-term LIBOR and highly-rated
banks’and financial firms’ corporate bond rates.
9The affine property applies to bond prices; therefore, affine models only impose structure on the factor
dynamics underthe pricing measure.
10StationarityofthestatevariablesisensuredifalltheeigenvaluesofKQ(t)arepositive(ifcomplex,thereal
component should be positive), see Ahn et al. (2002). However, stationarity is not a necessary requirement
for the process to be well defined.
7
where ρ :[0,T] R and ρ :[0,T] Rn are bounded, continuous functions.
0 1
→ →
DuffieandKan(1996)provethatzero-couponbondpricesinthisframeworkareexponential-
affine functions of the state variables
T
Q ′
P(t,T) = E exp r du =exp B(t,T)X +A(t,T) ,
t u t
−
Zt
(cid:2) (cid:0) (cid:1)(cid:3) (cid:0) (cid:1)
where B(t,T) and A(t,T) are the solutions to the following system of ordinary differential
equations (ODEs)
n
dB(t,T) 1
= ρ +(KQ)′B(t,T) (Σ′B(t,T)B(t,T)′Σ) (δj)′, B(T,T) =0, (2)
1 j,j
dt − 2
j=1
X
n
dA(t,T) 1
= ρ B(t,T)′KQθQ (Σ′B(t,T)B(t,T)′Σ) γj, A(T,T) = 0, (3)
0 j,j
dt − − 2
j=1
X
and the possible time-dependence of the parameters is suppressed in the notation. These
pricing functions imply that the zero-coupon yields are given by
′
1 B(t,T) A(t,T)
y(t,T) = logP(t,T) = X .
t
−T t − T t − T t
− − −
As per CDR, assume that the instantaneous risk-free rate is defined by
r = L +S .
t t t
In addition, assume that the state variables X = (L ,S ,C ) are described by the following
t t t t
system of SDEs under the risk-neutral Q-measure
Q L,Q
dL 0 0 0 θ L dW
t 1 t t
dS = 0 λ λ θQ S dt+Σ dWS,Q , λ > 0.
t − 2 − t t
Q C,Q
dCt 0 0 λ θ3 Ct dWt
Then, zero-coupon bond yields are given by
1 e−λ(T−t) 1 e−λ(T−t) A(t,T)
y(t,T)= L + − S + − e−λ(T−t) C .
t t t
λ(T t) λ(T t) − − T t
(cid:16) − (cid:17) (cid:16) − (cid:17) −
This result defines the class of AFNS models derived in CDR and the additional term in
0
the yield function is a so-called yield-adjustment term that represents convexity effects due
to Jensen’s inequality; see CDR for details. To complete the model, we need to specify the
risk premium structure that generates the connection to the dynamics under the real-world
P-measure. To that end, it is important to note that there are no restrictions on the dynamic
drift components under the empirical P-measure. Therefore, beyond the requirement of
constant volatility, we are free to choose the dynamics under the P-measure. To facilitate
8
the empirical implementation, we follow CDR and limit our focus to the essentially affine risk
premium introduced in Duffee (2002). In the Gaussian framework, this specification implies
that the risk premiums Γ depend linearly on the state variables; that is,
t
Γ = γ0+γ1X ,
t t
where γ0 R3 and γ1 R3×3 contain unrestricted parameters. The relationship between
∈ ∈
real-world yield curve dynamics under the P-measure and risk-neutral dynamics under the
Q-measure is given by
dWQ = dWP +Γ dt.
t t t
Thus, we can write the P-dynamics of the state variables as
dX = KP(θP X )dt+ΣdWP,
t t t
−
where both KP and θP are allowed to vary freely relative to their counterparts under the
Q-measure. Following CDR, we identify this class of models by fixing the means under the
Q-measure at zero, i.e., θQ = 0.11 Furthermore, CDR show that Σ cannot be more than a
triangular matrix for the model to be identified. Thus, the maximally flexible specification of
the original AFNS model has Q-dynamics given by
L,Q
dL 0 0 0 L σ 0 0 dW
t t 11 t
dS = 0 λ λ S dt+ σ σ 0 dWS,Q ,
t t 21 22 t
−
C,Q
dCt 0 0 λ Ct σ31 σ32 σ33 dWt
−
while its P-dynamics are given by
dL κP κP κP θP L σ 0 0 dWL,P
t 11 12 13 1 t 11 t
dS = κP κP κP θP S dt+ σ σ 0 dWS,P .
t 21 22 23 2 − t 21 22 t
dCt κP31 κP32 κP33 θ3P Ct σ31 σ32 σ33 dWtC,P
The main limitation of the AFNS class of models is that it is characterized by a constant
0
volatility matrix Σ. CLR modify the AFNS model in a straightforward fashion in order to
0
incorporatestochasticvolatility. ThekeyassumptiontopreservingthedesirableNelson-Siegel
factor loading structure in the zero-coupon bond yield function is to maintain the KQ mean-
reversion matrix under the Q-measure. Furthermore, all model classes will be characterized
by an instantaneous risk-free rate defined as the sum of the first two factors
r = L +S .
t t t
11CDR demonstrate that this choice is without loss of generality.
9
Description:Hence, we recommend estimation based on the Kalman filter for both types of AFNS models and corresponding affine term structure models in general. JEL Classification: C13, C58, G12, G17. Keywords: arbitrage-free Nelson-Siegel models, finite-sample bias, stochastic volatility. †We thank seminar