How Efficient is the Kalman Filter at Estimating Affine Term Structure Models? † Jens H. E. Christensen Jose A. Lopez and Glenn D. Rudebusch Federal Reserve Bank of San Francisco 101 Market Street, Mailstop 1130 San Francisco, CA 94105 Preliminary and incomplete draft. Comments are welcome. Abstract WeperformacarefullyorchestratedsimulationstudytoanalyzethebiasoftheKalman filterinestimatingarbitrage-freeNelson-Siegel(AFNS)modelswithandwithoutstochas- tic volatility. For Gaussian AFNS models, we document significant finite-sample bias in the estimated mean-reversion parameters. Since the Kalman filter is consistent and ef- ficient for that model class, this exercise provides a measure of the finite-sample bias that will affect any estimator. For AFNS models with stochastic volatility, significant finite-sample upward estimation bias remains, but it is not materially larger than in the Gaussian model. Hence, we recommend estimation based on the Kalman filter for both types of AFNS models and corresponding affine term structure models in general. JEL Classification: C13, C58, G12, G17. Keywords: arbitrage-free Nelson-Siegel models, finite-sample bias, stochastic volatility †Wethankseminar participantsat theSecondHumboldtCopenhagen Conference on Financial Economet- ricsforcommentsonan earlierdraft ofthispaper. Theviewsinthispaperaresolely theresponsibility ofthe authorsandshould notbeinterpretedasreflectingtheviewsof theFederalReserveBank ofSan Francisco or theBoard of Governors of theFederal ReserveSystem. This version: August 24, 2015. 1 Introduction Interest rate volatility is a topic of great research interest given its role in derivatives pricing and portfolio risk management. However, as compared to the empirical results presented in the extensive GARCH literature, the results of modeling interest rate volatility within the more commonly used affine, arbitrage-free models of the term structure have been less clear-cut, partly due to the difficulty in estimating their parameters. Estimation of flexible affine term structure models is complicated and time consuming, partly dueto thefairly large numberof parameters, and partly dueto thelatent natureof the state variables in such models. The latter causes the estimation to be plagued by numerous local maxima that are distinct in the sense that they are not invariant affine transformations1 of each other and therefore may have very different economic implications, see Duffee (2011) and Kim and Orphanides (2012) for discussions of these issues. To overcome those problems, Christensen et al. (2011, henceforth CDR) introduce the affine arbitrage-free class of Nelson-Siegel term structure models (henceforth referred to as AFNS models). These are affine term structure models that preserve the level, slope, and curvaturefactor loading structureinthebondyield functionknownfromthestandardNelson and Siegel (1987) yield curve model. These models are easy to estimate because the role of each factor is predetermined and does not vary for any admissible set of parameters. Furthermore, in that model class, the state variables are Gaussian with constant volatility. As a consequence, the models can be estimated with the standard Kalman filter, which is equivalent to exact maximum likelihood estimation and therefore is both efficient and consistent in the limit. However, despite its consistency and efficiency, the Kalman filter remains subject to any unavoidable finite-sample bias. In a recent paper, Christensen et al. (2014a, henceforth CLR) generalize the AFNS model framework introduced in CDR by incorporating stochastic volatility into the state variables. These models are also easy to estimate, again due to the imposed Nelson-Siegel factor loading structure. CLR estimate their models using the standard Kalman filter and report model fit on par with the original Gaussian AFNS model. Now, though, the Kalman filter is no longer efficient and potentially inconsistent because it only approximates the true probability distribution of the state variables by matching the first and second moment, essentially treating the state variables as if they were Gaussian. Thus, in addition to any finite-samplebias,thereispotentialforaddedbiasarisingfromthefactthattheKalmanfilter is only an approximation to the true likelihood function. Despite this concern, Kalman filter- basedestimationofaffinetermstructuremodelswithstochasticvolatility isrelativelycommon in empirical term structure analysis,2 but the size of any bias in realistic three-factor settings 1See Dai and Singleton (2000) for thedefinition of thisconcept. 2For examples, see Duffee (1999), Driessen (2005), Feldhu¨tter and Lando (2008), and Christensen et al. (2015). 1 has not been studied in detail in the existing term structure literature (to the best of our knowledge). In this paper, we focus on the AFNS model classes with and without stochastic volatility. This provides us with an ideal setting to study both the finite-sample bias and the added bias from using the Kalman filter for estimation of affine non-Gaussian models. As an alternative, Joslin et al. (2011) and Hamilton and Wu (2012) provide identification schemes that facilitate the estimation of affine Gaussian models in that they avoid the filtering of the unobserved latent factors.3 However, it is not obvious if or how those approaches extend to affine non-Gaussian models. Thus, the AFNS-based identification of affine Gaussian models provided by CDR and extended by CLR to affine non-Gaussian models remains an important contribution without which the analysis in this paper would not have been feasible.4 Because interest rates are highly persistent, empirical autoregressive models, including dynamic term structure models, suffer from substantial small-sample estimation bias. Specif- ically, model estimates will generally be biased toward a dynamic system that displays much less persistence than the true process (so estimates of the real-world mean-reversion matrix, KP, are upward biased). Furthermore, if the degree of interest rate persistence is underes- timated, future short rates would be expected to revert to their mean too quickly causing their expected longer-term averages to be too stable. Therefore, the bias in the estimated dynamics distorts the decomposition of yields and contaminates estimates of long-maturity term premiums. To study this finite-sample problem in detail, we start out simulating and estimating Gaussian AFNS models for which the Kalman filter is an efficient estimator as already noted. Wesimulateshortten-yearandlongforty-yearsamplestostudythefinite-samplebiasproblem directly. We allow for low and high noise to assess how data quality affects our conclusions. Furthermore, for the benchmark Gaussian AFNS model, we also analyze samples at weekly frequency in addition to the monthly frequency used throughout, but since this turns out to matter little for our conclusions, we do not repeat this exercise for the models with stochastic volatility. We then proceed to simulate and estimate AFNS models with stochastic volatility in a similarly careful way. Our findings can be summarized as follows. In the Gaussian AFNS model, there is a significant finite-sample upward bias in the estimates of the mean-reversion rate of the Nelson-Siegel level factor due to its near unit-root property. In addition, there is a more modest, finite-sample upward estimation bias in the mean-reversionparametersfortheslopeandcurvaturefactorthankstotheirlowerpersistence. Importantly, there is no finite-sample bias in the estimated mean parameters of any of the factors. Furthermore, all parameters that relate to the model’s Q-dynamics used for pricing 3Andreasen and Christensen (2015) offer an alternative way of estimating non-Gaussian term structure models. 4The related literature include Duan and Simonato (1995), Lund (1997), De Jong (2000), Duffee and Stanton (2004), and Duffeeand Stanton (2008) among others. 2 and fitting the cross section of yields are well determined and without any measurable bias. This property turns out to hold for non-Gaussian models as well. However, the accuracy of the estimated Q-dynamics is affected by the amount of noise in the data. Finally, the data frequency plays no role for these conclusions as both weekly and monthly simulated data produce similar results. However, in the weekly samples, the parameter standard deviations estimated fromtheoptimizedlikelihoodfunctionintheKalmanfiltertendtobetoolow. This makes the upward biased mean-reversion parameters appear even more significant than they are, which complicates model selection. Hence, we document one of the unusual situations where more data do not necessarily lead to better inference. For selecting the appropriate specification of the mean-reversion matrix, which matters for forecast performance, term premiumdecompositions etc., wethereforerecommendto rely onmonthly ratherthan weekly data. We then proceed to simulate and estimate AFNS models with stochastic volatility gener- ated by the level factor in one set of exercises, and with stochastic volatility generated by the curvature factor in another set of exercises. First, we findthat the finite-sample upward bias in theestimated mean-reversion parame- ters is notmaterially different inthemodels withstochastic volatility relative to theGaussian AFNS model. The intuition behind this result is that the time series properties of the three state variables are primarily determined by the Nelson-Siegel factor loading structure, which is almost identical for all AFNS models with and without stochastic volatility. For similar reasons we also see little bias in the estimated mean parameters in these models. Second, we analyze in detail the ability of the Kalman filter to estimate the volatility sensitivity parameters that determine the degree to which the stochastic volatility factor affects the volatility of the unconstrained factors in each model. For U.S. Treasury yields, thesesensitivityparametersareoftenestimatedtobenegligible(seeCLRforanexample)and we report similar results. To assess whether this is a general weakness of the Kalman filter whenappliedtomodelswithstochasticvolatility, weperformseparatesimulationexperiments with large values for the sensitivity parameters. Our results show that the Kalman filter is in fact able to estimate them with some accuracy. Thus, when their estimated values are tiny and insignificant, it is most likely because the data call for them to be so. Third, in general, it is the case that the parameters that primarily affect the models’ fit to the cross section of yields tend to have small or no bias, but their accuracy varies positively with the quality of the data. We note one exception though. In the AFNS model with stochastic volatility generated by the curvature factor, the mean of the curvature factor under the risk-neutral Q measure is not well identified. However, we show that this can be solved at practically no cost by fixing it at a low value that is exactly high enough that the curvature factor does not reach its zero lower bound. Another key finding is that the Kalman filter is as efficient at filtering state variables in 3 non-Gaussian models as it is at filtering in Gaussian models, in particular under optimal con- ditions with high-quality data. As a consequence, the fit of the AFNS models with stochastic volatility is as good as, if not better than, the fit of the Gaussian AFNS model. Finally, in light of the low interest rate environment in recent years, we emphasize that ourstudyhasnobaringon how Kalmanfilter-based estimations performwhenyields arenear their lower bound and exhibit asymmetric behavior for that reason. This is a task that we leave for future research. Still, the results we report could serve as a useful benchmark even for that kind of exercise. The rest of the paper is structured as follows. Section 2 describes our sample of U.S. Treasury yields and motivates our focus on theNelson-Siegel yield curve model, whileSection 3 briefly details the original Gaussian AFNS model of the term structure. Section 4 goes on to describe the five classes of AFNS models with stochastic volatility dynamics introduced in CLR. Section 5 details the estimation methodology, while Section 6 describes the simu- lation study. Section 7 contains the results from the simulation exercises for the Gaussian AFNS model, while Sections 8 and 9 contain the results for the AFNS models with stochastic volatility generated by the level and curvature factor, respectively. Section 10 concludes the paper. 2 Motivation for the Nelson-Siegel Model In this section, we motivate our focus on the Nelson-Siegel yield curve model using principal components analysis. Recall that principal components analysis decomposes the observed data into a number of factors equal to the number of time series and ranks those factors according to how much of the observed variation each factor explains. The specific Treasury yields we analyze to obtain realistic parameter sets to be used in our simulation exercises are zero-coupon yields constructed by the method described in Gu¨rkaynak et al. (2007) and briefly detailed here.5 For each business day a zero-coupon yield curve of the Svensson (1995)-type 1 e−λ1τ 1 e−λ1τ 1 e−λ2τ y (τ) = β + − β + − e−λ1τ β + − e−λ2τ β t 0 1 2 3 λ τ λ τ − λ τ − 1 1 2 h i h i is fitted to price a large pool of underlying off-the-run Treasury bonds. Thus, for each busi- ness day, we have the fitted values of the four coefficients (β (t),β (t),β (t),β (t)) and two 0 1 2 3 parameters (λ (t),λ (t)). From this data set zero-coupon yields for any relevant maturity 1 2 can becalculated. As demonstrated by Gu¨rkaynak et al. (2007), this discount function prices the underlying pool of bonds extremely well. By implication, the zero-coupon yields derived from this approach constitute a very good approximation to the true underlying Treasury 5The Board of Governors of the Federal Reserve updates the data on its website at http://www.federalreserve.gov/pubs/feds/2006/index.html. 4 0 1 10−year yield 5−year yield 1−year yield 3−month yield 8 nt e 6 c er p n e i at 4 R 2 0 1988 1992 1996 2000 2004 2008 Figure 1: Time Series of Treasury Yields. IllustrationoftheweeklyobservedTreasuryzero-couponbondyieldscoveringtheperiodfromDecem- ber 4, 1987, to January 2, 2009. The yields shown have maturities: Three-month, one-year,five-year, and ten-year. Maturity Mean Std. dev. Skewness Kurtosis in months in % in % 3 4.52 2.02 0.03 2.41 6 4.61 2.05 -0.01 2.40 12 4.77 2.04 -0.04 2.41 24 5.03 1.95 -0.03 2.43 36 5.24 1.86 0.02 2.39 60 5.58 1.72 0.15 2.25 84 5.85 1.62 0.26 2.13 120 6.16 1.52 0.36 2.05 Table 1: Summary Statistics of Treasury Yields. Summary statistics for the sample of weekly observed Treasury zero-coupon bond yields covering the period from December 4, 1987,to January 2, 2009. zero-coupon yield curve.6 Tohave themostactive partofthematurity spectrumrepresented, weconstructTreasury zero-coupon bond yields with the following maturities: 3-month, 6-month, 1-year, 2-year, 3- year, 5-year, 7-year, and 10-year. We use weekly data (Fridays) and limit our sample to the 6D’Amico and King (2013) show that the Svensson functional form has had some difficulty at times in fitting theunderlyingbond prices since the peak of the financial crisis. This explains why we end our sample onJanuary2,2009. Furthermore,weemphasizethatwemerelyusetheU.S.Treasuryyieldstoobtainrealistic parametersetstobeusedinthemodelsimulations. Hence,ultimately,theaccuracyoftheSvenssonsmoothed curvedoes not matter for our exercise and theconclusions we draw. 5 Maturity Loading on in months First P.C. Second P.C. Third P.C. 3 -0.38 -0.44 0.52 6 -0.39 -0.38 0.19 12 -0.40 -0.25 -0.21 24 -0.38 -0.03 -0.47 36 -0.36 0.12 -0.42 60 -0.33 0.33 -0.11 84 -0.30 0.44 0.18 120 -0.27 0.53 0.45 % explained 94.12 5.58 0.27 Table2: Eigenvectors of the First Three Principal Components of Treasury Yields. The loadings of yields of various maturities on the first three principal components are shown. The finalrowshowstheproportionofallbondyieldvariabilityaccountedforbyeachprincipalcomponent. The dataconsistofweeklyU.S. Treasuryzero-couponbondyieldsfromDecember4,1987,toJanuary 2, 2009. period from December 4, 1987, to January 2, 2009. The summary statistics are provided in Table 1, while Figure 1 illustrates the constructed time series of the three-month, one-year, five-year, and ten-year Treasury zero-coupon yields. Researchers have typically found that three factors are sufficient to model the time- variation inthecross section of Treasurybondyields (e.g., Litterman andScheinkman, 1991). Indeed, for our weekly Treasury bond data, 99.97% of the total variation is accounted for by three factors. Table 2 reports the eigenvectors that correspond to the first three principal components of our data. The first principal component accounts for 94.1% of the variation in the Treasury bond yields, and its loading across maturities is uniformly negative. Thus, like a level factor, a shock to this component changes all yields in the same direction irrespective of maturity. The second principal component accounts for 5.6% of the variation in these data and has sizable negative loadings for the shorter maturities and sizable positive loadings for the long maturities. Thus, like a slope factor, a shock to this component steepens or flattens the yield curve. Finally, the third component, which accounts for only 0.3% of the variation, has a U-shaped factor loading as a function of maturity, which is naturally interpreted as a curvature factor. In summary, three factors can explain more than 99.97% of the variation in this set of Treasury bond yields, and they have properties consistent with an interpretation of level, slope, and curvature as in the Nelson-Siegel model detailed in the following. 6 3 The AFNS Model with Constant Volatility Inthissection,webrieflyreviewtheAFNSmodelwithconstantvolatility, throughoutreferred to as the AFNS specification.7,8 We start from a standard continuous-time affine arbitrage- 0 free structure (Duffie and Kan, 1996) that underlies all the models to be estimated in this pa- per. Torepresentanaffinediffusionprocess,defineafilteredprobability space(Ω, ,( ),Q), t F F where the filtration ( ) = : t 0 satisfies the usual conditions (Williams, 1997). The t t F {F ≥ } state variables X are assumed to be a Markov process defined on a set M Rn that solves t ⊂ the following stochastic differential equation (SDE)9 dX = KQ(t)[θQ(t) X ]dt+Σ(t)D(X ,t)dWQ, (1) t t t t − where WQ is a standard Brownian motion in Rn, the information of which is contained in the filtration ( ). The drift terms θQ : [0,T] Rn and KQ : [0,T] Rn×n are bounded, t F → → continuous functions.10 Similarly, the volatility matrix Σ : [0,T] Rn×n is assumed to be a → bounded, continuous function, while D : M [0,T] Rn×n is assumed to have the following × → diagonal structure γ1(t)+δ1(t)X ... 0 t p ... ... ... , 0 ... γn(t)+δn(t)X t p where γ1(t) δ1(t) ... δ1(t) 1 n γ(t) = ... , δ(t) = ... ... ... , γn(t) δn(t) ... δn(t) 1 n γ : [0,T] Rn and δ : [0,T] Rn×n are bounded, continuous functions, and δi(t) denotes → → the ith row of the δ(t)-matrix. Finally, the instantaneous risk-free rate is assumed to be an affine function of the state variables ′ r = ρ (t)+ρ (t)X , t 0 1 t 7Ournomenclature follow CLR and draws on Dai and Singleton (2000). OurAFNSn models are members of their An(3) class of models, which havethree state variables and n square-root processes. 8This model has been shown to exhibit both good in-sample fit and out-of-sample forecast accuracy for various yield curves. The empirical analysis conducted in CDR is based on unsmoothed Fama-Bliss data for nominal Treasury yields. Christensen et al. (2010) examine yields for nominal and real Treasuries as per Gu¨rkaynak et al. (2007, 2010), while Christensen et al. (2014b) examine short-term LIBOR and highly-rated banks’and financial firms’ corporate bond rates. 9The affine property applies to bond prices; therefore, affine models only impose structure on the factor dynamics underthe pricing measure. 10StationarityofthestatevariablesisensuredifalltheeigenvaluesofKQ(t)arepositive(ifcomplex,thereal component should be positive), see Ahn et al. (2002). However, stationarity is not a necessary requirement for the process to be well defined. 7 where ρ :[0,T] R and ρ :[0,T] Rn are bounded, continuous functions. 0 1 → → DuffieandKan(1996)provethatzero-couponbondpricesinthisframeworkareexponential- affine functions of the state variables T Q ′ P(t,T) = E exp r du =exp B(t,T)X +A(t,T) , t u t − Zt (cid:2) (cid:0) (cid:1)(cid:3) (cid:0) (cid:1) where B(t,T) and A(t,T) are the solutions to the following system of ordinary differential equations (ODEs) n dB(t,T) 1 = ρ +(KQ)′B(t,T) (Σ′B(t,T)B(t,T)′Σ) (δj)′, B(T,T) =0, (2) 1 j,j dt − 2 j=1 X n dA(t,T) 1 = ρ B(t,T)′KQθQ (Σ′B(t,T)B(t,T)′Σ) γj, A(T,T) = 0, (3) 0 j,j dt − − 2 j=1 X and the possible time-dependence of the parameters is suppressed in the notation. These pricing functions imply that the zero-coupon yields are given by ′ 1 B(t,T) A(t,T) y(t,T) = logP(t,T) = X . t −T t − T t − T t − − − As per CDR, assume that the instantaneous risk-free rate is defined by r = L +S . t t t In addition, assume that the state variables X = (L ,S ,C ) are described by the following t t t t system of SDEs under the risk-neutral Q-measure Q L,Q dL 0 0 0 θ L dW t 1 t t dS = 0 λ λ θQ S dt+Σ dWS,Q , λ > 0. t − 2 − t t Q C,Q dCt 0 0 λ θ3 Ct dWt Then, zero-coupon bond yields are given by 1 e−λ(T−t) 1 e−λ(T−t) A(t,T) y(t,T)= L + − S + − e−λ(T−t) C . t t t λ(T t) λ(T t) − − T t (cid:16) − (cid:17) (cid:16) − (cid:17) − This result defines the class of AFNS models derived in CDR and the additional term in 0 the yield function is a so-called yield-adjustment term that represents convexity effects due to Jensen’s inequality; see CDR for details. To complete the model, we need to specify the risk premium structure that generates the connection to the dynamics under the real-world P-measure. To that end, it is important to note that there are no restrictions on the dynamic drift components under the empirical P-measure. Therefore, beyond the requirement of constant volatility, we are free to choose the dynamics under the P-measure. To facilitate 8 the empirical implementation, we follow CDR and limit our focus to the essentially affine risk premium introduced in Duffee (2002). In the Gaussian framework, this specification implies that the risk premiums Γ depend linearly on the state variables; that is, t Γ = γ0+γ1X , t t where γ0 R3 and γ1 R3×3 contain unrestricted parameters. The relationship between ∈ ∈ real-world yield curve dynamics under the P-measure and risk-neutral dynamics under the Q-measure is given by dWQ = dWP +Γ dt. t t t Thus, we can write the P-dynamics of the state variables as dX = KP(θP X )dt+ΣdWP, t t t − where both KP and θP are allowed to vary freely relative to their counterparts under the Q-measure. Following CDR, we identify this class of models by fixing the means under the Q-measure at zero, i.e., θQ = 0.11 Furthermore, CDR show that Σ cannot be more than a triangular matrix for the model to be identified. Thus, the maximally flexible specification of the original AFNS model has Q-dynamics given by L,Q dL 0 0 0 L σ 0 0 dW t t 11 t dS = 0 λ λ S dt+ σ σ 0 dWS,Q , t t 21 22 t − C,Q dCt 0 0 λ Ct σ31 σ32 σ33 dWt − while its P-dynamics are given by dL κP κP κP θP L σ 0 0 dWL,P t 11 12 13 1 t 11 t dS = κP κP κP θP S dt+ σ σ 0 dWS,P . t 21 22 23 2 − t 21 22 t dCt κP31 κP32 κP33 θ3P Ct σ31 σ32 σ33 dWtC,P The main limitation of the AFNS class of models is that it is characterized by a constant 0 volatility matrix Σ. CLR modify the AFNS model in a straightforward fashion in order to 0 incorporatestochasticvolatility. ThekeyassumptiontopreservingthedesirableNelson-Siegel factor loading structure in the zero-coupon bond yield function is to maintain the KQ mean- reversion matrix under the Q-measure. Furthermore, all model classes will be characterized by an instantaneous risk-free rate defined as the sum of the first two factors r = L +S . t t t 11CDR demonstrate that this choice is without loss of generality. 9
Description: