ebook img

A localization technique for ensemble Kalman filters PDF

0.2 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A localization technique for ensemble Kalman filters

A localization technique for ensemble Kalman filters ∗ Kay Bergemann Sebastian Reich January 21, 2010 0 1 0 2 n Abstract a J Ensemble Kalman filter techniques are widely used to assimilate observations into dy- 1 2 namical models. The phase space dimension is typically much larger than the number of ensemble members which leads to inaccurate results in the computed covariance matri- ] A ces. These inaccuracies can lead, among other things, to spurious long range correlations N which can be eliminated by Schur-product-based localization techniques. In this paper, weproposeanewtechniqueforimplementingsuchlocalization techniques withintheclass . h of ensemble transform/square root Kalman filters. Our approach relies on a continuous t a embedding of the Kalman filter update for the ensemble members, i.e., we state an ordi- m nary differential equation (ODE) whosesolutions, over aunittime interval, are equivalent [ to the Kalman filter update. The ODE formulation forms a gradient system with the ob- 3 servations as a cost functional. Besides localization, the new ODE ensemble formulation v should also find useful applications in the context of nonlinear observation operators and 8 7 observations arriving continuously in time. 6 1 Keywords. Data assimilation, ensemble Kalman filter, localization, continuous Kalman filter . 9 0 9 1 Introduction 0 : v i We consider ordinary differential equations X r a x˙ = f(x,t) (1) with state variable x Rn. Initial conditions at time t are not precisely known and we assume 0 ∈ instead that x(t ) N(x ,B), (2) 0 0 ∼ where N(x ,B) denotes an n-dimensional Gaussian distribution with mean x Rn and co- 0 0 ∈ variance matrix B Rn×n. We also assume that we obtain measurements y(t ) Rk at i ∈ ∈ ∗ Universit¨at Potsdam, Institut fu¨r Mathematik, Am Neuen Palais 10, D-14469 Potsdam, Germany 1 discrete times t t , j = 0,1,...,M, subject to measurement errors, which are also Gaussian j 0 ≥ distributed with zero mean and covariance matrix R Rk×k, i.e., ∈ y(t ) Hx(t ) N(0,R). (3) j j − ∼ Here H Rk×n is the (linear) measurement operator. ∈ Data assimilation is the task to combine the model (1) (here assumed to be perfect), the knowledge about the initial conditions (2) and available measurements (3) in a prediction of the probability distribution of the solution at any time t > t . We refer to Lewis et al. (2006) for 0 a detailed introduction and available approaches to data assimilation. In this paper, we focus on the ensemble Kalman filter (EnKF) method, originally proposed by Evensen (see Evensen (2006) for a recent account) and, in particular, on ensemble transform (Bishop et al., 2001), ensemble adjustment (Anderson, 2001), and ensemble square root filters (Tippett et al., 2003) and their sequential implementation (Whitaker and Hamill, 2002; Anderson, 2003). The EnKF relies on the simultaneous propagation of m independent solutions x (t), i = i 1,...,m, from which we can extract an empirical mean m 1 x(t) = x (t) (4) i m i=1 X and an empirical covariance matrix m 1 P(t) = (x (t) x(t))(x (t) x(t))T . (5) i i m 1 − − − i=1 X In typical applications from meteorology, the ensemble size m is much smaller than the dimen- sion n of the model phase space and, more importantly, also much smaller than the number of positive Lyapunov exponents. Hence P(t) is highly rank deficient which can lead to un- reliable predictions. Ensemble localization has been introduced by Houtekamer and Mitchell (2001) and Hamill et al. (2001) to overcome this problem. However, only two techniques are currently available to implement Schur-product-based localization within the framework of en- semble transform/square root Kalman filters. The first option is provided by a sequential processing of observations (Whitaker and Hamill, 2002; Anderson, 2003), while the determin- istic ensemble Kalman filter (DEnKF) of Sakov and Oke (2008a) is a second, more recent, option. The DEnKF results in an approximate implementation of ensemble transform/square root Kalman filters. We also mention box/local analysis methods (Evensen, 2003; Ott et al., 2004; Hunt et al., 2007), which assimilate data locally in physical space and which therefore possess a “built in” localization. In this paper, we demonstrate that techniques proposed by Bergemann et al. (2009) for the filter analysis step can be further generalized to an ordinary differential equation (ODE) formu- lationintermsoftheensemble members x , i = 1,...,m. Thisformulationissubsequently used i to derive aneasy toimplement localized ensemble Kalmanfilter, which canprocess observations simultaneously and can be extended to nonlinear observation operators. 2 2 Background material We summarize a number of key results and techniques regarding ensemble Kalman filters. We refer to Evensen (2006) for an introduction and in-depth discussion of such filters. 2.1 Kalman analysis step Let n denote the dimension of the phase space of the problem. We consider an ensemble of m members x (t) Rn which we collect in a matrix X(t) Rn×m. In terms of X, the ensemble i ∈ ∈ mean is given by 1 x(t) = X(t)e Rn (6) m ∈ and we introduce the ensemble deviation matrix X′(t) = X(t) x(t)eT Rn×m, (7) − ∈ where e = (1,...,1)T Rm. ∈ We now describe the basic Kalman analysis step. Let x and X′ denote the forecast mean f f and deviation matrix, respectively. The ensemble mean is updated according to x = x K(Hx y), (8) a f f − − where K = P HT HP HT +R −1 (9) f f is the Kalman gain matrix with empirical co(cid:0)variance matri(cid:1)x 1 P = X′(X′)T, (10) f m 1 − and R Rk×k is the measurement error covariance matrix. ∈ While the update of the mean is common to most ensemble Kalman filters, the update of the ensemble deviation matrix X′ can be implemented in several ways. In this paper, we focus f on ensemble update techniques that employ either a transformation of the form X′ = AX′ (11) a f with an appropriate matrix A Rn×n (Anderson, 2001) or a transformation ∈ X′ = X′ T (12) a f withanappropriateT Rm×m (Bishop et al.,2001;Whitaker and Hamill,2002;Tippett et al., ∈ 2003; Evensen, 2004). The matrices A and T are chosen such that the resulting ensemble deviation matrix X′ satisfies a 1 P = X′(X′)T = (I KH)P . (13) a m 1 a a − f − 3 It has been shown by Tippett et al. (2003) that both formulations (11) and (12) can be made equivalent not only in terms of (13) but also in terms of the resulting ensemble deviation matrix X′. Since n m in most applications, formulation (12) is generally preferred except when a ≫ working in a sequential framework (Anderson, 2003). Note that the transformation matrix T should also satisfy Te = e to guarantee X′e = 0 a (Wang et al., 2004; Livings et al., 2008; Sakov and Oke, 2008b). Otherwise, the update of the ensemble deviation matrix would affect the update of the ensemble mean. Several methods have been proposed recently (including Sakov and Oke (2008a) and Bergemann et al. (2009)) that satisfy (13) only approximately. More specifically, Sakov and Oke (2008a) suggest to use (11) with 1 A = I KH, (14) − 2 while Bergemann et al. (2009) use numerical approximations to the underlying ODE formula- tion d 1 Y = YYTHTR−1HY (15) ds −2m 2 − in a fictitious time s [0,1] See, for example, Simon (2006) for a derivation of (15). The ∈ initial condition is Y(0) = X′ and the updated ensemble deviation matrix, which satisfies (13) f exactly, is provided by the solution at time s = 1, i.e. X′ = Y(1). (16) a A typical numerical implementation of (15) uses two or four time-steps with the forward Euler method (Bergemann et al., 2009). The resulting transformation of the forecast into the ana- lyzed ensemble deviation matrix is of the form (11) with A defined through the time-stepping method. Note that the Kalman gain matrix (9) is equivalent to K = P HTR−1, (17) a which is advantageous in connection with (15) since only the inversion of the measurement error covariance matrix R Rk×k is now required to implement a complete Kalman analysis ∈ step. Algorithmically, one would first update the ensemble deviation matrix using (15), then form the analysed ensemble covariance matrix P = X′[X′]T/(m 1) as well as the Kalman a a a − gain matrix (17), and finally update the ensemble mean using (8). All methods discussed so far have in common that the Kalman update increments for the ensemble mean and ensemble deviation matrix lie in a m 1 dimensional subspace, denoted by − S Rn. This space is defined by the range/image of the forecast ensemble deviation matrix f ⊂ X′ . Bergemann et al. (2009) introduced a continuous matrix factorization algorithm for the f ensemble X(t), which automatically produces orthogonal vectors that span S . f It is common practice to apply variance inflation (Anderson and Anderson, 1999) to X′(t ) j beforetheforecastedensembleisupdatedundertheKalmanfilteranalysisstep, i.e., theKalman analysis step uses X := x(t )eT +δX′(t ), (18) f j j where δ 1 is an inflation factor, instead of X = X(t ). f j ≥ 4 2.2 Localization The idea of localization, as proposed by Houtekamer and Mitchell (2001), is to replace the matrices HP and HP HT in the Kalman gain matrix (9) by f f ] ^ HP = C (HP ), HP HT = C HP HT , (19) f loc,1 f f loc,2 f ◦ ◦ respectively, where C Rn×k and C Rk×k are appropria(cid:0)te localiz(cid:1)ation matrices based loc,1 loc,2 ∈ ∈ on filter functions suggested by (Gaspari and Cohn, 1999) and C Y denotes the Schur product ◦ of two matrices C and Y of identical dimension, i.e., (C Y) = (C) (Y) (20) ◦ i,j i,j i,j for all indices i,j. We denote the resulting modified Kalman gain matrix by K , i.e., loc,f ] ^ −1 K = (HP )T HP HT +R . (21) loc,f f f (cid:16) (cid:17) Localization was also proposed by Hamill et al. (2001) with the only difference that localization is not applied to HP HT. f Alternatively, one can localize the Kalman gain matrix formulation (17) and use ] K = (HP )TR−1 (22) loc,a a instead of (9) in an ensemble Kalman filter. Note that (21) and (22) are not equivalent in general and that formulation (22) is easier to implement. Based on these modified Kalman gain matrices, a Schur-product-based localization is easy to apply to the update (8) of the ensemble mean and to ensemble deviation updates that use perturbed observations (Burgers et al., 1998), which is essentially the localization approach of Houtekamer and Mitchell (2001) and Hamill et al. (2001). However, the popular class of ensemble transform/square root filters, based on (12), has not yet been amenable to Schur- product-based localizations except when observations are treated sequentially, i.e., when k = 1 in each transformation step (Whitaker and Hamill, 2002). It is feasible that localizations can be implemented for ensemble deviation updates of the form (11) through an appropriate modification of the ensemble adjustment technique of Anderson (2001). However, such a modification would lead to a computationally expensive implementation of Schur-product-based localizations. The recently proposed DEnKF filter of Sakov and Oke (2008a), on the other hand, leads to a computationally feasible implementation with the localization directly applied to (14), i.e., one uses 1 A = I K H (23) loc,f − 2 in (11). We note that localization implies in general that the update increments for the ensem- ble mean and the ensemble deviation matrix no longer lie in the subspace S defined by the f 5 range/image of X′ . While this is a desirable property on the one hand, it can lead to unbal- f anced fields in the analyzed ensemble X on the other hand. This has been investigated, for a example, by Houtekamer and Mitchell (2005) and Kepert (2009). We finally mention an alternative approach to localization. The box/local EnKF filters of Evensen (2003); Ott et al. (2004); Hunt et al. (2007) assimilate data locally in physical space and possess a “built in” localization based on the spatial structure of the underlying partial differential equation model. 3 Localization based on continuous ensemble updates We now describe an alternative for introducing localization, which is based on a generalization of the ODE formulation (15) and which leads to an ODE formulation directly in the ensemble members x . i We first note that the Kalman update (8) for the ensemble mean can also be formulated in terms of an ODE, i.e., d 1 x = YYTHTR−1(Hx y) (24) ds −m 1 − − with x(0) = x and x = x(1). See, for example, Simon (2006) for a derivation of (24). f a To further reveal the underlying mathematical structure of (15) and (24), we introduce the cost functional 1 S(x) = (Hx y)T R−1(Hx y) (25) 2 − − for each set of observations. Next we combine (15) and (24) to give rise to the differential equations d 1 dsxi = −2P{∇xiS(xi)+∇xS(x)} (26) in the ensemble members x , i = 1,...,m. The equations are closed through the standard i definition m 1 x(s) = x (s) (27) i m i=1 X for the mean and covariance matrix m 1 P(s) = (x (s) x(s))(x (s) x(s))T . (28) i i m 1 − − − i=1 X Since the covariance matrix P is symmetric, a straightforward calculation reveals that m d 1 S(x)+ S(x ) 0 (29) i ds m ≤ ( ) i=1 X along solutions of (26). More precisely, (26) is equivalent to the gradient system d dsxi = −P∇xiV(X), (30) 6 in the ensemble matrix X(s) with potential m m 1 (X) = S(x)+ S(x ) . (31) i V 2 m ( ) i=1 X The actual decay of the potential along solutions of (30) depends crucially on the covariance V matrix P. We note that Schur-product-based localizations can easily be applied to (26) to obtain, e.g., d 1 dsxi = −2P{∇xiS(xi)+∇xS(x)}, P = Cloc ◦P, (32) or e e d 1 x = (HP)TR−1 Hx +Hx 2y , HP = C (HP), (33) i i loc,1 ds −2 { − } ◦ in case of linear observation operators. These modified ensemble update formulations are easy g g to implement numerically. See Section 4 for details. 4 Numerical implementation aspects The various ODE formulations for the ensemble members x , i = 1,...,m, need to be solved i over a unit time interval with initial conditions provided by the forecast values x of the i,f ensemble members. We apply the forward Euler method with step-size ∆s = 1/4 (four time- steps) for our experiments. We found that ∆s = 1 (single time-step) and ∆s = 1/3 (three time-steps) lead to unstable simulations, while ∆s = 1/2 (two time-steps) leads to occasional instabilities for larger values of the ensemble inflation factor δ in (18). On the other hand, increasing the number of time-steps beyond four did not change the results significantly. We also expect that four time-steps will generally be sufficient in practical applications unless observations strongly contradict their forecast values and large gradient values are generated in (26). The same consideration can apply to simulations with large inflation factors δ in (18). As a safe guard, one can monitor the decay of the potential (31) along numerically generated solutions and adjust the step-size ∆s if necessary. Notethatthecontinuousformulationsdonotrequirematrixinversions/factorizations except for the computation ofR−1. The computational cost of localizationcan bereduced even further by using the following approximation. The matrix HP in (33) varies along solutions and an approximative formulation is obtained by replacing HP(s) by its value at s = 0 for all s > 0. This leads to a linear ODE in the ensemble membersgx with constant coefficient matrix, i.e., i g d 1 x = (HP(0))TR−1 Hx +Hx 2y . (34) i i ds −2 { − } ^ g Note that HP and HPHT are sparse matrices for compactly supported filter functions (Gaspari and Cohn, 1999). Numerical implementations of (34) should first update the incre- g 7 0.5 EnKF ESRF DEnKF 0.45 CEnKF−I (∆ s = 1/4) CEnKF−I (∆ s = 1/6) or CEnKF−II (∆ s = 1/4) err 0.4 S M R al m0.35 pti o 0.3 0.25 0 5 10 15 20 localization radius Figure 1: The best RMS error for the Lorenz-96 model with an ensemble size of m = 10 and k = 20 observations taken in intervals of ∆t = 0.05 over a total of 5000 assimilation cycles. obs ments z = Hx y with Euler’s method, i.e., i i − m ∆s ^ 1 zl+1 = zl HPHT(0)R−1 zl + zl , l = 0,...,L, (35) i i − 2 i m j ( ) j=1 X L = 1/∆s the number of integration steps, and then use the accumulated increments L−1 z = zl (36) i i l=0 X to compute the final update of the ensemble members x , i = 1,...,m. Overall matrix-vector- i products will induce a computational complexity of (km) in the ensemble size m and the O number of observations k independent of the system size n. The same order of complexity applies to the serial algorithm of Hamill et al. (2001) with the important difference that (34) can be implemented as a simultaneous update over all observations. Bergemann et al. (2009) proposed a re-orthogonalization technique for the ensemble devia- tion matrix X′. It should be noted that the re-orthogonalization is not uniquely defined. We implemented several variants of re-orthogonalization in combination with localization but did not find any significant improvements in the results. 5 Numerical experiments We now report results from two test problems and implementations of (33) and (34) with local- ization. The results are compared to those from standard ensemble Kalman filter techniques. 8 5.1 Lorenz-96 model The standard implementation of the Lorenz-96 model (Lorenz, 1996; Lorenz and Emanuel, 1998) has state vector x = (x ,...,x )T Rn, n = 40, and its time evolution is given by the 1 n ∈ differential equations x˙ = (x x )x x +8 (37) j j+1 j−2 j−1 j − − for j = 1,...,n. To close the equations, we define x = x , x = x , and x = x . −1 39 0 40 41 1 The attractor of this standard implementation has a fractal dimension of about 27 and 13 positive Lyapunov exponents. Localization will be necessary for ensembles with m 13 ≤ ensemble members. We use an ensemble size of m = 10 in our experiments. We observe every second grid point, i.e., k = 20, and the measurement error covariance is R = I . Measurement 20 are taken in time intervals of ∆t = 0.05. After a short spin-up period, a total of J = 5000 obs analysis steps are performed in each experiment. The ”true” trajectory x (t ) is generated truth n by integrating the Lorenz-96 model with the implicit midpoint rule and step-size ∆t = 0.005, i.e., we assume that there is no model error. The observations are obtained according to y(t ) = Hx (t )+r(t ), (38) obs truth obs obs where r(t ) are i.i.d. Gaussian random numbers with mean zero and covariance matrix R. obs We implement localization combined with standard ensemble inflation for the following five different ensemble Kalman filters: (i) EnKF with perturbed observations (Burgers et al., 1998; Houtekamer and Mitchell, 2005), (ii) ensemble square root filter (ESRF) with sequential treatment of observations (Whitaker and Hamill, 2002), (iii) DEnKF (Sakov and Oke, 2008a), (iv) formulation (33), denoted CEnKF-I, (v) formulation (34), denoted CEnKF-II. We imple- ment CEnKF-I with ∆s = 1/4 and ∆s = 1/6, respectively, to demonstrate the impact of the discretization parameter on the results. For simplicity, localization is performed by multiplying each element of the matrices HP and HPHT, respectively, by a distance dependent factor ρi,i′. This factor is defined by the compactlysupportedlocalizationfunction(4.10)fromGaspari and Cohn(1999), distanceri,i′ = min i i′ ,n i i′ , where i and i′ denote the indices of the associated observation/grid {| − | −| − |} points xi and xi′, respectively, and a fixed localization radius r0. The localization radius is varied between r = 2 and r = 30. The inflation factor δ in (18) is taken from the range 0 0 [√1.02,√1.16]. In Figure 1, we display the RMS error J 1 rmse = x(j ∆t ) x (j ∆t ) 2 (39) obs truth obs vnJ k · − · k u j=1 u X t for an optimally chosen inflation factor δ as a function of the localization radius r . Results are 0 displayed for those localization radii r only, which lead to at least one simulation with a RMS 0 error of less than one. Weconclude that EnKF yields thelowest filter skills while all other methods show analmost identical performance. 9 δ\r0 5 10 15 20 25 30 35 1.02 0.59 0.62 0.72 0.96 1.41 Inf Inf 1.06 0.80 0.67 0.62 0.65 0.79 0.99 1.66 1.10 1.09 0.85 0.72 0.69 0.75 1.04 1.13 1.14 1.38 1.01 0.84 0.77 0.78 0.88 1.40 1.18 Inf 1.18 0.96 0.84 0.82 0.86 1.35 Table 1: Mean RMS error for localized CEnKF-I over 4000 time steps as a functions of the localization radius r and the inflation factor δ. For clarity, the value Inf is assigned if the RMS 0 error exceeds the value 2.0 (no filter skill). δ\r0 5 10 15 20 25 30 35 1.02 0.60 0.65 0.77 1.01 1.31 1.80 Inf 1.06 0.80 0.66 0.63 0.68 0.86 1.13 1.51 1.10 1.11 0.84 0.72 0.69 0.74 0.93 1.29 1.14 1.42 1.04 0.83 0.76 0.78 0.86 1.14 1.18 Inf 1.21 0.96 0.85 0.82 0.86 1.16 Table 2: Mean RMS error for localized CEnKF-II over 4000 time steps as a functions of the localization radius r and the inflation factor δ. 0 δ\r0 5 10 15 20 25 30 35 1.02 0.59 0.62 0.75 0.94 1.06 Inf Inf 1.06 0.82 0.69 0.64 0.68 0.73 0.98 Inf 1.10 1.16 0.89 0.75 0.70 0.72 0.85 1.22 1.14 1.50 1.11 0.89 0.80 0.77 0.85 0.99 1.18 Inf 1.33 1.05 0.91 0.87 0.87 1.01 Table 3: Mean RMS error for localized DEnKF over 4000 time steps as a functions of the localization radius r and the inflation factor δ. 0 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.