Convolution particle filters for parameter estimation in general state-space models Fabien Campillo, Vivien Rossi To cite this version: Fabien Campillo, Vivien Rossi. Convolution particle filters for parameter estimation in general state- space models. [Research Report] RR-5939, INRIA. 2006, pp.28. inria-00081956v2 HAL Id: inria-00081956 https://hal.inria.fr/inria-00081956v2 Submitted on 27 Jun 2006 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. INSTITUTNATIONALDERECHERCHEENINFORMATIQUEETENAUTOMATIQUE Convolution particle filters for parameter estimation in general state-space models FabienCampillo andVivienRossi N˚5939 Juin2006 Systèmesnumériques (cid:13) apport de recherche(cid:13) (cid:13) 9 9 3 6 9- 4 2 0 N S S I Convolution particle (cid:28)lters for parameter estimation in general state-space models Fabien Campillo (cid:3) and Vivien Rossi y SystŁmes numØriques Projets Aspi Rapport de recherche n(cid:6)5939 (cid:22) Juin 2006(cid:22) 25 pages Abstract: The state-space modeling of partially observed dynamic systems generally requires estimates of unknown parameters. From a practical point of view, it is relevant in such (cid:28)ltering contexts to simultaneously estimate the unknown states and parameters. E(cid:30)cient simulation-based methods using convolution particle (cid:28)lters are proposed. The regularization properties of these (cid:28)lters is well suited, given the context of parameter esti- mation. Firstly the usual non Bayesianstatistical estimates are considered: the conditional least squares estimate (CLSE) and the maximum likelihood estimate (MLE). Secondly, in a Bayesian context, a Monte Carlo type method is presented. Finally these methods are compared in several simulated case studies. Key-words: Hidden Markov models, parameter estimation, particle (cid:28)lter, convolution kernels, conditional least squares estimate, maximum likelihood estimate (RØsumØ : tsvp) (cid:3) [email protected] y [email protected] UnitéderechercheINRIARennes IRISA,CampusuniversitairedeBeaulieu,35042RENNESCedex(France) Téléphone: 0299847100-International: +33299847100 Télécopie: 0299847171-International: +33299847171 Filtres particulaires (cid:224) convolution pour l’estimation de paramŁtres dans des modŁles (cid:224) espace d’Øtat gØnØraux RØsumØ : La modØlisation par espace d’Øtat de systŁmes dynamiques partiellement ob- servØs requiŁre le plus souvent l’estimation de paramŁtres inconnus. En pratique, il est pertinent dans un tel cadre de simultanØment estimer l’Øtat non observØ et les paramŁtres inconnus. On propose des mØthodes de simulation faisant appel (cid:224) des (cid:28)ltres particulaires (cid:224) con- volution. Les propriØtØs de rØgularisation de ces (cid:28)ltres sont particuliŁrement adaptØes (cid:224) ce contexte d’estimation paramØtrique. Dans un premier temps, on considŁre les estimØes des moindres carrØs conditionnelles et du maximum de vraisemblance. Puis, dans un contexte bayØsien,onproposeunemØthodedetypeMonteCarlo. CesmØthodessonten(cid:28)ncomparØes sur plusieurs exemples simulØs. Mots-clØ: modŁlesdeMarkovcachØs,estimationparamØtrique,(cid:28)ltreparticulaire,noyaux de convolution, estimation des moindres carrØs conditionnel, estimation du maximum de vraisemblance Convolution particle (cid:28)lters 3 Contents 1 Introduction 5 2 The convolution (cid:28)lters 6 2.1 The simple convolution (cid:28)lter (CF) . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 The resampled convolution (cid:28)lter (R-CF) . . . . . . . . . . . . . . . . . . . . . 8 2.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Conditional least squares estimate 8 3.1 The theoretical estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 The practical estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Maximum likelihood estimate 10 4.1 Maximum likelihood estimation with the CF . . . . . . . . . . . . . . . . . . 10 4.2 Maximum likelihood estimation with the R-CF . . . . . . . . . . . . . . . . . 10 5 Optimization di(cid:30)culties 11 6 R-CF with unknown parameters approach 11 6.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6.2 Theoretical study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7 Simulated case studies 15 7.1 Comparison of the approaches. . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7.1.1 Least squares estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7.1.2 Maximum likelihood estimate . . . . . . . . . . . . . . . . . . . . . . . 16 7.1.3 R-CF based parameter estimate . . . . . . . . . . . . . . . . . . . . . 17 7.2 Bearings(cid:21)onlytracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 8 Conclusion and discussion 21 A Kernel estimation 21 B Proof of theorem 2 22 RR n(cid:6)5939 4 F. Campillo & V. Rossi INRIA Convolution particle (cid:28)lters 5 1 Introduction Consider a general state-space dynamical system described by an unobserved state process x and an observation process y taking values in Rd and Rq respectively. This system t t depends on an unknown parameter (cid:18) 2 Rp. Suppose that the state process is Markovian, and that the observations y are independent conditionally to the state process. Suppose t also that the distribution law of y depends only on x . Hence this system is completely t t described by the state process transition density and the emission density, namely x jx (cid:24) f (x jx ;(cid:18)); t t(cid:0)1 t t t(cid:0)1 (1) y jx (cid:24) h (y jx ;(cid:18)); t t t t t and by the initial density law (cid:25) of x . 0 0 The goal is to estimate simultaneously the parameter (cid:18) and the state process x based t on the observations y =fy ;:::;y g. 1:t 1 t In the nonlinear hidden processes framework, the parameter estimation procedure is of- ten based on an approximation of the optimal (cid:28)lter. The extended Kalman (cid:28)lter and its various alternatives can give good results in practice but su(cid:27)er from an absence of theo- retical backing. The particle (cid:28)lters propose a good alternative: in many practical cases theygivebetterresults,moreovertheirtheoreticalpropertiesarebecomingincreasinglywell understood [1] [2] [3]. It is thus particularlyappealing to use particle (cid:28)ltering in order to estimate parameters in partially observed systems. For a review of the question, one can consult Doucet [4] or Liu & West [5]. There are two main approaches: (cid:15) The non Bayesian approach which consists of minimizing a given cost function like theconditionalleastsquarescriterionorbymaximizingthelikelihoodfunction. These methodsareusuallyperformedinbatchprocessesbutcanalsobeextendedtorecursive procedures. (cid:15) The Bayesian approach where an augmented state variable which includes the para- meter is processed by a (cid:28)ltering procedure. These methods suppose that a prior law is given for the parameter and are performed on-line. In practice, the (cid:28)rst approachcould be used as an initialization for the second one. Due to the partially observed system framework, the objective function introduced in the (cid:28)rst approach should be approximated for various values of the parameter (cid:18). This is done via the particle approximationof the conditional law p(y jy ;(cid:18)). The Monte Carlo t 1:t 1 (cid:0) nature of this particle approximation will make the optimization problematic. However, recent analyses propose signi(cid:28)cant improvements of these aspects [6] [4]. The second approach takes place in a classical Bayesian framework, a prior probability law(cid:26)((cid:18)) isthusintroducedonthe parameter(cid:18). A newstatevariable(x ;(cid:18) ), joining allthe t t unknown quantities, is considered and the posterior law p(x ;(cid:18) jy ) is then approximated t t 1:t using particle (cid:28)lters. RR n(cid:6)5939 6 F. Campillo & V. Rossi In this paper we propose and compare di(cid:27)erent estimates corresponding to these two approachesand based on convolution particle (cid:28)lter introduced in [7]. Thepaperisorganizedasfollows,we(cid:28)rstrecalltheprincipleofthe convolution(cid:28)lterfor the dynamical systems without unknown parameters. The application and the convergence analysis of this (cid:28)lter require weaker assumptions than the usual particle (cid:28)lters. This is due to the use of convolution kernels to weight the particle. Then the conditional least squares estimate and the maximum likelihood estimate are presented. Their adaptation to the state-space model context is possible thanks to the convolution (cid:28)lters. The computation of these two estimates calls upon an optimization procedure, which is problematic for particle (cid:28)lters because of their random nature. Next, the Bayesian estimation approach is presented, it also relies on the convolution particle (cid:28)lter. In this context, the standard particle approach highlights various drawbacks that are avoided by the convolution thanks to their smooth nature. Finally, these various estimates are compared in a range of simulated cases. 2 The convolution (cid:28)lters To present the convolution (cid:28)lter, suppose that the parameter (cid:18) is known and consider: x jx (cid:24) f (x jx ); t t(cid:0)1 t t t(cid:0)1 (2) y jx (cid:24) h (y jx ): t t t t t The objective is to estimate recursivelythe optimal (cid:28)lter p(x ;y ) p(x ;y ) t 1:t t 1:t p(x jy )= = (3) t 1:t p(y ) p(x ;y )dx 1:t t 1:t t R where p(x ;y ) is the (x ;y ) joint density. t 1:t t 1:t Assumption: Suppose that we know how to sample from the laws f ((cid:1)jx ), h ((cid:1)jx ) and t t 1 t t (cid:0) also from the initial law (cid:25) . 0 Notethattheexplicitdescriptionoftheconditionaldensitiesf andh isuselesswhereas t t forthe standardparticle(cid:28)lteringapproachesh should bestatedexplicitly. Forexample, in t caseof observationequationslikey =H(x ;v ) orH(x ;y ;v )=0,wherev isanoise, the t t t t t t t conditional density h is in general not available. t 2.1 The simple convolution (cid:28)lter (CF) Let fxig be a sample of size n of (cid:25) . For all i=1(cid:1)(cid:1)(cid:1)n, starting from xi, t successive 0 i=1 n 0 0 simulations(cid:1)f(cid:1)r(cid:1)omthesystem(2)leadtoasamplefxi;yi g fromp(x ;y ). Wegetthe t 1:t i=1 n t 1:t (cid:1)(cid:1)(cid:1) following empirical estimate of the joint density: n 1 p(x ;y )’ (cid:14) (x ;y ) (4) t 1:t n (xit;y1i:t) t 1:t Xi=1 INRIA Convolution particle (cid:28)lters 7 for t=0 initial sampling: x1(cid:1)(cid:1)(cid:1)xn (cid:24)(cid:25) 0 0 0 weight initialization: wi 1 for i=1:n 0 for t(cid:21)1 for i=1:N state sampling: xit(cid:24)ft((cid:1)jxit(cid:0)1) observation sampling: yti (cid:24)ht((cid:1)jxit) weight updating: wti wti(cid:0)1Khyn(yt(cid:0)yti) (cid:28)lter updating: pnt(xtjy1:t)= Pni=1wPti Knhxnw(xit(cid:0)xit) i=1 t Table 1: The simple convolution (cid:28)lter (CF). where (cid:14) is the Dirac measure in x. x The Kernel estimate pn(x ;y ) of p(x ;y ) is then obtained by convolution of the t t 1:t t 1:t empirical measure (4) with an appropriate kernel (cf. Appendix A): n 1 pn(x ;y )d=ef Kx (x (cid:0)xi)Ky(cid:22) (y (cid:0)yi ) t t 1:t n hn t t hn 1:t 1:t Xi=1 where t Ky(cid:22) (y (cid:0)yi )d=ef Ky (y (cid:0)yi): hn 1:t 1:t hn s s sY=1 in which Kx , Ky are Parzen-Rosenblatt kernels of appropriate dimensions. Note that in hn hn Kx (x (cid:0)xi) (resp. Ky (y (cid:0)yi)) h could implicitly depend on n, d and x1:n (resp. n, q hn t t hn t t n t and y1:n) (see Section 2.3). t From (3), an estimate of the optimal (cid:28)lter is then: n Kx (x (cid:0)xi)Ky(cid:22) (y (cid:0)yi ) pn(x jy )d=ef i=1 hn t t hn 1:t 1:t (5) t t 1:t P n Ky(cid:22) (y (cid:0)yi ) i=1 hn 1:t 1:t P The basic convolution (cid:28)lter (CF) is de(cid:28)ned by the density estimate (5). A simple recursive algorithm for its practical computation is presented in Table 1. Convergence properties of pn(x jy ) to the optimal (cid:28)lter are ensured [7] when h ! 0 t t 1:t n and nhtq+d ! 1. Just like the Monte Carlo (cid:28)lters without resampling, it implies that n n must growwith tto maintainagoodestimation. A better approachwith a resamplingstep is proposed in the next section. RR n(cid:6)5939
Description: