ebook img

Tutorial on ABC rejection and ABC SMC for parameter estimation and model selection PDF

0.95 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Tutorial on ABC rejection and ABC SMC for parameter estimation and model selection

Tutorial on ABC rejection and ABC SMC for parameter estimation and model selection Tina Toni, Michael P. H. Stumpf [email protected], [email protected] 0 1 In this tutorial we schematically illustrate four algorithms: 0 2 1. ABC rejection for parameter estimation [1, 2], n a 2. ABC SMC for parameter estimation [3, 4, 5], J 9 3. ABC rejection for model selection on the joint space [6], 1 ] 4. ABC SMC for model selection on the joint space [7]. O C We suggest to read this tutorial from the beginning. We start with a detailed t. explanation of the ABC rejection algorithm, which later helps to understand ABC SMC a as it is based on the same concepts. Also, both model selection algorithms are closely t s [ related to parameter estimation algorithms and it is therefore helpful to understand those first. 2 v 2 This tutorial forms a part of the supplementary material of the paper ”Simulation- 7 based model selection for dynamical systems in systems and population biology, 4 4 Bioinformatics, 26 (1), 104-110, 2010” (T. Toni, M. P. H. Stumpf). . 0 1 9 0 : v i X r a 1 o o o o o oo oo o o o o o o o o o o o o o o o o o o o o oo oo o o o ABCfordynamicalsystems ABCframeworkfordynamicalsystems ABC rejection (a) We define a prior distribution P(θ) x andwewouldliketoapproximatetheposterior x x distribution P(θ D ). We start by sampling a x 0 parameter θ∗ fro|m the prior distribution. We time call this sampled parameter a particle. ABCfordynamicalsystems ABC framework for dynamical systems (b) We simulate a data set D∗ according Prior Posterior to some simulation framework f(D θ∗). In P(θ) P(θ|D0) | (a) our examples we use different simulation frameworks. If we simulate a deterministic ox dynamical model, we add some noise at the ox time points of interest. If we simulate a ox ox stochastic dynamical model, we do not add ABCfortidmynaemicalsystems any additional noise to the trajectories. We ABCframeworkfordynamicalsystems (b) compare the simulated data set D∗ (circles) to the experimental data D (crosses) using 0 a distance function, d, and toPlreiorrance (cid:15); iPfosterior ox d(D0,D∗) ≤ (cid:15), we accept θ∗. TP(hθe) toleranceP(θ|D) ox ox ox (cid:15) 0 is the desired level of agreement between time ≥ D and D∗. 0 (c) The particle θ∗ is accepted because Prior Posterior D∗ and D0 are sufficiently close. APBC(fθor)dynamicalsystems P(θ|D0) ABCframeworkfordynamicalsystems (c) (d) We sample another parameter θ∗ from the priordistributionandsimulateacorresponding o dataset D∗. In this case D∗ and D0 are very ox ox different and we reject the particle (we ”throw x x o it away”). ABCfordynamicalsystems time ABC framework for dynamical systems (e) We repeat the whole procedure until Prior Posterior N particles have been accepted. They repre- sent a sample from P(θ d(D ,D∗) (cid:15)), which P(θ) P(θ|D0) 0 ABCframeworkfordynamicalsystems | ≤ (d) approximates the posterior distribution. If x (cid:15) is sufficiently small then the distribution x TinaToni ABCSMC x26/06/2009 3/3 P(θ d(D ,D∗) (cid:15)) will be a good approxi- x 0 mat|ion for the≤“true” posterior distribution, time P(θ D0). x | x x x (f) Many particles were rejected in the Prior Posterior time procedure, for which we have spent a lot Pof(θ) Prior P(θD)Posterior computational effort for simulation. ABC P(θ) | P(θ|D0) rejection is therefore computationally ineffi- (e) (f) cient. We can use ABC SMC to reduce the computational cost. Figure1: SchematicrepresentationofABCre- jection. 2 TinaToni ABCSMC 26/06/2009 3/3 TinaToni ABCSMC 26/06/2009 3/3 August2,2009 1/1 TinaToni ABCSMC 26/06/2009 4/24 TinaToni ABCSMC 26/06/2009 4/24 Population1 Population2 PopulationT Population2 PopulationT ABCSMC(SequentialMonteCarlo) ABC SMC (Toni et al., 2009) (cid:15)1 (cid:15)2 ... (cid:15)T 1 (cid:15)T − (a) As in ABC rejection, we define a prior distributionP(θ)andwewouldliketoapproxi- mateaposteriordistributionP(θ D ). InABC 0 | SMC we do this sequentially by constructing intermediate distributions, which converge to the posterior distribution. We define a Prior IntermediateDistributions Posterior tolerance schedule (cid:15) >(cid:15) >...(cid:15) 0. 1 2 T ≥ (a) ABCSMC(SequentialMonteCarlo) TinaToni,MichaelStumpf ABCdynamicalsystems 03/07/08 1/1 Population1 (b) We sample particles from a prior distribu- (cid:15)1 (cid:15)2 ... (cid:15)T 1 (cid:15)T − tionuntilN particleshavebeenaccepted(have reached the distance smaller than (cid:15) ). For all 1 accepted particles we calculate weights (see [4] for formulas and derivation). We call the sample of all accepted particles ”Population 1”. Prior IntermediateDistributions Posterior (b) ABCSMC(SequentialMonteCarlo) TinaToni,MichaelStumpf ABCdynamicalsystems 03/07/08 1/1 (c) We then sample a particle θ∗ from popu- Population1 (cid:15)1 (cid:15)2 ... (cid:15)T 1 (cid:15)T lation 1 and perturb it to obtain a perturbed − particle θ∗∗ K(θ θ∗), where K is a per- ∼ | turbation kernel (for example a Gaussian random walk). We then simulate a dataset D∗ f(D θ∗∗) and accept the particle θ∗∗ ∼ | if d(D ,D∗∗) (cid:15) . We repeat this until we 0 2 ≤ have accepted N particles in population 2. We calculate weights for all accepted particles. Prior IntermediateDistributions Posterior (c) ABCSMC(SequentialMonteCarlo) TinaToni,MichaelStumpf ABCdynamicalsystems 03/07/08 1/1 (d) We repeat the same procedure for the Population1 Population2 PopulationT (cid:15)1 (cid:15)2 ... (cid:15)T 1 (cid:15)T following populations, until we have accepted − N particles of the last population T and calculated their weights. Population T is a sample of particles that approximates the posterior distribution. ABC SMC is computationally much more Prior IntermediateDistributions Posterior efficient than ABC rejection (see [4] for comparison). (d) FigureTinaT2on:i,MichSaelcSthumpefmaticABCrdeynpamricaelssysteemnstation of 0A3/07B/08C1/1 SMC. 3 o o o o o o o o ABC framework for dynamical systems ABC rejection for model selection A sample from the prior distribu2on (white) and accepted par2cles (blue)   x m m  m  x1  2 3 θ θ θ   θx   θ     θ   θ    θ θ   θ    θ   θ   θ   θ   θ   x 1      2      3 4 5 6 7 8      9 10 11 12 13 14 15 time Prior Posterior P(m,θ) P(m,θD ) 0 | (a) (b) 0 1. Approximation of P(m|D0) 8 0. 6 0. 4 0. 2 0. 0 0. 1 2 3 Model (Mc)d l Figure 3: (a) Prior and posterior distributions, P(m,θ) and P(m,θ D ), are now defined on a 0 | jointmodelandparameterspace. (b)Particles(m,θ)aresampledfromthepriordistributionand accepted/rejected according to the distance between the simulated and experimental datasets. The accepted particles are shown in dark blue. (c) Six particles have been accepted: one from model 1 and five from model 2. The approximated marginal posterior probability of the model can be calculated as P(m = 1D ) = 1, P(m = 2D ) = 5, P(m = 3D ) = 0 . For illustrative | 0 6 | 0 6 | 0 6 purposes we have chosen a small number of particles. In principle this algorithm will yield consistent marginal posterior model distributions for N . →∞ 4 August2,2009 1/1 ABC SMC for model selection 0 8 1. Population 1 0. 6 A sample from the prior distribu2on and accepted par2cles of Pop1 0.   4 m m  m  0. θ θ θ1    θ   θ     θ   θ    θ2 θ   θ    θ   θ   θ 3  θ   θ   2 1      2      3 4 5 6 7 8      9 10 11 12 13 14 15 0. 0 0. 1 2 3 Mdl (a) (b) Popula’on t‐1   m m  m  1  2 3        θ1              θ2       θ3     θ4         θ5    θ6       m* = m 2  m** ~ KM(m|m*) t   m** =mm m13  3  m2  m3  0.0 0.2 0.4 0.6 0.8 1.0 Prior Population 1 1 2 3 1 2 3 θ  θ   5   6   SθAIfa* camc*ce cp~pe ltKep/P tr(eetm,mdje**, c**c t(,a θ (θlm|c*uθ)*l* a*f)rt, oe θm t*h *{e)(. mw  *e*ig, hθt)s}.t ‐1.  0.0 0.2 0.4 0.6 0.8 1.0 1 P o p u 2la t i o n 2 3 1 P o p u l2a t i o n 3 3 Model Model (c) (d) 5 Figure 4: (a) Particles are sampled from the prior P(m,θ) until N particles have been accepted (in this example N = 6 for illustration purposes, in practice N should be larger), that is the distance is smaller than (cid:15) . The weights w are calculated for all accepted particles and 1 t=1 normalized. (b) To obtain the marginal probability of m, we sum over all the weights corresponding to the model of interest: P (m(cid:48)) = (cid:80) w (m,θ). The histogram of the marginal t=1 m=m(cid:48) t=1 population 1 is presented. (c) This figure shows how we propose and accept a particle of population t, t = 2,...,T. We sample a model m∗ from population t 1 with probability P (m∗). For example, we might t−1 − have sampled m∗ = m . We then perturb the model using a model perturbation kernel KM 2 t to obtain m∗∗ KM (mm∗), for example m∗∗ = m . After we have obtained the model t 3 ∼ | m∗∗, we sample a parameter θ∗ belonging to model m∗∗ from population t 1 and perturb − it to obtain θ∗∗ KPt,m∗∗(θ θ∗). We simulate a dataset D∗ for a particle (m∗∗,θ∗∗) and ∼ | accept (d(D ,D∗) (cid:15) ) or reject the particle. If a particle is accepted, we calculate its weight 0 t ≤ w (m∗∗,θ∗∗). t (d) When N particles of population t have been accepted, we normalize the weights and marginalize them in order to obtain the marginal intermediate population of the model P (m). t We continue until population T, which is the approximation of the joint posterior distribution P(m,θ D ). The quantities of interest are the intermediate and the last marginal population of 0 | the model. In this example there are three populations, T =3. 6 References [1] BeaumontMA,ZhangWandBaldingDJ. ApproximateBayesiancomputationinpopulationgenet- ics. Genetics, 162(4):2025–2035, 2002. [2] Marjoram P, Molitor J, Plagnol V and Tavare S. Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci USA, 100(26):15324–8, 2003. [3] SissonSA,FanYandTanakaMM. SequentialMonteCarlowithoutlikelihoods. Proc Natl Acad Sci USA, 104(6):1760–5, 2007. [4] Toni T, Welch D, Strelkowa N, Ipsen A and Stumpf MPH. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface, 6:187–202, 2009. [5] Beaumont MA, Cornuet JM, Marin JM and Robert CP. Adaptive approximate Bayesian computa- tion. Biometrika, 96(4):983–990, 2009. [6] GrelaudA,RobertCP,MarinJM,RodolpheFandTalyJF. ABClikelihood-freemethodsformodel choice in Gibbs random fields. Bayesian Analysis, 4(2):317–336, 2009. [7] Toni T and Stumpf MPH. Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics, 26(1):104–10, 2010. 7

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.