Table Of ContentSubmittedtotheAnnalsofAppliedProbability
ADIABATIC REDUCTION OF A PIECEWISE
DETERMINISTIC MARKOV MODEL OF STOCHASTIC
GENE EXPRESSION WITH BURSTING TRANSCRIPTION
2 By Jinzhi Lei , Michael C. Mackey , Romain
1
Yvinec∗ and Changjing Zhuge
0
2
Tsinghua University, McGill University, Universit´e de Lyon
b
e This paper considers adiabatic reduction in a model of stochas-
F tic gene expression with bursting transcription. We prove that an
4 adiabatic reduction can be performed in a stochastic slow/fast sys-
2 tem with a jump Markov process. In thegene expression model, the
production of mRNA(thefast variable) is assumed to follow acom-
] pound Poisson process (the phenomena called bursting in molecular
R
biology)andtheproductionofprotein(theslowvariable)islinearas
P afunctionofmRNA.WhenthedynamicsofmRNAisassumedtobe
. faster than the protein dynamics (due to a mRNA degradation rate
h
t largerthanfortheprotein)weprovethat,withtheappropriatescal-
a
ing,theburstingphenomenacanbetransmittedtotheslowvariable.
m
Weshow that thereduced equation is eithera stochastic differential
[ equation with a jump Markov process or a deterministic ordinary
differential equation depending on the scaling that is appropriate.
1
These results are significant because adiabatic reduction techniques
v
seem to have not been developed for a stochastic differential system
1
1 containingajumpMarkovprocess.Wehopethattheyaregeneraliz-
4 abletogiveinsightintoadiabaticmethodsinmoregeneralstochastic
5 hybridsystems.Inparticular,asimilarproofgivesusamathematical
. justificationoftheburstingofmRNAthroughanappropriatescaling
2
of an “ON-OFF” switching gene model. Last but not least, for our
0
2 particular system, the adiabatic reduction allows us to understand
1 whatarethenecessary conditionsfortheburstingproduction-likeof
: mRNAand protein to occur.
v
i
X
r 1. Introduction. The adiabatic reduction technique gives results that
a
allow one to reduce the dimension of a system and justifies the use of an
effective set of reducedequations in lieu of dealing with a full,higher dimen-
sional, model if different time scales occur in the system. Adiabatic reduc-
tion results for deterministic systems of ordinary differential equations have
∗
corresponding author
AMS 2000 subject classifications: Primary 92C45, 60Fxx; secondary
92C40,60J25,60J75
Keywords and phrases: adiabatic reduction, piecewise deterministic Markov process,
stochastic bursting gene expression, quasi-steady state assumption, scaling limit
1
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
2 R.YVINECET AL.
been available since thevery precise results of Tikhonov (1952) andFenichel
(1979). The simplest results, in the hyperbolic case, give an effective con-
struction of an uniformly asymptotically stable slow manifold (and hence
a reduced equation) and prove the existence of an invariant manifold near
the slow manifold, with (theoretically) any order of approximation of this
invariant manifold.Suchpreciseandgeometric resultshavebeengeneralized
to random systems of stochastic differential equation with Gaussian white
noise (Berglund and Gentz (2006), see also Gardiner (1985) for previous
work on the Fokker-Planck equation). However, to the best of our knowl-
edge, analogous results for stochastic differential equations with jumps have
not been obtained.
The present paper gives a theoretical justification of an adiabatic reduc-
tion of a particular piecewise deterministic Markov process (Davis, 1984).
The results we obtain do not give a bound on the error of the reduced sys-
tem, but they do allow us to justify the use of a reduced system in the case
of a piecewise deterministic Markov process. In that sense, the results are
close to the recent ones by Crudu et al. (2012) and Kang and Kurtz (2012),
where general convergence results for discrete models of stochastic reaction
networks are given. In particular, these papers give alternative scaling of
the traditional ordinary differential equation and the diffusion approxima-
tiondependingonthedifferentscalingchosen(seeBalletal.(2006)forsome
examples in a reaction network model). After the scaling, the limiting mod-
els can be deterministic (ordinary differential equation), stochastic (jump
Markov process), or hybrid (piecewise deterministic process). For illustra-
tive and motivating examples given by a simulation algorithm, see Haseltine
and Rawlings (2002); Rao and Arkin (2003); Goutsias (2005).
Ourparticular modelis meanttodescribestochastic geneexpressionwith
explicit bursting (Friedman, Cai and Xie, 2006). The variables evolve under
the action of a continuous deterministic dynamical system interrupted by
positive jumps of random sizes that model the burst production. In that
sense, the convergence theorems we obtain in this paper can be seen as an
example in which there are an infinite number of reactions (one for each
“burstsize”) and give complementary results to those of Cruduet al. (2012)
and Kang and Kurtz(2012). We hopethat the results here are generalizable
to give insight into adiabatic reduction methods in more general stochastic
hybrid systems (Hespanha, 2006; Bujorianu and Lygeros, 2004). We note
also that more geometrical approaches have been proposed to reduce the
dimension of such systems in Bujorianu and Katoen (2008).
Biologically, the bursting of mRNA or protein is defined as the produc-
tion of several molecules within a very short time, indistinguishable within
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
ADIABATICREDUCTION FOR PDMP 3
the time scale of the measurement. In the biological context of models of
stochasticgeneexpression,explicitmodelsofburstingmRNAand/orprotein
production have been analyzed recently, either using a discrete (Shahrezaei
and Swain, 2008; Lei, 2009) or a continuous formalism (Friedman, Cai and
Xie, 2006; Mackey, Tyran-Kaminska and Yvinec, 2011) as more and more
experimental evidence from single-molecule visualization techniques has re-
vealed the ubiquitous nature of this phenomenon (Ozbudak et al., 2002;
Golding et al., 2005; Raj et al., 2006; Elf, Li and Xie, 2007; Xie et al., 2008;
Raj and van Oudenaarden, 2009; Suter et al., 2011). Traditional models of
gene expression are composed of at least two variables (mRNA and protein,
andsometimes theDNAstate). Theuseofareducedone-dimensionalmodel
(that has theadvantage that it can besolved analytically) has been justified
so far by an argument concerning the stationary distribution in Shahrezaei
and Swain (2008). However, it is clear that two different models may have
the same stationary distribution butvery different behavior (for an example
in that context, see Mackey, Tyran-Kaminska and Yvinec (2011)). Hence,
our results are of importance to rigorously prove the validity of using a
reduced model.
1.1. Continuous-state burstingmodel. Themodelsreferredtoabovehave
explicitly assumed the production of several molecules instantaneously,
through a jump Markov process, in agreement with experimental obser-
vations. In line with experimental observations, it is standard to assume
a Markovian hypothesis (an exponential waiting time between production
jumps) and that the jump sizes are exponentially distributed (geometrically
in the discrete case) as well. The intensity of the jumps can be a bounded
function to allow for self-regulation. The assumption that the intensity is
bounded is crucial, and makes the stochastic production term a compound
Poisson process.
LetX andY denotetheconcentrationsofmRNAandproteinrespectively.
A simple model of single gene expression with bursting in transcription is
given by
dX
(1) = −γ X +N˚(h,ϕ(Y)),
1
dt
dY
(2) = −γ Y +λ X.
2 2
dt
Here γ and γ are the degradation rates for the mRNA and protein re-
1 2
spectively, λ is the mRNA translation rate, and N˚(h,ϕ(Y)) describes the
2
transcription that is assumed to be a compound Poisson white noise occur-
ring at a rate ϕ with a non-negative jump size ∆X distributed with density
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
4 R.YVINECET AL.
h.
The equations (1)-(2) are a short hand notation for
t t ∞ ∞
(3) Xt = X0− γ1Xs−ds+ 1{r≤ϕ(Ys−)}zN(ds,dz,dr)
Z0 Z0 Z0 Z0
t t
(4) Yt = Y0− γ2Ys−ds+ λ2Xsds.
Z0 Z0
where Xs− = limt→s−X(t), and N(ds,dz,dr) is a Poisson random measure
on (0,∞)×[0,∞)2 with intensity dsh(z)dzdr, where s denotes the times of
thejumps,r isthestate-dependency inanacceptance/rejection fashion,and
z the jump size. Note that (X ) is a stochastic process with almost surely
t
finite variation on any bounded interval (0,T), so that the last integral is
well defined as a Stieltjes-integral.
The following discussion is valid for general rate functions ϕ and density
functions h(∆X) that satisfy
(5) ϕ ∈ C1, ϕ and ϕ′ are bounded, ie ϕ ≤ ϕ,ϕ′ ≤ ϕ
∞
(6) h ∈ C0 and xnh(x)dx < ∞, ∀n≥ 1.
Z0
For a general density function h, the average burst size is given by
∞
(7) b = xh(x)dx.
Z0
Remark 1. Hill functions are often used to model gene self-regulation,
so that ϕ is given by
1+yn
ϕ(y) = κ
Λ+∆yn
whereκ,Λ,∆arepositiveparametersandnisapositiveinteger(seeMackey,
Tyran-Kaminska and Yvinec (2011) for more details). An exponential dis-
tributionoftheburstingtranscriptionisoften usedinmodelinggeneexpres-
sion, in accordance with experimental findings, so that the density function
h is given by
1
h(∆X) = e−∆X/b,
b
with b the average burst size.
If ϕ is independent of the state Y, the average transcription rate is λ =
1
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
ADIABATICREDUCTION FOR PDMP 5
bϕ, and the average mRNA and protein concentrations are
bϕ
(8) X = .
eq
γ
1
λ bϕλ
2 2
(9) Y = X = .
eq eq
γ γ γ
2 1 2
1.2. Statement of the results. Inthefollowingdiscussion,weconsiderthe
situation whenmRNAdegradation is afast process,i.e.γ is ‘large enough’,
1
but the average protein concentration Y remains unchanged. Thus, we
eq
always assume one of the following three scaling relations:
(S1) ϕ/γ is independentof γ (= Y γ2 if ϕ is a constant) as γ → ∞ and
1 1 eqbλ2 1
h,λ remain unchanged; or
2
(S2) h → 1 h(∆X) (so that b = Y γ2γ1 ) as γ → ∞ and ϕ,λ remain
γ1 γ1 eqϕ(Y)λ2 1 2
unchanged; or
(S3) λ /γ is independent of γ (= Y γ2γ1 if ϕ is a constant) as γ → ∞
2 1 1 eq bϕ 1
and h, ϕ remain unchanged.
In this paper we determine an effective reduced equation for equation (2)
for each of the three scaling conditions (S1)-(S3). In particular, we show
that under assumption (S1), equation (2) can be approximated by the de-
terministic ordinary differential equation
dY
(10) = −γ Y +λ ψ(Y)
2 2
dt
where
(11) ψ(Y) = bϕ(Y)/γ .
1
We further show that under the scaling relations (S2) or (S3), equation (2)
can be reduced to the stochastic differential equation
dY
(12) = −γ Y +N˚(h¯(∆Y),ϕ(Y)).
2
dt
where h¯ is a suitable density function in the jump size ∆Y (to be detailed
below).
We first explain, using some heuristic arguments, the differences between
the three scaling relations and the associated results. When γ → ∞, apply-
1
ing a standard quasi-equilibrium assumption we have
dX
≈ 0
dt
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
6 R.YVINECET AL.
which yields
1
(13) X(t) ≈ N˚(h(.),ϕ(y)) = N˚(γ h(γ ·),ϕ(y)),
1 1
γ
1
and therefore the second equation (2) becomes
dY λ
≈ −γ Y + 2N˚(h(.),ϕ(y))
2
dt γ
1
γ γ
≈ −γ Y +N˚ 1h( 1),ϕ(y) .
2
λ λ
(cid:18) 2 2 (cid:19)
Hence in (12), h¯(∆Y) = (λ /γ )−1h((λ /γ )−1∆Y) under the scaling (S2)
2 1 2 1
and (S3). Furthermore, wenote that the scaling (S2) also implies γ h(γ ·)=
1 1
h(·), and therefore we can also write h¯(∆Y) = (λ )−1h((λ )−1∆Y) under
2 2
the scaling (S2). Finally, it is intuitively understandable that scaling (S1)
gives (10), as the jumps become more frequent and smaller.
In this paper, we provide three different proofs of the results mentioned
above. In particular, we prove the results using a Master equation approach
(the Kolmogorov forward equation) as well as starting from the stochastic
differential equation. Note that both techniques have been used in the past,
in particular within the context of discrete models of stochastic reaction
networks. For the Master equation approach, see Haseltine and Rawlings
(2005); Zeron and Santill´an (2010) while for the stochastic differential equa-
tion approach, we refer to Crudu et al. (2012); Kang and Kurtz (2012).
1.3. Outline of the paper. This paper is organized as follows. In Section
2, we consider the situation without auto-regulation so the rate ϕ is inde-
pendent of protein concentration Y. In this case the two equations (1)-(2)
form a set of linear stochastic differential equations. We use the method of
characteristic functionals to give a very short proof of the adiabatic reduc-
tion when the white noise process is independent of the stochastic variables
(external regulation). In Section 3 we consider the situation in which the
rate ϕ is dependent on the protein concentration Y. We first give a result
on the evolution equation on densities (Section 3.2), then we prove rigor-
ously a weak convergence of the stochastic process using the machinery of
generators (Section 3.3). Finally, in Section 4, we apply the same formalism
with generators to prove that a binary “ON-OFF“ model converges to a
bursting model, thus illustrating that these techniques can be applicable in
other contexts.
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
ADIABATICREDUCTION FOR PDMP 7
2. The case without auto-regulation. First, we consider the simple
case with no auto-regulation in the mRNA transcription so the rate ϕ is
independent of protein concentration Y. In this case, equation (1) is inde-
pendent of (2) which makes the analysis easier.
Consider the equations
dX
(14) = −γ X +N˚(h,ϕ), X(0) = X ≥ 0,
1 0
dt
dY
(15) = −γ Y +λ X, Y(0) = Y ≥ 0,
2 2 0
dt
where N˚(h,ϕ) is a compound Poisson white noise, and ϕ is a constant.
The solutions X(t) and Y(t) of (14)-(15) are stochastic processes uniquely
determined by the equation parameters and the stochastic process N˚.
We first state main results for each of the scaling relations we have con-
sidered.
Remark 2. In all scaling situations that we consider in the three the-
orems below, the stochastic process X converges only in finite-dimensional
t
law. Here finite-dimensional law means that for any finite number of fixed
times (t ,t ,···t ) the random variable (X(t ),X(t ),···X(t )) converges
1 2 n 1 2 n
inlawtowardstheappropriatedistribution.Convergenceinfinite-dimensional
law is weaker than convergence in law of the stochastic process. We will see
in the proofs of the theorems that this is because a lack of compactness for
the process X(t).
Theorem 1. Consider the equations (14)-(15). If the scaling (S1) is
γ γ
2 1
satisfied, i.e., ϕ = Y , then when γ → ∞,
eq 1
bλ
2
1. The stochastic process X(t) converges in finite-dimensional law to the
(deterministic) steady-state value X ;
eq
2. The stochastic process Y(t) converges in law towards the deterministic
solution of the ordinary differential equation
dY
(16) = −γ Y +λ X , Y(0) = Y ≥ 0.
2 2 eq 0
dt
Theorem 2. Consider the equations (14)-(15). If the scaling (S2) is
γ γ
satisfied, i.e., h → 1 h(∆X) (so b = Y 2 1), then when γ → ∞,
γ1 γ1 eqϕλ 1
2
1. The stochastic process X(t) converges in finite-dimensional law to the
compound Poisson white noise N˚(h,ϕ);
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
8 R.YVINECET AL.
2. The stochastic process Y(t) converges in law to the stochastic process
defined by the solution of the stochastic differential equation
dY
(17) = −γ Y +N˚(h¯,ϕ), Y(0) = Y ≥ 0
2 0
dt
where h¯(∆Y) = (λ )−1h((λ )−1∆Y) is the density function for the
2 2
jump size ∆Y.
Theorem 3. Consider the equations (14)-(15). If the scaling (S3) is
γ γ
2 1
satisfied, i.e., λ = Y , then when γ → ∞,
2 eq 1
bϕ
1. The stochastic process X(t) converges in finite-dimensional law to the
(deterministic) fixed value 0;
2. The stochastic process Y(t) converges in law to the stochastic process
determined by the solution of the stochastic differential equation
dY
(18) = −γ Y +N˚(h¯,ϕ), Y(0) = Y ≥ 0
2 0
dt
where h¯(∆Y) = (λ /γ )−1h((λ /γ )−1∆Y) is the density function for
2 1 2 1
the jump size ∆Y.
Remark 3. Note that scalings (S2) and (S3) give similar results for
the equation governing the protein variable Y but very different results
t
for the asymptotic stochastic process related to the mRNA. In particular,
in Theorem 2, very large bursts of mRNA are transmitted to the protein,
where in Theorem 3, very rarely is mRNA present but when present it is
efficiently synthesized into a burst of protein.
These theorems will be proved using the method of characteristic func-
tionals. For a stochastic process ξ (t ≥ 0), the characteristic functional
t
C : Σ → R is defined as
ξ
∞
(19) C [f]= E eR0 if(t)ξtdt
ξ
h i
for any function f in a suitable function space Σ so that the integral
∞
if(t)ξ dt is well defined. Before continuing, we need to introduce some
0 t
topological background as well as properties of the Fourier transform in
R
nuclear spaces (see Gel’fand and Vilenkin (1964))
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
ADIABATICREDUCTION FOR PDMP 9
2.1. Stochastic process as a distribution. We are going to recall here the
continuous correspondence between a stochastic process and a distribution.
All stochastic processes we consider here are right continuous with finiteleft
limits. It is naturalto definea stochastic processas arandomelement in the
space D = D (0,∞),R , the space of all real-valued functions on (0,∞)
that are right(cid:16)continuou(cid:17)s with finite left limits. Such a space is classically
endowed with the Skorokhod topology (see for instance Jacod and Shiryaev
(1987, Chap. 6)). We define D(R+), the space of smooth functions with
compact support, with the inductive limit topology given by the family of
semi-norms (k = 0,1,2...) p (f) = sup|f(k)| on every D([0,n]), n ∈ N (c.f.
k
(Schaefer, 1971, Example 2, page 57)). Let f ∈ D(R+), and define x˜ in the
dual space D′(R+) such that
∞
(20) x˜(f) = x(t)f(t)dt
Z0
Lemma 4. The map
D (0,∞),R → D′(R+)
(21)
(cid:16) (cid:17)
(x ) 7→ x˜,
t t≥0
where x˜ is defined by equation (20), is continuous.
Proof. Itisaclassical resultthatx ∈ D hasatmostacountablenumber
of discontinuity points so that x is locally integrable, the integral in Eq (20)
is well defined for all f ∈ D(R+), x˜ ∈ D′(R+) and
T
|x˜(f)|≤ | x(s)| ds k f k
∞
(cid:16)Z0 (cid:17)
for any f with support in [0,T] (Rudin, 1991, Section 6.11, page 142). We
conclude by noticing that
| x˜(f)|≤ sup |x(s) |k f k T
∞
s≤T
and x 7→ sup | x(s) | is continuous for the Skorohod topology (Jacod
s≤T
and Shiryaev, 1987, Proposition 2.4, page 339) for all T such that T is not
a discontinuity point.
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012
10 R.YVINECET AL.
2.2. Bochner-Minlos theorem for a nuclear space. Let E be a nuclear
space. Westate akey resultthatwillallow usto uniquelyidentify ameasure
on the dual E′ of E.
Bochner-Minlos Theorem. (Gel’fand and Vilenkin, 1964, Theorem
2, page 146) For a continuous functional C on a nuclear space E that sat-
isfies C(0) = 1, and for any complex z and elements x ∈A, j,k = 1,...,n,
j j
n n
(22) z z¯ C(x −x )≥ 0,
j k j k
j=1k=1
XX
there is a unique probability measure µ on the dual space E′, given by
(23) C(y) = eihx,yidµ(x).
ZE′
Note that the space D(R+) is a nuclear space (Schaefer, 1971, Example
2, page 107).
2.3. The characteristic functional of a Poisson white noise. The use of
the characteristic functional allows us to definea generalized stochastic pro-
cess that does not necessarily have a trajectory in the usual sense (like in
D for instance). Indeed a (compound) Poisson white noise is seen as a ran-
dom measure on the distribution space D, associated with the characteristic
functional (given in Hida and Si (2004), here f ∈ D(R+))
∞ ∞
(24) C [f]= exp ϕ (eizf(t) −1)h(z)dzdt ,
N˚
(cid:20) Z0 Z0 (cid:21)
where ϕ is the Poisson intensity and h the jump size distribution. It is not
hard to see that C [f − g] and C [g − f] are conjugate to each other,
N˚ N˚
C [0] = 1 and C [.] is continuous for h ∈ L1(R+), so the conditions in the
N˚ N˚
Bochner-Minlos Theorem2.2aresatisfiedandthereforeC uniquelydefines
N˚
a measure on D′(R+).
We refer to Prokhorov (1956); Hida and Si (2004, 2008) for further ma-
terial on characteristic functionals and generalized stochastic processes.
2.4. Proofs of Theorems 1 through 3. The proofs of Theorems 1 to 3
are based on the idea of Levy’s continuity theorem. However in the infinite-
dimensional case, the convergence of the Fourier transform does not imply
convergence inlawoftherandomvariable,andoneneedstoimposemorere-
strictions, namelyacompactness condition.Wewillusethefollowinglemma
imsart-aap ver. 2011/11/15 file: adiabatic_aap_finalpreprint_24feb12.tex date: February 27, 2012