Table Of Content

Efficient fluctuation analysis of biochemical pathways beyond the linear noise approximation using iNA Philipp Thomas∗†‡ ¶, Hannes Matuschek§¶ and Ramon Grima∗† ∗School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom †SynthSys Edinburgh, University of Edinburgh, Edinburgh, United Kingdom ‡Department of Physics, Humboldt University of Berlin, Berlin, Germany §Institute of Physics and Astronomy, University of Potsdam, Potsdam, Germany ¶These authors contributed equally. 2 1 0 2 Abstract—Thelinearnoiseapproximationiscommonlyused We recently developed intrinsic Noise Analyzer (iNA) [4], l toobtainintrinsicnoisestatisticssuchasFanofactorsandcoef- the first software package enabling a fluctuation analysis of u ficients of variation for biochemical networks. These estimates biochemical networks via the Linear Noise Approximation J are accurate for networks with large numbers of molecules. 6 However it is well known that many biochemical networks (LNA) and Effective Mesoscopic Rate Equation (EMRE) are characterized by at least one species with a small number approximations of the CME. The former gives the variance ] of molecules. We here describe modifications to the software and covariance of concentration fluctuations in the limit of M intrinsic Noise Analyzer (iNA) which enable it to accurately large molecule numbers while the latter gives the mean computenoisestatisticsoverwiderangesofmoleculenumbers. concentrations for intermediate to large molecule numbers Q This is achieved by calculating the next order corrections to and is more accurate than the conventional Rate Equations . the linear noise approximation’s estimates of variance and o covariance of concentration fluctuations. The efficiency of (REs). i b the methods is significantly improved by automated just-in- In this paper we develop and efficiently implement in - time compilation using the LLVM framework leading to a iNA, the Inverse Omega Square (IOS) method which gives q fluctuation analysis which typically outperforms that obtained accurate estimates for the mean concentrations and vari- [ by means of exact stochastic simulations. iNA is hence partic- ance / covariance of noise about them for systems whose ularly well suited for the needs of the computational biology 1 community. molecule numbers vary over wide ranges (few to thousands v ofmolecules).Thenewmethodistestedontwogeneexpres- 1 Keywords-Stochasticmodeling;LinearNoiseApproximation; sion models involving bimolecular reactions. Remarkably, 3 genetic regulatory circuits 6 the results of a few seconds long IOS calculation is found 1 to agree very well with those from hour long ensemble I. INTRODUCTION 7. averaging of SSA trajectories. 0 Experimental studies have shown that the protein abun- 2 dance varies from few tens to several thousands per protein II. RESULTS 1 species per cell [1]. It is also known that the standard In this section we describe the results of the novel IOS : v deviationoftheconcentrationfluctuationsduetotherandom method implemented inside iNA, compare with the results i timingofmolecularevents(intrinsicnoise)roughlyscalesas X oftheRE,LNAandEMREapproximationsoftheCMEand thesquarerootofthemeannumberofmolecules[2].Hence r with exact stochastic simulations using the SSA and finally a it is expected that intrinsic noise plays an important role in discuss the implementation and its performance. The three the dynamics of those biochemical networks characterized methods (LNA, EMRE, IOS) are derived from the system- by at least one species with low molecule numbers. size expansion (SSE) of the CME [2] which is applicable The stochastic simulation algorithm (SSA) is the conven- to monostable chemical systems. Technical details of the tionalmeansofprobingstochasticityinbiochemicalreaction various approximation methods are provided in the section systems[3].Thismethodsimulateseveryreactioneventand Methods. hence is typically slow for large reaction networks; this is particularly true if one is interested in intrinsic noise A. Applications statistics which require considerable ensemble averaging of We consider a two-stage model of gene expression with the trajectories produced by the SSA. A different route of an enzymatic degradation mechanism: inferring the required statistics involves finding an approximate solution of the chemical master equation (CME), a set Gene−k→0 Gene+mRNA, mRNA−k−d→M ∅, of differential equations for the probabilities of the states of the system, which is mathematically equivalent to the SSA. mRNA−k→s mRNA+Protein, Parameter (i) (ii) (iii) fluctuations with that obtained from the SSA. Notice that k0[G] 2.4min−1µM 0.024min−1µM 0.024min−1µM kdM 20min−1 0.2min−1 0.2min−1 in this case, the two are in severe disagreement. The SSA ks 1.5min−1 1.5min−1 15min−1 predictsthatthemeanconcentrationofproteinislargerthan k−1,k2 2min−1 2min−1 20min−1 that of the mRNA while the REs predict the opposite. It k1 400(µMmin)−1 400(µMmin)−1 4000(µMmin)−1 is also the case that the variance estimate of the LNA is Table I: Kinetic parameters used for the gene expression considerably smaller than that of the SSA. In Fig. 2(c) we model,scheme1,asdiscussedinthemaintext.Thevolume show the mean concentrations computed using the EMRE is fixed to Ω = 10−15l with an enzyme concentration and the variance computed using the IOS method. Note that of 0.1µM corresponding to 60 enzyme molecules. The the latter are in good agreement with the SSA in Fig. 2(b). Michaelis-Menten constant is 0.01µM in all cases. NotefurtherthatwhiletheEMREwasalreadyimplemented in the previous version of iNA, the IOS was not. Hence the present version is the first to provide estimates of the Protein+Enzyme−(cid:41)−k−1(cid:42)−Complex−k→2 Enzyme+∅. (1) mean concentrations and of the size of the noise which k−1 are more accurate than both the REs and the LNA. The The scheme involves the transcription of mRNA, its trans- transient times for this case are given by 37 minutes for lation to protein and subsequent degradation of both mRNA protein and 12 minutes for mRNA concentrations with a and protein. Note that while mRNA is degraded via an un- ratio of approximately 3 in agreement with the median and specificlinearreaction,thedegradationofproteinoccursvia modeofexperimentallymeasuredratios.Hencethisexample anenzymecatalyzedreaction.Thelattermaymodelproteol- provides clear evidence of the need to go beyond the RE ysis, the consumption of protein by a metabolic pathway or andLNAlevelofapproximationforphysiologicallyrelevant other post-translational modifications. A simplified version parameters of the gene expression model. of this model is one in which the protein degradation is Finallyweconsiderparameterset(iii),thecaseofmoder- replaced by the linear reaction: Protein→− ∅. Over the past ate transcription and fast translation (k is an order of mag- s decade the latter model has been the subject of numerous nitudelargerin(iii)comparedto(ii)).Previousstudieshave studies,principallybecauseitcanbesolvedexactlysincethe shownthatincreasedtranslationefficiencyleadstoincreased scheme is composed of purely first-order reactions [5]–[7]. noise in the protein abundance [5] due to the proteins being However, the former model as given by scheme (1), cannot produced in bursts [7]. Indeed a single realization of the be solved exactly because of the bimolecular association SSAshowsproteinsexpressedinsharplypeakedbursts(Fig. reactionbetweenenzymeandprotein.Henceinwhatfollows 3(a)). We used iNA to compute the mean concentrations wedemonstratethepowerofapproximationmethodstoinfer of mRNA and protein according to RE, EMRE and IOS usefulinformationregardingthemechanism’sintrinsicnoise approximation methods (Fig. 3(b)). These are contrasted properties. with the same mean concentrations computed from SSA Weconsiderthemodelwiththreedifferentparametersets simulations (Fig. 3(c)). Note that IOS provides the most (seeTabI)andfixthecompartmentvolumetoonefemtoliter accurate result, followed by the EMRE and the REs. The (one micron cubed). For all three cases, the REs predict latter performs poorly because it does not take into account the same steady state mRNA and protein concentrations: the effect of large fluctuations due to bursty behavior on [mRNA] = 0.12µM and [Protein] = 0.09µM. These the mean concentrations. Transient times of 48 minutes and correspond to 72 and 54 molecule numbers, respectively. 12minutesforproteinandmRNAconcentrationshavebeen We have used iNA to compute the mean concentrations extracted from the time course data as for previous cases; using the REs and the variance of fluctuations using the these are similar to those observed in the expression of the LNA for parameter set (i) in which transcription is fast. E. coli proteome [8], once again showing the necessity of Comparing these with SSA estimates (see Fig. 1) we see approximation methods beyond the RE and LNA to study that the REs and LNA provide reasonably accurate results physiologically relevant cases. for this parameter set. This analysis was within the scope B. Implementation of the previous version of iNA [4]. However, the scenario considered is not particularly realistic. This is since the iNA’sframeworkconsistofthreelayersofabstraction:the ratio of protein and mRNA lifetimes in this example is SBML parser which sets up a mathematical representation approximately 100 (as estimated from the time taken for of the reaction network, a module which performs the SSE theconcentrationstoreach90%oftheirsteadystatevalues) analytically using the computer algebra system Ginac [9] whileanevaluationof1,962genesinbuddingyeastshowed and a just-in-time (JIT) compiler based on the LLVM [10] that the ratios have median and mode close to 3 [7]. framework which compiles the mathematical model into We now consider parameter set (ii), the case of moderate platform specific machine code at runtime yielding high transcription. In Fig. 2(a) and (b) we compare the RE and performanceofSSEandSSAanalysesimplementediniNA. LNA predictions of mean concentrations and variance of As elaborated in the Methods section, in addition to the (a) (b) Figure 1: Gene expression model with fast transcription rate. In panels (a) and (b) we compare the RE predictions of mean concentrations with those obtained from ensemble averaging 3,000 SSA trajectories. The shaded areas denote the region of one standard deviation around the average concentrations which in (a) has been computed using the LNA and in (b) using the SSA. The results in (a) and (b) are in approximate agreement. (a) (b) (c) Figure 2: Gene expression model with moderate transcription rate. In panels (a) and (b) we compare the RE and LNA predictions of mean concentrations and variance of fluctuations with those obtained from ensemble averaging 30,000 stochastic realizations computed using the SSA. Note that the RE and LNA predictions are very different than the actual values.Inpanel(c)weshowthemeanconcentrationpredictionaccordingtotheEMREandthevariancepredictionaccording to IOS. These are in good agreement with those obtained from the SSA. REs, van Kampen’s SSE upon which the LNA, EMRE and −A(cid:126) whose solution is iNA’s steady state analysis. First IOS methods are based, requires the numerical solution of the typically nonlinear REs are solved using the Newton- the following high dimensional system of linear equations Raphson method with line search [11] and consequently the solution to the latter linear equation is found using ∂(cid:126)x ∂StSE =B(cid:126)xSSE+A(cid:126), (2) standard linear algebra. In the present version of iNA, time- course analysis is efficiently performed by means of the all- whosecoefficientsareautomaticallycomputedbyiNAusing roundODEintegratorLSODAwhichautomaticallyswitches the system size expansion from an SBML file. Note that between explicit and implicit methods [12]. the coefficient matrix B ≡ B([X(cid:126)]) and the vector A(cid:126) ≡ A(cid:126)([X(cid:126)]) depend parametrically on the solution of the REs. The number of simultaneous equations to be solved for The vector (cid:126)x is defined as (Ω−1(cid:126)x ,Ω−2(cid:126)x ) the LNA, EMRE analysis is approximately proportional SSE LNA,EMRE IOS where(cid:126)x definesthevariance/covarianceoftheLNA to N2 where N is the number of independent species LNA,EMRE together with the corrections to mean concentrations of the after conservation analysis [13] while for IOS, the number REsaccordingtotheEMREand(cid:126)x definesthecorrections of equations is approximately proportional to N3. The IOS tothevariance/covarianceoftheLNAandthecorrectionsto quadratic and cubic dependencies represent a challenge for themeanconcentrationsoftheEMREaccordingtotheIOS. softwaredevelopment.iNA’spreviousversionaddressedthis Aparticularsimpleresultcanbeinferredbysettingthetime need for performance by providing automated OpenMP derivative of the REs and Eq. (2) to zero yielding B(cid:126)x∗ = parallelism and a bytecode interpreter (BCI) for efficient SSE (a) (b) (c) Figure 3: Gene expression model with efficient translation. In panel (a) we show a single SSA trajectory which illustrates the large protein bursts in the protein concentrations. In panel (b) we show the concentrations predicted by REs, EMRE and IOS methods and in (c) we show the mean concentrations obtained from ensemble averaging 30,000 SSA trajectories. Only the IOS method is in good quantitative agreement with the SSA predictions. These predictions are over 5 times larger than those of REs, the discrepancies stemming from the fact that the latter do not take into account contributions to the mean due to large fluctuations. expression evaluation [4]. The latter concept has proven P +G−(cid:41)−k−1(cid:42)−GP , P +GP −(cid:41)−k−2(cid:42)−GP , (4) N N N N N2 its performance for both SSE and SSA methods while k−1 k−2 maintaining compatibility over many platforms. With the where P denotes the the nuclear protein. Note that protein N present version we introduce a cross platform strategy for P is consumed also by a different pathway involving S C JIT compilation of the SSE or SSA methods. iNA uses which is not explicitly modeled. As in the preceding exam- the modern compiler framework LLVM providing fast JIT ple, degradation occurs via enzyme-catalyzed mechanisms, compilation that allows to emmit and execute platform in this case via cytosolic and nuclear proteases E and F: specific machine code at runtime [10]. This allows us to automatically compile the system size expansion ODEs as PC +E −(cid:41)−k−3(cid:42)−EPC −k−ca→t,3 E, nativemachinecodeexecutabledirectlyontheCPUwithout k−3 resorting to interpreters. Therefore iNA’s JIT feature enjoys P +F −(cid:41)−k−4(cid:42)−FP −k−ca→t,4 F. (5) N N the speed of statically compiled code while maintaining k−4 the flexibility common to interpreters. Moreover, the LLVM The model has been analyzed in detail in Ref. [4] using the compiler framework offers the use of optimizations on EMREs computed by iNA. Therein it has been shown that platformindependentandplatformspecificinstructionlevels intrinsic noise can amplify transient oscillations which are which become advantageous for computationally expensive visibleevenatthecellpopulationlevel.InFig.4wecompare calculations. the outcome of 50,000 independent realizations of the SSA with iNA’s prediction of these synchronous noise-induced C. Performance oscillations using the EMRE for the mean concentrations together with the IOS estimate of variance (the new feature In order to benchmark the present implementation we in the present version of iNA). These estimates agree well consider an ensemble of independent single cell oscillators with those obtained from simulations. with weak negative feedback involving 10 species and 16 Afterconservationanalysisthemodelcomprises7species reactions. The model describes the transcription of mRNA, yielding a total number of 161 simultaneous equations for M, by a a clock gene, G and its translation to a cytosolic the SSE and hence is well suited for direct benchmarking clock protein, P : C purposes. This is particularly challenging for ODE integra- tors since the underlying stochastic dynamics causes the G−k→0 G+M, M −k−d→m ∅, M −k→s M +P . (3) C full system of 161 coupled equations to exhibit damped oscillations. The results of the benchmarks are summarized A negative feedback loop arises from nuclear import of inTab.IIhighlightingtheperformanceofthepresentversion cytosolicproteinanditsbindingtothegenewhichrepresses of iNA over the previous version. The improvements of transcription: iNA’s SSE using the LSODA over the previous Rosenbrock P −(cid:41)k−−(cid:42)in−P , P −k−→dp S, method reduces the execution time by a factor 2−3 while C N C factors of 5−6 are attained using the JIT compiler which kout combined reduce the execution time from 25s to less than The CME is typically intractable for computational pur- 2s. This is compared to the execution time of the SSA poses because of the inherent large state space. iNA cir- which is computationally extremely expensive because of cumvents this problem by approximating the moments of the considerable number of trajectories which need to be probabilitydensitysolutionoftheCMEsystematicallyusing averaged to obtain accurate statistics. van Kampen’s SSE [2], [14]. The starting point of the analysis is van Kampen’s ansatz method SSE,LSODA SSE,Rosenbrock SSA,single/ens. (cid:126)n BCI 12.88s(13.13s) 25.40s(25.66s) 0.15s/2.0h =[X(cid:126)]+Ω−1/2(cid:126)(cid:15), (8) BCI(opt.) 8.21s(8.51s) 26.59s(26.89s) 0.13s/1.8h Ω JIT 1.46s(1.86s) 10.77s(15.51s) 0.10s/1.4h by which one separates the instantaneous concentration into JIT(opt.) 1.12s(6.34s) 10.91s(17.04s) 0.10s/1.4h a deterministic part given by the solution [X(cid:126)] of the macro- Table II: Execution times of iNA for SSE and SSA for scopic REs for the reaction scheme (6) and the fluctuations the gene oscillatory network reproducing Fig. 4 (a) and around it parametrized by (cid:126)(cid:15). fluctuations around it. Note (b), respectively. The table compares the SSE method using that f(cid:126) = lim (cid:126)a/Ω. The change of variable causes Ω→∞ theODEintegratorLSODAandRosenbrockincombination theprobabilitydistributionofmolecularpopulationsP((cid:126)n,t) with a bytecode interpreter (BCI) or iNA’s JIT feature. The to be transformed into a new probability distribution of valueinthebracketsdenotestheoveralltimeincludingbyte- fluctuations Π((cid:126)(cid:15),t). The time derivative, the step operator code assembly or JIT compilation. The table also includes andthepropensityfunctionsarealsotransformed(seefor[4] the execution times using optimizations (opt.) which speeds for details). In particular the propensities expand in powers up execution times at the expense of longer compilation of Ω−1/2 and are given by times. The SSA value denotes the average time for a single aˆ ((cid:126)n,Ω) ∂f(0)([X(cid:126)]) run while the values in the brackets denote the extrapolated j =f(0)([X(cid:126)])+Ω−1/2(cid:15) j +Ω−1f(1)([X(cid:126)]) value for an ensemble of 50,000 independent realizations Ω j α ∂[Xα] j needed to generate Fig. 4(b). Benchmarks were performed 1 ∂f(0)([X(cid:126)]) on MacOS 10.7, Core 2 Duo @1.4Ghz (64Bit) using one + 2Ω−1(cid:15)α(cid:15)β∂[Xj ]∂[X ] +O(Ω−3/2). (9) α β core. Note that here we have used the convention that twice repeated Greek indices are summed over 1 to N, which we III. METHODS use in what follows as well. Consequently the CME up to order Ω−1 can be written as We consider a general reaction network confined in a (cid:32) R (cid:33) volume Ω under well-mixed conditions and involving the ∂Π((cid:126)(cid:15),t) −Ω1/2 ∂[Xα] −(cid:88)S f(0)([X(cid:126)]) ∂ Π((cid:126)(cid:15),t) interaction of N distinct chemical species via R chemical ∂t ∂t αk k α k=1 reactions of the type (cid:16) (cid:17) = Ω0L(0)+Ω−1/2L(1)+Ω−1L(2) Π((cid:126)(cid:15),t)+O(Ω−3/2), s1jX1+...+sNjXN −k→j r1jX1+...+rNjXN, (6) (10) where j is the reaction index running from 1 to R, X where the operators are i denotes chemical species i, k is the reaction rate of the jth 1 j L(0) =−∂ Jβ(cid:15) + ∂ ∂ D , (11) reaction and s and r are the stoichiometric coefficients. α α β 2 α β αβ ij ij Note that our general formulation does not require all L(1) =−∂ D(1)− 1∂ Jβγ(cid:15) (cid:15) + 1∂ ∂ Jγ (cid:15) reactions to be necessarily elementary. α α 2! α α β γ 2! α β αβ γ 1 The CME gives the time-evolution equation for the prob- − ∂ ∂ ∂ D , (12) ability P((cid:126)n,t) that the system is in a particular mesoscopic 3! α β γ αβγ sotfattehe(cid:126)ni=th(snp1e,c.i.e.,s.nINt)iTs gwivheenrebnyi:isthenumberofmolecules L(2) =−∂αJα(1)β(cid:15)β + 21!∂α∂βDα(1β)− 31!∂αJαβγδ(cid:15)β(cid:15)γ(cid:15)δ 1 1 1 + ∂ ∂ Jγδ(cid:15) (cid:15) − ∂ ∂ ∂ Jδ (cid:15) ∂P∂((cid:126)nt,t) =(cid:88)R (cid:18)(cid:89)N Ei−Sij −1(cid:19)aˆj((cid:126)n,Ω)P((cid:126)n,t), (7) + 21!∂2!∂α∂β∂αDβ γ δ , 3! α β γ αβγ δ (13) j=1 i=1 4! α β γ δ αβγδ where S = r −s , aˆ ((cid:126)n,Ω(cid:126)) is the propensity function and the SSE coefficients are given by ij ij ij j such that the probability for the jth reaction to occur R in the time interval [t,t + dt) is given by aˆj((cid:126)n,Ω(cid:126))dt D(n)ij..r =(cid:88)SikSjk...Srkfk(n)([X(cid:126)]), [3] and E−Sij is the step operator defined by its ac- k=1 i tion on a general function of molecular populations as J(n)st..z = ∂ ∂ ... ∂ D(n) . (14) Ei−Sijg(n1,...,ni,...,nN)=g(n1,...,ni−Sij,...,nN) [2]. ij..r ∂[Xs]∂[Xt] ∂[Xz] ij..r (a)SSE (b)SSA Figure 4: Screenshots of iNA’s SSE using the EMRE and IOS analysis versus the average of 50,000 stochastic simulations of the gene oscillatory network which exhibits amplified synchronous oscillations which are not captured by the REs (see Fig.8inRef.[4]).TheEMREmeanconcentrationstogetherwiththeIOSvariancesinpanel(a)areingoodagreementwith SSA simulations in panel (b). The computation of (a) takes less than two seconds compared to more than one hour for the stochastic simulation in (b), see Tab. II, which highlights the efficiency of the system size method implemented in iNA. Note that the above expressions generalize the expansion which is a Fokker-Planck equation with linear drift and carried out in Ref. [15] to include also nonelementary diffusion coefficients, also called the Linear Noise Approx- reactions as for instance trimolecular reactions or reactions imation. If the initial state is deterministic, i.e., (cid:126)n/Ω=[X(cid:126)] with propensities of the Michaelis-Menten type [16]. Note initially, the solution is a multivariate Gaussian distribution also that the Ω1/2 term vanishes since the macroscopic REs [2]centeredaround[(cid:15)] =0foralltimes.Thetimeevolution 0 are given by ∂ [X ]=(cid:80)R S f(0)([X(cid:126)]) leaving us with equationsofthesecondmomentareobtainedbymultiplying t α k=1 αk k a series expansion of the CME in powers of the inverse the latter by (cid:15) and (cid:15) (cid:15) respectively and performing the i i j square root of the volume. In what follows we shall omit integration over(cid:126)(cid:15): the upper index in the bracket of the SSE coefficients in the case of n=0. ∂ 1 [(cid:15) (cid:15) ] =Jα[(cid:15) (cid:15) ] + D +(i↔j), (19) Linear Noise Approximation: We now proceed by con- ∂t i j 0 i α j 0 2 ij structing equations for the moments of the (cid:126)(cid:15) variables. We follow the derivation presented in Ref. [15] and expand the where (i ↔ j) denotes the permutations of indices. It then probabilitydistributionoffluctuationsΠ((cid:126)(cid:15),t)intermsofthe follows that within the LNA all higher even moments can inverse square root of the volume be expressed in terms of the second moment by virtue of Wick’s theorem [17]: ∞ (cid:88) Π((cid:126)(cid:15),t)= Π ((cid:126)(cid:15),t)Ω−j/2. (15) j (cid:88) [(cid:15) (cid:15) ...(cid:15) ] = [(cid:15) (cid:15) ] ...[(cid:15) (cid:15) ] . (20) j=0 1 2 ζ 0 P1 P2 0 Pζ−1 Pζ 0 allpossiblepairings As a consequence there exists an equivalent expansion of P of{1,2..,ζ} the moments Note that all odd moments are zero since Π ((cid:126)(cid:15),t) is cen- ∞ 0 (cid:88) (cid:104)(cid:15) (cid:15) ...(cid:15) (cid:105)= [(cid:15) (cid:15) ...(cid:15) ] Ω−j/2, (16) tered. k l m k l m j j=0 EMRE and IOS approximations: The system size expansion can be used to calculate higher order corrections to the where the following definition has been used moments. The leading order correction to the first moment (cid:90) is given by [(cid:15) (cid:15) ...(cid:15) ] = (cid:15) (cid:15) ...(cid:15) Π ((cid:126)(cid:15),t)d(cid:126)(cid:15). (17) k l m j k l m j 1 Using the expansion of the probability density, Eq. (15), ∂ [(cid:15) ] =Jα[(cid:15) ] + Jαβ[(cid:15) (cid:15) ] +D(1), (21) t i 1 i α 1 2 i α β 0 i together with Eq. (10) and (11), we find after equating all terms of order Ω0 which is equivalent to the EMRE [18] implemented in the ∂ previous version of iNA [4]. The correction to the second Π ((cid:126)(cid:15),t)=L(0)Π ((cid:126)(cid:15),t) (18) ∂t 0 0 moment has been derived in detail in Ref. [15] and is given by 1.35 1.30 "! "! " iNAIOS 1 1 ∂t[(cid:15)i(cid:15)j]2 =Jiα[(cid:15)α(cid:15)j]2+ 2Jiαβ[(cid:15)α(cid:15)β(cid:15)j]1+ 3!Jiαβγ[(cid:15)α(cid:15)β(cid:15)γ(cid:15)j]0 1.25 "! "! IOS +D(1)[(cid:15) ] +J(1)α[(cid:15) (cid:15) ] + 1Jαβ[(cid:15) (cid:15) ] "LNA 1.20 ! analytical + 1Di (1)j+1(i↔i j), α j 0 4 ij α β 0 (22) "IOS 11..1105 # " "! # s!Simtouclhaatisotincs" 2 ij " 1.05 "! which is inherently coupled to the leading order non- " "! " " Gaussian correction of the third moment 1.00 0.00 0.05 0.10 0.15 0.20 0.25 0.30 1 ∆ ∂ [(cid:15) (cid:15) (cid:15) ] =Jα[(cid:15) (cid:15) (cid:15) ] + Jαβ[(cid:15) (cid:15) (cid:15) (cid:15) ] +D(1)[(cid:15) (cid:15) ] t i j k 1 i α j k 1 2 i α β j k 0 i j k 0 Figure 5: Michaelis-Menten reaction. We have verified the 1 1 1 + D [(cid:15) ] + Jα[(cid:15) (cid:15) ] + D soundness of our numerical implementation by comparing 2 jk i 1 2 jk α i 0 3 ijk withtheanalyticalresultusingtheIOSderivedinRef.[19]. +(i↔j)+(j ↔k). (23) The graph shows the ratio of IOS and LNA variance of substrate fluctuations against the ratio δ of the free enzyme Notethattheaboveequationsdependonthefourthmoment concentration and the total enzyme concentration in steady which is readily evaluated using Wick’s theorem: state conditions. This is also compared to the SSA where the ratio of SSA and LNA variance has been used. [(cid:15) (cid:15) (cid:15) (cid:15) ] =[(cid:15) (cid:15) ] [(cid:15) (cid:15) ] +[(cid:15) (cid:15) ] [(cid:15) (cid:15) ] +[(cid:15) (cid:15) ] [(cid:15) (cid:15) ] . i j k l 0 i j 0 k l 0 i k 0 j l 0 i l 0 j k 0 (24) In order to relate the above moments back to the moments Summarizing the system size expansion is obtained by of the concentration variables we use Eqs. (8) and (16) to defining the vector (cid:126)x = (Ω−1(cid:126)x ,Ω−2(cid:126)x ) with SSE LNA,EMRE IOS findexpressionsforthemeanconcentrationsandcovariance components of fluctuations which are given by (cid:126)x = vec([(cid:15) (cid:15) ] ,[(cid:15) ] ), (cid:68)n (cid:69) LNA,EMRE i j 0 i 1 Ωi =[Xi]+Ω−1[(cid:15)i]1+O(Ω−2) (25) (cid:126)xIOS = vec([(cid:15)i(cid:15)j(cid:15)k(cid:15)l]0,[(cid:15)i(cid:15)j(cid:15)k]1,[(cid:15)i(cid:15)j]2,[(cid:15)i]3), (29) (cid:68)(cid:16)n (cid:68)n (cid:69)(cid:17)(cid:16)n (cid:68)n (cid:69)(cid:17)(cid:69) Σ = i − i j − j where the symbol vec(·) denotes the vectorization with ij Ω Ω Ω Ω respect to the independent components of the involved fully =Ω−1[(cid:15) (cid:15) ] +Ω−2([(cid:15) (cid:15) ] −[(cid:15) ] [(cid:15) ] )+O(Ω−3) i j 0 i j 2 i 1 j 1 symmetric tensors. Then the vector (cid:126)x defines the LNA,EMRE (26) LNA covariance together with the EMRE and (cid:126)x defines IOS TheΩ−1 termofthelatterequationgivestheLNAestimate thecorrectionstotheLNAcovarianceandthirdmomentsof for the covariance while including terms to order Ω−2 gives theSSEtogetherwiththeorderΩ−2correctiontotheEMRE mean concentrations. It then follows that (cid:126)x satisfies the the IOS (Inverse Omega Squared) estimate. SSE linear ODE Note that by inspection of Eqs. (21) and (22) it follows these corrections are nonzero if the reaction network ∂(cid:126)xSSE =B(cid:126)x +A(cid:126), (30) involves at least one bimolecular reaction. Moreover we ∂t SSE can also estimate the next order correction to the mean where the upper block triangular matrix B ≡ B([X(cid:126)]) corrections, i.e., the correction to the EMRE, as has been and the full vector A(cid:126) ≡ A(cid:126)([X(cid:126)]) are determined by Eqs. done in Ref. [15] (19,21,22,23,27) and depend parametrically on the solution 1 of the REs. ∂ [(cid:15) ] =Jα[(cid:15) ] + Jαβ[(cid:15) (cid:15) ] t i 3 i α 3 2 i α β 1 InFig.5wehaveverifiedthecorrectnessofiNA’snumer- + 1Jαβγ[(cid:15) (cid:15) (cid:15) ] +J(1)α[(cid:15) ] . (27) icalimplementationoftheIOSapproximationbycomparing 3! i α β γ 1 i α 1 withthecompactIOSanalyticalexpressionrecentlyderived for the case of a simple enzyme catalyzed reaction (see Eq. The solution of the above equation allows us to obtain the (74) in Ref. [19]). mean concentration accurate to order Ω−2 (cid:68)ni(cid:69)=[X ]+Ω−1[(cid:15) ] +Ω−2[(cid:15) ] +O(Ω−3), (28) IV. DISCUSSION Ω i i 1 i 2 In this article we have introduced and implemented the where the first term is the RE solution, the first two terms IOSapproximationinthesoftwarepackageiNA.Thisallows constitute the EMRE estimate and including all three terms the mean concentrations and variances to be determined gives the IOS estimate for the mean concentrations. accurate to order Ω−2, an approximation which is superior to the previously implemented methods of LNA (variances [6] J. Paulsson, “Models of stochastic gene expression,” Physics accurate to order Ω0) and EMRE (mean concentrations of life reviews, vol. 2, no. 2, pp. 157–175, 2005. accuratetoorderΩ−1).Asweshownthisincreasedaccuracy [7] V. Shahrezaei and P. Swain, “Analytical distributions for is desirable to accurately account for the effects of intrinsic stochastic gene expression,” Proceedings of the National noise in biochemical reaction networks under low molecule Academy of Sciences, vol. 105, no. 45, p. 17256, 2008. number conditions. In particular, we have demonstrated [8] Y. Taniguchi, P. Choi, G. Li, H. Chen, M. Babu, J. Hearn, the utility of the software by analyzing an example of A. Emili, and X. Xie, “Quantifying e. coli proteome and gene expression with a functional enzyme and showcased transcriptomewithsingle-moleculesensitivityinsinglecells,” the efficiency of the method using a more complex gene Science, vol. 329, no. 5991, pp. 533–538, 2010. oscillatory network. We have also extended iNA by a more [9] C. Bauer, A. Frink, and R. Kreckel, “Introduction to the efficient JIT compilation strategy in combination with im- ginac framework for symbolic computation within the c++ provednumericalalgorithmswhichoffershighperformance programming language,” Journal of Symbolic Computation, and enables computations feasible even on desktop PCs. vol. 33, no. 1, pp. 1–12, 2002. This feature is particularly important when analyzing noise [10] C. Lattner and V. Adve, “Llvm: A compilation framework in reaction networks of intermediate or large size with for lifelong program analysis & transformation,” in Code well-separated timescales and some bimolecular reactions, GenerationandOptimization,2004.CGO2004.International conditions that have been shown to amplify the deviations Symposium on. IEEE, 2004, pp. 75–86. from the conventional rate equation description [20]. [11] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, We conclude by pointing out that while stiffness arising Numerical recipes: the art of scientific computing, 3rd ed. from multiple timescales in biological networks is still Cambridge University Press, 2007. an insufficiently resolved problem of stochastic simulation [12] L. Petzold, “Automatic selection of methods for solving stiff methods [3], [21], it does not pose significant problems for andnonstiffsystemsofordinarydifferentialequations.”SIAM the LNA, EMRE and IOS approximation methods. This is JournalonScientificandStatisticalComputing,vol.4,no.1, since the implementations of the latter in iNA are able to pp. 136–148, 1983. deal with stiffness natively using well established implicit [13] R.Vallabhajosyula,V.Chickarmane,andH.Sauro,“Conser- methods developed for ordinary differential equations. vation analysis of large biochemical networks,” Bioinformat- ics, vol. 22, no. 3, pp. 346–353, 2006. ACKNOWLEDGMENT RGgratefullyacknowledgessupportbySULSA(Scottish [14] N. Van Kampen, “The expansion of the master equation,” Advances in Chemical Physics, pp. 245–309, 1976. Universities Life Science Alliance). [15] R. Grima, P. Thomas, and A. Straube, “How accurate are AVAILABILITY the nonlinear chemical fokker-planck and chemical langevin The software iNA version 0.3 is freely available un- equations?” The Journal of Chemical Physics, vol. 135, p. der http://code.google.com/p/intrinsic-noise-analyzer/ as ex- 084103, 2011. ecutable binaries for Linux, MacOSX and Microsoft Win- [16] C. Rao and A. Arkin, “Stochastic chemical kinetics and the dows, as well as the full source code under an open source quasi-steady-state assumption: Application to the gillespie license. algorithm,” The Journal of chemical physics, vol. 118, p. 4999, 2003. REFERENCES [17] J.Zinn-Justin,PhaseTransitionsandRenormalizationGroup. [1] Y. Ishihama, T. Schmidt, J. Rappsilber, M. Mann, U. Hartl, Oxford University Press, 2007. M.J.Kerner,andD.Frishman,“Proteinabundanceprofiling of the escherichia coli cytosol,” BMC Genomics, vol. 9, p. [18] R. Grima, “An effective rate equation approach to reaction 102, 2008. kineticsinsmallvolumes:Theoryandapplicationtobiochem- icalreactionsinnonequilibriumsteady-stateconditions,”The [2] N. van Kampen, Stochastic processes in physics and chem- Journal of Chemical Physics, vol. 133, p. 035101, 2010. istry, 3rd ed. North-Holland, 2007. [19] “A study of the accuracy of moment-closure approximations [3] D. Gillespie, “Stochastic simulation of chemical kinetics,” for stochastic chemical kinetics,” The Journal of Chemical Annu. Rev. Phys. Chem., vol. 58, pp. 35–55, 2007. Physics, vol. 136, p. 154105, 2012. [4] P. Thomas, H. Matuschek, and R. Grima, “intrinsic noise [20] P. Thomas, A. Straube, and R. Grima, “Stochastic theory of analyzer: a software package for the exploration of stochas- large-scale enzyme-reaction networks: Finite copy number tic biochemical kinetics using the system size expansion,” corrections to rate equation models,” Journal of Chemical PlosOne, p. (in press), 2012. Physics, vol. 133, no. 19, p. 195101, 2010. [5] E.Ozbudak,M.Thattai,I.Kurtser,A.Grossman,andA.van [21] Y. Cao, D. Gillespie, and L. Petzold, “Adaptive explicit- Oudenaarden, “Regulation of noise in the expression of a implicittau-leapingmethodwithautomatictauselection,”The singlegene,”Naturegenetics,vol.31,no.1,pp.69–73,2002. Journal of Chemical Physics, vol. 126, p. 224101, 2007.

Computation of biochemical pathway fluctuations beyond the linear noise approximation using iNA PDF

0.61 MB·English

by Philipp Thomas

#additional_collections #journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computation of biochemical pathway fluctuations beyond the linear noise approximation using iNA

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.