ebook img

Bayesian Nonparametric Ordination for the Analysis of Microbial Communities PDF

3 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Bayesian Nonparametric Ordination for the Analysis of Microbial Communities

Bayesian Nonparametric Ordination for the Analysis of Microbial Communities Boyu Ren1, Sergio Bacallado2, Stefano Favaro3, Susan Holmes4 and 6 1 Lorenzo Trippa1 0 2 1Harvard University, Cambridge, US n a 2University of Cambridge, Cambridge, UK J 0 3Universita` degli Studi di Torino and Collegio Carlo Alberto, Turin, 2 Italy ] E 4Stanford University, Stanford, US M . t a January 21, 2016 t s [ 1 Abstract v 6 Humanmicrobiomestudiesusesequencingtechnologiestomeasuretheabun- 5 1 danceofbacterialspeciesorOperationalTaxonomicUnits(OTUs)insamplesof 5 biological material. Typically the data are organized in contingency tables with 0 OTU counts across heterogeneous biological samples. In the microbial ecology . 1 community, ordination methods are frequently used to investigate latent factors 0 6 or clusters that capture and describe variations of OTU counts across biologi- 1 cal samples. It remains important to evaluate how uncertainty in estimates of : v each biological sample’s microbial distribution propagates to ordination anal- i X yses, including visualization of clusters and projections of biological samples r on low dimensional spaces. We propose a Bayesian analysis for dependent dis- a tributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariatedataanalysistechniqueswecanvisualizecredibleregionsinecolog- ical ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets. 1 1 Introduction Next generation sequencing (NGS) has transformed the study of microbial ecology. Through the availability of cheap efficient amplification and sequencing, marker genes such as 16S rRNA are used to provide inventories of bacteria in many different envi- ronments. For instance soil and waste water microbiota have been inventoried (De- Santis et al., 2006) as well as the human body (Dethlefsen et al., 2007). NGS also enables researchers to describe the metagenome by computing counts of DNA reads and matching them to the genes present in various environments. Over the last ten years, numerous studies have shown the effects of environmental and clinical factors on the bacterial communities of the human microbiome. These studiesenhanceourunderstandingofhowthemicrobiomeisinvolvedinobesity(Turn- baugh et al., 2009), Crohn’s disease (Quince et al., 2013) or diabetes (Kostic et al., 2015). Studies are currently underway to improve our understanding of the effects of antibiotics (Dethlefsen and Relman, 2011), pregnancy (DiGiulio et al., 2015) and other perturbations to the human microbiome. Common microbial ecology pipelines either start by grouping the 16S rRNA se- quencesintoknownOperationalTaxonomicUnits(OTUs)ortaxaasdoneinCaporaso et al. (2010) or denoising and grouping the reads into more refined strains sometimes referred to as oligotypes or phylotypes (Rosen et al., 2012; Eren et al., 2014). We will call both types of groupings OTUs to maintain consistency. In all cases the data are analyzed in the form of contingency tables of read counts per sample for the different OTUs. Associated to these contingency tables are clinical and environmental covari- ates such as time, treatment and patients’ BMI, information collected on the same biological samples or environments. These are sometimes misnamed “metadata”; this contiguous information is usually fundamental in the analyses. The data are often assembled in multi-type structures, for instance phyloseq (McMurdie and Holmes, 2013) uses lists (S4 classes) to capture all the different aspects of the data at once. Currently bioinformaticians and statisticians analyze the preprocessed microbiome data using linear ordination methods such as Correspondence Analysis (CA), Canon- ical or Constrained Correspondence Analysis (CCA) and Multidimensional Scaling (MDS) (Caporaso et al., 2010; Oksanen et al., 2015; McMurdie and Holmes, 2013). Distance-based ordination methods use measures of between-sample or Beta diversity, such as the Unifrac distance (Lozupone and Knight, 2005). These analyses can reveal clustering of biological samples or taxa, or meaningful ecological or clinical gradients in the community structure of the bacteria. Clustering, when it occurs indicates a latent variable which is discrete, whereas gradients correspond to latent continuous variables. Following these exploratory stages, confirmatory analyses can include dif- ferential abundance testing (McMurdie and Holmes, 2014), two-sample tests for Beta diversity scores (Anderson et al., 2006), ANOVA permutation tests in CCA (Oksanen et al., 2015), or tests based on generalized linear models that include adjustment for multiple confounders (Paulson et al., 2013). The interaction between these tasks can be problematic. In particular, the uncer- tainty in the estimation of OTUs’ prevalence is often not propagated to subsequent 2 steps (Peiffer et al., 2013). Moreover, unequal sequencing depths generate variations of the number of OTUs with zero counts across biological samples. Finally, the hy- potheses tested in the inferential step are often formulated after significant exploration of the data, and are sensitive to earlier choices in data preprocessing. These issues motivate a Bayesian approach that enables us to integrate the steps of the analytical pipeline. Holmes et al. (2012) have suggested the use of a simple Dirichlet-Multinomial model for these data; however, in that analysis the multinomial probabilities for each sample are independent in the prior and posterior, which fails to capture underlying relationships between biological samples. On the other hand, the correlation between OTUs is negative in both prior and posterior, which is not consistent with the fact that some OTUs can only exist in tight-knit communities. We propose a Bayesian procedure, which jointly models the read counts from dif- ferent OTUs and sample-specific latent multinomial distributions, allowing for corre- lations between OTUs. The prior assigned to these multinomial probabilities is highly flexible, such that the analysis learns the dependence structure from the data, rather than constraining it a priori. The method can deal with uncertainty coherently, pro- vides model-based visualizations of the data and describes the effects of environmental covariates. Bayesian analysis with Dirichlet priors is a suitable starting point for microbiome data, since the OTUs distributions are inherently discrete. Moreover, Bayesian non- parametric priors for discrete distributions, suitable for an unbounded number of OTUs, have been the topic of intense research in recent years. General classes of priors such as normalized random measures have been developed, and their properties in relation to classical estimators of species diversity are well-understood (Ferguson, 1973; Lijoi and Pru¨nster, 2010). The problem of modeling dependent distributions has also been extensively studied since the proposal of the Dependent Dirichlet Process (MacEachern, 2000) by Mu¨ller et al. (2004); Rodr´ıguez et al. (2009) and Griffin et al. (2013). In this paper, we try to capture the variation in the composition of microbial com- munitiesasaresultoftheobservedandunobservedsamples’characteristics. Withthis goal we introduce a model which expresses the dependence between OTUs abundances in different environments through vectors embedded in a low dimensional space. Our model has aspects in common with nonparametric priors for dependent distributions, including a generalized Dirichlet type marginal prior on each distribution, but is also similar in spirit to the multivariate methods currently employed in the microbial ecol- ogy community. Namely, it allows us to visualize the relationship between biological samples through low dimensional projections. The paper is organized as follows, Section 2 describes a prior for dependent micro- bial distributions, first constructing the marginal prior of a single distribution through manipulation of a Gaussian process, and then extending this to multiple correlated distributions. The extension is achieved through a set of continuous latent factors, one for each biological sample, whose prior is one frequently used in Bayesian factor analyses. Section3derivesanMCMCsamplingalgorithmforposteriorinferenceanda fast algorithm to estimate biological samples’ similarity. Section 4 discusses a method 3 for visualizing the uncertainty in ordinations through conjoint analysis. Section 5 contains analyses of simulated data, which serve to demonstrate desirable properties of the method, followed by applications to real microbiome data in Section 6. Sec- tion 7 discusses potential improvement and concludes. The code for implementing the analyses discussed in this article can be found in the repository DirichletFactor. 2 Probability Model We construct discrete distributions {Pj;j ∈ J} indexed by biological samples in a set J. The distributions are supported on a common, countable set of OTUs, Z ,Z ,..., 1 2 in a measure space (Z,F), where Z is assumed to be complete and separable. Every OTU Z is associated to parameters σ ∈ (0,1) and X ∈ R . Every microbial i i i ∞ distribution Pj is associated to a parameter Yj ∈ R . The measure Pj is defined by ∞ Pj(A) = Mj(A)/Mj(Z), (cid:88)∞ (1) Mj(A) = I(Z ∈ A)σ (cid:104)X ,Yj(cid:105)+2, i i i i=1 for every set A ∈ F. Here I(·) and (cid:104)·,·(cid:105) are the standard indicator and inner product functions, while x+ is the positive part of x. In subsection 2.1 we consider a single microbial distribution Pj with fixed parame- ter Yj, and define a prior on σ = (σ ,σ ,...) and (X ) which makes Pj a Dirichlet 1 2 i i 1 ≥ process (Ferguson, 1973). The degree of similarity between the discrete distributions {Pj;j ∈ J} is summarized by the Gram matrix {φ(j,j ) = (cid:104)Yj,Yj(cid:48)(cid:105);j,j ∈ J}. (cid:48) (cid:48) Subsection 2.2 discusses the interpretation of this matrix, and subsection 2.3 proposes a prior for the parameters {Yj,j ∈ J} which has been previously used in Bayesian factor analysis, and which has the effect of shrinking the dimensionality of the Gram matrix φ. The parameters {Yj,j ∈ J} or φ can be used to visualize and understand variations of microbial distributions across biological samples. 2.1 Construction of a Dirichlet Process The prior on σ = (σ ,σ ,...) is the distribution of ordered points (σ > σ ) in a 1 2 i i+1 Poisson process on (0,1) with intensity ν(σ) = ασ 1(1−σ) 1/2, (2) − − where α > 0 is a concentration parameter. Fix j, and let Yj = (Y ,l ≥ 1) be a fixed l,j sequence of real numbers with (cid:80) Y2 = 1. Using the notation X = (X ,X ,...), l l,j i 1,i 2,i we let X , i,l = 1,2,... be independent and N(0,1) a priori. l,i Finally, let G be a nonatomic probability measure on (Z,F), and Z ,Z ,... be 1 2 a sequence of independent random variables with distribution G. We claim that the probability distribution Pj defined in Equation (1) is a Dirichlet Process with base measure G. 4 We note that the point process σ defines an infinite sequence of positive numbers, the products (cid:104)X ,Yj(cid:105), i = 1,2,..., are independent Gaussian N(0,1) variables, and i (cid:82)1 that the intensity ν satisfies the inequality σdν < ∞. These facts directly imply 0 that with probability 1, 0 < Mj(A) < ∞ when G(A) > 0. It also follows that for any sequence of disjoint sets A ,A ,... ∈ F the corresponding random variables Mj(A )’s 1 2 i are independent. In different words, Mj is a completely random measure (Kingman, 1967). The marginal L´evy intensity can be factorized as µ (ds)×G(dz), where M (cid:90) 1 (cid:18)1(cid:19)1/2 (cid:16) s (cid:17) µ (ds) ∝ ν(σ) s 1/2exp − dσ ds M − σ 2σ 0 exp(−s/2) ∝ ds, for s ∈ (0,∞). s The above expression shows that Mj is a Gamma process. We recall that the L´evy intensity of a Gamma process is proportional to the map s (cid:55)→ exp(−c × s) × s 1, − where c is a positive scale parameter. In Ferguson (1973) it is shown that a Dirichlet process can be defined by normalizing a Gamma process. It directly follows that Pj is a Dirichlet Process with base measure G. Remark. Our construction can be extended to a wider class of normalized random measures (James, 2002; Regazzini et al., 2003) by changing the intensity ν that defines the Poisson process σ. If we set ν(σ) = ασ 1 β(1−σ) 1/2+β, − − − β ∈ [0,1), in our definition of Mj , then the L´evy intensity of the random measure in (1) becomes proportional to s 1 βexp(−s/2). − − In this case the L´evy intensity indicates that Mj is a generalized Gamma process (Brix, 1999). We recall that by normalizing this class one obtains normalized generalized Gamma processes (Lijoi et al., 2007), which include the Dirichlet process and the normalized Inverse Gaussian process (Lijoi et al., 2005) as special cases. A few comments capture the relation between our definition of Pj(A) in (1) and alternative definitions of the Dirichlet Process. If we normalize m independent Gamma(α/m,1/2) variables, we obtain a vector with Dirichlet(α/m,...,α/m) distri- bution. To interpret our construction we can note that, when α/m < 1/2, each of theGamma(α/m,1/2)componentscanbe obtained by multiplyingaBeta(α/m,1/2− α/m)variableandanindependentGamma(1/2,1/2). Thedistributionofthe(cid:104)X ,Yj(cid:105)+2 i variables in (1) is in fact a mixture with a Gamma(1/2,1/2) component and a point massatzero. Finallyifweletmincreaseto∞, thelawoftheorderedBeta(α/m,1/2− α/m) converges weakly to the law of ordered points of a Poisson point process on (0,1) with intensity ν (see Appendix A). 5 2.2 Dependent Dirichlet Processes We use the representation for Dirichlet processes from Equation (1) to define a family of dependent Dirichlet processes. Let J be a complete and separable index space and φ : J × J → (−1,1) a strictly positive-definite kernel with φ(j,j) = 1 for j ∈ J. The index space can be for example the space of integer numbers N+. By Mercer’s theorem (Mercer, 1909), there exists a set of sequences {Yj = (Y ,Y ,...)} , such 1,j 2,j j (cid:80) ∈J that Y Y = φ(j,j ) for every pair j,j ∈ J. Geometrically φ(j,j ) is the cosine l l,j l,j(cid:48) (cid:48) (cid:48) (cid:48) of the angle between Yj and Yj(cid:48). A family of dependent Dirichlet processes is defined by setting (cid:80) I(Z ∈ A)×σ (cid:104)X ,Yj(cid:105)+2 Pj(A) = i i i i , ∀j ∈ J, (3) (cid:80) σ (cid:104)X ,Yj(cid:105)+2 i i i for every A ∈ F. Here the sequence (Z ,Z ,...) and the array (X ,X ,...), as in 1 2 1 2 Section 2.1, contain independent and identically distributed random variables, while σ is our Poisson process on the unit interval. We will use the notation Q = (cid:104)X ,Yj(cid:105). i,j i This construction has an interpretable dependency structure between the Pj’s that we state in the next proposition. Proposition 1. There exists a real function η : [0,1] → [0,1] such that the correlation between Pj(A) and Pj(cid:48)(A) is equal to η(φ(j,j )) for every A that satisfies G(A) > 0. (cid:48) In different words, the correlation between Pj(A) and Pj(cid:48)(A) does not depend on the specific measurable set A, it is a function of the angle defined by Yj and Yj(cid:48). The proof is in Appendix B. The degree of similarity between random probability measures in the definition (3) is specified through the kernel φ. The first panel of Figure 1 shows a simulation of Pj’s. In this figure J = {1,2,3,4}. When φ, the cosine of the angle between two vectors Yj and Yj(cid:48), corresponding to distinct biological samples j and j , decreases to −1 the random measures tend to concentrate on two (cid:48) disjoint sets. The second panel shows the function η that maps the φ(j,j )’s into the (cid:48) correlations corr(Pj(A),Pj(cid:48)(A)) = η(φ(j,j )). As expected the correlation increases (cid:48) with φ(j,j ). (cid:48) The next proposition provides mild conditions that guarantee a large support for the dependent Dirichlet processes that we defined. We assume here Z ⊂ R. This will be sufficient for most applications. The proof is in Appendix C. Proposition 2. Consider a collection of probability measures (F ,i = 1,...,d) on i Z, J = {1,...,d}, a positive definite kernel φ and assume that the support of G coincides with Z. The prior distribution in (3) assigns strictly positive probability to (cid:82) (cid:82) the neighborhood {(F ,...,F ) : | f dF − f dF | < (cid:15),i = 1,...,m, j = 1,...,d}, 1(cid:48) d(cid:48) i j(cid:48) i j where (cid:15) > 0 and f , i = 1,...,d, are bounded continuous functions. i In what follows we will replace the constraint (cid:80) Y2 = 1 with the requirement l l,j (cid:80) Y2 < ∞. Thetwoconstraintsareequivalentforourpurpose,becausewenormalize l l,j Mj(·) = (cid:80) I(Z ∈ ·)×σ (cid:104)X ,Yj(cid:105)+2, and (cid:80) Y2 can be viewed as a scale parameter. i i i i l l,j 6 0.0100 1.00 0.0075 0.75 Pop 1 and 2 Probabilities00..00002550 pop1234ulation ))j,j(φ(η(cid:31)00..2550 Pop 1 and 3 con00111..00c0101entration Pop 1 and 4 0.0000 0.00 1 2 3 4 S5pecie6s 7 8 9 10 −1.0 −0.5 φ(j0.,0j(cid:31)) 0.5 1.0 Figure 1: Left panel: realization of 4 microbial distributions from our dependent Dirichlet processes. We illustrate 10 representative OTUs and set α = 100. The miniaturefigureatthetop-leftcornershowstherelativepositionsofthefourmicrobial distributions’ vectors Y. The OTUs are those associated to the 10 largest σ’s. As suggested by this panel, the larger the angle between Y directions of two biological samples, the more the corresponding random distributions tend to concentrate on distinct sets. Right panel: correlation of two random probability measures when the cosineφ(j,j )betweenYj andYj(cid:48) variesfrom−1to1. Weconsiderfivedifferentvalues (cid:48) of the concentration parameter α. In the right panel we also mark with crosses the correlations between Pj(A) and Pj(cid:48)(A) for pairs of biological samples j,j considered (cid:48) in the left panel. 7 2.3 Prior on biological sample parameters This subsection deals with the task of estimating the parameters Yj,j ∈ J = {1,...,J}, that capture most of the variability observed when comparing J biological samples with different OTU counts. We define a joint prior on these factors which makes them concentrate on a low dimensional space; equivalently, the prior tends to shrinks the nuclear norm of the Gram matrix (φ(j ,j )) . The problem of esti- 1 2 j1,j2 ∈J mating low dimensional factor loadings or a low-rank covariance matrix is common in Bayesian factor analysis, and the prior defined below has been used in this area of research. The parameters Yj can be interpreted as key characteristics of the biological sam- ples that affect the relative abundance of OTUs. As in factor analysis, it is difficult to interpret these parameters unambiguously (Press and Shigemasu, 1989; Rowe, 2002); however, the angles between their directions have a clear interpretation. As observed (cid:112) in Figure 1, if the kernel φ(j ,j ) ≈ φ(j ,j )φ(j ,j ), the two microbial distributions 1 2 1 1 2 2 Pj1 and Pj2 will be very similar. If φ(j ,j ) ≈ 0, then there will be little correlation 1 2 (cid:112) between OTUs’ abundances in the two samples. If φ(j ,j ) ≈ − φ(j ,j )φ(j ,j ), 1 2 1 1 2 2 then the two microbial distributions are concentrated on disjoint sets. This interpre- tation suggests PCA of the Gram matrix (φ(j ,j )) as a useful exploratory data 1 2 j1,j2 ∈J analysis technique. It is common in factor analysis to restrict the dimensionality of factor loadings. In our model, this can be accomplished by choosing a number of degrees of freedom m ≤ J, setting (Y ,Y ,...) to be zero and adding an error term (cid:15) in the m+1,j m+2,j definition of Q , the OTU-specific latent weights, i,j Q = (cid:104)X ,Yj(cid:105)+(cid:15) , (4) i,j i i,j where the (cid:15) are independent Normal variables. Recall that each sample-specific i,j random distribution Pj is obtained by normalizing the random variables σ (Q+ )2. i i,j In most applications the dimensionality m is unknown. Several approaches to estimate m have been proposed (Lopes and West, 2004; Lee and Song, 2002; Lucas et al., 2006; Carvalho et al., 2008; Ando, 2009). However, most of them involve either calculation of Bayes Factors or complex MCMC algorithms. Instead we use a Normal shrinkage prior proposed by Bhattacharya and Dunson (2011). This prior includes an infinite sequence of factors (m = ∞), but the variability captured by this sequence of latent factors rapidly decreases to zero. A key advantage of the model is that it does not require to choose the number of factors. The prior is designed to replace direct selection of m with the strictly related goal of shrinking toward zero the unnecessary latent factors. In addition, this prior is nearly conjugate, which simplifies computations. The prior is defined as follows, γ ∼ Gamma(a ,1), γ ∼ Gamma(v/2,v/2), l l l(cid:48),j (cid:32) (cid:33) (cid:89) (5) Y |γ ,γ ∼ N 0,(γ ) 1 γ 1 , l ≥ 1, j ∈ J, l,j l l(cid:48),j l(cid:48),j − k− k l ≤ 8 where the random variables γ = (γ ,γ ; l,j ≥ 1) are independent and, conditionally l l(cid:48),j on these variables, the Y ’s are independent. l,j When a > 1, the shrinkage strength a priori increases with the index l, and l therefore the variability captured by each latent factor tends to decrease with l. We refer to Bhattacharya and Dunson (2011) for a detailed analysis of the prior in (5). In practice, the assumption of infinitely many factors is replaced for data analysis and posterior computations by a finite and sufficiently large number m of factors. This prior model is conditionally conjugate when paired with the dependent Dirichlet processes prior in subsection 2.2, a relevant and convenient characteristic for posterior simulations. 3 Posterior Analysis Given an exchangeable sequence W ,...,W from Pj = Mj ×Mj(Z) 1 as defined in 1 n − subsection 2.1, we can rewrite the likelihood function using variable augmentation as in James et al. (2009), (cid:89)n Pj({W }) = (cid:90) ∞ exp[−Mj(Z) T]×Tn−1 (cid:89)I Mj({W })nidT. (6) i Γ(n) i∗ 0 i=1 i=1 Here W ,...,W is the list of distinct values in (W ,...W ) and n ,...,n are the 1∗ I∗ 1 n 1 I occurrences in (W ,...W ), so that (cid:80)I n = n. We use expression (6) to specify 1 n i=1 i an algorithm that allows us to infer microbial abundances P1,...,PJ in J biological samples. We proceed, similarly to Muliere and Tardella (1998) and Ishwaran and James (2001), using truncated versions of the processes in subsection 2.2. We replace σ = {σ ,i ≥ 1} with a finite number I of independent Beta((cid:15) ,1/2 − (cid:15) ) points in (0,1). i I I Appendix A shows that when I diverges, and (cid:15) = α/I, this finite dimensional version I converges weakly to the process in (2). Each point σ is paired with a multivariate i normal Q = (Q ,...,Q ) with mean zero and covariance Σ. The distribution of i i,1 i,J M = σ (Q+ )2 is a mixture of a point mass at zero and a Gamma distribution. In i,j i i,j this section Q and σ are finite dimensional, and the normalized vectors Pj, which assign random probabilities to I OTUs in J biological samples, are proportional to (M ,...,M ), j = 1,...,J. Note that Pj conditional on 1(Q > 0),...,1(Q > 1,j I,j 1,j I,j 0) follows a Dirichelet distribution with parameters (cid:15) ×1(Q > 0),...,(cid:15) ×1(Q > I 1,j I I,j 0). The algorithm is based on iterative sampling from the full conditional distribu- tions. We first provide a description assuming that Σ is known. We then extend the description to allow sampling under the shrinkage prior in Section 2.3 and to infer Σ. With I OTUs and J biological samples, the typical dataset is n = (n ,...,n ), 1 J where n = (n ,...,n ) and n is the absolute frequency of the ith OTU in the j 1,j I,j i,j jth biological sample. We use the notation nj = (cid:80)I n , n = (cid:80)J n , σ = i=1 i,j i j=1 i,j (σ ,...,σ ), Y = (Yj,j = 1,...,J) and Q = (Q ,1 ≤ i ≤ I,1 ≤ j ≤ J). By using 1 I i,j 9 the representation in (6) we introduce the latent random variables T = (T ,...,T ) 1 J and rewrite the posterior distribution of (σ,Q) : (cid:32) (cid:33) (cid:32) (cid:33) nj p(σ,Q|n) ∝ (cid:89)J (cid:89)I (cid:0)σ Q+2(cid:1)ni,j ×(cid:89)J (cid:88)I σ Q+2 − ×π(σ,Q) (7) i i,j i i,j j=1 i=1 j=1 i=1 ∝(cid:90) π(σ,Q)(cid:89)J (cid:40)(cid:32)(cid:89)I (cid:0)σ Q+2(cid:1)ni,j(cid:33) Tjnj−1exp(cid:0)−Tj(cid:80)iσiQ+i,j2(cid:1)(cid:41)dT, (8) i i,j Γ(nj) j=1 i=1 where π is the prior. In order to obtain approximate (σ,Q) sampling we specify a Gibbs sampler for (σ,Q,T) with target distribution p(σ,Q,T|n) ∝π(σ,Q)(cid:89)J (cid:40)(cid:32)(cid:89)I (cid:0)σ Q+2(cid:1)ni,j(cid:33) Tjnj−1exp(cid:0)−Tj(cid:80)iσiQ+i,j2(cid:1)(cid:41). (9) i i,j Γ(nj) j=1 i=1 The sampler iterates the following steps: [Step 1] Sample T independently, one for each biological sample j = 1,...,J, j (cid:88) T |Q,σ,n ∼ Gamma(nj, σ Q+2). j i i,j i [Step 2] Sample Q independently, one for each OTU i = 1,...,I. The conditional i density of Q = (Q ...Q ) given σ,T,n is log-concave, and the random vectors i i,1 i,J Q , i = 1,...,I, given σ,T,n are conditionally independent. i We simulate, for j = 1,...,J, from (cid:18) (Q −µ )2(cid:19) p(Q |Q ,σ,T,n) ∝ Q+2ni,j ×exp(cid:0)−T σ Q+2(cid:1)×exp − i,j i,j , (10) i,j i,−j i,j j i i,j 2s2 j whereQ = (Q ,...,Q ,Q ,...,Q ),µ = E[Q |Q ],s2 = var[Q |Q ], i, j i,1 i,j 1 i,j+1 i,J i,j i,j i, j j i,j i, j withthep−roviso00 = 1. Sinc−eQ isamultivariateNormalbothµ− ands havesimple− i i,j j closed form expressions. When n = 0 the density in (10) reduces to a mixture of truncated normals: i,j µ s2 (1−p )N(Q ; i,j , j )I(Q > 0)+p N(Q ;µ ,s2)I(Q ≤ 0), 1 i,j ∆ ∆ i,j 1 i,j i,j j i,j i,j i,j Φ(0;µ ,s2)N(0; µi,j , s2j ) p = i,j j ∆i,j ∆i,j , 1 Φ(0;µ ,s2)N(0; µi,j , s2j )+N(0;µ ,s2)(cid:16)1−Φ(0; µi,j , s2j )(cid:17) i,j j ∆i,j ∆i,j i,j j ∆i,j ∆i,j and ∆ = 1+2σ T s2. Here N(·;µ,s2) and Φ(·;µ,s2) are the density and cumulative i,j i j j density functions of a Normal variable with mean µ and variance s2. When n > 0 the density p[Q |Q ,σ,T,n] remains log-concave and the sup- i,j i,j i, j portbecomes(0,+∞). WeupdateQ u−singaMetropolis-Hastingsstepwithproposal i,j 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.