ebook img

ANOVA theory and examples - Lausanne PDF

21 Pages·2010·0.3 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview ANOVA theory and examples - Lausanne

ANOVA (cid:131) Stands for ANalysis Of VAriance (cid:131) But it’s a test of differences in means (cid:131) The idea: The Observations y ij Treatment group i = 1 i = 2 … i = k y y … y 11 21 k,1 y y … y 12 22 k,2 … … … … y y … y 1, n1 2, n2 k, nk means: m m … m 1 2 k 1 The ANOVA table (cid:131) The analysis is usually laid out in a table (cid:131) For a one-way layout (where the response is assumed to vary according to grouping on one factor): Source df SS MS F p-val Treatment k-1 Σ(m-m)2 SST/(k-1) MST/MSE * i Error n-k Σ(y -m)2 SSE/(n-k) ij i Total n-1 Σ(y -m)2 ij m = overall mean, m = mean within group i i Sum of Squares (cid:131) The two-sample t-test tests for equality of the means of two groups. (cid:131) We could express the observations as: X =μ + E i =1,2 ij i ij (cid:131) Where the E are assumed to be N(0,σ2) ij (cid:131) H : μ = μ 1 2 2 Sum of Squares (cid:131) This can also be written as: X =μ+α + E i =1,2 ij i ij – μcould be seen as overall mean – α as deviation from μin group i i (cid:131) Model is overparameterized – Uses more parameters than necessary – Necessitates some constraint, e.g. Σα=0 i i Sum of Squares (cid:131) Goal: test difference between means of two (or more) groups – Between SS measures the difference (cid:131) The difference must be measured relativeto the variance withinthe groups – Within SS (cid:131) F-test:considers the ratio of B/W (cid:131) The larger F is, the more significant the difference 3 The ANOVA Procedure (cid:131) Subdivide observed total sum of squares into several components (cid:131) Pick appropriate significance point for a chosen Type I error αfrom an Ftable (cid:131) Compare the observed components to test the NULL hypothesis Comments (cid:131) Generalizes to any number of groups (cid:131) ANOVAs can be classified in various ways, e.g. – fixed effectsmodels – mixed effectsmodels – random effectsmodel – For now we consider fixed effect models • Parameter αis fixed, but unknown, in group i i X =μ+α+E ij i ij 4 One-Way ANOVA (cid:131) One-Way fixed-effect ANOVA (cid:131) Setup and derivation – Like two-sample t-test for gnumber of groups – Observations (n observations, i=1,2,…,g) i X ,X , ,X K i1 i2 in – Using overparameterized model for X X =μ+α+E j =1,2, ,n i =1,2, ,g ij i ij K i K – E assumed N(0,σ2), Σnα = 0, α fixed in ij i i i group i One-Way ANOVA – Null Hypothesis H is: α = α = … = α = 0 0 1 2 g – Total sum of squares is g ni ∑∑(X −X)2 ij i=1 j=1 – This is subdivided into Band W g g ni B=∑ni(Xi −X)2 W =∑∑(Xij −Xi)2 i=1 i=1 j=1 – with X =∑ni Xij X =∑g ∑ni Xij N=∑g n i j=1 ni i=1 j=1 N i=1 i 5 One-Way ANOVA – Total degrees of freedom: N– 1 • Subdivided into df = g– 1 and df = N -g B W – This gives the test statistic F B N−g F = * W g−1 Assumptions (cid:131) Have random samplesfrom each separate population (cid:131) The variance is the samein each treatment group (cid:131) The samples are sufficiently largethat the CLT holds for each sample mean (or the individual population distributions are normal) 6 What does it mean when we reject H? (cid:131) The null hypothesis H is a jointone: that all population means are equal (cid:131) When we reject the null, that does NOT mean that the means are all different! (cid:131) It means thatat least oneis different (cid:131) To find out whichis different, can do post hoc testing (pairwiset-tests, for example) Additional aspects (cid:131) Why not start off doing separate (z or t) tests for each pair of samples? ... (cid:131) Testing the assumptions (cid:131) Whichmean(s) is/are not equal – can do post hoctesting (pairwiset-tests, for example) (cid:131) Multiple comparisons (multiple testing) (cid:131) ‘Data snooping’ 7 Factorial crossing (cid:131) Compare 2 (or more) sets of conditions in the same experiment (cid:131) Designs with factorial treatment structure allow you to measure interaction between two (or more) sets of conditions that influence the response – you will look at this in more detail during the exercises today (cid:131) Factorial designs may be either observational or experimental 3 types of 2-factor factorial designs (cid:131) 2 experimental factors – you randomize treatments to each unit (cid:131) 2 observational factors – you cross-classify your populations into groups and get a sample from each population (cid:131) 1 experimental and 1 observational factor – you get a sampleof units from each population, then use randomization to assign levels of the experimental factor (treatments), separately within each sample 8 Interaction (cid:131) Interaction is very common (and very important) in science (cid:131) Interaction is a difference of differences (cid:131) Interaction is present if the effect of one factor is differentfor different levels of the other factor (cid:131) Main effects can be difficult to interpret in the presence of interaction, because the effect of one factor depends on the level of the other factor Interaction plot no interaction interaction 9 Two-Way ANOVA (cid:131) Two-Way Fixed Effects ANOVA (cid:131) More complicated setup; example: – Expression levels of one gene in lung cancer patients – adifferent risk classes • E.g.: ultrahigh, very high, intermediate, low – bdifferent age groups – nindividuals for each risk/age combination Two-Way ANOVA (cid:131) Observations: X ijk – iis the risk class (i= 1, 2, …, a) – jindicates the age group – kcorresponds to the individual in each group (k= 1, …, n) • Each group is a possible risk/age combination – The number of individuals in each group is the same, n – This is a “balanced” design (equal numbers in each group – Theory for unbalanced designs is a little more complicated 10

Description:
One-Way ANOVA. ▫ One-Way fixed-effect ANOVA. ▫ Setup and derivation. – Like two-sample t-test for g number of groups. – Observations (ni observations, i=1,2
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.