ebook img

NASA Technical Reports Server (NTRS) 20110008344: SMAR Sessions PDF

5 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 20110008344: SMAR Sessions

Experimental Designs (cid:128)ANOVA is very common with traditional designs of experiments involving 1 or more “factors,”with 2 or more “levels” SMAR (cid:121)Factor ○Level (cid:128)Factors can be “between”or “within” (cid:121)A.k.a. Independent/Dependant Measures (cid:121)A.k.a. Grouping/Repeated Factors Today’s Topic (SMAR-t ): ANOVA Types of Outcomes for ANOVA (cid:128)Continuously scaled outcomes assumed (cid:128)Analysis of Variance (i.e. ANOVA) to follow the normal distribution, or that (cid:121)Independent Measures ANOVA can be transformed so that it does (i.e. (cid:121)Repeated Measures ANOVA “normalized”) (cid:121) Mixed Factorials (cid:121)Examples: BMI, BP, BMD, Strength, (cid:121)Analysis of Covariance (ANCOVA) Standardized Scores, Viral Loads, Force, ○Using Covariates Averages or Sums of Likert-Scaled items (scale scores), Optical Density, Volume, Response Time, Distance, etc. Quick Review: Quick Review: Gaussian Distribution Function Gaussian Distribution Function (cid:128) A.K.A. The “Normal (cid:128) About 68% of all Distribution” scores fall within 1 (cid:128) A.K.A. The “Bell-Shaped SD unit from the Curve” mean. (cid:128) Has known probabilities associated with it, (cid:128) About 95% of all 95% scores fall within 2 (cid:128) Thus all Parametric Statistics are based on SD units from the the Gaussian Distribution mean. f(x)= 1 e−(x−x)22σ2 g 2πσ2 Where x=mean, and σ=standard deviation Quick Review: Quick Review: Gaussian Distribution Function Gaussian Distribution Function (cid:128) About 68% of all (cid:128) About 68% of all scores fall within 1 scores fall within 1 SD SD unit from the unit from the mean. mean. (cid:128) About 95% of all scores fall within 2 SD 68% units from the mean. 99% (cid:128) About 99% of all scores fall within 3 SD units from the mean. Central Limit Theorem Thus… (cid:128)States that for any population with mean µ (cid:128) Since we know so much about the Normal and standard deviation σ, the distribution of Distribution sample means with sample size nwill (cid:128) And we know that sample summaries (means approach a normal distributionwith µand SD or otherwise) tend to follow that distribution σ of n as n approaches infinity. (cid:121) Even data collected from non-normal samples (cid:128)REGARDLESS of the shape of the (cid:121) Especially so with large sample size (big-n) distribution in the population. (cid:128) We can usually apply our knowledge of the (cid:128)By the time sample sizes hit around 30, normal distribution to statistical comparisons, sampling distribution of means is close to estimates, and probability normal. (cid:121) As long as we do some preliminary screening… Moving to the t-test for comparing two Demo of central limit theorem. samples (cid:128) Used for comparing two samples Population I Population 2 collected randomly from two populations Sample 1 Sample 2 X=0 X=4 (cid:128) Many other flavors of X=2 X=6 the t-test exist…but X=4 X=8 we’ll start here. x=2 x=6 s=2 s=2 ( ) t= X1−X2 where s = s2p+s2p sX1−X2 X1−X2 n1 n2 and s2=SS1+SS2=df1s12+df2s22 p df+df df +df 1 2 1 2 s Dissect the formula: X −X Dissect the formula: Denominator 1 2 ( ) ( ) The difference between two sample means t = X1− X2 where s = s2p + s2p t = X1− X2 sX1−X2 X1−X2 n1 n2 sX1−X2 Divided ebryro sro omf eth me edaiffseurreen ocef sstandard SS +SS df s2 +df s2 and s2 = 1 2 = 1 1 2 2 p df +df df +df 1 2 1 2 Dissect the formula: Numerator Dissect the formula: Question? ( ) ( ) The difference between two sample means The difference between two sample means X − X X − X t = 1 2 t = 1 2 s s X1−X2 X1−X2 Divided by some measure of standard error of the differences Are the differences that I see between my two means unusual, given variability among other sample means of this size? T-tests on the Computer: Hypothesis testing Scenario (cid:128) Software gives us t-score and a p-value (cid:128) Allowing us to test hypotheses that the two (cid:128) The “null”hypothesis for the t-test is that samples come from the same population (or the two groups come from the same not) population (cid:128) And describe the magnitude of the differences (cid:121)Thus will have similar means, given sd (confidence intervals) (cid:128) The “alternative”hypothesis is usually that (cid:128) Ex. t = 4.87, p<.001 they don’t (cid:121)H : Two samples are from same population (cid:121)Thus have “different”means, but similar sd (cid:121)Hnu:ll Two samples are from different populations (cid:121)Can be directional (cid:128) Rejaeltct the Null (alpha < .05) & Report the (cid:128) We use the t-statistic in an attempt to magnitude of the differences Reject the null, supporting our claim of the alternative Consequences of Hypothesis Virtues of the t-test Testing & Alpha (cid:128)EVERYONE seems to understand it! The Truth is: (cid:128)With CLT, it’s easy to apply to lots of H0Really isTrue H0is Actually False Your decision is: (there’s no effect) (there is an effect) different data scenarios You Rejected H Due to a Type I Error Power 0 (cid:128)There are other versions that make it Statistically Significant Result Probability = α Probability = (1-β) (Conclude the 2 groups must very flexible come from different populations) (cid:121)Formula for “Repeated Measures”designs You Accepted H Due to a Probability = 1-α Type II Error 0 (cid:121)Formula for problems associated with non- Non-Significant Result Probability = β normality and/or variance heterogeneity (Assume the 2 groups are come from same population) If you have a “significant” result: Limitations of t-tests The Truth is: (cid:128)Alpha risk is .05 for each t-test H Really isTrue H is Actually False (cid:121)Probability of falsely rejecting the null, and 0 0 Your decision is: (there’s no effect) (there is an effect) concluding that there is a difference, when You Rejected H0Due to a Wrong Conclusion Right Conclusion it’s really due to chance. Statistically Significant Result (Conclude the 2 groups must (cid:121)So comparing 3, 4, 5 or more groups is quite come from different problematic! populations) You Accepted H Due to a Right Conclusion Wrong Conclusion 0 Non-Significant Result (Assume the 2 gGroivuepns aa rsei gnificant t-score comparing means…. come from same population) If you have a “non-significant” Comparing Three Groups result: The Truth is: H Really isTrue H is Actually False 0 0 Your decision is: (there’s no effect) (there is an effect) Group I Group 2 Group 3 You Rejected H Due to a Wrong Conclusion Right Conclusion 0 Statistically Significant Result (Conclude theG 2i vgerno uap nso mn-ussitg nificant t-score comparing means…. come from different populations) You Accepted H Due to a Right Conclusion Wrong Conclusion 0 Non-Significant Result (Assume the 2 groups are come from same population) Comparing Three Groups Comparing Three Groups Group I Group 2 Group 3 Group I Group 2 Group 3 T-test number 1 T-test number 1 T-test number 2 Alpha risk = .05 Alpha risk = .05 Alpha risk = .05 T-test number 3 Alpha risk = .05 Comparing Three Groups Comparing Three Groups Group I Group 2 Group 3 Group I Group 2 Group 3 T-test number 1 T-test number 2 T-test number 1 T-test number 2 Alpha risk = .05 Alpha risk = .05 Alpha risk = .05 Alpha risk = .05 T-test number 3 Alpha risk = .05 Comparing Three Groups Assumptions Required of ANOVA (cid:128)Data collected randomly from the population, with roughly equal n per cell Group I Group 2 Group 3 (cid:121)And sufficiently large n (n>30, common r-o-t) (cid:128)Data measured on interval or ratio scale, T-test number 1 T-test number 2 and is normally distributed Alpha risk = .05 Alpha risk = .05 (cid:128)Homogeneity of variance across groups (cid:128)Sphericity for RM designs—variance of the differences between means for any T-test number 3 pair of groups is equal to any other pair Alpha risk = .05 Assumption of Randomly Collected Data with Analysis of Variance (ANOVA) Sufficiently Large n (cid:128) Can compare unlimited number of groups (cid:128) Is our subject pool at NASA randomly or occurrences, and still keep alpha risk = selected from our inference-population? .05 (cid:121) Are those bedrest subjects representative of astronauts? (cid:128) Able to take multiple grouping (or time) (cid:121) Are today’s astronauts representative of future ones? factors into account and determine their independent and combined effects (cid:128) Regarding n, How big is big enough? (cid:128) Can examine “trends”in data, and can test (cid:121)Rule of Thumb…at least 30 per group specific (often complex) hypotheses (cid:121)More is better (cid:128) The analytic focus is on variance, but the ○ Cautions about overpowered studies… interpretation falls back to means—thus (cid:121)But BALANCE is critical!! results become intuitive ○ Rule of thumb—smallest group should not be less than 1/3rd the size of the largest group. Assumption of Interval or Ratio Scale & Normali More on Homogeneity of Variance (cid:128)The “bell-shaped”curve—assumption of all parametric statistics (cid:128)If distributions are normal in one, then (cid:128)Studies show that ANOVA is robust to should be for all violations of this, but only if sample size is substantially large, and Homogeneity is met Group Group Group 1 2 3 Assumption of Homogeneity of Variance More on Homogeneity of Variance Across Groups (cid:128) Variance on the dependant variable should be similar across groups (cid:121) Why? (cid:128)If distributions in 1 (cid:128) Because we’re examining VARIANCE in ANOVA, and group is leptokurtotic so we need for variance in each group to be roughly (tall and skinny), similar before we can conclude that any differences that we find are attributable to groupdifferences (not mere then it should be for variability differences). all other groups (cid:128) Even in Means Comparisons (ex.t-tests), since Means are highly affected by variability, we need for variability to be similar in our groups so that differences that we find can be attributed to true group differences, and not merely by variability differences between our groups. Group Group Group 1 2 3 More on Homogeneity of Variance What about skewed data? (cid:128)Positive or negative skews in the data (cid:128)If distributions in 1 can wreak havoc with statistical analysis group is platykurtotic (cid:121)Thus always recommend thorough data (short & fat) then it screening should be for all (cid:121)Identify outliers—data entry errors? other groups (cid:121)Consider data transformations if necessary ○A great thing to google! Group Group Group 1 2 3 More on Homogeneity of Variance Common Transformations Square Root 1 Reflect and Square Root (cid:128) Any Miss-Match is a Density Problem (cid:121) Because we might interpret a xx statistical differences to real group differences, when it’s 1 Logarithm .04 1 Reflect and aocf tvuaarlilayn dcuee to heterogeneity .03 Logarithm (cid:128) …Thankfully there are ways Density Density.02 Density tsoo ltuetsiot nfosr athreis s pormobelteimme, sa nd .01 possible. SPSS will test this xx 060 80 10x0 120 140 xx assumption for us (stay tuned) 1 Inverse 1 Reflect and Inverse Group Group Group 1 2 3 Density Density Tabachnick, B.G., & Fidell, :L.S. (1989). Using Multivariate Statistics 2ndEd. New York: Harper- xx Collins. xx

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.