ebook img

Unit 7. Hypothesis Testing - University of Massachusetts Amherst PDF

55 Pages·2008·0.44 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Unit 7. Hypothesis Testing - University of Massachusetts Amherst

PubHlth 540 Hypothesis Testing Page 1 of 55 Unit 7. Hypothesis Testing 1. The Logic of Hypothesis Testing …………………………………… 2 2. Beware the Statistical Hypothesis Test ……………………………… 15 3. Introduction to Type I, II Error and Statistical Power ……………… 18 Topic 4. Normal: Test for μ, σ2 Known …………………………………… 24 5. Normal: Test for μ, σ2 Known – Critical Region Approach ……… 27 6. Normal: Test for μ, σ2 Unknown ………………………………… 31 7. Normal: Test for σ2 ………………………………………..……. 34 36 8. Normal Test for μ – Paired Data Setting ……………… DIFFERENCE 40 9. Normal: Test for [μ - μ ] – Two Independent Groups …………… 1 2 46 10. Normal: Test for Equality of Two Variances (σ2 σ2 )………….. 1 2 49 11. Single Binomial: Test for Proportion π …………………………… 51 12. Two Binomials: Test for [π −π ] – Two Independent Groups………… 1 2 Appendix URL’s for the Computation of Probabilities …………………………………… 55 PubHlth 540 Hypothesis Testing Page 2 of 55 1. The Logic of Hypothesis testing What if we want to do comparisons? Test hypotheses? This is inferential statistics. Inferential statistics require probability models. The concept of a probability model was introduced previously. • Recall that, loosely, a “probability” tells us the chances of observing something. • We use “probabilities” to compare the reasonableness of competing hypotheses. Thus, they are tools in decision making. Example Given a particular exposure (smoking), what is the probability of a particular disease? (“Tobacco companies on trial”) Example What are the chances that a person without disease (no HIV infection) will obtain a positive result on an HIV antibody test? (“False positive”) Inasmuch as we’re after an understanding of nature, we use the tool of “chance” only as long as we have to. - Probability models, i.e.- “chance”, describe the unknown. • “Noise” in the signal-to-noise concept is “chance”. Thus, what we do know is modeled (“signal”). The rest, representing what we cannot explain, is regarded as “due chance”. PubHlth 540 Hypothesis Testing Page 3 of 55 As science progresses, increasingly, “due chance” variability is explained. • Hypotheses are formulated, experiments are performed, and results are evaluated for their consistency (their non-consistency, actually) with a hypothesis. • With the conclusion that a hypothesis is reasonable, the investigator has “explained” some of the current total pool of “due chance”. The pool of unknown, the “due chance”, is now smaller. • Perhaps the next investigator, with his or her refined hypothesis, will reduce further the pool of “due chance”. Inferential statistics proceeds similarly. Consider the following scenario (hypothetical): Interest is in investigating whether the type of access to clean injection paraphernalia will affect a person’s frequency of drug injection. A randomized trial investigating, among other things, frequency of injection is comparing two groups: 1) needle exchange and legal pharmacy sales versus 2) legal pharmacy sales only. Comparison of 2 Groups Analysis reveals no overall effect of randomization assignment on frequency of drug injection. Persons with access only to pharmacy sales appear to have similar frequencies of drug injection as persons with access to both pharmacy sales and needle exchange. However, the variability in the data is great. Another way of saying this is to say that the “noise” is great. Perhaps in this “noise” there is another story to uncover. This prompts a closer look. The mechanics of the subsequent closer looks might take the form of stratified analyses, regression modeling, etc. Comparison of More Than 2 Groups When the data are analyzed separately for men and women, it appears that access to needle exchange is beneficial among women and harmful among men, at least with respect to frequency of drug injection. Thus, scientific inquiry, through the use of statistical modeling and hypothesis testing, treats deterministic events as stochastic until their nature is understood. PubHlth 540 Hypothesis Testing Page 4 of 55 Statistical Hypothesis Testing is a Tool for the Investigation of Research Hypotheses. Here are some examples of research hypotheses – also some study designs. • Following counseling, access to needle exchange and pharmacies, compared to access to pharmacies alone, results in a lower 6-month sero-incidence of HIV infection. Study design - Randomized controlled trial Analysis Goal - Comparison of two groups • The implementation of the policy of banning legal pharmacy sales of syringes will reduce the prevalence of drug injection in Anchorage. Study design - Repeated cross-sectional survey Analysis Goal - Comparison of two groups • The delivery of an educational intervention to injection drug users in residential treatment will produce “safer” injection practices upon discharge. Study design - Intervention study Analysis Goal - Paired (Pre Test/Post Test) longitudinal comparison • The cost to Anchorage, Alaska of screening 1000 injection drug users for Hepatitis C is $X. The logic of proof by contradiction is used to evaluate alternative explanations for observed phenomena in what is called statistical hypothesis testing. As we will see, statistical inference is not biological inference. PubHlth 540 Hypothesis Testing Page 5 of 55 In evaluating competing explanations for observed phenomena, we draw upon concepts of null and alternative hypotheses. Following are examples of null (H ) and alternative (H ) hypotheses. O A • Following counseling, access to needle exchange and pharmacies, compared to access to pharmacies alone, results in a different 6-month sero-incidence of HIV infection. Let μ represent the mean 6-month sero-incidence of HIV infection Group 1: Pharmacy Sales Access only (mean = μ ) 1 Group 2: Pharmacy Sales + Needle Exchange Access (mean = μ ) 2 H : μ = μ O 2 1 H : μ ≠ μ (two sided) A 2 1 Note: For ethical reasons, many randomized trial involving human subjects cannot be justified without the belief that the alternative is two sided. The exception is equivalence trials. • The implementation of the policy of banning legal pharmacy sales of syringes will reduce the prevalence of drug injection in Anchorage. Let π represent the prevalence of drug injection Group 1: 1998 Anchorage population of drug injectors (mean = π ) 1 Group 2: 2000 Anchorage population of drug injectors (mean = π ) 2 H : π = π O 2 1 H : π < π (one sided) A 2 1 PubHlth 540 Hypothesis Testing Page 6 of 55 Lucky for us, it is possible to identify a reasonably consistent paradigm of steps in constructing a statistical hypothesis test, and it works for a variety of study design and analysis goal settings. Here they are. 1. Identify the research question. 2. State the assumptions necessary for computing probabilities. 3. Specify H and H . O A 4. “Reason” an appropriate test statistic. 5. Specify an “evaluation” rule. 6. Perform the calculations. 7. “Evaluate” findings and report. 8. Interpret in the context of biological relevance. 9. (Accompany the procedure with an appropriate confidence interval) PubHlth 540 Hypothesis Testing Page 7 of 55 Following is a schematic of the thinking that underlies a statistical hypothesis test. In each picture below, the scenario considered is that there are two candidate source probability distributions that might have given rise to the observed sample mean. These are null versus alternative. In the top picture, the two candidate source distributions are the same. I’ve drawn two curves only so that I can make the point that the two are one and the same. Take home – The data (the sample mean) is consistent with null. In the lower picture, the null is the left distribution, the alternative is the right distribution. Take home – The data (the sample mean) is NOT consistent with the null. PubHlth 540 Hypothesis Testing Page 8 of 55 The next schematic is intended to show you, with pictures, how the logic of “proof by contradiction” works. The setting is that the investigator wishes to assess, utilizing the tools of statistical hypothesis testing, the relative plausibility of two explanations for the observed data. As before, one explanation is the null and the other is the alternative. In many (but not all) settings of the proof by contradiction argument is the strategy of designating as “null” the “there is nothing going on” explanation) and seeking to advance the “alternative” (there is a treatment benefit, or there is a change over time or there is a difference between groups explanation). Step 1 –Grant the null … The top picture represents the starting point for “proof by contradiction”. It is saying, schematically, “assume the null hypothesis is true”. Under this assumption, the true and the null curves are essentially the same. This is why the two curves are right on top of each other. Step 2 – Collect data … The middle picture represents the starting point for the investigator. He or she collects data and might summarize it in the form of the sample mean. The absence of a graph of a distribution is a reminder that the investigator doesn’t actually know which distribution gave rise to the data. Step 3 – Argue “yes” or “no” does data contradict null. Represented in the lower picture is the sample mean again. Also shown is the distribution that gave rise to the sample mean if the null is true (left) and the distribution that gave rise to the sample mean if the alternative is true (right) The shaded area is a probability calculation under the assumption that the null is true. It answers the question “Under the assumption of the null hypothesis, what are the chances of a value of the sample mean as extreme, or more, than was observed?” Small chances contradict the null suggesting REJECT Large chances are consistent w null suggesting ACCEPT PubHlth 540 Hypothesis Testing Page 9 of 55 Let’s look at the question “what are the chances of a sample mean as extreme or more extreme”, separately for two scenarios. Scenario 1 - NULL is true • Observed sample mean is close to null mean. • Likelihood of being “this far away”, when calculated pretending that the null is true, produces a large value. • Statistical decision - “do NOT reject”. Scenario 2 - ALTERNATIVE is true • Observed sample mean is now close to the alternative mean. • Likelihood of being “this far away” when calculated pretending that the null is true produces a small value. • Statistical decision – “reject” • Do you notice in this logical framework the implicit assumption that the X value that is available to us is be close to its true mean? • In the next pages, you will learn that the calculation shown here to answer the question “If I pretend that the null hypothesis is true, then what were my chances of observing a sample mean as far away as the value obtained” is a p-value calculation. • ”p-value” goes by a variety of names: p-value, significance level, achieved significance. PubHlth 540 Hypothesis Testing Page 10 of 55 Illustration of the logic of hypothesis testing. You may recall this example from the course introduction. Consider a setting where, with standard care, cancer patients are expected to survive a mean duration of time equal to 38.3 months. Investigators are hopeful that a new therapy will improve survival. Suppose that the new therapy is administered to 100 cancer patients. It is observed that they experience instead an average survival time of 46.9 months. Is survival statistically significantly improved (relative to standard care) with receipt of the new therapy? This illustration follows the steps outlined on page 6. 1. Identify the research question μ= 38.3 With standard care, the expected survival time is months. With the new X ,X ,...,X X = 46.9 therapy, the observed 100 survival times, have average 1 2 100 n=100 μ > 38.3 months. Is this compelling evidence that ? true Assumptions are needed for computing probabilities. For now, we’ll assume that the 100 survival times follow a distribution that is Normal (Gaussian). We’ll suppose further that it is known that σ2 = 43.32 months2 . Note – In real life, this would not be a very reasonable assumption as survival distributions tend to be quite skewed. Normality is assumed here, and only for illustration purposes, so as to keep the example simple. 2. Specify the null and alternative hypotheses μ = μ ≤ 38.3 H : months O true O μ = μ > 38.3 H : months A true A

Description:
Beware the Statistical Hypothesis Test . There are a variety of reasons for utilizing . only with caution the tools of statistical hypothesis testing. 1.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.