Hypothesis Testing
Chapter 12
(Part 2)
Decision Making
You must select a significance level, \(\alpha\) before collecting the data
- When p-value \(< \alpha\) we reject the null hypothesis.
- When p-value \(\geq \alpha\) we fail to reject the null hypothesis.
Sample Interpretation
Assuming NULL HYPOTHESIS IS TRUE IN CONTEXT OF PROBLEM, there is about a PVALUE% chance of see data as extreme as mine. Since the p-value is less than (or greater than) our threshold ALPHA%, we reject (or fail to reject) the null hypothesis. This means CONCLUSION IN CONTEXT OF PROBLEM.
Two-tail vs one-tail tests
- One-tailed tests provides more power to detect an effect. You may be tempted to use a one-tailed test whenever you have a hypothesis about the direction of an effect.
- Before doing so, consider the consequences of missing an effect in the other direction and consider the consequences of increasing your type 1 error rate
- You can and almost always should use a two-tailed test NO MATTER how the problem is worded.
- We focus on two-tailed tests in this class because it is the safe option and we do NOT want to miss an effect in the opposite direction.
https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faq-what-are-the-differences-between-one-tailed-and-two-tailed-tests/
Example
Imagine you have developed a new drug that you believe is an improvement over an existing drug. You wish to maximize your ability to detect the improvement, so you opt for a one-tailed test.
- In doing so, you fail to test for the possibility that the new drug is less effective than the existing drug.
- The consequences in this example are extreme, but they illustrate a danger of inappropriate use of a one-tailed test.
Deeper understanding
Which type of error is usually more of a priority?
Which type of error is usually easier to control? Why?
Deeper Understanding (continued)
Can we lower both errors at the same time?
Can we know if we made a correct decision?
Deeper Understanding (continued)
How does sample size impact decision making?
- Increasing the sample size decreases the p-value and the standard error while making the STAT more extreme
- As a result, it is harder to detect statistical significance and reject the null with a smaller sample size