Hypothesis Testing
Chapter 12
(Part 2)

Steps for conducting a Hypothesis test

  1. Formulate a hypothesis. Define \(H_0\) and \(H_A\).
  2. Determine a significance level (\(\alpha\)), Type I error rate.
  3. Determine your distribution under \(H_0\) (t-dist or normal).
  4. Collect your sample data and compute the test statistic.
  5. Determine the p-value based on the test statistic.
  6. Draw conclusions in the context of the problem.

Decision Making

You must select a significance level, \(\alpha\) before collecting the data

  • When p-value \(< \alpha\) we reject the null hypothesis.
  • When p-value \(\geq \alpha\) we fail to reject the null hypothesis.

Sample Interpretation


Assuming NULL HYPOTHESIS IS TRUE IN CONTEXT OF PROBLEM, there is about a PVALUE% chance of see data as extreme as mine. Since the p-value is less than (or greater than) our threshold ALPHA%, we reject (or fail to reject) the null hypothesis. This means CONCLUSION IN CONTEXT OF PROBLEM.

Two-tail vs one-tail tests

  • One-tailed tests provides more power to detect an effect. You may be tempted to use a one-tailed test whenever you have a hypothesis about the direction of an effect.
  • Before doing so, consider the consequences of missing an effect in the other direction and consider the consequences of increasing your type 1 error rate
  • You can and almost always should use a two-tailed test NO MATTER how the problem is worded.
  • We focus on two-tailed tests in this class because it is the safe option and we do NOT want to miss an effect in the opposite direction.

https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faq-what-are-the-differences-between-one-tailed-and-two-tailed-tests/

Example

Imagine you have developed a new drug that you believe is an improvement over an existing drug. You wish to maximize your ability to detect the improvement, so you opt for a one-tailed test.

  • In doing so, you fail to test for the possibility that the new drug is less effective than the existing drug.
  • The consequences in this example are extreme, but they illustrate a danger of inappropriate use of a one-tailed test.

Deeper understanding

Which type of error is usually more of a priority?



Which type of error is usually easier to control? Why?

Deeper Understanding (continued)

Can we lower both errors at the same time?



Can we know if we made a correct decision?

Deeper Understanding (continued)

How does sample size impact decision making?

  • Increasing the sample size decreases the p-value and the standard error while making the STAT more extreme
  • As a result, it is harder to detect statistical significance and reject the null with a smaller sample size