Chi-Square Test¶
Sal Khan’s tutorial on YouTube demonstrates how to check whether two herbs are really helping a group of people to prevent the flu and helps me to understand Chi-Squared test.
In the illustration above you can see 1
stands for herb one, 2
herb2, _
is the placebo. Red/green colors represents healthy/seek individuals.
Use Chi-Squared test if you have n discrete classes put into contingency table. Now I only have a bunch of individuals and I can create a contingency table by grouping the data by the Herb and Sickness Status, and counting the number of patients in each group.
| treatement | Herb1 | Herb2 | Placebo | Total |
|------------|-------|-------|---------|-------|
| outcome | | | | |
| sick | 20 | 30 | 30 | 80 |
| not sick | 100 | 110 | 90 | 300 |
| Total | 120 | 140 | 120 | 380 |
Can you see any difference in the effect of herbs on protection against the flu?
As per Null Hypothesis, denoted by H0 we assume that the herbs does nothing. In our example, the null
hypothesis would be that healthiness is at least as common while not taking any herbs as with herbs.
If H0 holds there just happen to be a lot of sick people that took herbs or not sick
taking herbs in the study.
In contrast, the hypothesis under which our belief on effect of Herb is true is known as the alternative hypothesis, denoted by H1. Here H1 is that consuming herbs does increase/decrease a likelihood of getting the flu.
Significance level¶
Probability of rejecting a H0 while it is true has upper bound which is known as a significance level. We care about level 5% or above we do not want to reject H0 if we do not have to.