In a sample of 200 workers 45% said they missed work because of personal illness. Ten years ago in a sample of 200 workers, 35% said they missed work because of personal illness. At $\alpha = 0.01$ is there a difference in the proportions?

Solution

Given that $n_1 = 200$, $n_2=200$.

The sample proportions are
$\hat{p}_1=0.45$ and $\hat{p}_2=0.35$.

The pooled estimate of sample proportion is
$\hat{p} =\frac{n_1\hat{p}_1+ n_2 \hat{p}_2}{n_1+n_2}=\frac{200*0.45+200*0.35}{200+200} =0.4$

Step 1 State the hypothesis testing problem

The hypothesis testing problem is

$H_0 : p_1 = p_2$ against $H_1 : p_1 \neq p_2$ ($\textit{two-tailed}$)

Step 2 Define test statistic

The test statistic for testing above hypothesis testing problem is
$$ \begin{aligned} Z & =\frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)}{\sqrt{\frac{\hat{p}(1-\hat{p})}{n_1}+\frac{\hat{p}(1-\hat{p})}{n_2}}}. \end{aligned} $$

The test statistic $Z$ follows standard normal distribution $N(0,1)$.

Step 3 Specify the level of significance $\alpha$

The significance level is $\alpha = 0.01$.

Step 4 Determine the critical value

As the alternative hypothesis is $\textit{two-tailed}$, the critical value of $Z$ $\text{ are }$ $\text{-2.58 and 2.58}$ (From Normal Statistical Table).

Z-critical two tailed 0.01
Z-critical two tailed 0.01

The rejection region (i.e. critical region) is $\text{Z < -2.58 or Z > 2.58}$.

Step 5 Computation

The test statistic under the null hypothesis is
$$ \begin{aligned} Z_{obs}&= \frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)}{\sqrt{\frac{\hat{p}(1-\hat{p})}{n_1}+\frac{\hat{p}(1-\hat{p})}{n_2}}}\\ &= \frac{(0.45-0.35)-0}{\sqrt{\frac{0.4*(1-0.4)}{200}+\frac{0.4*(1-0.4)}{200}}}\\ &= 2.041 \end{aligned} $$

Step 6 Decision

Traditional approach:

The rejection region (i.e. critical region) is $\text{Z < -2.58 or Z > 2.58}$. The test statistic is $Z_{obs} =2.041$ which falls $outside$ the critical region, we $\textit{fail to reject}$ the null hypothesis.

OR

$p$-value approach:

The test is $\text{two-tailed}$ test, so the p-value is the area to the $\text{extreme}$ of the test statistic ($Z_{obs}=2.041$) is p-value = $0.0412$.

The p-value is $0.0412$ which is $\textit{greater than}$ the significance level of $\alpha = 0.01$, we $\textit{fail to reject}$ the null hypothesis.

There is no sufficient evidence to conclude that the that there is difference in the proportions.

Further Reading