In the United States, approximately 19% of adult deaths are caused by cancer, 47% by heart disease, and the remaining 34% by other causes. A study was done on causes of death for 650 smokers, with the results given below.

Cause of death | Cancer | Heart disease | Other |
---|---|---|---|

Number of adults | 135 | 310 | 205 |

Use a 0.05 significance level to test the claim that causes of death for smokers do not fit the distribution given above.

#### Solution

The observed data is

Cause | Obs. Freq.$(O)$ | Prop. |
---|---|---|

Cancer | 135 | 0.19 |

Heart Disease | 310 | 0.47 |

Other | 205 | 0.34 |

##### Step 1 Setup the hypothesis

The null and alternative hypothesis are as follows:

`$H_0:p_{\text{cancer}} = 0.19, p_{\text{Heart disease}}=0.47, p_{\text{Other}}=0.34$`

$H_1:$ Causes of death for smokers do not fit the specified distribution.

##### Step 2 Test statistic

The test statistic for testing above hypothesis is

` $$ \begin{equation*} \chi^2= \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ \end{equation*} $$ `

##### Step 3 Level of Significance

The level of significance is $\alpha =0.05$.

##### Step 4 Critical value of $\chi^2$

The level of significance is $\alpha =0.05$. Degrees of freedom $df=k-1=3-1 =2$.

The critical value of $\chi^2$ for $df=2$ and $\alpha=0.05$ level of significance is $\chi^2 =5.9915$.

##### Step 5 Test Statistic

The expected frequencies can be calculated as

` $$ \begin{equation*} E_{i} =N*p_i \end{equation*} $$ `

For example, $E_{1}$ is given by

` $$ \begin{eqnarray*} E_{1} & = &N*p_1\\ &=& 650*0.19\\ &=&123.5. \end{eqnarray*} $$ `

Cause | Obs. Freq.$(O)$ | Prop. $p_i$ | Expe.Freq.$(E)$ | $(O-E)^2/E$ |
---|---|---|---|---|

Cancer | 135 | 0.19 | 123.5 | 1.071 |

Heart Disease | 310 | 0.47 | 305.5 | 0.066 |

Other | 205 | 0.34 | 221 | 1.158 |

The test statistic is

` $$ \begin{eqnarray*} \chi^2&=& \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ &=&\frac{(135-123.5)^2}{123.5}+\cdots + \frac{(205-221)^2}{221}\\ &=& 1.071 +\cdots + 1.158\\ &=& 2.295. \end{eqnarray*} $$ `

##### Step 6 Decision (Traditional approach)

The test statistic is $\chi^2 =2.295$ which falls $outside$ the critical region bounded by the critical value $5.9915$, we $\textit{fail to reject}$ the null hypothesis.

**OR**

##### Step 6 Decision ($p$-value approach)

The p-value is $P(\chi^2_{2}>2.295) =0.31743$.

As the p-value $0.3174$ is $\textit{greater than}$ the significance level of $\alpha = 0.05$, we $\textit{fail to reject}$ the null hypothesis.

There is no sufficient evidence to support the claim that causes of death for smokers do not fit the specified distribution.

#### Further Reading

- Statistics
- Descriptive Statistics
- Probability Theory
- Probability Distribution
- Hypothesis Testing
- Confidence interval
- Sample size determination
- Non-parametric Tests
- Correlation Regression
- Statistics Calculators