Among the four northwestern states, Washington has 51% of the total population, Oregon has 30%, Idaho has 11%, and Montana has 8%. A market research selects a sample of 1000 subjects, with 450 in Washington, 340 in Oregon, 150 in Idaho and 60 in Montana. At the 0.05 significance level, test the claim that the sample of 1000 has a distribution that agrees with the distribution of state population.

#### Solution

The proportions are $p*{Washington}=0.51, p*{Oregon} =0.3, p*{Idaho} = 0.11, p*{Montana} = 0.08$.

The observed data is

State | Obs. Freq.$(O)$ | Prop. |
---|---|---|

Washington | 450 | 0.51 |

Oregon | 340 | 0.3 |

Idaho | 150 | 0.11 |

Montana | 60 | 0.08 |

The null and alternative hypothesis are as follows:

$H*0:p*{Washington}=0.51, p*{Oregon} =0.3, p*{Idaho} = 0.11, p_{Montana} = 0.08$

(i.e., The sample of 1000 subjects has a distribution that agrees with the distribution of state populations.)

$H_1:$ The sample of 1000 subjects has a distribution that does not agrees with the distribution of state populations.

**Test statistic**

The test statistic for testing above hypothesis is

` $$ \begin{aligned} \chi^2& = \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ \end{aligned} $$ `

**Level of Significance**

The level of significance is $\alpha =0.05$.

**Critical value of $\chi^2$**

The level of significance is $\alpha =0.05$. Degrees of freedom $df=k-1=4-1 =3$.

The critical value of $\chi^2$ for $df=3$ and $\alpha=0.05$ level of significance is $\chi^2 =7.8147$.

**Test Statistic**

The expected frequencies can be calculated as

` $$ \begin{aligned} E_{i} =N*p_i \end{aligned} $$ `

For example, $E_{1}$ is given by

` $$ \begin{aligned} E_{1} & = N*p_1\\ &= 1000*0.51\\ &=510. \end{aligned} $$ `

$E_{2}$ is given by

` $$ \begin{aligned} E_{2} & = N*p_2\\ &= 1000*0.3\\ &=300. \end{aligned} $$ `

$E_{3}$ is given by

` $$ \begin{aligned} E_{3} & = N*p_3\\ &= 1000*0.11\\ &=110. \end{aligned} $$ `

$E_{4}$ is given by

` $$ \begin{aligned} E_{4} & = N*p_4\\ &= 1000*0.08\\ &=80. \end{aligned} $$ `

State | Obs. Freq.$(O)$ | Prop. $p_i$ | Expe.Freq.$(E)$ | $(O-E)^2/E$ |
---|---|---|---|---|

Washington | 450 | 0.51 | 510 | 7.059 |

Oregon | 340 | 0.3 | 300 | 5.333 |

Idaho | 150 | 0.11 | 110 | 14.545 |

Montana | 60 | 0.08 | 80 | 5 |

The test statistic is

` $$ \begin{aligned} \chi^2&= \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ &=\frac{(450-510)^2}{510}+\frac{(340-300)^2}{300}+\frac{(150-110)^2}{110}+ \frac{(60-80)^2}{80}\\ &= 7.059 +5.333 +14.545+ 5\\ &= 31.937. \end{aligned} $$ `

**Decision (Traditional approach)**

The test statistic is $\chi^2 =31.937$ which falls $inside$ the critical region bounded by the critical value $7.8147$, we $\textit{reject}$ the null hypothesis.

OR

**Decision ($p$-value approach)**

The p-value is $P(\chi^2_{3}>31.937) =0$.

As the p-value $0$ is $\textit{less than}$ the significance level of $\alpha = 0.05$, we $\textit{reject}$ the null hypothesis.

We conclude that the sample of 1000 subjects has a distribution that does not agrees with the distribution of state populations at $\alpha = 0.05$ level of significance.

#### Further Reading

- Statistics
- Descriptive Statistics
- Probability Theory
- Probability Distribution
- Hypothesis Testing
- Confidence interval
- Sample size determination
- Non-parametric Tests
- Correlation Regression
- Statistics Calculators