Among the four northwestern states, Washington has 51% of the total population, Oregon has 30%, Idaho has 11%, and Montana has 8%. A market research selects a sample of 1000 subjects, with 450 in Washington, 340 in Oregon, 150 in Idaho and 60 in Montana. At the 0.05 significance level, test the claim that the sample of 1000 has a distribution that agrees with the distribution of state population.

Solution

The proportions are $p{Washington}=0.51, p{Oregon} =0.3, p{Idaho} = 0.11, p{Montana} = 0.08$.

The observed data is

State Obs. Freq.$(O)$ Prop.
Washington 450 0.51
Oregon 340 0.3
Idaho 150 0.11
Montana 60 0.08

The null and alternative hypothesis are as follows:

$H0:p{Washington}=0.51, p{Oregon} =0.3, p{Idaho} = 0.11, p_{Montana} = 0.08$

(i.e., The sample of 1000 subjects has a distribution that agrees with the distribution of state populations.)

$H_1:$ The sample of 1000 subjects has a distribution that does not agrees with the distribution of state populations.

Test statistic

The test statistic for testing above hypothesis is

$$ \begin{aligned} \chi^2& = \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ \end{aligned} $$

Level of Significance

The level of significance is $\alpha =0.05$.

Critical value of $\chi^2$

The level of significance is $\alpha =0.05$. Degrees of freedom $df=k-1=4-1 =3$.

The critical value of $\chi^2$ for $df=3$ and $\alpha=0.05$ level of significance is $\chi^2 =7.8147$.

Test Statistic

The expected frequencies can be calculated as

$$ \begin{aligned} E_{i} =N*p_i \end{aligned} $$

For example, $E_{1}$ is given by

$$ \begin{aligned} E_{1} & = N*p_1\\ &= 1000*0.51\\ &=510. \end{aligned} $$

$E_{2}$ is given by

$$ \begin{aligned} E_{2} & = N*p_2\\ &= 1000*0.3\\ &=300. \end{aligned} $$

$E_{3}$ is given by

$$ \begin{aligned} E_{3} & = N*p_3\\ &= 1000*0.11\\ &=110. \end{aligned} $$

$E_{4}$ is given by

$$ \begin{aligned} E_{4} & = N*p_4\\ &= 1000*0.08\\ &=80. \end{aligned} $$

State Obs. Freq.$(O)$ Prop. $p_i$ Expe.Freq.$(E)$ $(O-E)^2/E$
Washington 450 0.51 510 7.059
Oregon 340 0.3 300 5.333
Idaho 150 0.11 110 14.545
Montana 60 0.08 80 5

The test statistic is

$$ \begin{aligned} \chi^2&= \sum \frac{(O_{i} -E_{i})^2}{E_{i}} \sim \chi^2_{(k-1)}\\ &=\frac{(450-510)^2}{510}+\frac{(340-300)^2}{300}+\frac{(150-110)^2}{110}+ \frac{(60-80)^2}{80}\\ &= 7.059 +5.333 +14.545+ 5\\ &= 31.937. \end{aligned} $$

chi-square critical value
chi-square critical value

Decision (Traditional approach)

The test statistic is $\chi^2 =31.937$ which falls $inside$ the critical region bounded by the critical value $7.8147$, we $\textit{reject}$ the null hypothesis.

OR

Decision ($p$-value approach)

The p-value is $P(\chi^2_{3}>31.937) =0$.

As the p-value $0$ is $\textit{less than}$ the significance level of $\alpha = 0.05$, we $\textit{reject}$ the null hypothesis.

We conclude that the sample of 1000 subjects has a distribution that does not agrees with the distribution of state populations at $\alpha = 0.05$ level of significance.

Further Reading