Data from the Framingham study allow us to compare the distribution of initial serum cholesterol levels for two population of males; those who go on to develop coronary artery disease and those who do not. The mean serum cholesterol level of the population of men who do not develop heart disease is $\mu =219 mg/100 ml$ and the standard deviation is $\sigma=41mg/100ml$. Supppose however, that you do not know the true population mean; instead, you hypothesize that $\mu$ is equal to 244 mg/100ml. This is the mean initial serum cholesterol level of men who eventually develop the disease. Since it is believed that the mean serum cholesterol level for the men who do not develop heart disease cannot be higher than the mean level for men who do, a one sided test conducted at the $\alpha= 0.05$ level of significance is appropriate.
(a) What is the probability of making a type 1 error?
(b) If a sample of size 25 is selected from the population of men who do not go on to develop coronary heart disease, what is the probability of making a type II error?
(c) What is the power of the test?
(d) How could you increase the power?
(e) You wish to test the null hypothesis $H_0: \mu\geq 244mg/100ml$ Against the alternative
$H_A: \mu < 244 mg/100ml$ at the $\alpha= 0.05$ level of significance. If the true population mean is as low as 219 mg/100ml, you want to risk only a 5 % chance of failing to reject $H_0$. How large a sample would be required?
(f) How would the sample size change if you were willing to risk a 10% chance of failing to reject a false null hypothesis?
Solution
(a) The probability of Type I Error is
$P(\text{Type I Error})=P(\text{Reject } H_0 | H_0 \text{ is true}) = \min { p,\alpha}$, where $\alpha$ is the level of significance and $p$ is the p-value.
(b) The probability of Type II error is
$P(\text{Type II Error}) = P(\text{Accept } H_0 | H_1\text{ is true})=\beta$.
As population standard deviation $\sigma$ is known, we reject the null hypothesis $H_0$ if $\frac{\overline{x}-\mu_0}{\sigma/\sqrt{n}}\leq -z_\alpha$, where $\mu_0 = 244 mg/100 ml$
.
We are given that true mean for the population of the men who do not develop coronary heart disease is $\mu_1 = 219 mg/100 ml$.
$$ \begin{aligned} P(\text{Accept } H_0 | H_1\text{ is true}) &= P\bigg(\overline{x} > -z_\alpha \frac{\sigma}{\sqrt{n}}+\mu_0 | \mu_1 = 219\bigg)\\ &= P\bigg(\frac{\overline{x}-\mu_1}{\sigma/\sqrt{n}} > -z_\alpha + \frac{\mu_0-\mu_1}{\sigma/\sqrt{n}}|\mu_1 = 219\bigg)\\ &= P\bigg(Z > -1.64 + \frac{244-219}{41/\sqrt{25}}|\mu_1 = 219\bigg)\\ &= P(Z > 1.4087)\\ &= 0.0793 \end{aligned} $$
So the probability of making Type II error is about $\beta \approx 0.08$
, when $\mu_1 = 219 mg/100 ml$.
(c) The power of the test is $1-\beta = 1- 0.08 = 0.92$
, when $\mu_1 = 219 mg/100 ml$.
(d) The power of the text can be increase by decreasing the probability of type II error $\beta$.
(e) Given that $\mu_0=244$, $\mu_1=219$, $\sigma =41$, $\alpha =0.05$, and $\beta = 0.05$.
The $ES$ is given by
$$ \begin{aligned} ES &= \frac{|\mu_1-\mu_0|}{\sigma}\\ &= \frac{|219-244|}{41}\\ &= 0.61. \end{aligned} $$
The formula for determining the sample size required to ensure that the test has a specified power is
$$ \begin{aligned} n &= \bigg(\frac{Z_{1-\alpha}+Z_{1-\beta}}{ES}\bigg)^2\\ &= \bigg(\frac{Z_{1-0.05}+Z_{1-0.05}}{0.61}\bigg)^2\\ &= \bigg(\frac{Z_{0.95}+Z_{0.95}}{0.61}\bigg)^2\\ &= \bigg(\frac{1.64+1.64}{0.61}\bigg)^2\\ &=28.91\\ &\approx 29 \end{aligned} $$
Thus the sample of size at least $n=29$ would be required to obtain a risk less than 5% of failing to reject $H_0$, when the alternative hypothesis $\mu_1 = 219 mg/100 ml$.
(f) Given that $\mu_0=244$, $\mu_1=219$, $\sigma =41$, $\alpha =0.05$, and $\beta = 0.1$.
The $ES$ is given by
$$ \begin{aligned} ES &= \frac{|\mu_1-\mu_0|}{\sigma}\\ &= \frac{|219-244|}{41}\\ &= 0.61. \end{aligned} $$
The formula for determining the sample size required to ensure that the test has a specified power is
$$ \begin{aligned} n &= \bigg(\frac{Z_{1-\alpha}+Z_{1-\beta}}{ES}\bigg)^2\\ &= \bigg(\frac{Z_{1-0.05}+Z_{1-0.1}}{0.61}\bigg)^2\\ &= \bigg(\frac{Z_{0.95}+Z_{0.9}}{0.61}\bigg)^2\\ &= \bigg(\frac{1.64+1.28}{0.61}\bigg)^2\\ &=22.91\\ &\approx 23 \end{aligned} $$
Thus the sample of size at least $n=23$ would be required to obtain a risk less than 10% of failing to reject $H_0$, when the alternative hypothesis $\mu_1 = 219 mg/100 ml$.
As compared to (e), the value of $n$ is less. This is because allowing higher risk of Type II error, we can allow for more variability in the sample.
Further Reading
- Statistics
- Descriptive Statistics
- Probability Theory
- Probability Distribution
- Hypothesis Testing
- Confidence interval
- Sample size determination
- Non-parametric Tests
- Correlation Regression
- Statistics Calculators