An observational study is conducted to investigate the association between age and total serum cholesterol. The correlation is estimated at r=0.35. The study involves n=125 participants and mean (std dev) age is 44.3 (10.0) years with an age range of 35 to 55 years, and mean (std dev) total cholesterol is 202.8 (38.4).
a) Estimate the equation of the line that best describes the association between age (as the independent variable) and total serum cholesterol.
b) Estimate the total serum choresterol for a 50-year old person.
c) Estimate the total serum choresterol for a 70-year, old person.
Solution
a) Let $X$ denote the age and $Y$ denote the total serum cholesterol.
Given that $n = 125$. The mean and sd of $X$ are $\overline{X}=44.3$ and $s_x=10$ respectively. The mean and sd of $Y$ are $\overline{Y}=202.8$ and $s_y=38.4$ respectively. The correlation coefficient between $X$ and $Y$ is $r=0.35$.
The regression equation is $Y = a+ b*X$, where
$$ \begin{aligned} b &= r \frac{s_y}{s_x}\\ &= 0.35 \big(\frac{38.4}{10}\big)\\ &=1.344 \end{aligned} $$
$$ \begin{aligned} a&=\overline{Y}-b*\overline{X}\\ &=202.8 -1.344* 44.3\\ &=143.261 \end{aligned} $$
The estimated equation of the line that best describe the association between age and total serum cholesterol is
$$ Y = 143.2608 + 1.344*X $$
b) Estimate of the total serum choresterol for a 50-year old person is
$$ \begin{aligned} \hat{Y} &=143.2608 + 1.344 * 50\\ &= 210.461 \end{aligned} $$
c) Estimate of the total serum choresterol for a 70-year old person is
$$ \begin{aligned} \hat{Y} &=143.2608 + 1.344 * 70\\ &= 237.341 \end{aligned} $$
Further Reading
- Statistics
- Descriptive Statistics
- Probability Theory
- Probability Distribution
- Hypothesis Testing
- Confidence interval
- Sample size determination
- Non-parametric Tests
- Correlation Regression
- Statistics Calculators