The following table lists midterm and final exam grades for randomly selected students in a statistics course:
Midterm (x) | 82 | 65 | 93 | 70 | 80 |
---|---|---|---|---|---|
Final (y) | 94 | 77 | 94 | 79 | 91 |
a. Plot the scatter diagram for the paired data.
b. Assume a 5% level of significance, use hypothesis testing and the linear correlation coefficient (r) to determine if there is a linear correlation. $\alpha = 0.05$.
c. Find the equation of the regression line and plot it on your scatter diagram using an x of 80 and an x of 93.
d. Determine the residual value for an x of 80and explain its meaning.
Solution
a. Scatter diagram

b. Let $x$ denote Midterm grades and $y$ denote Final grades.
$x$ | $y$ | $x^2$ | $y^2$ | $xy$ |
---|---|---|---|---|
82 | 94 | 6724 | 8836 | 7708 |
65 | 77 | 4225 | 5929 | 5005 |
93 | 94 | 8649 | 8836 | 8742 |
70 | 79 | 4900 | 6241 | 5530 |
80 | 91 | 6400 | 8281 | 7280 |
390 | 435 | 30898 | 38123 | 34265 |
The correlation coefficient $r$ is given by
$$ \begin{aligned} r & = \frac{n \sum xy - (\sum x)(\sum y)}{\sqrt{\big(n(\sum x^2) -(\sum x)^2\big)\times \big(n(\sum y^2) -(\sum y)^2\big)}}\\ & = \frac{5*34265-(390)(435)}{\sqrt{\big(5*(30898)-(390)^2\big)\times\big(5*(38123)-(435)^2\big)}}\\ &= 0.919. \end{aligned} $$
Given that $n = 5$
, $r=0.919$
, $\alpha =0.05$
.
State the hypothesis testing problem
The hypothesis testing problem is $H_0: \rho = 0$
against $H_a: \rho \neq 0$
.
Define the test statistic
The test statistic for testing above hypothesis is
$$ \begin{aligned} t&=\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\\ &=\frac{0.919\sqrt{5 -2}}{\sqrt{1-0.919^2}}\\ &=4.037 \end{aligned} $$
The test statistic $t$ follows Students' $t$ distribution with $n-2=5-2 =3$ degrees of freedom.
The level of significance is $\alpha = 0.05$.
Determine the critical values
For the specified value of $\alpha$ determine the critical region.
$$ \begin{aligned} P(t < t_{1-\alpha/2,n-2} \text{ or } t > t_{\alpha/2,n-2}) = \alpha. \end{aligned} $$

The critical values are $t_{\alpha/2,n-2}=-3.182$
and $t_{1-\alpha/2,n-2}=3.182$
.
Decision
As the observed value of test statistic $t$ falls inside the critical region, we reject the null hypothesis.
We conclude that there is a linear correlation between Midterm and Final grades.
c. The slope $b_1$ is given by
$$ \begin{aligned} b_1 & = \frac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2}\\ & = \frac{5*34265-(390)(435)}{5*(30898)-(390)^2}\\ &= \frac{1675}{2390}\\ &= 0.7008. \end{aligned} $$
The estimate of intercept is
$$ b_0=\overline{y}-b_1\overline{x}=\bigg(\frac{435}{5}\bigg)-0.7008*\bigg(\frac{390}{5}\bigg). $$
The best fitted linear regression equation is
$$ \hat{y} = 32.3347+ (0.7008)*x $$

d. Estimated value for an $x$ of 80 is
$$ \hat{y}_{x=80} = 32.3347+ (0.7008)*80 = 88.402 $$
The residual for an $x=80$ is
$y-\hat{y}_{x=80} = 90-88.402 =2.598$
.
The estimate of true error for Midterm score of 80 is $2.598$
.
Further Reading
- Statistics
- Descriptive Statistics
- Probability Theory
- Probability Distribution
- Hypothesis Testing
- Confidence interval
- Sample size determination
- Non-parametric Tests
- Correlation Regression
- Statistics Calculators