The following results were obtained experimentally when verifying Hooke's Law.

Load (N) Extension (mm)
2 2
5 23
8 62
11 119
15 223

a. Create a scatter plot.
b. Produce the best-fit straight line and determine its equation.
c. Determine value, degree and nature of the correlation of the data.

Solution

Let $x$ denote the Load and $y$ denote the Extension in mm.

a. The scatter diagram is

scatter-plot2
scatter-plot2
$x$ $y$ $x^2$ $y^2$ $xy$
1 2 2 4 4 4
2 5 23 25 529 115
3 8 62 64 3844 496
4 11 119 121 14161 1309
5 15 223 225 49729 3345
Total 41 429 439 68267 5269

b. Let the simple linear regression model of $Y$ on $X$ is

$$y=\beta_0 + \beta_1x +e$$

By the method of least square, the estimates of $\beta_1$ and $\beta_0$ are respectively

$$ \begin{aligned} \hat{\beta}_1 & = \frac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2} \end{aligned} $$

and

$$ \begin{aligned} \hat{\beta}_0&=\overline{y}-\hat{\beta}_1\overline{x} \end{aligned} $$

The sample mean of $x$ is

$$ \begin{aligned} \overline{x}&=\frac{1}{n} \sum_{i=1}^n x_i\\ &=\frac{41}{5}\\ &=8.2 \end{aligned} $$

The sample mean of $y$ is

$$ \begin{aligned} \overline{y}&=\frac{1}{n} \sum_{i=1}^n y_i\\ &=\frac{429}{5}\\ &=85.8 \end{aligned} $$

The estimate of $\beta_1$ is given by

$$ \begin{aligned} b_1 & = \frac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2}\\ & = \frac{5*5269-(41)(429)}{5*(439)-(41)^2}\\ &= \frac{8756}{514}\\ &= 17.035. \end{aligned} $$

The estimate of intercept is

$$ \begin{aligned} b_0&=\overline{y}-b_1\overline{x}\\ &=85.8-(17.035)*8.2\\ &=-53.887. \end{aligned} $$

The best fitted simple linear regression model to predict Extension from Load is

$$ \begin{aligned} \hat{y} &= -53.887+ (17.035)*x \end{aligned} $$

c. Correlation coefficient:

The sample variance of $X$ is

$$ \begin{aligned} s_{x}^2 &=\frac{1}{n-1}\sum_{i=1}^{n}(x_i -\overline{x})^2\\ &= \frac{1}{n-1}\bigg(\sum_{i=1}^n x_i^2 - \frac{(\sum_{i=1}^n x_i)^2}{n}\bigg)\\ &= \frac{1}{5 -1}\big(439-\frac{41^2}{5}\big)\\ &= 25.7. \end{aligned} $$

The sample variance of $Y$ is

$$ \begin{aligned} s_{y}^2 &=\frac{1}{n-1}\sum_{i=1}^{n}(y_i -\overline{y})^2\\ &= \frac{1}{n-1}\bigg(\sum_{i=1}^n y_i^2 - \frac{(\sum_{i=1}^n y_i)^2}{n}\bigg)\\ &= \frac{1}{5 -1}\big(68267-\frac{429^2}{5}\big)\\ &= 7864.7. \end{aligned} $$

The covariance between $X$ and $Y$ is

$$ \begin{aligned} s_{xy}&=\frac{1}{n-1}\sum_{i=1}^{n}(x_i -\overline{x})(y_i-\overline{y})\\ &= \frac{1}{n-1}\bigg(\sum_{i=1}^n x_iy_i - \frac{(\sum_{i=1}^n x_i)(\sum_{i=1}^n y_i)}{n}\bigg)\\ &=\frac{1}{5-1}\big(5269 - \frac{41\times 429}{5} \big)\\ &=437.8. \end{aligned} $$

The product moment correlation coefficient is

$$ \begin{eqnarray*} r &=& \frac{Cov(X,Y)}{\sqrt{V(x)*V(Y)}} \\ &=&\frac{s_{xy}}{\sqrt{s_x^2\times s_y^2}}\\ &=& \frac{437.8}{\sqrt{25.7* 7864.7}}\\ & = & 0.9738. \end{eqnarray*} $$

There is a strong positive relation between Load and Extension.

Further Reading