Consider the linear regression model,
$$
y_i=f_i(\boldsymbol{x}|\boldsymbol{\beta})+\varepsilon_i,
$$
where $y_i$ is the response or the dependent variable at the $i$th case, $i=1,\cdots, N$ and the predictor or the independent variable is the $\boldsymbol{x}$ term defined in the mean function $f_i(\boldsymbol{x}|\boldsymbol{\beta})$. For simplicity, consider the following simple linear regression (SLR) model,
$$
y_i=\beta_0+\beta_1x_i+\varepsilon_i.
$$
To obtain the (best) estimate of $\beta_0$ and $\beta_1$, we solve for the least residual sum of squares (RSS) given by,
$$
S=\sum_{i=1}^{n}\varepsilon_i^2=\sum_{i=1}^{n}(y_i-\beta_0-\beta_1x_i)^2.
$$
Now suppose we want to fit the model to the following data, Average Heights and Weights for American Women, where weight is the response and height is the predictor. The data is available in R by default.

. . . a love story between theory and practice . . .