MLE is a value that maximizes the likelihood function.
Refer to equation 11.2..2 for the conditional joint density of\({Y_1}, \ldots ,{Y_n}\), given\(x = \left( {{x_1}, \ldots ,{x_n}} \right)\)\({f_n}\left( {y\left| {x,{\beta _0},{\beta _1},{\sigma ^2}} \right.} \right) = \frac{1}{{{{\left( {2\pi {\sigma ^2}} \right)}^{n/2}}}}\exp \left( { - \frac{1}{{2{\sigma ^2}}} \cdot \sum\limits_{i = 1}^n {{{\left( {{y_i} - {\beta _0} - {\beta _1}{x_i}} \right)}^2}} } \right)\)
The above equation is also the likelihood function for the parameter,
\({\bf{L}}\left( {{{\bf{\sigma }}^{\bf{2}}}} \right){\bf{ = }}\frac{{\bf{1}}}{{{{\left( {{\bf{2\pi }}{{\bf{\sigma }}^{\bf{2}}}} \right)}^{{\bf{n/2}}}}}}{\bf{exp}}\left( {{\bf{ - }}\frac{{\bf{1}}}{{{\bf{2}}{{\bf{\sigma }}^{\bf{2}}}}}{\bf{ \times }}\sum\limits_{{\bf{i = 1}}}^{\bf{n}} {{{\left( {{{\bf{y}}_{\bf{i}}}{\bf{ - }}{{\bf{\beta }}_{\bf{0}}}{\bf{ - }}{{\bf{\beta }}_{\bf{1}}}{{\bf{x}}_{\bf{i}}}} \right)}^{\bf{2}}}} } \right)\)
Note that the likelihood depends only on \({\sigma ^2}\)since the MLEs of \({\beta _0}\)and \({\beta _1}\)were already found - those are the estimated parameters of linear regression, \({\hat \beta _0}\)and\({\hat \beta _1}\).
Apply natural log as follows,
\(\begin{array}{c}l\left( {{\sigma ^2}} \right) = \ln \left( {L\left( {{\sigma ^2}} \right)} \right)\\ = - \frac{n}{2}\ln \left( {2\pi {\sigma ^2}} \right) - \frac{1}{{2{\sigma ^2}}}\sum\limits_{i = 1}^n {{{\left( {{y_i} - {\beta _0} - {\beta _1}{x_i}} \right)}^2}} \end{array}\)
The derivative of \(l\) (with respect to \({\sigma ^2}\) ) is expressed as,
\(\frac{{\partial \left( {l\left( {{\sigma ^2}} \right)} \right)}}{{\partial \left( {{\sigma ^2}} \right)}} = - \frac{n}{{2{\sigma ^2}}} + \frac{1}{{2{\sigma ^4}}}\sum\limits_{i = 1}^n {{{\left( {{y_i} - {\beta _0} - {\beta _1}{x_i}} \right)}^2}} .\)