Chapter 8: Problem 11

Suppose that we wish to construct the likelihood ratio statistic for comparison of the two linear models \(y=X_{1} \beta_{1}+\varepsilon\) and \(y=X_{1} \beta_{1}+X_{2} \beta_{2}+\varepsilon\), where the components of \(\varepsilon\) are independent normal variables with mean zero and variance \(\sigma^{2} ;\) call the corresponding residual sums of squares \(S S_{1}\) and \(S S\) on \(v_{1}\) and \(v\) degrees of freedom. (a) Show that the maximum value of the log likelihood is \(-\frac{1}{2} n(\log S S+1-\log n)\) for a model whose residual sum of squares is \(S S\), and deduce that the likelihood ratio statistic for comparison of the models above is \(W=n \log \left(S S_{1} / S S\right)\). (b) By writing \(S S_{1}=S S+\left(S S_{1}-S S\right)\), show that \(W\) is a monotonic function of the \(F\) statistic for comparison of the models. (c) Show that \(W \doteq\left(v_{1}-v\right) F\) when \(n\) is large and \(v\) is close to \(n\), and say why \(F\) would usually be preferred to \(W\).

Short Answer

Expert verified

The likelihood ratio statistic \( W = n \log(SS_1 / SS) \) relates to the F statistic as a monotonic function, and is approximated by \( (v_1-v)F \) when \( n \) is large, thereby favoring the use of \( F \) due to its interpretability and standardized distribution.

Step by step solution

Understanding the Likelihood Function

The likelihood function for a normal distribution is given by \( L(\beta,\sigma^2) = (2\pi\sigma^2)^{-n/2}\exp\left(-\frac{1}{2\sigma^2}\text{RSS}\right) \), where RSS is the residual sum of squares. For model comparison, we use the maximum likelihood estimation.

Finding the Maximum Log Likelihood

For a model with residual sum of squares \( SS \), the maximum likelihood occurs when \( \sigma^2 = SS/n \). Substitute this into the log likelihood function to get: \[ \log L = -\frac{n}{2} \log(2\pi) - \frac{n}{2} \log \left(\frac{SS}{n}\right) - \frac{1}{2} n \] which simplifies to \( -\frac{1}{2}n( \log SS + 1 - \log n) \).

Deriving the Likelihood Ratio Statistic W

The likelihood ratio statistic \( W \) is given by the log likelihood difference between the models: \[ W = -\frac{1}{2}n(\log SS_1 + 1 - \log n) - \left(-\frac{1}{2}n(\log SS + 1 - \log n) \right) = n \log(SS_1 / SS) \]

Expressing SS1 in Terms of SS and F

Express the residual sum of squares \( SS_1 = SS + (SS_1 - SS) \). The expression \( (SS_1 - SS) \) can be related to the F statistic, which compares the variance between the models.

Showing the Monotonicity in Relation to F

The likelihood ratio statistic \( W = n \log(SS_1 / SS) \) can be transformed using the F statistic: \[ F = \frac{(SS_1 - SS)/(v_1 - v)}{SS/v} \]This shows that \( W \) is a monotonic function of \( F \) because they both reflect model improvement.

Approximating W with the F Statistic

For large \( n \) and when \( v \) is close to \( n \), the approximation \( W \approx (v_1 - v) F \) holds due to the relationship between \( SS_1 \), \( SS \), and the F-statistic's comparative measure of explained variance.

Preference for F over W

Typically, \( F \) is preferred over \( W \) because it provides a direct statistical test (ANOVA) with an established distribution under the null hypothesis, making it more interpretable and commonly used in hypothesis testing.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Linear Models

Linear Models are fundamental in statistical analysis for understanding relationships between variables. They express the response variable, often denoted as \( y \), as a linear combination of predictor variables, plus an error term \( \varepsilon \). In the exercise at hand, we compare two such models where:

Model 1: \( y = X_1 \beta_1 + \varepsilon \)
Model 2: \( y = X_1 \beta_1 + X_2 \beta_2 + \varepsilon \)

These models assess how additional variables may improve the explanation of the response variable. The error term represents the part of \( y \) not explained by the predictors and is assumed to be normally distributed with mean zero and variance \( \sigma^2 \). Understanding these models helps in gauging the effect of different predictors on the outcome variable.

Likelihood Function

The Likelihood Function is a core concept in parameter estimation, especially in the context of statistical modeling like linear models. It quantifies how likely it is to observe the given data under specific parameter values.
For models with normally distributed errors, the likelihood function is defined as:\[ L(\beta, \sigma^2) = (2\pi\sigma^2)^{-n/2} \exp\left(-\frac{1}{2\sigma^2}\text{RSS}\right) \]where \( \text{RSS} \) is the residual sum of squares.

The goal is to maximize this likelihood with respect to the parameters \( \beta \) and \( \sigma^2 \). Doing so provides maximum likelihood estimates (MLEs), which are parameter values that make the observed data most probable. In the exercise's context, the maximum log likelihood for a model is expressed as:

\[ -\frac{1}{2} n( \log SS + 1 - \log n) \]where \( SS \) is the residual sum of squares for the model. This formulation simplifies the computation and comparison of models, leading us to the likelihood ratio test.

F Statistic

The F Statistic is a fundamental tool for comparing statistical models. Particularly in linear models, it helps determine if the addition of new predictors significantly improves the model.

Calculated as: \[ F = \frac{(SS_1 - SS)/(v_1 - v)}{SS/v} \]where \( SS_1 \) and \( SS \) are the residual sum of squares for the models being compared, and \( v_1 \) and \( v \) are their degrees of freedom, respectively.
The F statistic quantifies whether the model with more predictors is significantly better fitting compared to a simpler model.
When the F statistic is large, it indicates that the additional predictors significantly reduce the error from the model, suggesting better performance.

This measure is often preferred over the likelihood ratio because it fits into the Analysis of Variance (ANOVA) framework, providing a direct pathway to hypothesis testing.

Residual Sum of Squares

Residual Sum of Squares (RSS) plays a crucial role in assessing the fit of a linear model. It measures the total of squared differences between observed values and the values predicted by the model. The RSS is calculated as:
\[ RSS = \sum (y_i - \hat{y}_i)^2 \]where \( y_i \) are the observed values and \( \hat{y}_i \) are the predicted values from the model.

A smaller RSS indicates a better fit, as it signifies that the model's predictions are closer to actual observations. In model comparison, choosing a model with a lower RSS typically means a better overall fit.

For Model 1, the RSS is \( SS_1 \), and for Model 2, it is \( SS \).
These RSS values are pivotal in calculating the likelihood function and, subsequently, the likelihood ratio statistic.

The comparison between different models often hinges on analyzing these RSS values to determine improvement in model fit when additional variables are included.

Short Answer

Step by step solution

Understanding the Likelihood Function

Finding the Maximum Log Likelihood

Deriving the Likelihood Ratio Statistic W

Expressing SS1 in Terms of SS and F

Showing the Monotonicity in Relation to F

Approximating W with the F Statistic

Preference for F over W

Key Concepts

Linear Models

Likelihood Function

F Statistic

Residual Sum of Squares

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Mechanics Maths

Theoretical and Mathematical Physics

Probability and Statistics

Decision Maths

Calculus

Discrete Mathematics

Study anywhere. Anytime. Across all devices.

Company

Product

Help