Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Question: Entry-level job preferences. Benefits Quarterly published a study of entry-level job preferences. A number of independent variables were used to model the job preferences (measured on a 10-point scale) of 164 business school graduates. Suppose stepwise regression is used to build a model for job preference score (y) as a function of the following independent variables:

x1={1ifflextimeposition0ifnotx2={1iffdaycaresupportrequired0ifnotx3={1iffsupporttransfersupportrequired0ifnotx4=Numberofsickdaysallowed

x5={1iff1applicantmarried0ifnotx6=Numberofchildrenapplicantx6={1iffmaleapplicant0iffemaleapplicant

a. How many models are fit to the data in step 1? Give the general form of these models.

b. How many models are fit to the data in step 2? Give the general form of these models.

c. How many models are fit to the data in step 3? Give the general form of these models.

d. Explain how the procedure determines when to stop adding independent variables to the model.

e. Describe two major drawbacks to using the final stepwise model as the best model for job preference score y.

Short Answer

Expert verified

Answer

a. In step 1 of stepwise regression since there are 7 variables, 7 linear models in one variable is fitted to the data for 7 independent variables. The general model for step 1 isforE(y)=β0+β1xi.

b. In step 2 of stepwise regression since there are 7 independent variables,linear models in two variables are fitted to the data for 7 independent variables. The general model for step 1 isforE(y)=β0+β1x1+β2xi.

c. In step 3 of stepwise regression since there are 7 independent variables,linear models in three variables is fitted to the data for 7 independent variables. The general model for step 1 is forrole="math" localid="1658381585196" E(y)=β0+β1x1+β2x2+β3xi

d. The stepwise regression keeps on adding independent variables till no further independent variable can be added that gives significant t-values.

e. The final model reached with step-wise regression doesn’t account for interaction or higher-order terms which might be more fitted for the data. Also since for every added variable, t-tests are conducted which might lead to the high probability of making type I or type II errors.

Step by step solution

01

Given Information

There are total seven independent variables out of which five are qualitative (binary) while two are quantitative variables.

02

Models in step 1 of stepwise regression

In step 1 of the stepwise regression, linear model in one independent variable is modelled for all the k no of variables in the question.

So, in this situation since there are 7 variables, 7 linear models in one variable is fitted to the data for 7 independent variables.

The general model for step 1 is forE(y)=β0+β1xi.

03

Models in step 2 of stepwise regression 

In step 2 of the stepwise regression, linear model in two independent variables is modelled for selected independent variable in step 1 and all the remaining (k=1) no of variables in the question.

Hence, in this situation since there are 7 independent variables, combination formula is used to choose i items from a total of n items) linear models in two variables is fitted to the data for 7 independent variables. The general model for step 1 is for E(y)=β0+β1x1+β2xi.

04

Models in step 2 of stepwise regression

In step 3 of the stepwise regression, linear model in three independent variables is modelled for selected independent variables in step 2 and all the remaining (k-2) no of variables in the question.

Therefore, in this situation since there are 7 independent variables, (combination formula is used to choose x items from a total of n items) linear models in three variables is fitted to the data for 7 independent variables. The general model for step 1 is forE(y)=β0+β1x1+β2x2+β3xi

05

Procedure of step wise regression

The stepwise regression keeps on adding independent variables till no further independent variable can be added that gives significant t-values. In the question since there are 7 independent variables, the step-wise regression will be run till step 7 and t-test will be conducted to check the significance of each added variable.

06

Drawback of using stepwise regression model

The final model reached with step wise regression doesn’t account for interaction or higher order terms which might be more fitted for the data. Also since for every added variable, t-tests are conducted which might lead to the high probability of making type I or type II error.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

When a multiple regression model is used for estimating the mean of the dependent variable and for predicting a new value of y, which will be narrower—the confidence interval for the mean or the prediction interval for the new y-value? Why?

Question: Bordeaux wine sold at auction. The uncertainty of the weather during the growing season, the phenomenon that wine tastes better with age, and the fact that some vineyards produce better wines than others encourage speculation concerning the value of a case of wine produced by a certain vineyard during a certain year (or vintage). The publishers of a newsletter titled Liquid Assets: The International Guide to Fine Wine discussed a multiple regression approach to predicting the London auction price of red Bordeaux wine. The natural logarithm of the price y (in dollars) of a case containing a dozen bottles of red wine was modelled as a function of weather during growing season and age of vintage. Consider the multiple regression results for hypothetical data collected for 30 vintages (years) shown below.

  1. Conduct a t-test (atα=0.05 ) for each of the βparameters in the model. Interpret the results.
  2. When the natural log of y is used as a dependent variable, the antilogarithm of a b coefficient minus 1—that is ebi - 1—represents the percentage change in y for every 1-unit increase in the associated x-value. Use this information to interpret each of the b estimates.
  3. Interpret the values of R2and s. Do you recommend using the model for predicting Bordeaux wine prices? Explain

Impact of race on football card values. University of Colorado sociologists investigated the impact of race on the value of professional football players’ “rookie” cards (Electronic Journal of Sociology, 2007). The sample consisted of 148 rookie cards of National Football League (NFL) players who were inducted into the Football Hall of Fame. The price of the card (in dollars) was modeled as a function of several qualitative independent variables: race of player (black or white), card availability (high or low), and player position (quarterback, running back, wide receiver, tight end, defensive lineman, linebacker, defensive back, or offensive lineman).

  1. Create the appropriate dummy variables for each of the qualitative independent variables.
  2. Write a model for price (y) as a function of race. Interpret theβ’s in the model.
  3. Write a model for price (y) as a function of card availability. Interpret theβ’s in the model.
  4. Write a model for price (y) as a function of position. Interpret theβ’s in the model.

Consider the model:

E(y)=β0+β1x1+β2x2+β3x22+β4x3+β5x1x22

where x2 is a quantitative model and

x1=(1receivedtreatment0didnotreceivetreatment)

The resulting least squares prediction equation is

localid="1649802968695" y=2+x1-5x2+3x22-4x3+x1x22

a. Substitute the values for the dummy variables to determine the curves relating to the mean value E(y) in general form.

b. On the same graph, plot the curves obtained in part a for the independent variable between 0 and 3. Use the least squares prediction equation.

Question: Consumer behavior while waiting in line. The Journal of Consumer Research (November 2003) published a study of consumer behavior while waiting in a queue. A sample of n = 148 college students was asked to imagine that they were waiting in line at a post office to mail a package and that the estimated waiting time is 10 minutes or less. After a 10-minute wait, students were asked about their level of negative feelings (annoyed, anxious) on a scale of 1 (strongly disagree) to 9 (strongly agree). Before answering, however, the students were informed about how many people were ahead of them and behind them in the line. The researchers used regression to relate negative feelings score (y) to number ahead in line (x1) and number behind in line (x2).

a.The researchers fit an interaction model to the data. Write the hypothesized equation of this model.

b. In the words of the problem, explain what it means to say that “x1 and x2 interact to affect y.”

c. A t-test for the interaction β in the model resulted in a p-value greater than 0.25. Interpret this result.

d. From their analysis, the researchers concluded that “the greater the number of people ahead, the higher the negative feeling score” and “the greater the number of people behind, the lower the negative feeling score.” Use this information to determine the signs of β1 and β2 in the model.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free