Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Variation and Prediction Intervals. In Exercises 17–20, find the (a) explained variation, (b) unexplained variation, and (c) indicated prediction interval. In each case, there is sufficient evidence to support a claim of a linear correlation, so it is reasonable to use the regression equation when making predictions.

Town Courts Listed below are amounts of court income and salaries paid to the town justices (based on data from the Poughkeepsie Journal). All amounts are in thousands of dollars, and all of the towns are in Dutchess County, New York. For the prediction interval, use a 99% confidence level with a court income of $800,000.

Court Income

65

404

1567

1131

272

252

111

154

32

Justice Salary

30

44

92

56

46

61

25

26

18

Short Answer

Expert verified

(a)Explained Variation:3210.364

(b) Unexplained Variation:1087.191

(c) 95% Prediction Interval:(10.4,104.6)

Step by step solution

01

Given information

Data are given fortwo variables, “Court Income” and “Justice Salary”.

02

Regression equation

Let x denote the variable “Court Income.”

Let y denote the variable “Justice Salary.”

The regression equation of y on x has the following notation:

\(\hat y = {b_0} + {b_1}x\), where

\({b_0}\)is the intercept term and\({b_1}\)is the slope coefficient.

The following calculations are done to compute the intercept and the slope coefficient:

The y-intercept is computed below:

\(\begin{array}{c}{b_0} = \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( {398} \right)\left( {4076640} \right) - \left( {3988} \right)\left( {262465} \right)}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}\\ = 27.701478\\ \approx 27.70\end{array}\)

The slope coefficient is computed below:

\(\begin{array}{c}{b_1} = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 9 \right)\left( {262465} \right) - \left( {3988} \right)\left( {398} \right)}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}\\ = 0.0372835\\ \approx 0.04\end{array}\)

Thus, the regression equation becomes as shown:

\(\begin{array}{l}\hat y = 27.701478 - 0.0372835x\\\hat y \approx 27.70 - 0.04x\end{array}\)

03

Predicted values

The mean value of observed y is computed below:

\(\begin{array}{c}\bar y = \frac{{\sum y }}{n}\\ = \frac{{398}}{9}\\ = 44.222\end{array}\)

The following table shows the predicted values (obtained by substituting the values of x in the regression equation) and other important calculations:

The value of the explained variation is shown below:

\(\sum {{{\left( {\hat y - \bar y} \right)}^2}} = 3210.364\)

Thus, the explained variation is 3210.364.

The value of the unexplained variation is shown below:

\(\sum {{{\left( {y - \hat y} \right)}^2}} = 1087.191\)

Thus, the unexplained variation is 1087.191.

04

Predicted value at \(\left( {{x_0}} \right)\)

Substitute\({x_0} = 800\)in the regression equation to obtain the predicted value.

\(\begin{array}{c}\hat y = 27.70 + 0.04x\\ = 27.70 + 0.04\left( {800} \right)\\ = 57.5283\\ \approx 58\end{array}\)

05

Formula of prediction interval

The prediction interval is obtained using the formula shown below:

\(\begin{array}{c}PI = \hat y \pm E\\ = \hat y \pm {t_{\frac{\alpha }{2}}}{s_e}\sqrt {1 + \frac{1}{n} + \frac{{n{{\left( {{x_0} - \bar x} \right)}^2}}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}} \end{array}\)

06

Degrees of freedom and critical value

The following formula is used to compute the level of significance

\(\begin{array}{c}Confidence\;Level = 99\% \\100\left( {1 - \alpha } \right) = 99\\1 - \alpha = 0.99\\ = 0.01\end{array}\)

The degrees of freedom for computing the t-multiplier are shown below:

\(\begin{array}{c}df = n - 2\\ = 9 - 2\\ = 7\end{array}\)

The two-tailed value of the t-multiplier for 0.01 level of significance and 7 degrees of freedom is 3.4995.

07

Standard error of the estimate

The standard error of the estimate is computed below:

\(\begin{array}{c}{s_e} = \sqrt {\frac{{\sum {{{\left( {y - \hat y} \right)}^2}} }}{{n - 2}}} \\ = \sqrt {\frac{{1087.191}}{{9 - 2}}} \\ = 12.46247\end{array}\)

08

Value of \(\bar x\)

The value of \(\bar x\) is computed as follows:

\(\begin{array}{c}\bar x = \frac{{\sum x }}{n}\\ = \frac{{3988}}{9}\\ = 443.111\end{array}\)

09

Prediction interval

Substitute the values obtained above to calculate the margin of error (E).

\(\begin{array}{c}E = {t_{\frac{\alpha }{2}}}{s_e}\sqrt {1 + \frac{1}{n} + \frac{{n{{\left( {{x_0} - \bar x} \right)}^2}}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}} \\ = \left( {3.4995} \right)\left( {12.46247} \right)\sqrt {1 + \frac{1}{9} + \frac{{9{{\left( {800 - 443.111} \right)}^2}}}{{9\left( {4076640} \right) - {{\left( {3988} \right)}^2}}}} \\ = 47.0986\end{array}\)

Thus, the prediction interval becomes as shown:

\(\begin{array}{c}PI = \left( {\hat y - E,\hat y + E} \right)\\ = \left( {57.5283 - 47.0986,57.5283 + 47.0986} \right)\\ = \left( {10.4297,04.6269} \right)\\ \approx \left( {10.4,104.6} \right)\end{array}\)

Therefore, the 99% prediction interval for the justice salary for thecourt income of $800,000is (10.4,104.6).

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times)

Which of the following change if the two variables of enrollment and burglaries are switched: the value of r= 0.499, the P-value of 0.393, the critical values of\( \pm \)0.878?

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times).

If you had computed the value of the linear correlation coefficient to be 1.500, what should you conclude?

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times).

Exercise 1 stated that ris found to be 0.499. Does that value change if the actual enrollment values of 53,000, 28,000, 27,000, 36,000, and 42,000 are used instead of 53, 28, 27, 36, and 42?

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times).

The sample data result in a linear correlation coefficient of r= 0.499 and the regression equation\(\hat y = 3.83 + 2.39x\). What is the best predicted number of burglaries, given an enrollment of 50 (thousand), and how was it found?

In exercise 10-1 12. Clusters Refer to the following Minitab-generated scatterplot. The four points in the lower left corner are measurements from women, and the four points in the upper right corner are from men.

a. Examine the pattern of the four points in the lower left corner (from women) only, and subjectively determine whether there appears to be a correlation between x and y for women.

b. Examine the pattern of the four points in the upper right corner (from men) only, and subjectively determine whether there appears to be a correlation between x and y for men.

c. Find the linear correlation coefficient using only the four points in the lower left corner (for women). Will the four points in the upper left corner (for men) have the same linear correlation coefficient?

d. Find the value of the linear correlation coefficient using all eight points. What does that value suggest about the relationship between x and y?

e. Based on the preceding results, what do you conclude? Should the data from women and the data from men be considered together, or do they appear to represent two different and distinct populations that should be analyzed separately?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free