Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Variation and Prediction Intervals. In Exercises 17–20, find the (a) explained variation, (b) unexplained variation, and (c) indicated prediction interval. In each case, there is sufficient evidence to support a claim of a linear correlation, so it is reasonable to use the regression equation when making predictions.

Weighing Seals with a Camera The table below lists overhead widths (cm) of seals measured from photographs and the weights (kg) of the seals (based on “Mass Estimation of Weddell Seals Using Techniques of Photogrammetry,” by R. Garrott of Montana State University). For the prediction interval, use a 99% confidence level with an overhead width of 9.0 cm.

Overhead Width

7.2

7.4

9.8

9.4

8.8

8.4

Weight

116

154

245

202

200

191

Short Answer

Expert verified

(a) Explained Variation:8880.1818

(b) Unexplained Variation:991.1515

(c) 99% Prediction Interval: (124.97 cm, 284.55 cm)

Step by step solution

01

Given information

Data are given on two variables, “Overhead Width” and “Weight”.

02

Regression equation

Let x denote the variable “Overhead Width.”

Let y denote the variable “Weight”.

The regression equation of y on x has the following notation:

\(\hat y = {b_0} + {b_1}x\)where

\({b_0}\)is the intercept term

\({b_1}\)is the slope coefficient

The following calculations are done to compute the intercept and the slope coefficient:

The value of the y-intercept is computed below:

\(\begin{aligned}{c}{b_0} &= \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ &= \frac{{\left( {1108} \right)\left( {439} \right) - \left( {81} \right)\left( {9639} \right)}}{{8\left( {439} \right) - {{\left( {81} \right)}^2}}}\\ &= - 156.878788\\ \approx - 156.88\end{aligned}\)

The value of the slope coefficient is computed below:

\(\begin{aligned}{c}{b_1} &= \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ &= \frac{{\left( 6 \right)\left( {9639} \right) - \left( {51} \right)\left( {1108} \right)}}{{6\left( {439} \right) - {{\left( {51} \right)}^2}}}\\ &= 40.181818\\ &\approx 40.18\end{aligned}\)

Thus, the regression equation becomes:

\(\begin{aligned}{c}\hat y &= - 156.878788 + 40.181818x\\ \approx - 156.88 + 40.18x\end{aligned}\)

03

Predicted values

The mean value of observed y is computed below:

\(\begin{aligned}{c}\bar y &= \frac{{\sum y }}{n}\\ &= \frac{{1108}}{6}\\ &= 184.6667\end{aligned}\)

The following table shows the predicted values (upon substituting the values of x in the regression equation), and other important calculations are done below:

The value of the explained variation is shown below:

\(\sum {{{\left( {\hat y - \bar y} \right)}^2}} = 8880.1818\)

Thus, the explained variation is equal to 8880.182.

The value of the unexplained variation is shown below:

\(\sum {{{\left( {y - \hat y} \right)}^2}} = 991.1515\)

Thus, the unexplained variation is equal to 991.1515.

04

Predicted value at \(\left( {{x_0}} \right)\)

Substituting the value of\({x_0} = 9\)in the regression equation, the predicted value is obtained as follows:

\(\begin{aligned}{c}\hat y &= - 156.878788 + 40.181818x\\ &= - 156.878788 + 40.181818\left( 9 \right)\\ &= 204.7576\end{aligned}\)

05

Level of significance and degrees of freedom

The following formula is used to compute the level of significance

\(\begin{aligned}{c}{\rm{Confidence}}\;{\rm{Level}} &= 99\% \\100\left( {1 - \alpha } \right) &= 99\\1 - \alpha &= 0.99\\ &= 0.01\end{aligned}\)

The degrees of freedom for computing the value of the t-multiplier are shown below:

\(\begin{aligned}{c}df &= n - 2\\ &= 6 - 2\\ &= 4\end{aligned}\)

06

Value of t-multiplier, \({t_{\frac{\alpha }{2}}}\)

The value of the t-multiplier for a level of significance equal to 0.01and degrees of freedom equal to 4 is equal to 4.6041.

07

Standard error of the estimate

The value of the standard error of the estimate is computed below:

\(\begin{array}{c}{s_e} = \sqrt {\frac{{\sum {{{\left( {y - \hat y} \right)}^2}} }}{{n - 2}}} \\ = \sqrt {\frac{{991.1515}}{{6 - 2}}} \\ = 15.74128\end{array}\)

08

Value of \(\bar x\)

The value of\(\bar x\)is computed as follows:

\(\begin{array}{c}\bar x = \frac{{\sum x }}{n}\\ = \frac{{51}}{6}\\ = 8.5\end{array}\)

09

Prediction interval

Substitute the values obtained above to calculate the value of margin of error (E) as shown:

\(\begin{aligned}{c}E &= {t_{\frac{\alpha }{2}}}{s_e}\sqrt {1 + \frac{1}{n} + \frac{{n{{\left( {{x_0} - \bar x} \right)}^2}}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}} \\ &= \left( {4.6041} \right)\left( {15.74128} \right)\sqrt {1 + \frac{1}{6} + \frac{{6{{\left( {9 - 8.5} \right)}^2}}}{{6\left( {439} \right) - {{\left( {51} \right)}^2}}}} \\ &= 79.791718\end{aligned}\)

Thus, the prediction interval becomes:

\(\begin{aligned}{c}PI &= \left( {\hat y - E,\hat y + E} \right)\\ &= \left( {204.7576 - 79.791718,204.7576 + 79.791718} \right)\\ &= \left( {124.966,284.549} \right)\\ &\approx \left( {124.97,284.55} \right)\end{aligned}\)

Therefore, the 99% prediction interval for the overhead width for the given value of weight equal to 9.0 cm is (124.97 cm, 284.55 cm).

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Interpreting a Computer Display. In Exercises 9–12, refer to the display obtained by using the paired data consisting of Florida registered boats (tens of thousands) and numbers of manatee deaths from encounters with boats in Florida for different recent years (from Data Set 10 in Appendix B). Along with the paired boat, manatee sample data, StatCrunch was also given the value of 85 (tens of thousands) boats to be used for predicting manatee fatalities.

Finding a Prediction Interval For a year with 850,000 (x = 852) registered boats in Florida, identify the 95% prediction interval estimate of the number of manatee fatalities resulting from encounters with boats. Write a statement interpreting that interval.

Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of A = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Old Faithful Listed below are duration times (seconds) and time intervals (min) to the next eruption for randomly selected eruptions of the Old Faithful geyser in Yellowstone National Park. Is there sufficient evidence to conclude that there is a linear correlation between duration times and interval after times?

Duration

242

255

227

251

262

207

140

Interval After

91

81

91

92

102

94

91

In Exercises 9–12, refer to the accompanying table, which was obtained using the data from 21 cars listed in Data Set 20 “Car Measurements” in Appendix B. The response (y) variable is CITY (fuel consumption in mi/gal). The predictor (x) variables are WT (weight in pounds), DISP (engine displacement in liters), and HWY (highway fuel consumption in mi/gal).

Which regression equation is best for predicting city fuel consumption? Why?

Super Bowl and\({R^2}\)Let x represent years coded as 1, 2, 3, . . . for years starting in 1980, and let y represent the numbers of points scored in each Super Bowl from 1980. Using the data from 1980 to the last Super Bowl at the time of this writing, we obtain the following values of\({R^2}\)for the different models: linear: 0.147; quadratic: 0.255; logarithmic: 0.176; exponential: 0.175; power: 0.203. Based on these results, which model is best? Is the best model a good model? What do the results suggest about predicting the number of points scored in a future Super Bowl game?

Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of A = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Tips Listed below are amounts of bills for dinner and the amounts of the tips that were left. The data were collected by students of the author. Is there sufficient evidence to conclude that there is a linear correlation between the bill amounts and the tip amounts? If everyone were to tip with the same percentage, what should be the value of r?

Bill(dollars)

33.46

50.68

87.92

98.84

63.6

107.34

Tip(dollars)

5.5

5

8.08

17

12

16

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free