Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Effects of Clusters Refer to the Minitab-generated scatterplot given in Exercise 12 of Section 10-1 on page 485.

a. Using the pairs of values for all 8 points, find the equation of the regression line.

b. Using only the pairs of values for the 4 points in the lower left corner, find the equation of the regression line.

c. Using only the pairs of values for the 4 points in the upper right corner, find the equation of the regression line.

d. Compare the results from parts (a), (b), and (c).

Short Answer

Expert verified

a. The regression equation is\(\hat y = 0.085 - 0.985x\).

b. The regression equation with only 4 lower-left corner values is\(\hat y = 1.5 - 0.00x\).

c. The regression equation with only 4 upper-right corner values is\(\hat y = 9.5 - 0.00x\).

c. The regression equations obtained in parts (a), (b), and (c) are completely different from one another. The presence of different sets of values affects the regression equation to a large extent.

Step by step solution

01

Given information

A set of 8 pairs of values is considered.

02

Regression equation using all values

a.

The regression equation of y on x has the following notation:

\(\hat y = {b_0} + {b_1}x\),where

\({b_0}\)is the intercept term, and

\({b_1}\) is the slope coefficient.

The following data points are considered:

The following table shows the necessary calculations:


The value of the y-intercept is computed below.

\(\begin{array}{c}{b_0} = \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( {44} \right)\left( {372} \right) - \left( {44} \right)\left( {370} \right)}}{{8\left( {372} \right) - {{\left( {44} \right)}^2}}}\\ = 0.085\end{array}\).

The value of the slope coefficient is computed below.

\(\begin{array}{c}{b_1} = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 8 \right)\left( {370} \right) - \left( {44} \right)\left( {44} \right)}}{{8\left( {372} \right) - {{\left( {44} \right)}^2}}}\\ = 0.985\end{array}\).

03

Regression equation using only lower-left points

b.

The following 4 pairs of data points are considered:

The following table shows the necessary calculations:

The value of the y-intercept is computed below.

\(\begin{array}{c}{b_0} = \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 6 \right)\left( {10} \right) - \left( 6 \right)\left( 9 \right)}}{{4\left( {10} \right) - {{\left( 6 \right)}^2}}}\\ = 1.5\end{array}\).

The value of the slope coefficient is computed below.

\(\begin{array}{c}{b_1} = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 4 \right)\left( 9 \right) - \left( 6 \right)\left( 6 \right)}}{{4\left( {10} \right) - {{\left( 6 \right)}^2}}}\\ = 0.000\end{array}\).

Thus, the regression equation becomes

\(\hat y = 1.5 - 0.00x\).

04

Regression equation using upper-right corner values

c.

The following 4 pairs of data points are considered:

The following table shows the necessary calculations:

The value of the y-intercept is computed below.

\(\begin{array}{c}{b_0} = \frac{{\left( {\sum y } \right)\left( {\sum {{x^2}} } \right) - \left( {\sum x } \right)\left( {\sum {xy} } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( {38} \right)\left( {362} \right) - \left( {38} \right)\left( {361} \right)}}{{4\left( {362} \right) - {{\left( {38} \right)}^2}}}\\ = 9.5\end{array}\).

The value of the slope coefficient is computed below.

\(\begin{array}{c}{b_1} = \frac{{n\left( {\sum {xy} } \right) - \left( {\sum x } \right)\left( {\sum y } \right)}}{{n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}}}\\ = \frac{{\left( 4 \right)\left( {361} \right) - \left( {38} \right)\left( {38} \right)}}{{4\left( {362} \right) - {{\left( {38} \right)}^2}}}\\ = 0.000\end{array}\).

Thus, the regression equation becomes

\(\hat y = 9.5 - 0.00x\).

05

Comparison

d.

The regression equations obtained in parts (a), (b), and (c) are completely different from one another.

Thus, the presence of different sets of values can greatly influence the regression equation.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Different hotels on Las Vegas Boulevard (โ€œthe stripโ€) in Las Vegas are randomly selected, and their ratings and prices were obtained from Travelocity. Using technology, with xrepresenting the ratings and yrepresenting price, we find that the regression equation has a slope of 130 and a y-intercept of -368.

a. What is the equation of the regression line?

b. What does the symbol\(\hat y\)represent?

Critical Thinking: Is the pain medicine Duragesic effective in reducing pain? Listed below are measures of pain intensity before and after using the drug Duragesic (fontanels) (based on data from Janssen Pharmaceutical Products, L.P.). The data are listed in order by row, and corresponding measures are from the same subject before and after treatment. For example, the first subject had a measure of 1.2 before treatment and a measure of 0.4 after treatment. Each pair of measurements is from one subject, and the intensity of pain was measured using the standard visual analog score. A higher score corresponds to higher pain intensity.

Pain intensity before Duragestic Treatment

1.2

1.3

1.5

1.6

8

3.4

3.5

2.8

2.6

2.2

3

7.1

2.3

2.1

3.4

6.4

5

4.2

2.8

3.9

5.2

6.9

6.9

5

5.5

6

5.5

8.6

9.4

10

7.6

Pain intensity after Duragestic Treatment

0.4

1.4

1.8

2.9

6.0

1.4

0.7

3.9

0.9

1.8

0.9

9.3

8.0

6.8

2.3

0.4

0.7

1.2

4.5

2.0

1.6

2.0

2.0

6.8

6.6

4.1

4.6

2.9

5.4

4.8

4.1

Regression:Use the given data to find the equation of the regression line. Let the response (y) variable be the pain intensity after treatment. What would be the equation of the regression line for a treatment having absolutely no effect?

The following exercises are based on the following sample data consisting of numbers of enrolled students (in thousands) and numbers of burglaries for randomly selected large colleges in a recent year (based on data from the New York Times).

Enrollment (thousands)

53

28

27

36

42

Burglaries

86

57

32

131

157

True or false: If the sample data lead us to the conclusion that there is sufficient evidence to support the claim of a linear correlation between enrollment and number of burglaries, then we could also conclude that higher enrollments cause increases in numbers of burglaries.

Explore! Exercises 9 and 10 provide two data sets from โ€œGraphs in Statistical Analysis,โ€ by F. J. Anscombe, the American Statistician, Vol. 27. For each exercise,

a. Construct a scatterplot.

b. Find the value of the linear correlation coefficient r, then determine whether there is sufficient evidence to support the claim of a linear correlation between the two variables.

c. Identify the feature of the data that would be missed if part (b) was completed without constructing the scatterplot.

x

10

8

13

9

11

14

6

4

12

7

5

y

9.14

8.14

8.74

8.77

9.26

8.10

6.13

3.10

9.13

7.26

4.74

Interpreting the Coefficient of Determination. In Exercises 5โ€“8, use the value of the linear correlation coefficient r to find the coefficient of determination and the percentage of the total variation that can be explained by the linear relationship between the two variables.

Crickets and Temperature r = 0.874 (x = number of cricket chirps in 1 minute, y = temperature in ยฐF)

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free