Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of A = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Old Faithful Listed below are duration times (seconds) and time intervals (min) to the next eruption for randomly selected eruptions of the Old Faithful geyser in Yellowstone National Park. Is there sufficient evidence to conclude that there is a linear correlation between duration times and interval after times?

Duration

242

255

227

251

262

207

140

Interval After

91

81

91

92

102

94

91

Short Answer

Expert verified

The scatter plot is:

The value of the correlation coefficient is 0.046.

The p-value is 0.921.

There is not enough evidence to support the claim that there is a linear correlation between the two variables.

Step by step solution

01

Given information

The data is recorded for two variables: duration in seconds and time intervals in minutes for the next eruption of a geyser.

Duration

Interval After

242

91

255

81

227

91

251

92

262

102

207

94

140

91

02

Sketch a scatterplot

A scatterplot is a graph oftwo variables that havepaired values. Each variable is scaled on one axis.

Steps to sketch a scatterplot:

  1. Mark two axes, xand y,for duration and interval after, respectively.
  2. Mark the paired data values on the graph corresponding to the axes.

The resultant graph is shown below.

03

Compute the measure of correlation coefficient

The formula for the correlation coefficient is

\(r = \frac{{n\sum {xy} - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {n\left( {\sum {{x^2}} } \right) - {{\left( {\sum x } \right)}^2}} \sqrt {n\left( {\sum {{y^2}} } \right) - {{\left( {\sum y } \right)}^2}} }}\).

Let the duration be variable x and theinterval after be variable y.

The valuesare listedin the table below:

x

y

\({x^2}\)

\({y^2}\)

\(xy\)

242

91

58564

8281

22022

255

81

65025

6561

20655

227

91

51529

8281

20657

251

92

63001

8464

23092

262

102

68644

10404

26724

207

94

42849

8836

19458

140

91

19600

8281

12740

\(\sum x = 1584\)

\(\sum y = 642\)

\(\sum {{x^2}} = 369212\)

\(\sum {{y^2} = } \;59108\)

\(\sum {xy\; = \;} 145348\)

Substitute the values in the formula:

\(\begin{aligned} r &= \frac{{7\left( {145348} \right) - \left( {1584} \right)\left( {642} \right)}}{{\sqrt {7\left( {369212} \right) - {{\left( {1584} \right)}^2}} \sqrt {7\left( {59108} \right) - {{\left( {642} \right)}^2}} }}\\ &= 0.046\end{aligned}\)

Thus, the correlation coefficient is 0.046.

04

Step 4:Conduct a hypothesis test for correlation

Define\(\rho \)as the true measure ofthe correlation coefficient for the two variables.

For testing the claim, form the hypotheses:

\(\begin{array}{l}{{\rm{H}}_{\rm{o}}}:\rho = 0\\{{\rm{{\rm H}}}_{\rm{a}}}:\rho \ne 0\end{array}\)

The samplesize is7 (n).

The test statistic is computed as follows:

\(\begin{aligned} t &= \frac{r}{{\sqrt {\frac{{1 - {r^2}}}{{n - 2}}} }}\\ &= \frac{{0.046}}{{\sqrt {\frac{{1 - {{0.046}^2}}}{{7 - 2}}} }}\\ &= 0.103\end{aligned}\)

Thus, the test statistic is 0.103.

The degree of freedom is

\(\begin{aligned} df &= n - 2\\ &= 7 - 2\\ &= 5.\end{aligned}\)

The p-value is computed from the t-distribution table.

\(\begin{aligned} p{\rm{ - value}} &= 2P\left( {T > t} \right)\\ &= 2P\left( {T > 0.103} \right)\\ &= 2\left( {1 - P\left( {T < 0.103} \right)} \right)\\ &= 0.921\end{aligned}\)

Thus, the p-value is 0.921.

Since the p-value is greater than 0.05, the null hypothesis fails to be rejected.

Therefore, there is not enough evidence to conclude that variables x(duration) and y (interval after) have a linear correlation between them.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Super Bowl and\({R^2}\)Let x represent years coded as 1, 2, 3, . . . for years starting in 1980, and let y represent the numbers of points scored in each Super Bowl from 1980. Using the data from 1980 to the last Super Bowl at the time of this writing, we obtain the following values of\({R^2}\)for the different models: linear: 0.147; quadratic: 0.255; logarithmic: 0.176; exponential: 0.175; power: 0.203. Based on these results, which model is best? Is the best model a good model? What do the results suggest about predicting the number of points scored in a future Super Bowl game?

Terminology Using the lengths (in.), chest sizes (in.), and weights (lb) of bears from Data Set 9 “Bear Measurements” in Appendix B, we get this regression equation: Weight = -274 + 0.426 Length +12.1 Chest Size. Identify the response and predictor variables

In Exercises 9–12, refer to the accompanying table, which was obtained using the data from 21 cars listed in Data Set 20 “Car Measurements” in Appendix B. The response (y) variable is CITY (fuel consumption in mi/gal). The predictor (x) variables are WT (weight in pounds), DISP (engine displacement in liters), and HWY (highway fuel consumption in mi/gal).

Which regression equation is best for predicting city fuel consumption? Why?

In exercise 10-1 12. Clusters Refer to the following Minitab-generated scatterplot. The four points in the lower left corner are measurements from women, and the four points in the upper right corner are from men.

a. Examine the pattern of the four points in the lower left corner (from women) only, and subjectively determine whether there appears to be a correlation between x and y for women.

b. Examine the pattern of the four points in the upper right corner (from men) only, and subjectively determine whether there appears to be a correlation between x and y for men.

c. Find the linear correlation coefficient using only the four points in the lower left corner (for women). Will the four points in the upper left corner (for men) have the same linear correlation coefficient?

d. Find the value of the linear correlation coefficient using all eight points. What does that value suggest about the relationship between x and y?

e. Based on the preceding results, what do you conclude? Should the data from women and the data from men be considered together, or do they appear to represent two different and distinct populations that should be analyzed separately?

Exercises 13–28 use the same data sets as Exercises 13–28 in Section 10-1. In each case, find the regression equation, letting the first variable be the predictor (x) variable. Find the indicated predicted value by following the prediction procedure summarized in Figure 10-5 on page 493.

Using the listed old/new mpg ratings, find the best predicted new

mpg rating for a car with an old rating of 30 mpg. Is there anything to suggest that the prediction is likely to be quite good?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free