Chapter 9: Q25 E (page 380)

The accompanying data consists of prices (\$) for one sample of California cabernet sauvignon wines that received ratings of 93 or higher in the May 2013 issue of Wine Spectator and another sample of California cabernets that received ratings of 89 or lower in the same issue.
$\begin{array}{*{20}{c}}{ \ge 93:}&{100}&{100}&{60}&{135}&{195}&{195}&{}\\{}&{125}&{135}&{95}&{42}&{75}&{72}&{}\\{ \le 89:}&{80}&{75}&{75}&{85}&{75}&{35}&{85}\\{}&{65}&{45}&{100}&{28}&{38}&{50}&{28}\end{array}$
Assume that these are both random samples of prices from the population of all wines recently reviewed that received ratings of at least 93 and at most 89 , respectively.
a. Investigate the plausibility of assuming that both sampled populations are normal.
b. Construct a comparative boxplot. What does it suggest about the difference in true average prices?
c. Calculate a confidence interval at the$95\% $confidence level to estimate the difference between${\mu _1}$, the mean price in the higher rating population, and${\mu _2}$, the mean price in the lower rating population. Is the interval consistent with the statement "Price rarely equates to quality" made by a columnist in the cited issue of the magazine?

Short Answer

Expert verified

(a) Plausible

(b) A large difference .

The interval is not consistent with the statement.

Step by step solution

a)Step 1: Determine the normal probability plot

Given:

$\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}$

If we want to perform a two-sample$t$test, then we require that both sampling distributions of the sample mean are approximately normal.

FIRST DATA SET

We will create a normal probability plot.

The data values are on the horizontal axis and the standardized normal scores are on the vertical axis.

If the data contains$n$data values, then the standardized normal scores are the z-scores in the normal probability table of the appendix corresponding to an area of$\frac{{j - 0.5}}{n}$(or the closest area) with$j \in \{ 1,2,3, \ldots ,n\} $.

The smallest standardized score corresponds with the smallest data value, the second smallest standardized score corresponds with the second smallest data value, and so on.

b)Step 2: Determine the normal probability plot

$\begin{array}{l}{{\bar x}_1} = 13.4\\{{\bar x}_2} = 9.7\\{n_1} = 65\\{n_2} = 50\\{\sigma _{{{\bar x}_1}}} = 2.05 \Rightarrow {s_1} = {\sigma _{{{\bar x}_1}}}\sqrt n = 2.05\sqrt {65} \approx 16.5276\\{\sigma _{{{\bar x}_2}}} = 1.76 \Rightarrow {s_2} = {\sigma _{{{\bar x}_2}}}\sqrt n = 1.76\sqrt {50} \approx 12.4451\end{array}$

Let us assume: $\alpha = 0.05$

Given claim: exceeds

The claim is either the null hypothesis or the alternative hypothesis. The null hypothesis and the alternative hypothesis state the opposite of each other. The null hypothesis needs to contain the value mentioned in the claim.

$\begin{array}{l}{H_0}:{\mu _1} = {\mu _2}\\{H_a}:{\mu _1} > {\mu _2}\end{array}$

SECOND DATA SET

We will create a normal probability plot.

The data values are on the horizontal axis and the standardized normal scores are on the vertical axis.

If the data contains $n$ data values, then the standardized normal scores are the z-scores in the normal probability table of the appendix corresponding to an area of $\frac{{j - 0.5}}{n}$ (or the closest area) with$j \in \{ 1,2,3, \ldots ,n\} $.

The smallest standardized score corresponds with the smallest data value, the second smallest standardized score corresponds with the second smallest data value, and so on.

If the pattern in the normal probability plot is roughly linear and does not contain strong curvature, then the population distribution is approximately normal.

Both probability plots do not contain strong curvature and are roughly linear, thus both population distributions are approximately normal.

Since the population distributions are approximately normal, the sampling distribution of the sample mean(s) $\bar x$ are also approximately normal. and thus it is appropriate to use the two-sample$t$ test.

B)Step 3: Fild the quartile for first data set

Given:

$\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}$

Sort the data values from smallest to largest:

$\begin{array}{l} \ge 93:42,60,72,75,95,100,100,125,135,135,195,195\\ \le 89:28,28,35,38,45,50,65,75,75,75,80,85,85,100\end{array}$

FIRST DATA SET

The minimum is $42.$

Since the number of data values is even, the median is the average of the two middle values of the sorted data set:

$M = {Q_2} = \frac{{100 + 100}}{2} = 100$

The first quartile is the median of the data values below the median (or at $25\% $ of the data):

${Q_1} = \frac{{72 + 75}}{2} = 73.5$

The third quartile is the median of the data values above the median (or at $75\% $ of the data):

${Q_3} = \frac{{135 + 135}}{2} = 135$

The maximum is $195.$

Find the quartile for second data set

SECOND DATA SET

The minimum is$28.$

Since the number of data values is even, the median is the average of the two middle values of the sorted data set:

$M = {Q_2} = \frac{{65 + 75}}{2} = 70$

The first quartile is the median of the data values below the median (or at$25\% $of the data):

${Q_1} = 38$

The third quartile is the median of the data values above the median (or at$75\% $of the data):

${Q_3} = 80$

The maximum is $100$ .

Mapping the graph

The whiskers of the boxplot are at the minimum and maximum value. The box starts at the first quartile, ends at the third quartile and has a vertical line at the median.

The first quartile is at $25\% $ of the sorted data list, the median at $50\% $ and the third quartile at$75\% $.

There appears to be a large difference between the true average prices, because the vertical lines corresponding to the median in the box of th boxplots lie are not roughly at the same location (on the horizontal axis)

c)Step 6: Determine the standard deviation

Given:

$\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}$

The mean is the sum of all values divided by the number of values:

$\begin{array}{l}{{\bar x}_1} = \frac{{100 + 100 + 60 + \ldots + 42 + 75 + 72}}{{12}} = 110.75\\{{\bar x}_2} = \frac{{80 + 75 + 75 + \ldots + 38 + 50 + 28}}{{14}} \approx 61.7143\end{array}$

The variance is the sum of squared deviations from the mean divided by$n - 1$. The standard deviation is the square root of the variance: $\begin{array}{l}{s_1} = \sqrt {\frac{{{{(100 - 110.75)}^2} + \ldots . + {{(72 - 110.75)}^2}}}{{12 - 1}}} \approx 48.7445\\{s_2} = \sqrt {\frac{{{{(80 - 61.7143)}^2} + \ldots . + {{(28 - 61.7143)}^2}}}{{14 - 1}}} \approx 23.8438\end{array}$

Find the endpoint of the confidence interval

Given:

$c = 95\% = 0.95$

Determine the degrees of freedom (rounded down to the nearest integer):

$\Delta = \frac{{{{\left( {\frac{{s_1^2}}{{{n_1}}} + \frac{{s_2^2}}{{{n_2}}}} \right)}^2}}}{{\frac{{{{\left( {s_1^2/{n_1}} \right)}^2}}}{{{n_1} - 1}} + \frac{{{{\left( {s_2^2/{n_2}} \right)}^2}}}{{{n_2} - 1}}}} = \frac{{{{\left( {\frac{{{{48.7445}^2}}}{{12}} + \frac{{{{23.8438}^2}}}{{14}}} \right)}^2}}}{{\frac{{{{\left( {{{48.7445}^2}/12} \right)}^2}}}{{12 - 1}} + \frac{{{{\left( {{{23.8438}^2}/14} \right)}^2}}}{{14 - 1}}}} \approx 15$

Determine the t-value by looking in the row starting with degrees of freedom $df = 15$ and in the column with $1 - c/2 = 0.025$ in the Student's t distribution table in the appendix:

${t_{\alpha /2}} = 2.131$

The margin of error is then:

$E = {t_{\alpha /2}} \cdot \sqrt {\frac{{s_1^2}}{{{n_1}}} + \frac{{s_2^2}}{{{n_2}}}} = 2.131 \cdot \sqrt {\frac{{{{48.7445}^2}}}{{12}} + \frac{{{{23.8438}^2}}}{{14}}} \approx 32.9177$

The endpoints of the confidence interval for ${\mu _1} - {\mu _2}$ are: $\begin{array}{l}\left( {{{\bar x}_1} - {{\bar x}_2}} \right) - E = (110.75 - 61.7143) - 32.9177 = 49.0357 - 32.9177 = 16.1180\\\left( {{{\bar x}_1} - {{\bar x}_2}} \right) + E = (110.75 - 61.7143) + 32.9177 = 49.0357 + 32.9177 = 81.9534\end{array}$

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Short Answer

Step by step solution

a)Step 1: Determine the normal probability plot

b)Step 2: Determine the normal probability plot

B)Step 3: Fild the quartile for first data set

Find the quartile for second data set

Mapping the graph

c)Step 6: Determine the standard deviation

Find the endpoint of the confidence interval

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Applied Mathematics

Logic and Functions

Pure Maths

Mechanics Maths

Statistics

Theoretical and Mathematical Physics

Study anywhere. Anytime. Across all devices.

Company

Product

Help