Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

The accompanying data consists of prices (\$) for one sample of California cabernet sauvignon wines that received ratings of 93 or higher in the May 2013 issue of Wine Spectator and another sample of California cabernets that received ratings of 89 or lower in the same issue.

\(\begin{array}{*{20}{c}}{ \ge 93:}&{100}&{100}&{60}&{135}&{195}&{195}&{}\\{}&{125}&{135}&{95}&{42}&{75}&{72}&{}\\{ \le 89:}&{80}&{75}&{75}&{85}&{75}&{35}&{85}\\{}&{65}&{45}&{100}&{28}&{38}&{50}&{28}\end{array}\)

Assume that these are both random samples of prices from the population of all wines recently reviewed that received ratings of at least 93 and at most 89 , respectively.

a. Investigate the plausibility of assuming that both sampled populations are normal.

b. Construct a comparative boxplot. What does it suggest about the difference in true average prices?

c. Calculate a confidence interval at the\(95\% \)confidence level to estimate the difference between\({\mu _1}\), the mean price in the higher rating population, and\({\mu _2}\), the mean price in the lower rating population. Is the interval consistent with the statement "Price rarely equates to quality" made by a columnist in the cited issue of the magazine?

Short Answer

Expert verified

(a) Plausible

(b) A large difference .

(c) \((16.1180,81.9534)\)

The interval is not consistent with the statement.

Step by step solution

01

a)Step 1: Determine the normal probability plot

Given:

\(\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}\)

If we want to perform a two-sample\(t\)test, then we require that both sampling distributions of the sample mean are approximately normal.

FIRST DATA SET

We will create a normal probability plot.

The data values are on the horizontal axis and the standardized normal scores are on the vertical axis.

If the data contains\(n\)data values, then the standardized normal scores are the z-scores in the normal probability table of the appendix corresponding to an area of\(\frac{{j - 0.5}}{n}\)(or the closest area) with\(j \in \{ 1,2,3, \ldots ,n\} \).

The smallest standardized score corresponds with the smallest data value, the second smallest standardized score corresponds with the second smallest data value, and so on.

02

b)Step 2: Determine the normal probability plot

\(\begin{array}{l}{{\bar x}_1} = 13.4\\{{\bar x}_2} = 9.7\\{n_1} = 65\\{n_2} = 50\\{\sigma _{{{\bar x}_1}}} = 2.05 \Rightarrow {s_1} = {\sigma _{{{\bar x}_1}}}\sqrt n = 2.05\sqrt {65} \approx 16.5276\\{\sigma _{{{\bar x}_2}}} = 1.76 \Rightarrow {s_2} = {\sigma _{{{\bar x}_2}}}\sqrt n = 1.76\sqrt {50} \approx 12.4451\end{array}\)

Let us assume: \(\alpha = 0.05\)

Given claim: exceeds

The claim is either the null hypothesis or the alternative hypothesis. The null hypothesis and the alternative hypothesis state the opposite of each other. The null hypothesis needs to contain the value mentioned in the claim.

\(\begin{array}{l}{H_0}:{\mu _1} = {\mu _2}\\{H_a}:{\mu _1} > {\mu _2}\end{array}\)

SECOND DATA SET

We will create a normal probability plot.

The data values are on the horizontal axis and the standardized normal scores are on the vertical axis.

If the data contains \(n\) data values, then the standardized normal scores are the z-scores in the normal probability table of the appendix corresponding to an area of \(\frac{{j - 0.5}}{n}\) (or the closest area) with\(j \in \{ 1,2,3, \ldots ,n\} \).

The smallest standardized score corresponds with the smallest data value, the second smallest standardized score corresponds with the second smallest data value, and so on.

If the pattern in the normal probability plot is roughly linear and does not contain strong curvature, then the population distribution is approximately normal.

Both probability plots do not contain strong curvature and are roughly linear, thus both population distributions are approximately normal.

Since the population distributions are approximately normal, the sampling distribution of the sample mean(s) \(\bar x\) are also approximately normal. and thus it is appropriate to use the two-sample\(t\) test.

03

B)Step 3: Fild the quartile for first data set

Given:

\(\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}\)

Sort the data values from smallest to largest:

\(\begin{array}{l} \ge 93:42,60,72,75,95,100,100,125,135,135,195,195\\ \le 89:28,28,35,38,45,50,65,75,75,75,80,85,85,100\end{array}\)

FIRST DATA SET

The minimum is \(42.\)

Since the number of data values is even, the median is the average of the two middle values of the sorted data set:

\(M = {Q_2} = \frac{{100 + 100}}{2} = 100\)

The first quartile is the median of the data values below the median (or at \(25\% \) of the data):

\({Q_1} = \frac{{72 + 75}}{2} = 73.5\)

The third quartile is the median of the data values above the median (or at \(75\% \) of the data):

\({Q_3} = \frac{{135 + 135}}{2} = 135\)

The maximum is \(195.\)

04

Find the quartile for second data set

SECOND DATA SET

The minimum is\(28.\)

Since the number of data values is even, the median is the average of the two middle values of the sorted data set:

\(M = {Q_2} = \frac{{65 + 75}}{2} = 70\)

The first quartile is the median of the data values below the median (or at\(25\% \)of the data):

\({Q_1} = 38\)

The third quartile is the median of the data values above the median (or at\(75\% \)of the data):

\({Q_3} = 80\)

The maximum is \(100\) .

05

Mapping the graph

The whiskers of the boxplot are at the minimum and maximum value. The box starts at the first quartile, ends at the third quartile and has a vertical line at the median.

The first quartile is at \(25\% \) of the sorted data list, the median at \(50\% \) and the third quartile at\(75\% \).

There appears to be a large difference between the true average prices, because the vertical lines corresponding to the median in the box of th boxplots lie are not roughly at the same location (on the horizontal axis)

06

c)Step 6: Determine the standard deviation

Given:

\(\begin{array}{l} \ge 93:100,100,60,135,195,195,125,135,95,42,75,72\\ \le 89:80,75,75,85,75,35,85,65,45,100,28,38,50,28\end{array}\)

The mean is the sum of all values divided by the number of values:

\(\begin{array}{l}{{\bar x}_1} = \frac{{100 + 100 + 60 + \ldots + 42 + 75 + 72}}{{12}} = 110.75\\{{\bar x}_2} = \frac{{80 + 75 + 75 + \ldots + 38 + 50 + 28}}{{14}} \approx 61.7143\end{array}\)

The variance is the sum of squared deviations from the mean divided by\(n - 1\). The standard deviation is the square root of the variance: \(\begin{array}{l}{s_1} = \sqrt {\frac{{{{(100 - 110.75)}^2} + \ldots . + {{(72 - 110.75)}^2}}}{{12 - 1}}} \approx 48.7445\\{s_2} = \sqrt {\frac{{{{(80 - 61.7143)}^2} + \ldots . + {{(28 - 61.7143)}^2}}}{{14 - 1}}} \approx 23.8438\end{array}\)

07

Find the endpoint of the confidence interval

Given:

\(c = 95\% = 0.95\)

Determine the degrees of freedom (rounded down to the nearest integer):

\(\Delta = \frac{{{{\left( {\frac{{s_1^2}}{{{n_1}}} + \frac{{s_2^2}}{{{n_2}}}} \right)}^2}}}{{\frac{{{{\left( {s_1^2/{n_1}} \right)}^2}}}{{{n_1} - 1}} + \frac{{{{\left( {s_2^2/{n_2}} \right)}^2}}}{{{n_2} - 1}}}} = \frac{{{{\left( {\frac{{{{48.7445}^2}}}{{12}} + \frac{{{{23.8438}^2}}}{{14}}} \right)}^2}}}{{\frac{{{{\left( {{{48.7445}^2}/12} \right)}^2}}}{{12 - 1}} + \frac{{{{\left( {{{23.8438}^2}/14} \right)}^2}}}{{14 - 1}}}} \approx 15\)

Determine the t-value by looking in the row starting with degrees of freedom \(df = 15\) and in the column with \(1 - c/2 = 0.025\) in the Student's t distribution table in the appendix:

\({t_{\alpha /2}} = 2.131\)

The margin of error is then:

\(E = {t_{\alpha /2}} \cdot \sqrt {\frac{{s_1^2}}{{{n_1}}} + \frac{{s_2^2}}{{{n_2}}}} = 2.131 \cdot \sqrt {\frac{{{{48.7445}^2}}}{{12}} + \frac{{{{23.8438}^2}}}{{14}}} \approx 32.9177\)

The endpoints of the confidence interval for \({\mu _1} - {\mu _2}\) are: \(\begin{array}{l}\left( {{{\bar x}_1} - {{\bar x}_2}} \right) - E = (110.75 - 61.7143) - 32.9177 = 49.0357 - 32.9177 = 16.1180\\\left( {{{\bar x}_1} - {{\bar x}_2}} \right) + E = (110.75 - 61.7143) + 32.9177 = 49.0357 + 32.9177 = 81.9534\end{array}\)

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

a. Show for the upper-tailed test with \({\sigma _1}\) and \({\sigma _2}\)known that as either\(m\) or\(n\) increases, \(\beta \)decreases when \({\mu _1} - {\mu _2} > {\Delta _0}\).

b. For the case of equal sample sizes \(\left( {m = n} \right)\)and fixed \(\alpha \),what happens to the necessary sample size \(n\) as \(\beta \) is decreased, where \(\beta \) is the desired type II error probability at a fixed alternative?

The article "Flexure of Concrete Beams Reinforced with Advanced Composite Orthogrids"\((J\). of Aerospace Engr., 1997: 7-15) gave the accompanying data on ultimate load\((kN)\)for two different types of beams.

\( - 7.0944\)

a. Assuming that the underlying distributions are normal, calculate and interpret a\(99\% \)CI for the difference between true average load for the fiberglass beams and that for the carbon beams.

b. Does the upper limit of the interval you calculated in part (a) give a\(99\% \)upper confidence bound for the difference between the two\(\mu \)'s? If not, calculate such a bound. Does it strongly suggest that true average load for the carbon beams is more than that for the fiberglass beams? Explain.

Tensile-strength tests were carried out on two different grades of wire rod (โ€œFluidized Bed Patenting of Wire Rods,โ€ Wire J., June \(1977: 56 - 61)\), resulting in the accompanying data

Sample

Sample Mean Sample

Grade Size (kg/mm2 ) SD

\(\overline {\underline {\begin{array}{*{20}{l}}{ AISI 1064}&{m = 129}&{\bar x = 107.6}&{{s_1} = 1.3}\\{ AISI 1078}&{n = 129}&{\bar y = 123.6}&{{s_2} = 2.0}\\{}&{}&{}&{}\end{array}} } \)

a. Does the data provide compelling evidence for concluding that true average strength for the \(1078\) grade exceeds that for the \(1064\) grade by more than \(10kg/m{m^2}\) ? Test the appropriate hypotheses using a significance level of \(.01\).

b. Estimate the difference between true average strengths for the two grades in a way that provides information about precision and reliability

A mechanical engineer wishes to compare strength properties of steel beams with similar beams made with a particular alloy. The same number of beams, \(n\), of each type will be tested. Each beam will be set in a horizontal position with a support on each end, a force of \(2500lb\) will be applied at the center, and the deflection will be measured. From past experience with such beams, the engineer is willing to assume that the true standard deviation of deflection for both types of beam is \(.05in\). Because the alloy is more expensive, the engineer wishes to test at level \(.01\) whether it has smaller average deflection than the steel beam. What value of n is appropriate if the desired type II error probability is \(.05\) when the difference in true average deflection favors the alloy by \(.04in\).?

The accompanying summary data on total cholesterol level (mmol/l) was obtained from a sample of Asian postmenopausal women who were vegans and another sample of such women who were omnivores (โ€œVegetarianism, Bone Loss, and Vitamin D: A Longitudinal Study in Asian Vegans and Non-Vegans,โ€ European J. of Clinical Nutr., 2012: 75โ€“82)

Diet sample sample sample

Size mean SD

\(\overline {\underline {\begin{array}{*{20}{l}}{ Vegan }&{88}&{5.10}&{1.07}\\{ Omnivore }&{93}&{5.55}&{1.10}\\{}&{}&{}&{}\end{array}} } \)

Calculate and interpret a \(99\% \) \(CI\) for the difference between population mean total cholesterol level for vegans and population mean total cholesterol level for omnivores (the cited article included a \(95\% \)\(CI\)). (Note: The article described a more sophisticated statistical analysis for investigating bone density loss taking into account other characteristics (โ€œcovariatesโ€) such as age, body weight, and various nutritional factors; the resulting CI included 0, suggesting no diet effect.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free