Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Data 4.3 on page 265 introduces a situation in which a restaurant chain is measuring the levels of arsenic in chicken from its suppliers. The question is whether there is evidence that the mean level of arsenic is greater than \(80 \mathrm{ppb},\) so we are testing \(H_{0}: \mu=80\) vs \(H_{a}:\) \(\mu>80,\) where \(\mu\) represents the average level of arsenic in all chicken from a certain supplier. It takes money and time to test for arsenic, so samples are often small. Suppose \(n=6\) chickens from one supplier are tested, and the levels of arsenic (in ppb) are: \(\begin{array}{llllll}68, & 75, & 81, & 93, & 95, & 134\end{array}\) (a) What is the sample mean for the data? (b) Translate the original sample data by the appropriate amount to create a new dataset in which the null hypothesis is true. How do the sample size and standard deviation of this new dataset compare to the sample size and standard deviation of the original dataset? (c) Write the six new data values from part (b) on six cards. Sample from these cards with replacement to generate one randomization sample. (Select a card at random, record the value, put it back, select another at random, until you have a sample of size \(6,\) to match the original sample size.) List the values in the sample and give the sample mean. (d) Generate 9 more simulated samples, for a total of 10 samples for a randomization distribution. Give the sample mean in each case and create a small dotplot. Use an arrow to locate the original sample mean on your dotplot.

Short Answer

Expert verified
The exercise first requires the calculation of the original sample mean, which equals to 91. A hypothesis arbitrary set to 80 is examined by creating new data sets and generating repeated random samples. The variations in these means are then displayed on a dotplot where the original sample mean is indicated.

Step by step solution

01

Calculating Sample Mean

First, calculate the sample mean by adding all the given values and dividing by the total number of values, which is 6 in this case. The data provided includes the arsenic levels of 6 chickens: 68, 75, 81, 93, 95, 134. Hence, the sample mean \(\bar{x} = \frac{(68 + 75 + 81 + 93 + 95 + 134)}{6} = 91\).
02

Generate New Dataset

Translate the original sample data to create a new dataset where the null hypothesis is true (\(\mu = 80\)). This implies subtracting the original sample mean (91) from each data point and then adding the hypothesized mean (80). The new dataset then becomes: 57, 64, 70, 82, 84, 123. The sample size remains the same (n=6) as we are not adding or removing any data points. The standard deviation remains the same because shifting data (adding or subtracting a constant from each data point) does not affect the spread of data.
03

Generate Randomization Sample

For a randomization sample, select 6 data points from the new dataset with replacement (meaning a value can be picked more than once). Let's say the randomly selected observations are 57, 70, 123, 70, 84, 123. Then, calculate the sample mean of this random sample.
04

Repeat Random Sampling

Repeat step 3 nine more times to have a total of ten samples for a randomization distribution. For each of these nine new samples, calculate the sample mean.
05

Create a Dotplot

A dotplot can be used to visually display the variations in the sample means for the ten random samples. On this dotplot, use an arrow to locate the original sample mean (91).

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Randomization Test
Understanding the randomization test is crucial for anyone grappling with statistical hypothesis testing. In essence, a randomization test, also known as a permutation test, is a non-parametric method used to test the hypothesis. This technique involves creating numerous simulated samples by randomly reshuffling your dataset. By comparing these shuffled samples against the original data, the randomization test evaluates if the observed data might occur by chance.

Imagine shuffling a deck of cards representing your data points, where the null hypothesis holds true—every shuffle leads to a new possible world under the null hypothesis. By comparing sufficient shuffles against your actual data, you get a sense of whether your original dataset stands out as unusual, or if it's just another possible outcome. This approach doesn't assume a particular distribution, making it versatile and robust, especially with small sample sizes or non-normal data as seen in the chicken arsenic levels case.
Sample Mean Calculation
The sample mean, symbolized as \(\bar{x}\), is a central concept in statistics and represents the average of a set of observations. It is calculated by summing all the values in a sample and dividing by the number of observations in that sample. For instance, with the chicken supplier's arsenic levels, we calculate the sample mean by adding \(68 + 75 + 81 + 93 + 95 + 134\) and dividing by 6, leading to a sample mean of 91 ppb.

This calculation is a straightforward but fundamental step in many statistical analyses, including hypothesis testing. The sample mean serves as the observed estimate of the population mean \(\mu\), which is under scrutiny when testing the null hypothesis \(H_0: \mu = 80\) vs \(H_a: \mu > 80\). The sample mean acts as a pivot point from which the randomization test will generate simulated samples to evaluate the null hypothesis.
Standard Deviation
Standard deviation is a measure of the dispersion or spread of a set of data points in a sample. It reveals how much variation there is from the average (mean). A low standard deviation means that data points are generally close to the mean, while a high standard deviation indicates that the data points are spread out over a wider range of values.

When we manipulate a dataset by adding or subtracting a constant to implement the null hypothesis, this does not affect the standard deviation. The operation shifts all data points uniformly, maintaining their relative distances from each other. Therefore, in the arsenic level example, even after adjusting the data to reflect the null hypothesis, the spread of the original data (as measured by the standard deviation) stays constant. This consistency is key in our statistical toolkit, ensuring that the variability observed in our sample is faithfully represented throughout the randomization test.
Null Hypothesis
At the heart of hypothesis testing lies the null hypothesis, denoted as \(H_0\). It is a statement of no effect or no difference that serves as a starting point for statistical significance testing. In our example, the null hypothesis posits that the mean level of arsenic in the chicken sample is 80 ppb (\(H_0: \mu = 80\)).

The null hypothesis is vital because it allows us to calculate the probability of observing a test statistic at least as extreme as the one we observed, given that the null hypothesis is true. If this probability (p-value) is very low, we have evidence against \(H_0\) and may reject it in favor of the alternative hypothesis, \(H_a\), suggesting a new effect or difference. In the context of the chicken arsenic levels, if the sample mean calculation is significantly higher than 80 ppb, we would consider the evidence against the null hypothesis that the mean arsenic level is not greater than 80 ppb.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

You roll a die 60 times and record the sample proportion of 5 's, and you want to test whether the die is biased to give more 5 's than a fair die would ordinarily give. To find the p-value for your sample data, you create a randomization distribution of proportions of 5 's in many simulated samples of size 60 with a fair die. (a) State the null and alternative hypotheses. (b) Where will the center of the distribution be? Why? (c) Give an example of a sample proportion for which the number of 5 's obtained is less than what you would expect in a fair die. (d) Will your answer to part (c) lie on the left or the right of the center of the randomization distribution? (e) To find the p-value for your answer to part (c), would you look at the left, right, or both tails? (f) For your answer in part (c), can you say anything about the size of the p-value?

Polling 1000 people in a large community to determine if there is evidence for the claim that the percentage of people in the community living in a mobile home is greater then \(10 \%\).

A study suggests that exposure to UV rays through the car window may increase the risk of skin cancer. \(^{52}\) The study reviewed the records of all 1,050 skin cancer patients referred to the St. Louis University Cancer Center in 2004\. Of the 42 patients with melanoma, the cancer occurred on the left side of the body in 31 patients and on the right side in the other 11 . (a) Is this an experiment or an observational study? (b) Of the patients with melanoma, what proportion had the cancer on the left side? (c) A bootstrap \(95 \%\) confidence interval for the proportion of melanomas occurring on the left is 0.579 to \(0.861 .\) Clearly interpret the confidence interval in the context of the problem. (d) Suppose the question of interest is whether melanomas are more likely to occur on the left side than on the right. State the null and alternative hypotheses. (e) Is this a one-tailed or two-tailed test? (f) Use the confidence interval given in part (c) to predict the results of the hypothesis test in part (d). Explain your reasoning. (g) A randomization distribution gives the p-value as 0.003 for testing the hypotheses given in part (d). What is the conclusion of the test in the context of this study? (h) The authors hypothesize that skin cancers are more prevalent on the left because of the sunlight coming in through car windows. (Windows protect against UVB rays but not UVA rays.) Do the data in this study support a conclusion that more melanomas occur on the left side because of increased exposure to sunlight on that side for drivers?

In a test to see whether there is a difference between males and females in average nasal tip angle, the study indicates that " \(p>0.05\)."

The data in Hurricanes contains the number of hurricanes that made landfall on the eastern coast of the United States over the 101 years from 1914 to 2014 . Suppose we are interested in testing whether the number of hurricanes is increasing over time. (a) State the null and alternative hypotheses for testing whether the correlation between year and number of hurricanes is positive, which would indicate the number of hurricanes is increasing. (b) Describe in detail how you would create a randomization distribution to test this claim (if you had many more hours to do this exercise and no access to technology).

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free