Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Fiber in the Diet The number of grams of fiber eaten in one day for a sample of ten people are \(\begin{array}{ll}10 & 11\end{array}\) \(\begin{array}{ll}11 & 14\end{array}\) \(\begin{array}{llllll}15 & 17 & 21 & 24 & 28 & 115\end{array}\) (a) Find the mean and the median for these data. (b) The value of 115 appears to be an obvious outlier. Compute the mean and the median for the nine numbers with the outlier excluded. (c) Comment on the effect of the outlier on the mean and on the median.

Short Answer

Expert verified
With the outlier, the mean is 25.6 and the median is 16. Without the outlier, the mean is 15.67 and the median is 15. This shows that the mean is more affected by the outlier than the median, making the median a more resistant measure of central tendency in this case.

Step by step solution

01

Calculate the Mean (with outlier)

First, find the sum of all the data points: \(10 + 11 + 11 + 14 + 15 + 17 + 21 + 24 + 28 + 115 = 256\). Then divide this by the total number of data points (10 in this case) to get the mean: \(256 / 10 = 25.6\)
02

Calculate the Median (with outlier)

First, list the numbers in ascending order: \(10, 11, 11, 14, 15, 17, 21, 24, 28, 115\). Since there are 10 data points, the median is the average of the 5th and 6th data points, which is \( (15 + 17) / 2 = 16\).
03

Calculate the Mean (without outlier)

Now, exclude the outlier (115) and repeat the process for finding the mean: \(10 + 11 + 11 + 14 + 15 + 17 + 21 + 24 + 28 = 141\). Then divide this by the number of data points minus the outlier (9 in this case) to get the new mean: \(141 / 9 = 15.67\)
04

Calculate the Median (without outlier)

Again, exclude the outlier and list the numbers in ascending order: \(10, 11, 11, 14, 15, 17, 21, 24, 28\). There are now 9 data points and the median is the 5th, which is 15.
05

Discuss the Effect of the Outlier

Notice how the mean got smaller when the outlier was excluded, from 25.6 to 15.67. This shows that the mean is sensitive to extreme values. However, the median only decreased slightly, from 16 to 15. This demonstrates that the median is resistant to outliers, and hence it is a better representation of the central value when outliers are present.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Each describe a sample. The information given includes the five number summary, the sample size, and the largest and smallest data values in the tails of the distribution. In each case: (a) Clearly identify any outliers, using the IQR method. (b) Draw a boxplot. Five number summary: (42,72,78,80,99)\(;\) \(n=120 .\) Tails: 42, 63, \(65,67,68, \ldots, 88,89,95,96,99\).

Ronda Rousey Fight Times Perhaps the most popular fighter since the turn of the decade, Ronda Rousey is famous for defeating her opponents quickly. The five number summary for the times of her first 12 UFC (Ultimate Fighting Championship) fights, in seconds, is (14,25,44,64,289) . (a) Only three of her fights have lasted more than a minute, at \(289,267,\) and 66 seconds, respectively. Use the \(I Q R\) method to see which, if any, of these values are high outliers. (b) Are there any low outliers in these data, according to the \(I Q R\) method? (c) Draw the boxplot for Ronda Rousey's fight times. (d) Based on the boxplot or five number summary, would we expect Ronda's mean fight time to be greater than or less than her median?

If we have learned to solve problems by one method, we often have difficulty bringing new insight to similar problems. However, electrical stimulation of the brain appears to help subjects come up with fresh insight. In a recent experiment \({ }^{17}\) conducted at the University of Sydney in Australia, 40 participants were trained to solve problems in a certain way and then asked to solve an unfamiliar problem that required fresh insight. Half of the participants were randomly assigned to receive non-invasive electrical stimulation of the brain while the other half (control group) received sham stimulation as a placebo. The participants did not know which group they were in. In the control group, \(20 \%\) of the participants successfully solved the problem while \(60 \%\) of the participants who received brain stimulation solved the problem. (a) Is this an experiment or an observational study? Explain. (b) From the description, does it appear that the study is double-blind, single-blind, or not blind? (c) What are the variables? Indicate whether each is categorical or quantitative. (d) Make a two-way table of the data. (e) What percent of the people who correctly solved the problem had the electrical stimulation? (f) Give values for \(\hat{p}_{E},\) the proportion of people in the electrical stimulation group to solve the problem, and \(\hat{p}_{S},\) the proportion of people in the sham stimulation group to solve the problem. What is the difference in proportions \(\hat{p}_{E}-\hat{p}_{S} ?\) (g) Does electrical stimulation of the brain appear to help insight?

The Honeybee dataset contains data collected from the USDA on the estimated number of honeybee colonies (in thousands) for the years 1995 through 2012.77 We use technology to find that a regression line to predict number of (thousand) colonies from year (in calendar year) is $$\text { Colonies }=19,291,511-8.358(\text { Year })$$ (a) Interpret the slope of the line in context. (b) Often researchers will adjust a year explanatory variable such that it represents years since the first year data were colleected. Why might they do this? (Hint: Consider interpreting the yintercept in this regression line.) (c) Predict the bee population in \(2100 .\) Is this prediction appropriate (why or why not)?

Deal with an experiment to study the effects of financial incentives to quit smoking. 19 Smokers at a company were invited to participate in a smoking cessation program and randomly assigned to one of two groups. Those in the Reward group would get a cash award if they stopped smoking for six months. Those in the Deposit group were asked to deposit some money which they would get back along with a substantial bonus if they stopped smoking. After six months, 156 of the 914 smokers who accepted the invitation to be in the reward-only program stopped smoking, while 78 of the 146 smokers who paid a deposit quit. Set up a two-way table and compare the success rates between participants who entered the two programs.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free