Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

For each set of data (a) Find the mean \(\bar{x}\). (b) Find the median \(m\). (c) Indicate whether there appear to be any outliers. If so, what are they? $$ \begin{array}{llllllll} 15, & 22, & 12, & 28, & 58, & 18, & 25, & 18 \end{array} $$

Short Answer

Expert verified
The mean is 24.5, the median is 20 and the outlier is 58.

Step by step solution

01

Calculate the mean

To calculate the mean, add up all the values: 15 + 22 + 12 + 28 + 58 + 18 + 25 + 18 = 196. Then divide by the total count of numbers which is 8 in this case. This gives a mean of 196 / 8 = 24.5.
02

Find the median

To find the median, arrange the values in ascending order: 12, 15, 18, 18, 22, 25, 28, 58. Because we have an even number of observations, the median is the mean of the two middle values. These values are the 4th and the 5th values from either side. In our case these are 18 and 22. So, the median is (18 + 22) / 2 = 20.
03

Identify outliers

Identify any outliers by computing the interquartile range (IQR). The first step in computing the IQR is to define the 'lower half' and the 'upper half'. From our set, the lower half includes: 12, 15, 18, 18 and the upper half includes: 22, 25, 28, 58. Then we find the median of each of these halves. Lower median equals to (15 + 18) / 2 = 16.5 and upper median to (25 + 28) / 2 = 26.5. Next, we compute the interquartile range (IQR) which is the difference between the upper and the lower median, i.e. 26.5 - 16.5 = 10. If a value is greater than 1.5 times the IQR added to the upper quartile, or less than 1.5 times the IQR subtracted from the lower quartile, it's an outlier. From our set, only the number 58 is an outlier as it's greater than 26.5 + 1.5*10 = 41.5.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Mean Calculation
Understanding how to calculate the mean, or average, of a data set is an essential skill in descriptive statistics. The mean is calculated by adding together all values in the set and then dividing by the number of values. For example, with the numbers 15, 22, 12, 28, 58, 18, 25, and 18, their sum is 196. Since there are 8 numbers, dividing 196 by 8 yields a mean of 24.5.
The mean offers a simple measure of central tendency, serving as a snapshot of the data's 'center', but it's important to remember that the mean can be affected by extremely high or low values, known as outliers.
Median Calculation
The median is another measure of central tendency, representing the middle value in a data set when it's arranged in order. To find the median of the dataset 12, 15, 18, 18, 22, 25, 28, 58, we must first list the numbers in ascending order, which has already been done. Since there's an even number of values (8 in total), the median is the average of the fourth and fifth values: (18 + 22) / 2, resulting in a median of 20.
The median is particularly useful because it's not skewed by outliers. In a skewed distribution or when outliers are present, the median can be a better representation of central tendency than the mean.
Outlier Identification
Outlier identification is crucial in statistical analysis as outliers can greatly influence the results. An outlier is a value that is significantly higher or lower than most of the data. In our example, we determine outliers by using the interquartile range (IQR) method. The IQR represents the spread of the middle 50% of the data. Any number more than 1.5 times the IQR above the upper quartile (third quartile) or below the lower quartile (first quartile) is considered an outlier. In this case, the value 58 is an outlier as it exceeds 26.5 + (1.5 * 10), which equals 41.5.
Detecting outliers allows researchers to decide whether they should be included in the analysis or treated separately, as they might represent errors, unique cases, or variability in the data.
Interquartile Range (IQR)
The interquartile range (IQR) measures the dispersion of a dataset by indicating the range within which the central 50% of the values fall. To find the IQR, the dataset is divided into quarters. After sorting the data into ascending order, you determine the median of the lower and upper halves, known as the first and third quartiles, respectively. The IQR is the difference between these two values. In our case, the lower median (first quartile) is 16.5 and the upper median (third quartile) is 26.5. Subtracting the lower median from the upper median gives us an IQR of 10.
The IQR is a robust measure of spread that, unlike the range, is not affected by outliers in the data. It's commonly used alongside the median to provide a more complete picture of the data's distribution.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Suppose an experiment will randomly divide 40 cases between two possible treatments, \(A\) and \(B,\) and will then record two possible outcomes, Successful or Not successful. The outline of a two-way table is shown in Table 2.14. In each case below, fill in the table with possible values to show: (a) A clear association between treatment and outcome. (b) No association at all between treatment and outcome. Table 2.14 Fill in the blanks to show (a) Association or (b) No association $$\begin{array}{|l|c|c|c|}\hline & \text { Successful } & \text { Not successful } & \text { Total } \\\\\hline \text { Treatment A } & & & 20 \\\\\hline \text { Treatment B } & & & 20 \\\\\hline \text { Total } & & & 40 \\\\\hline\end{array}$$

Exercise 2.143 on page 102 introduces a study that examines the association between playing football. brain size as measured by left hippocampal volume (in \(\mu \mathrm{L}\) ), and percentile on a cognitive reaction test. Figure 2.56 gives two scatterplots. Both have number of years playing football as the explanatory variable while Graph (a) has cognitive percentile as the response variable and Graph (b) has hippocampal volume as the response variable. (a) The two corresponding correlations are -0.465 and \(-0.366 .\) Which correlation goes with which scatterplot? (b) Both correlations are negative. Interpret what this means in terms of football, brain size, and cognitive percentile.

When honeybee scouts find a food source or a nice site for a new home, they communicate the location to the rest of the swarm by doing a "waggle dance." 74 They point in the direction of the site and dance longer for sites farther away. The rest of the bees use the duration of the dance to predict distance to the site. Table 2.32 Duration of \(a\) honeybee waggle dance to indicate distance to the source $$\begin{array}{cc} \hline \text { Distance } & \text { Duration } \\ \hline 200 & 0.40 \\\250 & 0.45 \\ 500 & 0.95 \\\950 & 1.30 \\ 1950 & 2.00 \\\3500 & 3.10 \\\4300 & 4.10 \\\\\hline\end{array}$$ Table 2.32 shows the distance, in meters, and the duration of the dance, in seconds, for seven honeybee scouts. \(^{75}\) This information is also given in HoneybeeWaggle. (a) Which is the explanatory variable? Which is the response variable? (b) Figure 2.70 shows a scatterplot of the data. Does there appear to be a linear trend in the data? If so, is it positive or negative? (c) Use technology to find the correlation between the two variables. (d) Use technology to find the regression line to predict distance from duration. (e) Interpret the slope of the line in context. (f) Predict the distance to the site if a honeybee does a waggle dance lasting 1 second. Lasting 3 seconds.

We use data from HollywoodMovies introduced in Data 2.7 on page \(95 .\) The dataset includes information on all movies to come out of Hollywood between 2007 and 2013 . The variable AudienceScore in the dataset HollywoodMovies gives audience scores (on a scale from 1 to 100 ) from the Rotten Tomatoes website. The five number summary of these scores is (19,49,61,74,96) . Are there any outliers in these scores, according to the \(I Q R\) method? How bad would an average audience score rating have to be on Rotten Tomatoes to qualify as a low outlier?

Give the correct notation for the mean. The average number of text messages sent in a day was 67 , in a sample of US smartphone users ages \(18-24\), according to a survey conducted by Experian. \(^{26}\)

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free