Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Altman and Bland report the survival times for patients with active hepatitis, half treated with prednisone and half receiving no treatment. \({ }^{10}\) The survival times (in months) (Exercise 1.73 and \(\mathrm{EX} 0173\) ) are adapted from their data for those treated with prednisone. $$ \begin{array}{rr} 8 & 127 \\ 11 & 133 \\ 52 & 139 \\ 57 & 142 \\ 65 & 144 \\ 87 & 147 \\ 93 & 148 \\ 97 & 157 \\ 109 & 162 \\ 120 & 165 \end{array} $$ a. Can you tell by looking at the data whether it is roughly symmetric? Or is it skewed? b. Calculate the mean and the median. Use these measures to decide whether or not the data are symmetric or skewed. c. Draw a box plot to describe the data. Explain why the box plot confirms your conclusions in part b.

Short Answer

Expert verified
Short Answer: The survival time data is somewhat skewed to the right. The mean (89.65) is slightly less than the median (123.5), and the box plot shows a longer right whisker and an asymmetrical box, indicating a right skewness.

Step by step solution

01

(Exercise part a: Symmetry or skewness estimation by looking at the data)

To analyze the data just by looking at it, we need to observe the spacing between the numbers. If the spacing is uniform throughout, the data is likely symmetric. If the spacing is narrower on one side and wider on the other side, the data is skewed. In this case, it is hard to tell definitively if the data is symmetric or skewed just by looking at it. We need to calculate the mean and median and use a box plot to make a conclusion.
02

(Exercise part b: Calculating the mean and median)

First, we need to find the mean and median of the survival times. To find the mean, add up the survival times and then divide by the number of data points. To find the median, sort the data points in ascending order and locate the middle value. Here's how to do it: Mean = \(\frac{8+11+52+57+65+87+93+97+109+120+127+133+139+142+144+147+148+157+162+165}{20} = 89.65\) To find the median, we have already sorted data: \(\{8, 11, 52, 57, 65, 87, 93, 97, 109, 120, 127, 133, 139, 142, 144, 147, 148, 157, 162, 165\}\) Since there are 20 data points, the median is the average of the 10th and 11th data points: Median = \(\frac{120+127}{2} = 123.5\) Since the mean (89.65) is slightly less than the median (123.5), we can conclude that the data is somewhat skewed to the right.
03

(Exercise part c: Drawing a box plot and explaining conclusions)

To draw a box plot, we need to find the 1st quartile (Q1), median (Q2), and the 3rd quartile (Q3). There are 20 data points, so Q1 is the median of the first 10 data points: Q1 = Median of \(\{8, 11, 52, 57, 65, 87, 93, 97, 109, 120\} = \frac{65+87}{2} = 76\) Q3 is the median of the last 10 data points: Q3 = Median of \(\{127, 133, 139, 142, 144, 147, 148, 157, 162, 165\} = \frac{142+144}{2} = 143\) The box plot will have the following components: - A vertical line at Q1 (76) - A vertical line at Q2 (123.5) - A vertical line at Q3 (143) - A rectangle surrounding Q1, Q2, and Q3 - Whiskers extending from the minimum value (8) to Q1 and from Q3 to the maximum value (165) Looking at the box plot, we can see that the right whisker is longer than the left whisker and the box is not symmetrical. Thus, it confirms that the data is somewhat skewed to the right, as we concluded in part b.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Data Symmetry
In descriptive statistics, data symmetry indicates whether a dataset is evenly distributed on either side of a central value. Symmetry in data means that the left and right sides of the distribution are mirror images of each other. When considering symmetry, any deviation towards one end signifies skewness.

To determine symmetry just by visual inspection, as was attempted in step 1 of the example problem, look for regular spacing and balanced distribution of values around the middle value. If the values gradually increase and decrease at a consistent rate from the center, we expect symmetry. But often, a visual inspection is insufficient; thus, calculating statistical measures such as the mean and median is crucial for confirming data symmetry or skewness.
Mean and Median
The mean and median are central measures of tendency in descriptive statistics. The mean is simply the average of all data points, calculated by summing them up and dividing by the number of values. Contrastingly, the median is the middle value when the dataset is sorted from smallest to largest, or the average of the two middle values in an even-sized dataset.

In the example problem, the mean was found to be lower than the median, which hints at a skew in the dataset. Typically, when the mean and median are not equal, the side where the mean lies is considered to be 'weighted' with more or larger values, indicating skewness.
Box Plot
A box plot, or box-and-whisker plot, is a graphical representation of a dataset’s distribution and is very helpful in depicting the degree of skewness. It consists of a rectangle (box) and lines (whiskers) extending from either side. The box encloses the interquartile range (Q1 to Q3), with a line inside it that represents the median (Q2). The whiskers stretch out to the minimum and maximum values, excluding outliers.

In step 3 of the solution process, the box plot was crafted to give a visual understanding of distribution. The length and symmetry of the whiskers, the placement of the median within the box, and the position of the box within the total range are crucial indicators. A longer right whisker, as in the example, confirms right-skewness—often easier to discern in a box plot than in a raw data list.
Data Skewness
Data skewness is a measure of asymmetry in a data distribution. If a dataset is skewed to the right, like in the provided example, it means there are a minority of higher values that extend the tail of the distribution to the right. Conversely, left skewness indicates a tail that stretches to the left with lower values.

Skewness can significantly affect the mean, pulling it towards the tail's direction, while the median remains more robust and less influenced by extreme values. The difference in the mean and median values, and the non-symmetrical visualization offered by the box plot, indicate that the dataset in the example problem does not possess data symmetry but rather a skewness towards the higher values.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Suppose that some measurements occur more than once and that the data \(x_{1}, x_{2}, \ldots, x_{k}\) are arranged in a frequency table as shown here: $$ \begin{array}{cc} \text { Observations } & \text { Frequency } f_{i} \\ \hline x_{1} & f_{1} \\ x_{2} & f_{2} \\ \cdot & \cdot \\ \cdot & \cdot \\ \cdot & \cdot \\ x_{k} & f_{k} \end{array} $$ The formulas for the mean and variance for grouped data are \(\bar{x}=\frac{\sum x_{i} f_{i}}{n}\) $$ \text { where } n=\Sigma f_{i} $$ and $$ s^{2}=\frac{\sum x_{i}^{2} f_{i}-\frac{\left(\sum x_{i} f_{i}\right)^{2}}{n}}{n-1} $$ Notice that if each value occurs once, these formulas reduce to those given in the text. Although these formulas for grouped data are primarily of value when you have a large number of measurements, demonstrate their use for the sample \(1,0,0,1,3,1,3,2,3,0,\) 0,1,1,3,2 a. Calculate \(\bar{x}\) and \(s^{2}\) directly, using the formulas for ungrouped data. b. The frequency table for the \(n=15\) measurements is as follows: $$ \begin{array}{ll} x & f \\ \hline 0 & 4 \\ 1 & 5 \\ 2 & 2 \\ 3 & 4 \end{array} $$ Calculate \(\bar{x}\) and \(s^{2}\) using the formulas for grouped data. Compare with your answers to part a.

If you scored at the 69 th percentile on a placement test, how does your score compare with others?

The data below are 30 waiting times between eruptions of the Old Faithful geyser in Yellowstone National Park. $$ \begin{array}{lllllllllllllll} 56 & 89 & 51 & 79 & 58 & 82 & 52 & 88 & 52 & 78 & 69 & 75 & 77 & 72 & 71 \\ 55 & 87 & 53 & 85 & 61 & 93 & 54 & 76 & 80 & 81 & 59 & 86 & 78 & 71 & 77 \end{array} $$ a. Calculate the range. b. Use the range approximation to approximate the standard deviation of these 30 measurements. c. Calculate the sample standard deviation \(s\). d. What proportion of the measurements lie within two standard deviations of the mean? Within three standard deviations of the mean? Do these proportions agree with the proportions given in Tchebysheff's Theorem?

Find the five-number summary and the IQR for these data: $$ 19,12,16,0,14,9,6,1,12,13,10,19,7,5,8 $$

The International Baccalaureate (IB) program is an accelerated academic program offered at a growing number of high schools throughout the country. Students enrolled in this program are placed in accelerated or advanced courses and must take IB examinations in each of six subject areas at the end of their junior or senior year. Students are scored on a scale of \(1-7,\) with \(1-2\) being poor, 3 mediocre, 4 average, and \(5-7\) excellent. During its first year of operation at John \(\mathrm{W}\). North High School in Riverside, California, 17 juniors attempted the IB economics exam, with these results: $$ \begin{array}{cc} \text { Exam Grade } & \text { Number of Students } \\ \hline 7 & 1 \\ 6 & 4 \\ 5 & 4 \\ 4 & 4 \\ 3 & 4 \end{array} $$ Calculate the mean and standard deviation for these scores.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free