Chapter 4: Problem 25
A clerk entering salary data into a company spreadsheet accidentally put an extra " \(0^{\prime \prime}\) in the boss's salary, listing it as \(\$ 2,000,000\) instead of \(\$ 200,000 .\) Explain how this error will affect these summary statistics for the company payroll: a) measures of center: median and mean. b) measures of spread: range, IQR, and standard deviation.
Short Answer
Step by step solution
Analyze the Impact on the Median
Analyze the Impact on the Mean
Analyze the Impact on the Range
Analyze the Impact on the IQR
Analyze the Impact on the Standard Deviation
Unlock Step-by-Step Solutions & Ace Your Exams!
-
Full Textbook Solutions
Get detailed explanations and key concepts
-
Unlimited Al creation
Al flashcards, explanations, exams and more...
-
Ads-free access
To over 500 millions flashcards
-
Money-back guarantee
We refund you if you fail your exam.
Over 30 million students worldwide already upgrade their learning with Vaia!
Key Concepts
These are the key concepts you need to understand to accurately answer the question.
Measures of Center
The **median** is the middle number in a data set organized in ascending order. It divides the data into two equal parts. Depending on the number of data points, it may even result in the average of the two middle values.
Meanwhile, the **mean** provides the average value by summing up all the individual data points and dividing by the total number of items. The mean is sensitive to extreme values, commonly referred to as outliers, and can be greatly affected by them.
Measures of Spread
The **range** shows variability by subtracting the smallest data point from the largest. A larger range indicates more spread within the dataset.
The **Interquartile Range (IQR)** measures the range within the middle 50% of the data, calculated by subtracting the first quartile (Q1) from the third quartile (Q3). It provides insight into the consistency of the middle section of the dataset.
The **standard deviation** provides an average distance of each data point from the mean. A higher standard deviation indicates more variability within the dataset, while a lower value points to less spread.
Impact on Median and Mean
The **median** remains relatively stable against such errors unless the mistake lands directly in the center of the arranged data points. This means that, in most cases, the median won't change significantly with an error in an extreme value like the boss's salary.
The **mean**, on the other hand, is highly sensitive and notably impacted by extreme values. In our example, misreporting the boss’s salary as $2,000,000 instead of $200,000 drastically increases the average salary calculation. This happens because the mean is calculated by adding all salaries together so that a larger number dramatically skews the total.
Impact on Range and IQR
**Range** is directly affected since it depends on both the maximum and minimum values. Thus, an erroneous entry like a tenfold increase in salary increases the range disproportionately to reflect a larger spread.
However, the **Interquartile Range (IQR)** is usually unaffected by outliers like an inflated salary as this measure disregards extreme values. It’s focused strictly within the dataset's middle half. Only if the outliers alter Q1 or Q3, the IQR might change, but such influence is rare with individual anomalies.
Impact on Standard Deviation
This occurs as standard deviation reflects the dispersion of dataset entries from the calculated mean. When the mean itself is skewed upwards by extreme values, the deviation from which every other value is measured increases. Ultimately, large errors cause the standard deviation to report a much higher variability level in the data than what might be accurate.