Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Box plots and the standard normal distribution. What relationship exists between the standard normal distribution and the box-plot methodology (Section 2.8) for describing distributions of data using quartiles? The answer depends on the true underlying probability distribution of the data. Assume for the remainder of this exercise that the distribution is normal.

a. Calculate the values of the standard normal random variable z, call them zL and zU, that correspond to the hinges of the box plot—that is, the lower and upper quartiles, QL and QU—of the probability distribution.

b. Calculate the zvalues that correspond to the inner fences of the box plot for a normal probability distribution.

c. Calculate the zvalues that correspond to the outer fences of the box plot for a normal probability distribution.

d. What is the probability that observation lies beyond the inner fences of a normal probability distribution? The outer fences?

e. Can you better understand why the inner and outer fences of a box plot are used to detect outliers in a distribution? Explain.

Short Answer

Expert verified

a. The lower and upper quartile is -0.67449 and 0.67449

b. The zvalues that correspond to the inner fences of the box plot are -2.697959 and 2.697959

c. The zvalues that correspond to the outer fences of the box plot are -4.72143 and 4.72143

d. The probability of an observation falling outside of inner fences is 0.006977 and outer fences is 0

e. The probability is very low for an observation to fall outside of these fences

Step by step solution

01

Given information

The given distribution is a normal distribution

02

 Calculating the lower and upper quantile

a.

The lower quartile is 25th percentile

Let zL be the standard normal random variable that corresponds to QL

i.e,Pz<zL=0.25ΦzL=0.25zL=Φ-10.25zL=-0.67449

So the lower quantile is -0.67449

The upper quantile is 75th percentile

Let zQ be the standard normal random variable corresponds to QU

i.e,Pz<zU=0.75ΦzU=0.75zU=Φ-10.75zU=0.67449

So the upper quartile is 0.67449

03

 Calculating the inner fences of the box

b.IQR=QU-QL=0.67449--0.67449=1.34898ThelowerinnerfenceisLIF=QL-1.5xIQRLIF=QL-1.5xIQR=QL-1.5xQU-QL=-0.67449-1.5x1.34898=-2.697959TheupperinnerfenceisUIF=QU+1.5xIQRUIF=QU+1.5xIQR=QU+1.5QU-QL=0.67449+1.5x1.34898=2.697959Sothelowerinnerfenceis-2.697959andtheupperinnerfenceis2.697959

04

 Calculating the outer fences of the box

C.IQR=QU-QL=0.67449--0.67449=1.34898

The lower outer fence is LOF=Q1-3xIQR

LOF=QL-3xIQR=QL-3xQU-QL=-0.67449-3x1.34898=-4.72143

The upper outer fence isUOF=QU+3xIQR

UOF=QL+3xIQR=QU+3xQU-QL=0.67449+3x1.34898=4.72143

So the lower outer fence is -4.72143 and upper outer fence is 4.72143

05

 Calculating the probabilities

d.

The probability that observation lies beyond the inner fences of a normal probability distribution is,

I.E,Pz<-2.697959+Pz<-2.697959=1-Pz<-2.697959+1-z<-2.697959=2-2Pz<-2.697959=2-2Φ2.697959=2-2X0.996512=0.006977

So, the probability is 0.006977

The probability that an observation lies beyond the outer fences of a normal probability distribution is,

I.E,Pz<-2.697959+Pz>4.72143=1-Pz<4.72143+1-z<4.72143=2-2Pz<4.72143=2-2Φ4.72143=2-2X0.999999𝆏0

So the probability is 0

06

Explanation

The inner and outer fences of box plot are used to detect outliers in a distribution in the following ways:

Values that are beyond the inner fences are deemed potential outliers because they are extreme values that represent relatively rare occurrences. In fact, for a normal probability distribution, less than 1% of the observations are expected to fall outside of inner fences.

Measurements that fall beyond the outer fences are very extreme measurements that require special analysis. Since less than one-hundredth of 1% (0.1% or 0.001) of the measurements from a normal distribution are expected to fall beyond the outer fences, these measurements are considered to be outliers.

From part(d) we clearly understand why the inner and outer fences of the box plot are used to detect outliers in a distribution.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

If x is a binomial random variable, compute for each of the following cases:

  1. n = 4, x = 2, p = .2
  2. n = 3, x = 0, p = .7
  3. n = 5, x = 3, p = .1
  4. n = 3, x = 1, p = .9
  5. n = 3, x = 1, p = .3
  6. n = 4, x = 2, p = .6

4.131 Chemical composition of gold artifacts. The Journal of Open Archaeology Data(Vol. 1, 2012) provided data onthe chemical composition of more than 200 pre-Columbiangold and gold-alloy artifacts recovered in the archaeologicalregion inhabited by the Muisca of Colombia (a.d.600–1800). One of many variables measured was the percentageof copper present in the gold artifacts. Summary statisticsfor this variable follow: mean = 29.94%, median = 19.75%,standard deviation = 28.37%. Demonstrate why the probabilitydistribution for the percentage of copper in thesegold artifacts cannot be normally distributed.

Preventative maintenance tests. The optimal schedulingofpreventative maintenance tests of some (but not all) ofnindependently operating components was developed in Reliability Engineering and System Safety(January2006).The time (in hours) between failures of a component wasapproximated by an exponential distribution with meanθ.

a. Supposeθ=1000 hours. Find the probability that the time between component failures ranges between 1200and1500hours.

b. Again, assumeθ=1000hours. Find the probability that the time between component failures is at least1200hours.

c. Given that the time between failures is at leastrole="math" localid="1658214710824" 1200 hours, what is the probability that the time between failures is less than1500hours?

Give the z-score for a measurement from a normal distribution for the following:

a. 1 standard deviation above the mean

b. 1 standard deviation below the mean

c. Equal to the mean

d. 2.5 standard deviations below the mean

e. 3 standard deviations above the mean

If a population data set is normally distributed, what isthe proportion of measurements you would expect to fallwithin the following intervals?

a.μ±σb.μ±2σc.μ±3σ

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free