Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Small Sample Size and Outliers As we have seen, bootstrap distributions are generally symmetric and bell-shaped and centered at the value of the original sample statistic. However, strange things can happen when the sample size is small and there is an outlier present. Use StatKey or other technology to create a bootstrap distribution for the standard deviation based on the following data: \(8 \quad 10\) 72 \(13 \quad 8\) \(\begin{array}{ll}10 & 50\end{array}\) Describe the shape of the distribution. Is it appropriate to construct a confidence interval from this distribution? Explain why the distribution might have the shape it does.

Short Answer

Expert verified
The shape of the bootstrap distribution might be skewed or irregular due to the outlier in the data set. Depending on the observed skewness or irregularity, it might not be appropriate to construct a confidence interval from this distribution. The presence of an outlier (72) in the small data set is what causes the particular shape of the distribution - being substantially distant from the other data points, it significantly influences the standard deviation whenever it is included in a sample.

Step by step solution

01

Generating the Bootstrap Distribution

Firstly, with technology tools like StatKey or others reflecting statistical modelling, generate a bootstrap distribution for the standard deviation of the given data: \(8, 10, 72, 13, 8, 10, 50\). The bootstrap distribution is created by taking numerous samples from the data, with replacement, and calculating the statistic (in this case, standard deviation) from each sample.
02

Observing the Shape of the Distribution

Next is to observe the shape or pattern of the bootstrap distribution. It might not follow a symmetric, bell-shaped curve, mainly due to the presence of an outlier in the small data set.
03

Discussing the Appropriateness of a Confidence Interval

Analyze whether it is appropriate to construct a confidence interval from this distribution. If the bootstrap distribution is not symmetric, or if it is substantially skewed, it may not be suitable to form a confidence interval from it since standard methods often require the sampling distribution to be approximately symmetric.
04

Understanding the Shape of Distribution

Explain why the distribution might have the shape it does. This can largely be due to the presence of an outlier (72) in the data set. Being a single data point that lies far from the other values, the outlier would have a significant impact on the standard deviation whenever it is included in a sample, thereby influencing the shape of the bootstrap distribution.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

What Proportion of Adults and Teens Text Message? A study of \(n=2252\) adults age 18 or older found that \(72 \%\) of the cell phone users send and receive text messages. \({ }^{15}\) A study of \(n=800\) teens age 12 to 17 found that \(87 \%\) of the teen cell phone users send and receive text messages. What is the best estimate for the difference in the proportion of cell phone users who use text messages, between adults (defined as 18 and over) and teens? Give notation (as a difference with a minus sign) for the quantity we are trying to estimate, notation for the quantity that gives the best estimate, and the value of the best estimate. Be sure to clearly define any parameters in the context of this situation.

Have You Ever Been Arrested? According to a recent study of 7335 young people in the US, \(30 \%\) had been arrested \(^{28}\) for a crime other than a traffic violation by the age of 23. Crimes included such things as vandalism, underage drinking, drunken driving, shoplifting, and drug possession. (a) Is the \(30 \%\) a parameter or a statistic? Use the correct notation. (b) Use the information given to estimate a parameter, and clearly define the parameter being estimated. (c) The margin of error for the estimate in part (b) is \(0.01 .\) Use this information to give a range of plausible values for the parameter. (d) Given the margin of error in part (c), if we asked all young people in the US if they have ever been arrested, is it likely that the actual proportion is less than \(25 \% ?\)

Use data from a study designed to examine the effect of doing synchronized movements (such as marching in step or doing synchronized dance steps) and the effect of exertion on many different variables, such as pain tolerance and attitudes toward others. In the study, 264 high school students in Brazil were randomly assigned to one of four groups reflecting whether or not movements were synchronized (Synch= yes or no) and level of activity (Exertion= high or low). \(^{49}\) Participants rated how close they felt to others in their group both before (CloseBefore) and after (CloseAfter) the activity, using a 7-point scale (1=least close to \(7=\) most close ). Participants also had their pain tolerance measured using pressure from a blood pressure cuff, by indicating when the pressure became too uncomfortable (up to a maximum pressure of \(300 \mathrm{mmHg}\) ). Higher numbers for this Pain Tolerance measure indicate higher pain tolerance. The full dataset is available in SynchronizedMovement. For each of the following problems: (a) Give notation for the quantity we are estimating, and define any relevant parameters. (b) Use StatKey or other technology to find the value of the sample statistic. Give the correct notation with your answer. (c) Use StatKey or other technology to find the standard error for the estimate. (d) Use the standard error to give a \(95 \%\) confidence interval for the quantity we are estimating. (e) Interpret the confidence interval in context. What Proportion Go to Maximum Pressure? We see that 75 of the 264 people in the study allowed the pressure to reach its maximum level of \(300 \mathrm{mmHg}\), without ever saying that the pain was too much (MaxPressure=yes). Use this information to estimate the proportion of people who would allow the pressure to reach its maximum level.

A sample is given. Indicate whether each option is a possible bootstrap sample from this original sample. Original sample: 85,72,79,97,88 . Do the values given constitute a possible bootstrap sample from the original sample? (a) 79,79,97,85,88 (b) 72,79,85,88,97 (c) 85,88,97,72 (d) 88,97,81,78,85 (e) 97,85,79,85,97 (f) 72,72,79,72,79

Use data from a study designed to examine the effect of doing synchronized movements (such as marching in step or doing synchronized dance steps) and the effect of exertion on many different variables, such as pain tolerance and attitudes toward others. In the study, 264 high school students in Brazil were randomly assigned to one of four groups reflecting whether or not movements were synchronized (Synch= yes or no) and level of activity (Exertion= high or low). \(^{49}\) Participants rated how close they felt to others in their group both before (CloseBefore) and after (CloseAfter) the activity, using a 7-point scale (1=least close to \(7=\) most close ). Participants also had their pain tolerance measured using pressure from a blood pressure cuff, by indicating when the pressure became too uncomfortable (up to a maximum pressure of \(300 \mathrm{mmHg}\) ). Higher numbers for this Pain Tolerance measure indicate higher pain tolerance. The full dataset is available in SynchronizedMovement. For each of the following problems: (a) Give notation for the quantity we are estimating, and define any relevant parameters. (b) Use StatKey or other technology to find the value of the sample statistic. Give the correct notation with your answer. (c) Use StatKey or other technology to find the standard error for the estimate. (d) Use the standard error to give a \(95 \%\) confidence interval for the quantity we are estimating. (e) Interpret the confidence interval in context. Does Synchronization Increase Feelings of Closeness? Use the closeness ratings given after the activity (CloseAfter) to estimate the difference in mean rating of closeness between those who have just done a synchronized activity and those who do a non-synchronized activity.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free