Chapter 4: Problem 2

Let \(x_{1}, x_{2}, \ldots, x_{n}\) be the values of a random sample. A bootstrap sample, \(\mathbf{x}^{* \prime}=\left(x_{1}^{}, x_{2}^{}, \ldots, x_{n}^{}\right)\), is a random sample of \(x_{1}, x_{2}, \ldots, x_{n}\) drawn with replacement. (a) Show that \(x_{1}^{}, x_{2}^{}, \ldots, x_{n}^{}\) are iid with common cdf \(\widehat{F}_{n}\), the empirical cdf of \(x_{1}, x_{2}, \ldots, x_{n}\) (b) Show that \(E\left(x_{i}^{}\right)=\bar{x}\) (c) If \(n\) is odd, show that median \(\left\\{x_{i}^{}\right\\}=x_{((n+1) / 2)}\). (d) Show that \(V\left(x_{i}^{*}\right)=n^{-1} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}\).

Short Answer

Expert verified

The bootstrap samples \(x_{1}^{*}, x_{2}^{*}, \ldots, x_{n}^{*}\) are independent and identically distributed (iid) random variables with empirical CDF \(\widehat{F}_{n}\). The expectation of each \(x_{i}^{*}\) is \(\bar{x}\), the mean of original sample. If \(n\) is odd, the median of the bootstrap sample is \(x_{((n+1) / 2)}\), the median of the original sample. The variance of each \(x_{i}^{*}\) is equal to the variance of the original sample.

Step by step solution

Defining the empirical CDF

The empirical CDF \(\widehat{F}_{n}\) is a step function that jumps up by \(1/n\) at each of the \(n\) data points. The value of \(\widehat{F}_{n}\) at any number is the proportion of elements in the sample less than or equal to that number. So, for any \(x_{i}\), \(\widehat{F}_{n}(x_{i})=\frac{1}{n}\). Hence, \(x_{1}^{*}, x_{2}^{*}, \ldots, x_{n}^{*}\) are identically distributed.

Independence of bootstrap samples

Since each \(x_{i}^{*}\) in the bootstrap sample is drawn independently from the original sample, \(x_{1}^{*}, x_{2}^{*}, \ldots, x_{n}^{*}\) are independent.

Expectation of a bootstrap sample

Expectation of each \(x_{i}^{*}\) is equal to the average of the original sample because each \(x_{i}^{*}\) is as likely to take on the value of any of the \(x_{i}\)'s. Therefore, \(E\left(x_{i}^{*}\right)=\bar{x}\), where \(\bar{x}\) is the mean of the original sample.

Median of a bootstrap sample

If \(n\) is odd, then each \(x_{i}^{*}\) takes value from \(x_{i}\) independently with the same probability. Hence, it's most likely that the median of \(\{x_{i}^{*}\}\) will be the same as the median of \(\{x_{i}\}\), that is, \(x_{((n+1) / 2)}\).

Variance of a bootstrap sample

The variance of \(x_{i}^{*}\) is equivalent to the variance of the original sample, because each \(x_{i}^{*}\) is a simple random sample from \(x_{i}\). So, \(V\left(x_{i}^{*}\right)=n^{-1}\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}\).

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Bootstrap Sampling

Bootstrap sampling is a resampling technique used to estimate the distribution of a statistic by randomly drawing samples with replacement from the original dataset. When working with a dataset, like a set of observations \( x_{1}, x_{2}, \ldots, x_{n} \), the fundamental idea is to generate new datasets, called bootstrap samples. Each of these new samples, \( \mathbf{x}^{* \prime} \), contains the same number of observations as the original but allows for repeated observations due to the replacement factor.

When applied, bootstrap sampling grants us insight into the variability and stability of sample statistics like the mean or variance. This robust approach allows for drawing conclusions about the population from which the original sample was drawn, even when no explicit knowledge about the population is available. The method's strength lies in its simplicity and flexibility, making it invaluable for assessing the uncertainty or confidence in statistical estimates from limited or non-parametric data.

Empirical Cumulative Distribution Function (ECDF)

The empirical cumulative distribution function (ECDF) represents the proportion of observations less or equal to a certain value. For each data point \( x_{i} \) in a dataset \( \{x_{1}, x_{2}, \ldots, x_{n}\} \), the ECDF \( \widehat{F}_{n} \) increases by \( 1/n \) at that specific value. Essentially, the ECDF is a step function that graphically showcases the distribution of data.

To bring this into perspective, imagine lining up all data points on the number line; at each point, take a step upward. The height of the step at any given position is the fraction of data points that are at or below that level, a snapshot of the data's relative standing. Unlike theoretical distribution functions, the ECDF is based strictly on the available data, hence its empirical nature, and provides a non-parametric model to understand the inherent distribution of the data.

Independent and Identically Distributed (iid)

When discussing random variables or samples, 'independent and identically distributed' (iid) is a critical concept in the realm of statistics and probability. Idependence implies that the occurrence of one event does not influence that of another. Identically distributed denotes that each random variable has the same probability distribution.

In the context of bootstrap sampling, each bootstrap element \( x_{i}^{*} \) is drawn from the original sample independently, meaning that the selection of one does not affect the selection of another. Moreover, they are identically distributed as each element comes from the same original sample and thus follows the empirical cumulative distribution function (ECDF), \( \widehat{F}_{n} \), of the original data. This iid property is foundational in bootstrapping and many other statistical methods as it ensures consistent behavior across samples, which is imperative for valid inference.

Sample Variance

Sample variance is a measure that tells us how widely dispersed the values in a sample are. It's calculated by taking the squared differences between each observation and the sample mean, adding them all up, and then dividing by the number of observations minus one. In a mathematical form, for a sample \( \{x_{1}, x_{2}, \ldots, x_{n}\} \) with a mean of \( \bar{x} \), the sample variance \( s^2 \) is \( s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_{i}-\bar{x})^2\).

In bootstrap sampling, each resampled dataset \( \mathbf{x}^{* \prime} \) provides a sample variance that can be used to estimate the variance of the sampling distribution of a statistic. This resulting bootstrap variance captures the variability among the resampled datasets, lending a way to understand uncertainty and construct confidence intervals around statistical estimates. It's a cornerstone for inferential statistics, providing a glimpse into the sample's diversity, and by extension, the underlying population's diversity.

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Short Answer

Step by step solution

Defining the empirical CDF

Independence of bootstrap samples

Expectation of a bootstrap sample

Median of a bootstrap sample

Variance of a bootstrap sample

Key Concepts

Bootstrap Sampling

Empirical Cumulative Distribution Function (ECDF)

Independent and Identically Distributed (iid)

Sample Variance

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Decision Maths

Discrete Mathematics

Pure Maths

Calculus

Probability and Statistics

Theoretical and Mathematical Physics

Study anywhere. Anytime. Across all devices.

Company

Product

Help