Chapter 12: Q6SE (page 850)

The \({\chi ^2}\) goodness-of-fit test (see Chapter 10) is based on an asymptotic approximation to the distribution of the test statistic. For small to medium samples, the asymptotic approximation might not be very good. Simulation can be used to assess how good the approximation is. Simulation can also be used to estimate the power function of a goodness-of-fit test. For this exercise, assume that we are performing the test that was done in Example 10.1.6. The idea illustrated in this exercise applies to all such problems.
a. Simulate \(v = 10,000\) samples of size \(n = 23\) from the normal distribution with a mean of 3.912 and variance of 0.25. For each sample, compute the \({\chi ^2}\) goodness of fit statistic Q using the same four intervals that were used in Example 10.1.6. Use the simulations to estimate the probability that Q is greater than or equal to the 0.9,0.95 and 0.99 quantiles of the \({\chi ^2}\) distribution with three degrees of freedom.
b. Suppose that we are interested in the power function of a \({\chi ^2}\) goodness-of-fit test when the actual distribution of the data is the normal distribution with a mean of 4.2 and variance of 0.8. Use simulation to estimate the power function of the level 0.1,0.05 and 0.01 tests at the alternative specified.

Short Answer

Expert verified

a) The estimates are 0.041, 0.0089, and 0.001, respectively, for the \(0.9,0.95,\) and 0.95 quantiles of the chi-square distribution with 3 degrees of freedom.

(b) The estimates are\(0.039,0.0101,\)and 0.014, respectively, for the 0.9, 0.95, and\(0.95\)quantile of the chi-square distribution with 3 degrees of freedom.

Step by step solution

(a) To find the \({\chi ^2}\) distribution with three degrees of freedom

The assumption for generating\(\nu = 10000\)normal distribution samples of size\(n = 23\)is that the mean is\(log(50)\)and the standard deviation of\(\sqrt {0.25} .\)Hence,\(\mu = 3.921,\)and\({\sigma ^2} = 0.25.\)

The intervals should not exceed 5, so\(5 - 1 = 4\)it will be used in this simulation. The intervals depend on the probability of the distribution under the null hypothesis, which is the mentioned normal distribution. The upper limit of the first interval is

\(U{L_1} = \mu + 0.5 \times {\Phi ^{ - 1}}(0.25) = 3.192 + 0.5 \times ( - 0.674) = 3.575\)

where\(P(Z < - 0.674) = 0.25\). Next, the probability that the observation falls in the interval\(( - \infty ,3.575)\) is\(0.25.\)The other 3 intervals shall have the exact probabilities and are computed using\({\Phi ^{ - 1}}(0.5)\)and\({\Phi ^{ - 1}}(0.75)\) instead of\({\Phi ^{ - 1}}(0.25)\)

\(\begin{aligned}{c}U{L_2} = \mu + 0.5 \times {\Phi ^{ - 1}}(0.5) = 3.192 + 0.5 \times 0 = 3.912U{L_3} \\ = \mu + 0.5 \times {\Phi ^{ - 1}}(0.75) = 3.192 + 0.5 \times 0.674 = 4.249\end{aligned}\)

Then, the\({\chi ^2}\)statistic value is computed using

\(Q = _{i = 1}^k\frac{{{{\left( {{N_i} - np_i^0} \right)}^2}}}{{np_i^0}}\)

where, in this case,\({N_i},i = 1,2,3,4\)are the number of observations that fall in the\({i^{th }}\) interval\(n = 23,\)and\(p_i^0 = 0.25\)is by construction equal for each\(i = 1,2,3,4.\)

The given code in RStudio computes the 0.9, 0.95\(0.99\)quantiles of the chi-square distribution with 3 degrees of freedom, then the interval's upper limits are given above. In the end, the for loop checks how many test statistic values are more significant than the given quantiles. That number is used to estimate the required probabilities.

The simulation results in the following estimates of the probability for the 0.9 quantiles of the chi-square distribution with 3 degrees of freedom. The estimate is \(0.041,\) for \(0.95\) quantile \(0.0089,\) , and for \(0.99\) quantile, it depends on the significance level; one would either reject or fail to reject the null hypothesis. For example, a significance level\(0.01\) , as \(0.01 < 0.041,\) one would fail to reject the null hypothesis.

(b) To find the value of the resulted estimate

For this part, instead of simulating the samples from the initial distribution, use the valid assumption, that is, simulate a random sample from a normal distribution with mean \(\mu = 4.2\) and \({\sigma ^2} = 0.8.\) The results from estimates are \(0.039,0.101,0.014\)