Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Let \(R_{1}, R_{2}\) be independent binomial random variables with probabilities \(\pi_{1}, \pi_{2}\) and denominators \(m_{1}, m_{2}\), and let \(P_{i}=R_{i} / m_{i} .\) It is desired to test if \(\pi_{1}=\pi_{2}\). Let \(\widehat{\pi}=\left(m_{1} P_{1}+m_{2} P_{2}\right) /\left(m_{1}+m_{2}\right) .\) Show that when \(\pi_{1}=\pi_{2}\), the statistic $$ Z=\frac{P_{1}-P_{2}}{\sqrt{\widehat{\pi}(1-\hat{\pi})\left(1 / m_{1}+1 / m_{2}\right)}} \stackrel{D}{\longrightarrow} N(0,1) $$ when \(m_{1}, m_{2} \rightarrow \infty\) in such a way that \(m_{1} / m_{2} \rightarrow \xi\) for \(0<\xi<1\). Now consider a \(2 \times 2\) table formed using two independent binomial variables and having entries \(R_{i}, S_{i}\) where \(R_{i}+S_{i}=m_{i}, R_{i} / m_{i}=P_{i}\), for \(i=1,2\). Show that if \(\pi_{1}=\pi_{2}\) and \(m_{1}, m_{2} \rightarrow \infty\), then $$ X^{2}=\left(n_{1}+n_{2}\right)\left(R_{1} S_{2}-R_{2} S_{1}\right)^{2} /\left\\{n_{1} n_{2}\left(R_{1}+R_{2}\right)\left(S_{1}+S_{2}\right)\right\\} \stackrel{D}{\longrightarrow} \chi_{1}^{2} $$ Two batches of trees were planted in a park: 250 were obtained from nursery \(A\) and 250 from nursery \(B\). Subsequently 41 and 64 trees from the two groups die. Do trees from the two nurseries have the same survival probabilities? Are the assumptions you make reasonable?

Short Answer

Expert verified
Trees do not have the same survival probabilities as seen by significant chi-square statistic. Assumptions are reasonable, given large and comparable sample sizes.

Step by step solution

01

Define Conditional Binomial Distributions

Since we have two independent binomial variables, each follows a binomial distribution. Specifically, \(R_1 \sim \text{Binomial}(m_1, \pi_1)\) and \(R_2 \sim \text{Binomial}(m_2, \pi_2)\). The proportion \(P_i = \frac{R_i}{m_i}\) represents the sample proportion of successes.
02

Estimate Common Proportion Under Null Hypothesis

Assuming \(\pi_1 = \pi_2\) under the null hypothesis, the pooled estimate of the common proportion \(\widehat{\pi}\) is calculated as \[\widehat{\pi} = \frac{m_1 P_1 + m_2 P_2}{m_1+m_2}.\] This estimate combines the proportions using the total number of trials.
03

Formulate Z-Statistic

The Z-statistic for testing the equality of proportions is given by \[Z = \frac{P_1 - P_2}{\sqrt{\widehat{\pi} (1 - \widehat{\pi}) \left( \frac{1}{m_1} + \frac{1}{m_2} \right)}}.\] This statistic, under the null hypothesis, approximately follows a standard normal distribution as sample sizes become large.
04

Investigate Limit Distribution of Z-Statistic

As \(m_1, m_2 \rightarrow \infty\) and \(m_1/m_2 \rightarrow \xi\) for \(0 < \xi < 1\), the distribution of \(Z\) converges to \(N(0,1)\) due to the normal approximation of large-sample proportions and equal expected proportions.
05

Formulate Chi-Square Statistic for Contingency Table

Given the 2x2 contingency table with entries \(R_i, S_i\) where \(R_i + S_i = m_i\), we can define \(X^2\) to test independence as \[X^2 = \frac{(m_1+m_2)(R_1 S_2 - R_2 S_1)^2}{m_1 m_2 (R_1 + R_2)(S_1 + S_2)}.\] Assuming \(\pi_1 = \pi_2\), this statistic asymptotically follows a \(\chi^2_1\) distribution.
06

Apply Procedure to Tree Data

For the provided data: \(R_1 = 209, S_1 = 41, R_2 = 186, S_2 = 64\). Calculating \(X^2\) using these values, check if it exceeds a critical value from the \(\chi^2_1\) distribution to determine if there is a significant difference in survival probabilities.
07

Validate Assumptions

Check assumptions: (1) The samples (i.e., tree sets) are independent, (2) Both sample sizes are sufficiently large, and (3) The ratio of sample sizes is in an acceptable range (close to equality), ensuring the approximations hold.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Binomial Distribution
In a binomial distribution, we deal with situations where there are two possible outcomes, often labeled as "success" and "failure." This distribution is a cornerstone of probability theory, especially when dealing with independent trials. Imagine flipping a coin where heads are a success and tails are a failure.
The binomial distribution helps in determining the probability of obtaining a certain number of successes in a fixed number of trials. Each trial is independent, meaning the outcome of one flip doesn't affect another.
  • For example, in our exercise, the variables \( R_1 \) and \( R_2 \) are independent binomial random variables.
  • The probability of success in each trial is given by \( \pi_1 \) and \( \pi_2 \) respectively.
These concepts are pivotal when comparing the proportions of successes in two different groups, which leads us into hypothesis testing.
Contingency Table Analysis
Contingency table analysis allows us to examine the relationships between categorical variables, often presented in a matrix format. A common form of this analysis is a 2x2 table.
In our exercise, the 2x2 table is formed using two independent binomial variables. This table represents observed frequencies of different combinations of outcomes from these variables.
  • To fill the table, numbers of observed successful () and unsuccessful (\( S_i \)) outcomes are arranged in rows and columns.
  • By analyzing this data, we can assess whether there's a statistically significant difference in outcomes between groups.
Contingency tables are powerful in hypothesis testing because they give us a visual and statistical method to determine if there's any association between the different categorical variables.
Chi-Square Distribution
The Chi-Square Distribution is fundamental in testing the relationship between categorical variables, often used in contingency table analysis. It helps us determine how well the observed data fit a particular distribution by calculating how much observed counts deviate from expected counts.
  • In this exercise, we compute a Chi-Square statistic using the formula provided, which considers all observed frequencies.
  • The key is to compare this calculated statistic against a Chi-Square Distribution with degrees of freedom. Here, the degrees of freedom are 1, as we have a 2x2 table setup.
  • If the computed value of \( X^2 \) exceeds the critical value from the Chi-Square table, it suggests a significant difference between groups, indicating that the null hypothesis (that there's no difference between proportions) might not hold.
    Statistical Inference
    Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution. This is central to performing hypothesis tests and drawing conclusions from sample data about a population. The main focus is on estimating population parameters and testing hypotheses.
    In the provided exercise, we use statistical inference to determine whether the survival probabilities of trees from two different nurseries are the same.
    • We start by forming hypotheses: The null hypothesis assumes the survival rates are equal, while the alternative hypothesis suggests they are not.
    • Next, we compute statistics, like the Z-statistic and Chi-Square statistic, to test these hypotheses.
    • Finally, we use the result of these statistics to make informed conclusions about the null hypothesis.
    By analyzing the given data, statistical inference enables us to understand broader patterns and draw conclusions about the population from our sample observations.

    One App. One Place for Learning.

    All the tools & learning materials you need for study success - in one app.

    Get started for free

    Most popular questions from this chapter

    Show how to use inversion to generate Bernoulli random variables. If \(0<\pi<1\), what distribution has \(\sum_{j=1}^{m} I\left(U_{j} \leq \pi\right) ?\)

    The Cholesky decomposition of an \(p \times p\) symmetric positive matrix \(\Omega\) is the unique lower triangular \(p \times p\) matrix \(L\) such that \(L L^{\mathrm{T}}=\Omega\). Find the distribution of \(\mu+L Z\), where \(Z\) is a vector containing a standard normal random sample \(Z_{1}, \ldots, Z_{p}\), and hence give an algorithm to generate from the multivariate normal distribution.

    Suppose that \(Y_{1}, \ldots, Y_{4}\) are independent normal variables, each with variance \(\sigma^{2}\), but with means \(\mu+\alpha+\beta+\gamma, \mu+\alpha-\beta-\gamma, \mu-\alpha+\beta-\gamma, \mu-\alpha-\beta+\gamma\). Let \(Z^{\mathrm{T}}=\frac{1}{4}\left(Y_{1}+Y_{2}+Y_{3}+Y_{4}, Y_{1}+Y_{2}-Y_{3}-Y_{4}, Y_{1}-Y_{2}+Y_{3}-Y_{4}, Y_{1}-Y_{2}-Y_{3}+Y_{4}\right)\) Calculate the mean vector and covariance matrix of \(Z\), and give the joint distribution of \(Z_{1}\) and \(V=Z_{2}^{2}+Z_{3}^{2}+Z_{4}^{2}\) when \(\alpha=\beta=\gamma=0\). What is then the distribution of \(Z_{1} /(V / 3)^{1 / 2} ?\)

    Let \(Y_{1}, \ldots, Y_{n}\) be defined by \(Y_{j}=\mu+\sigma X_{j}\), where \(X_{1}, \ldots, X_{n}\) is a random sample from a known density \(g\) with distribution function \(G\). If \(M=m(Y)\) and \(S=s(Y)\) are location and scale statistics based on \(Y_{1}, \ldots, Y_{n}\), that is, they have the properties that \(m(Y)=\mu+\sigma m(X)\) and \(s(Y)=\sigma s(X)\) for all \(X_{1}, \ldots, X_{n}, \sigma>0\) and real \(\mu\), then show that \(Z(\mu)=n^{1 / 2}(M-\mu) / S\) is a pivot. When \(n\) is odd and large, \(g\) is the standard normal density, \(M\) is the median of \(Y_{1}, \ldots, Y_{n}\) and \(S=\) IQR their interquartile range, show that \(S / 1.35 \stackrel{P}{\longrightarrow} \sigma\), and hence show that as \(n \rightarrow \infty, Z(\mu) \stackrel{D}{\longrightarrow} N\left(0, \tau^{2}\right)\), for known \(\tau>0 .\) Hence give the form of a \(95 \%\) confidence interval for \(\mu\). Compare this interval and that based on using \(Z(\mu)\) with \(M=\bar{Y}\) and \(S^{2}\) the sample variance, for the data for day 4 in Table \(2.1\).

    \(W_{i}, X_{i}, Y_{i}\), and \(Z_{i}, i=1,2\), are eight independent, normal random variables with common variance \(\sigma^{2}\) and expectations \(\mu_{W}, \mu_{X}, \mu_{Y}\) and \(\mu_{Z} .\) Find the joint distribution of the random variables $$ \begin{aligned} T_{1} &=\frac{1}{2}\left(W_{1}+W_{2}\right)-\mu_{W}, T_{2}=\frac{1}{2}\left(X_{1}+X_{2}\right)-\mu_{X} \\ T_{3} &=\frac{1}{2}\left(Y_{1}+Y_{2}\right)-\mu_{Y}, T_{4}=\frac{1}{2}\left(Z_{1}+Z_{2}\right)-\mu_{Z} \\ T_{5} &=W_{1}-W_{2}, T_{6}=X_{1}-X_{2}, T_{7}=Y_{1}-Y_{2}, T_{8}=Z_{1}-Z_{2} \end{aligned} $$ Hence obtain the distribution of $$ U=4 \frac{T_{1}^{2}+T_{2}^{2}+T_{3}^{2}+T_{4}^{2}}{T_{5}^{2}+T_{6}^{2}+T_{7}^{2}+T_{8}^{2}} $$ Show that the random variables \(U /(1+U)\) and \(1 /(1+U)\) are identically distributed, without finding their probability density functions. Find their common density function and hence determine \(\operatorname{Pr}(U \leq 2)\).

    See all solutions

    Recommended explanations on Math Textbooks

    View all explanations

    What do you think about this solution?

    We value your feedback to improve our textbook solutions.

    Study anywhere. Anytime. Across all devices.

    Sign-up for free