Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

(a) Show that when data \((X, Y)\) are available, but with values of \(Y\) missing at random, the log likelihood contribution can be written $$ \ell(\theta) \equiv I \log f(Y \mid X ; \theta)+\log f(X ; \theta) $$ and deduce that the expected information for \(\theta\) depends on the missingness mechanism but that the observed information does not. (b) Consider binary pairs \((X, Y)\) with indicator \(I\) equal to zero when \(Y\) is missing; \(X\) is always seen. Their joint distribution is given by $$ \operatorname{Pr}(Y=1 \mid X=0)=\theta_{0}, \quad \operatorname{Pr}(Y=1 \mid X=1)=\theta_{1}, \quad \operatorname{Pr}(X=1)=\lambda $$ while the missingness mechanism is $$ \operatorname{Pr}(I=1 \mid X=0)=\eta_{0}, \quad \operatorname{Pr}(I=1 \mid X=1)=\eta_{1} $$ (i) Show that the likelihood contribution from \((X, Y, I)\) is $$ \begin{aligned} &{\left[\left\\{\theta_{1}^{Y}\left(1-\theta_{1}\right)^{1-Y}\right\\}^{X}\left\\{\theta_{0}^{Y}\left(1-\theta_{0}\right)^{1-Y}\right\\}^{1-X}\right]^{I}} \\\ &\quad \times\left\\{\eta_{0}^{I}\left(1-\eta_{0}\right)^{1-I}\right\\}^{1-X}\left\\{\eta_{1}^{I}\left(1-\eta_{1}\right)^{1-1}\right\\}^{X} \times \lambda^{X}(1-\lambda)^{1-X} \end{aligned} $$ Deduce that the observed information for \(\theta_{1}\) based on a random sample of size \(n\) is $$ -\frac{\partial^{2} \ell\left(\theta_{0}, \theta_{1}\right)}{\partial \theta_{1}^{2}}=\sum_{j=1}^{n} I_{j} X_{j}\left\\{\frac{Y_{j}}{\theta_{1}^{2}}+\frac{1-Y_{j}}{\left(1-\theta_{1}\right)^{2}}\right\\} $$ Give corresponding expressions for \(\partial^{2} \ell\left(\theta_{0}, \theta_{1}\right) / \partial \theta_{0}^{2}\) and \(\partial^{2} \ell\left(\theta_{0}, \theta_{1}\right) / \partial \theta_{0} \partial \theta_{1}\). (ii) Statistician A calculates the expected information treating \(I_{1}, \ldots, I_{n}\) as fixed and thereby ignores the missing data mechanism. Show that he gets \(i_{A}\left(\theta_{1}, \theta_{1}\right)=M \lambda /\left\\{\theta_{1}(1-)\right.\) \(\left.\left.\theta_{1}\right)\right\\}\), where \(M=\sum I_{j}\), and find the corresponding quantities \(i_{A}\left(\theta_{0}, \theta_{1}\right)\) and \(i_{A}\left(\theta_{0}, \theta_{0}\right)\). If he uses this procedure for many sets of data, deduce that on average \(M\) is replaced by \(n \operatorname{Pr}(I=1)=n\left\\{\lambda \eta_{1}+(1-\lambda) \eta_{0}\right\\}\) (iii) Statistician B calculates the expected information taking into account the missingness mechanism. Show that she gets \(i_{B}\left(\theta_{1}, \theta_{1}\right)=n \lambda \eta_{1} /\left\\{\theta_{1}\left(1-\theta_{1}\right)\right\\}\), and obtain \(i_{B}\left(\theta_{0}, \theta_{1}\right)\) and \(i_{B}\left(\theta_{0}, \theta_{0}\right) .\) (iv) Show that \(\mathrm{A}\) and \(\mathrm{B}\) get the same expected information matrices only if \(Y\) is missing completely at random. Does this accord with the discussion above? (c) Statistician C argues that expected information should never be used in data analysis: even if the data actually observed are complete, unless it can be guaranteed that data could not be missing at random for any reason, every expected information calculation should involve every potential missingness mechanism. Such a guarantee is impossible in practice, so no expected information calculation is ever correct. Do you agree? (Kenward and Molenberghs, 1998\()\)

Short Answer

Expert verified
Observed and expected information differ due to missing data assumptions. Complete data absence awareness supports using expected information with caution.

Step by step solution

01

Understand the Likelihood Contribution

Given the log likelihood with missing data mechanism, notice that if \(I\) (indicator for missing data) is zero, the first term involving \(Y\) is excluded. Analyze the expression \(\ell(\theta) = I \log f(Y \mid X ; \theta) + \log f(X ; \theta)\) which separates likelihood contributions of observed and unobserved data.
02

Evaluate Expected and Observed Information

The key observation is that the expected information for \(\theta\) will consider the missing data mechanism due to expectation over this random component, whereas the observed information does not depend on this mechanism under the MAR (missing at random) assumption.
03

Formulate Likelihood for (X, Y, I)

Use the joint distribution and the missingness mechanism. Compute the likelihood contribution for observed pairs \((X, Y, I)\) as given in the problem statement, considering separate terms for each possible case of \(X\), \(Y\), and \(I\). Substitute the given probabilities to derive the likelihood formula.
04

Derive Observed Information

Differentiate the log likelihood with respect to \(\theta_1\) twice. Start by treating \(\ell(\theta_0, \theta_1)\) as a log of compound expressions and apply derivative rules to capture the contribution depending on \(I_j X_j\). Get corresponding expressions for each relevant derivative concerning \(\theta_0\) and \(\theta_1\).
05

Calculate Expected Information (Statistician A)

Assume \(I_j\) are fixed, leading to treatment of missing data as complete. Derive expected information for \(\theta_1, \theta_0\), combining knowledge of info contributions with sample sums corresponding to actual data indicators \(M = \sum I_j\). Use calculations assuming fixed data presence.
06

Calculate Expected Information (Statistician B)

Adjust expected information calculations to consider average data influence due to the missingness mechanism. Combine sample size \(n\), the probability of \(X\) values, and missingness contributions \(\eta_0, \eta_1\). Relate this to the contributions of each \(\theta\).
07

Compare Statistician A and B Results

Verify the equivalency between A and B's expected information only when data is missing completely at random. This results in identical matrices since both calculations effectively ignore missingness influence.
08

Discuss Statistician C's Perspective

Consider the statement about the relevance of expected information. Under the assumption of potential non-random missingness, argue the validity and applicability of expected information analyses. Share your agreement or disagreement based on the necessity for practical analysis.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Missing Data Mechanism
In statistical inference, understanding the missing data mechanism is crucial. It refers to the process that leads to data being incomplete. Specifically, this mechanism characterizes why certain data points are missing and influences how we account for them. There are different types of missing data mechanisms like Missing Completely at Random (MCAR), Missing at Random (MAR), and Not Missing at Random (NMAR).

For instance, in the given exercise, assuming a data pair \(X, Y\) with indicator \(I\) shows the presence or absence of \(Y\), the missingness mechanism can alter how the likelihood is constructed. When data are MAR, the probability of data being missing depends only on the observed data. Therefore, the missingness mechanism directly affects the expected information but not the observed information. This distinction highlights the significance of understanding the missing data mechanism as it impacts both statistical analysis and the choice of methods to handle incomplete data.
Likelihood Functions
Likelihood functions are a cornerstone of statistical inference, providing a method to estimate parameters of a statistical model. The likelihood function represents the probability of observed data given specific model parameters. When we have missing data, especially in cases as presented in the problem, the likelihood function needs to be modified.

In our example, the contribution to the likelihood depends on both observed and missing data. The formula \(\ell(\theta) = I \log f(Y \mid X ; \theta) + \log f(X ; \theta)\) separates the contributions of the observed and missing data. The use of an indicator \(I\) ensures that terms involving missing data are only included when relevant. Consequently, the likelihood function here allows us to integrate information from the missing data when it's possible, helping in parameter estimation.
Observed Information
Observed information pertains to the second derivative of the log-likelihood with respect to the parameter of interest. It measures how much data actually informs us about a parameter. Unlike expected information, observed information does not consider the missingness mechanism. In missing at random (MAR) settings, it strictly relies on available data.

In the provided solution, the expression \[-\frac{\partial^{2} \ell\left(\theta_{0}, \theta_{1}\right)}{\partial \theta_{1}^{2}}=\sum_{j=1}^{n} I_{j} X_{j}\left\{\frac{Y_{j}}{\theta_{1}^{2}}+\frac{1-Y_{j}}{\left(1-\theta_{1}\right)^{2}}\right\}\]gives us the observed information for \(\theta_{1}\). This formula highlights how only the seen data \(X\) and \(Y\) (when \(I=1\)) contribute to the information measure, thereby limiting the analysis to what is directly observed.
Expected Information
Expected information, or Fisher information, calculates the variance of the score (first derivative) and incorporates the data's full potential insight, taking into account the missingness mechanism. In a statistical model, it helps in assessing the efficiency of estimators and is generally used in large sample inference.

In this scenario, we evaluate expected information differently depending on whether we consider the missingness mechanism. Statistician A, for example, treats the missing data mechanism as fixed, ignoring how probability influences the missing data, resulting in expected information calculated as if all data were present. Meanwhile, Statistician B acknowledges the missingness mechanism, offering a more comprehensive measure. This is demonstrated when Statistician B computes:\[i_{B}\left(\theta_{1}, \theta_{1}\right)=n \lambda \eta_{1} /\left\{\theta_{1}\left(1-\theta_{1}\right)\right\}\]indicating that incorporating the probability of missingness delivers a more accurate reflection of the true information conveyed by the data.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In a competing risks model with \(k=2\), write $$ \begin{aligned} \operatorname{Pr}(Y \leq y) &=\operatorname{Pr}(Y \leq y \mid I=1) \operatorname{Pr}(I=1)+\operatorname{Pr}(Y \leq y \mid I=2) \operatorname{Pr}(I=2) \\ &=p F_{1}(y)+(1-p) F_{2}(y) \end{aligned} $$ say. Hence find the cause-specific hazard functions \(h_{1}\) and \(h_{2}\), and express \(F_{1}, F_{2}\) and \(p\) in terms of them. Show that the likelihood for an uncensored sample may be written $$ p^{r}(1-p)^{n-r} \prod_{j=1}^{r} f_{1}\left(y_{j}\right) \prod_{j=r+1}^{n} f_{2}\left(y_{j}\right) $$ and find the likelihood when there is censoring. If \(\left.f_{(} y_{1} \mid y_{2}\right)\) and \(f\left(y_{2} \mid y_{1}\right)\) be arbitrary densities with support \(\left[y_{2}, \infty\right)\) and \(\left[y_{1}, \infty\right)\), then show that the joint density $$ f\left(y_{1}, y_{2}\right)= \begin{cases}p f_{1}\left(y_{1}\right) f\left(y_{2} \mid y_{1}\right), & y_{1} \leq y_{2} \\ (1-p) f_{2}\left(y_{2}\right) f\left(y_{1} \mid y_{2}\right), & y_{1}>y_{2}\end{cases} $$ produces the same likelihoods. Deduce that the joint density is not identifiable.

Consider data from the straight-line regression model with \(n\) observations and $$ x_{j}= \begin{cases}0, & j=1, \ldots, m \\ 1, & \text { otherwise }\end{cases} $$ where \(m \leq n .\) Give a careful interpretation of the parameters \(\beta_{0}\) and \(\beta_{1}\), and find their least squares estimates. For what value(s) of \(m\) is \(\operatorname{var}\left(\widehat{\beta}_{1}\right)\) minimized, and for which maximized? Do your results make qualitative sense?

(a) Suppose that \(Y_{1}\) and \(Y_{2}\) have gamma densities (2.7) with parameters \(\lambda, \kappa_{1}\) and \(\lambda, \kappa_{2}\). Show that the conditional density of \(Y_{1}\) given \(Y_{1}+Y_{2}=s\) is $$ \frac{\Gamma\left(\kappa_{1}+\kappa_{2}\right)}{s^{\kappa_{1}+\kappa_{2}-1} \Gamma\left(\kappa_{1}\right) \Gamma\left(\kappa_{2}\right)} u^{\kappa_{1}-1}(s-u)^{\kappa_{2}-1}, \quad 00 $$ and establish that this is an exponential family. Give its mean and variance. (b) Show that \(Y_{1} /\left(Y_{1}+Y_{2}\right)\) has the beta density. (c) Discuss how you would use samples of form \(y_{1} /\left(y_{1}+y_{2}\right)\) to check the fit of this model with known \(v_{1}\) and \(v_{2}\).

Use the relation \(\mathcal{F}(y)=\exp \left\\{-\int_{0}^{y} h(u) d u\right\\}\) between the survivor and hazard functions to find the survivor functions corresponding to the following hazards: (a) \(h(y)=\lambda\); (b) \(h(y)=\lambda y^{\alpha}\); (c) \(h(y)=\alpha y^{\kappa-1} /\left(\beta+y^{k}\right) .\) In each case state what the distribution is. Show that \(\mathrm{E}\\{1 / h(Y)\\}=\mathrm{E}(Y)\) and hence find the means in (a), (b), and (c).

Suppose \(Y=\tau \varepsilon\), where \(\tau \in \mathbb{R}_{+}\)and \(\varepsilon\) is a random variable with known density \(f\). Show that this scale model is a group transformation model with free action \(g_{\tau}(y)=\tau y\). Show that \(s_{1}(Y)=\bar{Y}\) and \(s_{2}(Y)=\left(\sum Y_{j}^{2}\right)^{1 / 2}\) are equivariant and find the corresponding maximal invariants. Sketch the orbits when \(n=2\).

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free