Chapter 4: Problem 1

The logistic density with location and scale parameters $\mu$ and $\sigma$ is $$ f(y ; \mu, \sigma)=\frac{\exp \\{(y-\mu) / \sigma\\}}{\sigma[1+\exp \\{(y-\mu) / \sigma\\}]^{2}}, \quad-\infty0 $$ (a) If $Y$ has density $f(y ; \mu, 1)$, show that the expected information for $\mu$ is $1 / 3$. (b) Instead of observing $Y$, we observe the indicator $Z$ of whether or not $Y$ is positive. When $\sigma=1$, show that the expected information for $\mu$ based on $Z$ is $e^{\mu} /\left(1+e^{\mu}\right)^{2}$, and deduce that the maximum efficiency of sampling based on $Z$ rather than $Y$ is $3 / 4$. Why is this greatest at $\mu=0 ?$ (c) Find the expected information $I(\mu, \sigma)$ based on $Y$ when $\sigma$ is unknown. Without doing any calculations, explain why both parameters cannot be estimated based only on $Z$.

Short Answer

Expert verified

The expected information for $\mu$ based on $Y$ is $\frac{1}{3}$. The expected information for $\mu$ based on $Z$ is $\frac{e^{\mu}}{(1+e^{\mu})^2}$, with maximum efficiency $\frac{3}{4}$ at $\mu = 0$. Both $\mu$ and $\sigma$ cannot be estimated from $Z$.

Step by step solution

Information from Y for μ when σ=1

The Fisher information for a parameter $ \mu $ in the probability density function (pdf) is given by $ I(\mu) = \mathbb{E} \left[ \left( \frac{\partial}{\partial \mu} \log f(y; \mu) \right)^2 \right] $. For the logistic distribution, the pdf is $ f(y; \mu, 1) = \frac{e^{(y-\mu)}}{(1+e^{(y-\mu)})^2} $. The log-likelihood is $ \log f(y; \mu) = (y-\mu) - 2 \log (1+e^{(y-\mu)}) $. The derivative with respect to $ \mu $ is $ -1 + \frac{2}{1+e^{(y-\mu)}} $. Calculate $ \left( \frac{\partial}{\partial \mu} \log f(y; \mu) \right)^2 = \left(-1 + \frac{2}{1+e^{(y-\mu)}}\right)^2 $. Integrate this over all $ y $ to find the expectation, resulting in $ I(\mu) = \frac{1}{3} $.

Information from Z for μ when σ=1

When observing $ Z $ instead of $ Y $, $ Z = 1 $ if $ Y > 0 $ and $ Z = 0 $ otherwise. The probability that $ Z = 1 $ is given by $ P(Y > 0) = \frac{1}{1+e^{\mu}} $. Thus, the observed likelihood becomes a Bernoulli random variable with $ P(Z=1) = \frac{1}{1+e^{\mu}} $. The Fisher Information for $ \mu $ can be obtained by differentiating the log-likelihood of this Bernoulli distribution: $ I(\mu) = \left( \frac{\partial}{\partial \mu} \left[ Z \log \frac{e^{\mu}}{1+e^{\mu}} + (1-Z) \log \frac{1}{1+e^{\mu}} \right] \right)^2 $. It calculates to $ \frac{e^{\mu}}{(1+e^{\mu})^2} $.

Efficiency of Sampling with Z

The efficiency of sampling can be defined as the ratio of expected information from $ Z $ to $ Y $. This is given by $ \frac{I(Z)}{I(Y)} = 3 \cdot \frac{e^{\mu}}{(1+e^{\mu})^2} $. At $ \mu = 0 $, $ e^{\mu} = 1 $, and hence, $ I(Z) = \frac{1}{4} $. Thus, the efficiency is $ \frac{1}{4} \times 3 = \frac{3}{4} $. It is greatest at $ \mu = 0 $ because the information function reaches its maximum there.

Information based on Y for μ and σ unknown

The expected information $ I(\mu, \sigma) $ when both $ \mu $ and $ \sigma $ are unknown involves calculating the Fisher information matrix. This requires the partial derivatives of the log likelihood with respect to both parameters, resulting in a 2x2 matrix. For a logistic distribution, the elements can be derived, but we recognize it involves a more complex integration over the joint likelihood.

Inability to estimate both μ and σ based on Z

When using $ Z $, the calculation of Fisher information was possible because only one parameter was involved under a clear condition (observing only if $ Y > 0 $). With two unknown parameters ($ \mu $ and $ \sigma $), the limited information provided by the binary event $ Z $ does not suffice to estimate two parameters uniquely. Intuitively, because $ Z $ only provides the sign of $ Y $, it loses information on its magnitude, which is essential in distinguishing separate effects of $ \mu $ and $ \sigma $.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Fisher Information

Fisher Information is a crucial concept for understanding how much information an observable random variable carries about an unknown parameter. In the context of a logistic regression model, it helps us quantify how clearly our data can tell us about the parameter in question, for example, the location parameter $ \mu $. The Fisher Information is calculated as the expectation of the squared derivative of the log-likelihood function with respect to the parameter of interest.

When $ Y $ follows a logistic distribution with a standard scale size of 1, we use its Probability Density Function (PDF) to determine the Fisher Information for $ \mu $. The function is the expected value of:\[ \left( -1 + \frac{2}{1+e^{(y-\mu)}} \right)^2 \]Integrating this over all possibilities of $ y $ gives us the Fisher Information as $ \frac{1}{3} $. This value essentially informs us how well we can estimate the parameter $ \mu $ from $ Y $ with a clear mathematical backing.

Parameter Estimation

Parameter Estimation refers to the process of using sample data to infer the values of parameters within a mathematical model. In logistic regression, parameters such as $ \mu $ or $ \sigma $ need to be estimated to describe the underlying distribution that the data follows.

For example, in part (b) of our exercise, instead of observing $ Y $ directly, we observe whether $ Y $ is positive or negative. This results in observing $ Z $, which becomes a Bernoulli random variable. Despite not seeing the exact values of $ Y $, we attempt to estimate $ \mu $ from the simplified binary data. Differentiating the log-likelihood of the Bernoulli distribution gives us a way to estimate $ \mu $, albeit with some loss of precision compared to observing $ Y $ directly.

The aim is always to find the parameter values which maximize the likelihood function, allowing us to create the best fitting model given the observed data.

Efficiency of Sampling

Efficiency in the context of sampling and estimation refers to the quality of the statistical estimate. A more efficient estimator will give tighter estimates and provide more information per data point about the parameter.

In logistic regression, we frequently consider whether direct observations or derived data maximize our estimation capabilities. When we use $ Z $ (indicator variable) instead of $ Y $, the Fisher Information changes, and so does our ability to make precise estimates. The efficiency when based on $ Z $ is calculated as the ratio $ \frac{I(Z)}{I(Y)} = 3 \cdot \frac{e^{\mu}}{(1+e^{\mu})^2} $. This reveals that maximum efficiency reaches $ \frac{3}{4} $ when $ \mu = 0 $, showing that the parameter estimates are most accurate when the logistic curve is symmetric around zero.

This insight helps in ensuring that we use the best sampling technique based on our model and what we are trying to optimize in parameter estimation.

Probability Density Function

Understanding Probability Density Functions (PDFs) is fundamental to a range of statistical models, including logistic regression. The PDF describes how likely a random variable takes on a particular value.

In logistic regression, the PDF for the logistic distribution can be specifically written as: \[ f(y; \mu, \sigma) = \frac{\exp((y-\mu)/\sigma)}{\sigma[1+\exp((y-\mu)/\sigma)]^2} \]This formula defines the likelihood of observing a specific outcome $ y $ given the parameters $ \mu $ and $ \sigma $. In our exercise, it is shown that when $\sigma=1$, the formula simplifies, making calculations more straightforward.

These densities are not only crucial for understanding the relationship and spread of data points around our target parameter but also serve as a base for further calculations that involve expectations, such as finding Fisher Information.

Short Answer

Step by step solution

Information from Y for μ when σ=1

Information from Z for μ when σ=1

Efficiency of Sampling with Z

Information based on Y for μ and σ unknown

Inability to estimate both μ and σ based on Z

Key Concepts

Fisher Information

Parameter Estimation

Efficiency of Sampling

Probability Density Function

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Statistics

Discrete Mathematics

Logic and Functions

Pure Maths

Decision Maths

Geometry

Study anywhere. Anytime. Across all devices.

Company

Product

Help