Chapter 10: Problem 6
The rate of growth of an epidemic such as AIDS for a large population can be estimated fairly accurately and treated as a known function \(g(t)\) of time \(t\). In a smaller area where few cases have been observed the rate is hard to estimate because data are scarce. However predictions of the numbers of future cases in such an area must be made in order to allocate resources such as hospital beds. A simple assumption is that cases in the area arise in a non- homogeneous Poisson process with rate \(\lambda g(t)\), for which the mean number of cases in period \(\left(t_{1}, t_{2}\right)\) is \(\lambda \int_{t_{1}}^{t_{2}} g(t) d t\). Suppose that \(N_{1}=n_{1}\) individuals with the disease have been observed in the period \((-\infty, 0)\), and that predictions are required for the number \(N_{2}\), of cases to be observed in a future period \(\left(t_{1}, t_{2}\right)\). (a) Find the conditional distribution of \(N_{2}\) given \(N_{1}+N_{2}\), and show it to be free of \(\lambda\). Deduce that a \((1-2 \alpha)\) prediction interval \(\left(n_{-}, n_{+}\right)\)for \(N_{2}\) is found by solving approximately the equations $$ \begin{aligned} &\alpha=\operatorname{Pr}\left(N_{2} \leq n_{-} \mid N_{1}+N_{2}=n_{1}+n_{-}\right) \\ &\alpha=\operatorname{Pr}\left(N_{2} \geq n_{+} \mid N_{1}+N_{2}=n_{1}+n_{+}\right) \end{aligned} $$ (b) Use a normal approximation to the conditional distribution in (a) to show that for moderate to large \(n_{1}, n_{-}\)and \(n_{+}\)are the solutions to the quadratic equation $$ (1-p)^{2} n^{2}+p(p-1)\left(2 n_{1}+z_{\alpha}^{2}\right) n+n_{1} p\left\\{n_{1} p-(1-p) z_{\alpha}^{2}\right\\}=0 $$ where \(\Phi\left(z_{\alpha}\right)=\alpha\) and $$ p=\int_{t_{1}}^{t_{2}} g(t) d t /\left\\{\int_{t_{1}}^{t_{2}} g(t) d t+\int_{-\infty}^{0} g(t) d t\right\\} $$ (c) Find approximate \(0.90\) prediction intervals for the special case where \(g(t)=2^{t / 2}\), so that the doubling time for the epidemic is two years, \(n_{1}=10\) cases have been observed until time 0 , and \(t_{1}=0, t_{2}=1\) (next year) (Cox and Davison, 1989). (d) Show that conditional on \(A, R_{1}\) has a generalized linear model density with $$ b(\theta)=\log \left\\{\sum_{u=u-}^{u_{+}}\left(\begin{array}{c} m_{1} \\ u \end{array}\right)\left(\begin{array}{c} m_{0} \\ a-u \end{array}\right) e^{u \theta}\right\\}, u_{-}=\max \left\\{0, a-m_{0}\right\\}, u_{+}=\min \left\\{m_{1}, a\right\\} $$ Deduce that a score test of \(\Delta=1\) based on data from \(n\) independent \(2 \times 2\) tables \(\left(R_{0 j}, m_{0 j}-R_{0 j} ; R_{1 j}, m_{1 j}-R_{1 j}\right)\) is obtained by treating \(\sum R_{1 j}\) as approximately normal with mean and variance $$ \sum_{j=1}^{n} \frac{m_{1 j} a_{j}}{m_{0 j}+m_{1 j}}, \quad \sum_{j=1}^{n} \frac{m_{0 j} m_{1 j} a_{j}\left(m_{0 j}+m_{0 j}-a_{j}\right)}{\left(m_{0 j}+m_{1 j}\right)^{2}\left(m_{0 j}+m_{1 j}-1\right)} $$ when continuity-corrected this is the Mantel-Haenszel test. (Mantel and Haenszel, 1959 )
Short Answer
Step by step solution
Key Concepts
These are the key concepts you need to understand to accurately answer the question.