We continue with the LR-test on the HIP study.

Question

We continue with the LR-test on the HIP study.

Let Y_ T and Y_ C be the numbers of cancer deaths in the treatment and control groups respectively. Assuming these are independent from each other, the probability of having y_ t breast cancer deaths in the treatment group and y_ c breast cancer deaths in the control group is the product

\displaystyle \displaystyle \mathbf{P}(Y_ T=y_ t, Y_ C=y_ c) \displaystyle = \displaystyle \mathbf{P}(Y_ T=y_ t) \mathbf{P}(Y_ C=y_ c).
Recall the HIP mammography study data:

We use the binomial model for Y_ T and Y_ C:

\displaystyle \displaystyle Y_ T\sim \text {Binom}(31000, \pi _ T)
\displaystyle Y_ C\sim \text {Binom}(31000, \pi _ C)
The likelihood ratio test statistic is

\displaystyle \displaystyle \Lambda (y_ T, y_ C) \displaystyle = \displaystyle -2\log \frac{\max _{\Theta _0} \mathbf{P}(y_ T,y_ C;\pi _ T,\pi _ C)}{\max _{\Theta _ A} \mathbf{P}(y_ T,y_ C;\pi _ T,\pi _ C)}
\displaystyle = \displaystyle -2\log \frac{\max _{\pi _ T=\pi _ C\in [0,1]}\mathbf{P}(y_ T,y_ C;\pi )}{\max _{\pi _ T\neq \pi _ C} \mathbf{P}(y_ T,y_ C;\pi _ T,\pi _ C)}
\displaystyle = \displaystyle -2\log \frac{\max _{\pi _ T=\pi _ C=\pi \in [0,1]}\mathbf{P}\left(\text {Binom}(31000,\pi ) = y_ T\right)\mathbf{P}\left(\text {Binom}(31000,\pi ) = y_ C\right) }{\max _{\pi _ T\neq \pi _ C} \mathbf{P}\left(\text {Binom}(31000,\pi _ T) = y_ T\right)\mathbf{P}\left(\text {Binom}(31000,\pi _ C) = y_ C\right)}
\displaystyle = \displaystyle -2\log \frac{\mathbf{P}\left(\text {Binom}(31000,{\color{blue}{\hat{\pi }^{\text {MLE}}}} ) = y_ T\right)\mathbf{P}\left(\text {Binom}(31000,{\color{blue}{\hat{\pi }^{\text {MLE}}}} ) = y_ C\right) }{\mathbf{P}\left(\text {Binom}(31000,{\color{blue}{\hat{\pi }^{\text {MLE}}_ T}} ) = y_ T\right)\mathbf{P}\left(\text {Binom}(31000,{\color{blue}{\hat{\pi }^{\text {MLE}}_ C}} ) = y_ C\right)}
where we have used \displaystyle \mathbf{P}\left(\text {Binom}(n,p)=y\right) to denote the probability that a binomial variable with parameters n,p takes value y.

Based on the observed data, Find the parameters (\pi _ T,\pi _ C) that maximize the numerator and the denominator in the definition of the test statistic \Lambda. That is, find the 3 different maximum likelihood estimates (in blue ) in the expression above.

Review: MLE for Binomial Distribution
Show

The value \pi that maximizes \mathbf{P}(\text {Binom}(31000,\pi ) = 39)\mathbf{P}(\text {Binom}(31000,\pi ) = 63):

{\color{blue}{\hat{\pi }^{\text {MLE}}}} =\quad
unanswered
The value of \pi _ T that maximizes \mathbf{P}(\text {Binom}(31000,\pi _ T) = 39):

{\color{blue}{\hat{\pi }^{\text {MLE}}_ T}} =\quad
unanswered
The value of \pi _ C that maximizes \mathbf{P}(\text {Binom}(31000,\pi _ C) = 63):

{\color{blue}{\hat{\pi }^{\text {MLE}}_ C}} =\quad
unanswered
What is the value of the test statistic \Lambda based on observed data? (Enter the value with a precision of 3 decimal points.)

Answer 1

To find the maximum likelihood estimates (MLE) for the parameters (\pi_T, \pi_C), we need to find the values that maximize the probabilities \mathbf{P}(\text{Binom}(31000,\pi) = y_T) and \mathbf{P}(\text{Binom}(31000,\pi) = y_C) respectively.

For the numerator, the value of \pi that maximizes \mathbf{P}(\text{Binom}(31000,\pi) = 39)\mathbf{P}(\text{Binom}(31000,\pi) = 63) is the MLE for the overall breast cancer death rate. We can calculate it using the MLE formula for the binomial distribution:

{\color{blue}{\hat{\pi }^{\text {MLE}}}} = \frac{y_T + y_C}{31000 + 31000}

Substituting the values y_T = 39 and y_C = 63, we have:

{\color{blue}{\hat{\pi }^{\text {MLE}}}} = \frac{39 + 63}{31000 + 31000} = \frac{102}{62000} = 0.00164516

For the denominator, we want to find the values of \pi_T and \pi_C that maximize the probabilities \mathbf{P}(\text{Binom}(31000,\pi_T) = 39) and \mathbf{P}(\text{Binom}(31000,\pi_C) = 63) respectively.

To find the MLE for \pi_T, we can solve the equation:

\mathbf{P}(\text{Binom}(31000,\pi_T) = 39) = \left( \binom{31000}{39} \pi_T^{39} (1-\pi_T)^{31000-39} \right)

Taking the derivative with respect to \pi_T and setting it to zero, we can solve for \pi_T:

39\left(\frac{\pi_T^{38}(1-\pi_T)^{31000-39}}{\pi_T^{39}(1-\pi_T)^{31000-39}} - 1\right) = 0

Simplifying, we have:

\pi_T - \pi_T^{40} - (1-\pi_T)^{31000-39} = 0

Unfortunately, this equation does not have a closed-form solution and needs to be solved numerically. Using numerical methods or software, we can find that the MLE for \pi_T is approximately 0.00125806.

Similarly, for \pi_C, we want to find the MLE that maximizes \mathbf{P}(\text{Binom}(31000,\pi_C) = 63). Again, this requires solving an equation numerically. The MLE for \pi_C is approximately 0.00203226.

Therefore, the 3 different maximum likelihood estimates for the parameters (\pi_T, \pi_C) are:

{\color{blue}{\hat{\pi }^{\text {MLE}}}} = 0.00164516
{\color{blue}{\hat{\pi }^{\text {MLE}}_T}} = 0.00125806
{\color{blue}{\hat{\pi }^{\text {MLE}}_C}} = 0.00203226

To find the value of the test statistic \Lambda based on the observed data, we substitute these values into the formula:

\Lambda(y_T, y_C) = -2\log \frac{\mathbf{P}(\text{Binom}(31000,\hat{\pi }^{\text {MLE}}) = y_T)\mathbf{P}(\text{Binom}(31000,\hat{\pi }^{\text {MLE}}) = y_C) } {\mathbf{P}(\text{Binom}(31000,\hat{\pi }^{\text {MLE}}_T) = y_T)\mathbf{P}(\text{Binom}(31000,\hat{\pi }^{\text {MLE}}_C) = y_C)}

Substituting the observed values y_T = 39, y_C = 63, and the MLEs, we can calculate the test statistic:

\Lambda(39, 63) = -2\log \frac{\mathbf{P}(\text{Binom}(31000,0.00164516) = 39)\mathbf{P}(\text{Binom}(31000,0.00164516) = 63) } {\mathbf{P}(\text{Binom}(31000,0.00125806) = 39)\mathbf{P}(\text{Binom}(31000,0.00203226) = 63)}

Calculating this expression will give us the value of the test statistic \Lambda based on the observed data.

Answer 2

To find the maximum likelihood estimates (MLE) for the parameters (\(\pi_T, \pi_C\)) that maximize the numerator and denominator in the definition of the test statistic \(\Lambda\), we need to calculate the individual MLEs for each parameter.

1. MLE for the numerator:
\(\hat{\pi}^{\text{MLE}} = \frac{39}{31000}\)

2. MLE for \(\pi_T\):
\(\hat{\pi}_T^{\text{MLE}} = \frac{39}{31000}\)

3. MLE for \(\pi_C\):
\(\hat{\pi}_C^{\text{MLE}} = \frac{63}{31000}\)

Now, let's calculate the value of the test statistic \(\Lambda\) based on the observed data.

\(\Lambda(y_T, y_C) = -2\log\left(\frac{\mathbf{P}(\text{Binom}(31000,\hat{\pi}^{\text{MLE}}))^{y_T} \cdot \mathbf{P}(\text{Binom}(31000,\hat{\pi}^{\text{MLE}}))^{y_C}}{\mathbf{P}(\text{Binom}(31000,\hat{\pi}_T^{\text{MLE}}))^{y_T} \cdot \mathbf{P}(\text{Binom}(31000,\hat{\pi}_C^{\text{MLE}}))^{y_C}}\right)\)

Now, we can substitute the values of the MLEs and the observed data into the equation to calculate the value of the test statistic \(\Lambda\). Please provide the observed data (\(y_T\) and \(y_C\)) for a precise calculation.