Although the [mathjaxinline]\, X_ i \,[/mathjaxinline] are discrete, we can also use a logistic regression model to analyze the data. That is, now we assume
[mathjax]Y_ i | X_ i \sim \textsf{Ber}\left( \frac{1}{1 + \mathbf e^{-(X_ i \beta _1 + \beta _0})} \right),[/mathjax]
for [mathjaxinline]\, \beta _0, \beta _1 \in \mathbb {R} \,[/mathjaxinline], and that given [mathjaxinline]\, X_ i \,[/mathjaxinline], the [mathjaxinline]\, Y_ i \,[/mathjaxinline] are independent.
Calculate the maximum likelihood estimator [mathjaxinline]\, \widehat{\beta }_0 \,[/mathjaxinline], [mathjaxinline]\, \widehat{\beta }_1 \,[/mathjaxinline] for [mathjaxinline]\, \beta _0 \,[/mathjaxinline] and [mathjaxinline]\, \beta _1 \,[/mathjaxinline], where we again assume that all [mathjaxinline]\, f_{kl} > 0 \,[/mathjaxinline]. Express your answer in terms of [mathjaxinline]f_{00}[/mathjaxinline] (enter “A"), [mathjaxinline]f_{01}[/mathjaxinline] (enter “B"), [mathjaxinline]f_{10}[/mathjaxinline] (enter “C"), [mathjaxinline]f_{11}[/mathjaxinline] (enter “D") and [mathjaxinline]n[/mathjaxinline].
[mathjaxinline]\widehat{\beta }_{0}[/mathjaxinline]
(B+D)/(A+B+C+D)
[mathjaxinline]\widehat{\beta }_{1}[/mathjaxinline]
To find the maximum likelihood estimators, we need to maximize the likelihood function. The likelihood function is given by:
[mathjax]L(\beta_0, \beta_1) = \prod_{i=1}^n \left(\frac{1}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}}\right)^{Y_i}\left(\frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}}\right)^{(1-Y_i)}[/mathjax]
To simplify the calculations, we can take the logarithm of the likelihood function:
[mathjax]\log(L(\beta_0, \beta_1)) = \sum_{i=1}^n Y_i\log(\frac{1}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}}) + (1-Y_i)\log(\frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}})[/mathjax]
Taking the derivative of the logarithm of the likelihood function with respect to [mathjax]\beta_0[/mathjax] and [mathjax]\beta_1[/mathjax] and setting them equal to zero, we can solve for the maximum likelihood estimators:
[mathjax]\frac{\partial \log(L)}{\partial \beta_0} = \sum_{i=1}^n (Y_i - \frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}}) = 0[/mathjax]
[mathjax]\frac{\partial \log(L)}{\partial \beta_1} = \sum_{i=1}^n X_i (Y_i - \frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}}) = 0[/mathjax]
We can rewrite the equations above as:
[mathjax]\sum_{i=1}^n Y_i - \frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}} = 0[/mathjax]
[mathjax]\sum_{i=1}^n X_i Y_i - \frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}} = 0[/mathjax]
Simplifying the equations further:
[mathjax]\frac{1}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}} = \frac{\sum_{i=1}^n Y_i}{n} = \frac{B + D}{n}[/mathjax]
[mathjax]X_i \frac{\mathbf e^{-(X_i \beta_1 + \beta_0)}}{1 + \mathbf e^{-(X_i \beta_1 + \beta_0)}} = \frac{\sum_{i=1}^n X_i Y_i}{n} = \frac{D}{n}[/mathjax]
Now, let's solve these equations for [mathjax]\mathbf e^{-(X_i \beta_1 + \beta_0)}[/mathjax]:
[mathjax]\mathbf e^{-(X_i \beta_1 + \beta_0)} = \frac{n B + n D}{n - B - D}[/mathjax]
Substituting this back into the second equation:
[mathjax]X_i \frac{\frac{n B + n D}{n - B - D}}{1 + \frac{n B + n D}{n - B - D}} = \frac{D}{n}[/mathjax]
Simplifying further:
[mathjax]X_i \frac{n B + n D}{2n B + 2n D} = \frac{D}{n}[/mathjax]
[mathjax]2n B + 2n D = 2n D[/mathjax]
[mathjax]2n B = 0[/mathjax]
Since [mathjax]f_{0,1}[/mathjax] does not appear in the equation, we cannot solve for [mathjax]\beta_1[/mathjax]. In other words, the maximum likelihood estimator for [mathjax]\beta_1[/mathjax] is not identifiable.
However, we can solve for [mathjax]\beta_0[/mathjax]:
[mathjax]\frac{1}{1 + \frac{n B + n D}{n - B - D}} = \frac{B + D}{n}[/mathjax]
Simplifying further:
[mathjax]n - B - D = n B + n D[/mathjax]
[mathjax]n - 2n B = n B[/mathjax]
[mathjax]n = 3n B[/mathjax]
[mathjax]B = \frac{1}{3}[/mathjax]
Therefore, the maximum likelihood estimator for [mathjax]\beta_0[/mathjax] is [mathjax]\widehat{\beta}_0 = \frac{B + D}{n} = \frac{B + D}{A + B + C + D}[/mathjax].
To calculate the maximum likelihood estimator ([mathjax] \widehat{\beta}_0 [/mathjax] and [mathjax] \widehat{\beta}_1 [/mathjax]), we need to find the values that maximize the likelihood function given the observed data.
The likelihood function can be expressed as:
[mathjax] L(\beta_0, \beta_1) = \prod_{i=1}^{n} \left( \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}} \right)^{Y_i} \left( 1 - \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}} \right)^{1-Y_i} [/mathjax]
Taking the logarithm of the likelihood function (log-likelihood) makes it easier to maximize:
[mathjax] \log L(\beta_0, \beta_1) = \sum_{i=1}^{n} \left( Y_i \log \left( \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}} \right) + (1-Y_i) \log \left( 1 - \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}} \right) \right) [/mathjax]
To find the maximum likelihood estimator, we need to find the values of [mathjax] \beta_0 [/mathjax] and [mathjax] \beta_1 [/mathjax] that maximize the log-likelihood. This can be done by taking partial derivatives with respect to [mathjax] \beta_0 [/mathjax] and [mathjax] \beta_1 [/mathjax] and setting them equal to zero.
Taking the partial derivative with respect to [mathjax] \beta_0 [/mathjax]:
[mathjax] \frac{\partial \log L(\beta_0, \beta_1)}{\partial \beta_0} = \sum_{i=1}^{n} \left( \frac{Y_i}{1 + e^{-(X_i \beta_1 + \beta_0)}} - \frac{1-Y_i}{1 - \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}}} \right) = \sum_{i=1}^{n} \left( \frac{Y_i - 1 + Y_i e^{(X_i \beta_1 + \beta_0)}}{1 + e^{(X_i \beta_1 + \beta_0)}} \right) = 0 [/mathjax]
Taking the partial derivative with respect to [mathjax] \beta_1 [/mathjax]:
[mathjax] \frac{\partial \log L(\beta_0, \beta_1)}{\partial \beta_1} = \sum_{i=1}^{n} \left( \frac{X_i Y_i}{1 + e^{-(X_i \beta_1 + \beta_0)}} - \frac{X_i (1-Y_i)}{1 - \frac{1}{1 + e^{-(X_i \beta_1 + \beta_0)}}} \right) = \sum_{i=1}^{n} \left( \frac{X_i Y_i - X_i + X_i Y_i e^{(X_i \beta_1 + \beta_0)}}{1 + e^{(X_i \beta_1 + \beta_0)}} \right) = 0 [/mathjax]
To solve these equations, we need the values of [mathjax] X_i [/mathjax] and [mathjax] Y_i [/mathjax]. However, in the question, we are only given [mathjax] f_{00}, f_{01}, f_{10}, f_{11} [/mathjax], which represent the frequencies of the observed outcomes.
It seems that we don't have enough information to explicitly calculate the maximum likelihood estimator of [mathjax] \beta_0 [/mathjax] and [mathjax] \beta_1 [/mathjax].