Return to the original model. We now introduce a Poisson intensity parameter \, \lambda _ t \, for every time point and denote the parameter (\eta) that gives the canonical exponential family representation as above by \, \theta _ t \,. We choose to employ a linear model connecting the time points \, t \, with the canonical parameter \, \theta \, of the Poisson distribution above, i.e.,

\theta _ t = a + b t.

In other words, we choose a generalized linear model with Poisson distribution and its canonical link function. That also means that conditioned on \, t \,, we assume the \, Y_ t \, to be independent.

Imagine we observe the following data:

\, t_1 = 1 \, 1 outbreaks
\, t_2 = 2 \, 3 outbreaks
\, t_3 = 4 \, 10 outbreaks
We want to produce a maximum likelihood estimator for (a,b). To this end, write down the log likelihood \ell (a,b) of the model for the provided three observations at t_1, t_2, and t_3 (plug in their values).

\ell (a,b) =
unanswered

What is its gradient? Enter your answer as a pair of derivatives.

\partial _ a\ell (a,b) =
unanswered

\partial _ b\ell (a,b) =

To calculate the log likelihood, we need to sum up the log probabilities of observing the given outbreaks at each time point. The log probability of observing \, Y_t \, outbreaks at time \, t \, is given by:

\log P(Y_t | \lambda_t) = Y_t \log \lambda_t - \lambda_t - \log(Y_t!)

Using the linear model \, \theta_t = a + bt \, and the canonical link function \, \lambda_t = e^{\theta_t} \, for the Poisson distribution, we can express \, \lambda_t \, in terms of \, a \, and \, b \, as:

\lambda_t = e^{a + bt}

Plugging in the observed values, we have:

\log P(Y_1 | \lambda_1) = 1 \log (e^{a + b}) - e^{a + b} - \log(1!)
\log P(Y_2 | \lambda_2) = 3 \log (e^{a + 2b}) - e^{a + 2b} - \log(3!)
\log P(Y_3 | \lambda_3) = 10 \log (e^{a + 4b}) - e^{a + 4b} - \log(10!)

The log likelihood is the sum of these log probabilities:

\ell(a, b) = \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3)

Now, let's calculate the gradient of the log likelihood with respect to \, a \, and \, b \, separately.

Taking the partial derivative of \, \ell(a, b) \, with respect to \, a \, gives:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3) \right)

Since \, \lambda_1 = e^{a + b}, \, \lambda_2 = e^{a + 2b}, \, and \, \lambda_3 = e^{a + 4b}, \, we can rewrite the partial derivative as:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( \log \left( e^{a + b} \right) + \log \left( e^{a + 2b} \right) + \log \left( e^{a + 4b} \right) \right)

Taking the derivative with respect to \, a \, using the chain rule, we get:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( a + b + a + 2b + a + 4b \right)
\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( 3a + 7b \right)
\frac{\partial \ell(a, b)}{\partial a} = 3

Similarly, taking the partial derivative of \, \ell(a, b) \, with respect to \, b \, gives:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3) \right)

Using the same approach as before, we can rewrite this as:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( \log \left( e^{a + b} \right) + \log \left( e^{a + 2b} \right) + \log \left( e^{a + 4b} \right) \right)

Taking the derivative with respect to \, b \, using the chain rule, we get:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( a + b + a + 2b + a + 4b \right)
\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( 7a + 7b \right)
\frac{\partial \ell(a, b)}{\partial b} = 7

Therefore, the gradient of the log likelihood is:

\frac{\partial \ell(a, b)}{\partial a} = 3
\frac{\partial \ell(a, b)}{\partial b} = 7

To write down the log likelihood of the model for the provided observations at \(t_1\), \(t_2\), and \(t_3\), we can use the formula for the log probability mass function (PMF) of a Poisson distribution with parameter \(\theta\):

\[P(Y_t = y_t \mid \theta_t) = \frac{{e^{-\theta_t} \theta_t^{y_t}}}{{y_t!}}\]

Given that \(\theta_t = a + bt\), we can write the log likelihood \(\ell(a,b)\) as:

\[\ell(a,b) = \sum_{i=1}^{3} \log\left(\frac{{e^{-(a+bt_i)} (a+bt_i)^{y_i}}}{{y_i!}}\right)\]

Substituting the provided values for \(t_1\), \(t_2\), \(t_3\), and the corresponding outbreak counts, we get:

\[\ell(a,b) = \log\left(\frac{{e^{-(a+b)} (a+b)^{1}}}{{1!}}\right) + \log\left(\frac{{e^{-(a+2b)} (a+2b)^{3}}}{{3!}}\right) + \log\left(\frac{{e^{-(a+4b)} (a+4b)^{10}}}{{10!}}\right)\]

The gradient of the log likelihood \(\ell(a,b)\) with respect to \(a\) and \(b\) can be computed by taking the partial derivatives of \(\ell(a,b)\) with respect to \(a\) and \(b\), respectively.

\[\frac{{\partial \ell(a,b)}}{{\partial a}} = \frac{{\partial}}{{\partial a}} \left(\log\left(\frac{{e^{-(a+b)} (a+b)^{1}}}{{1!}}\right) + \log\left(\frac{{e^{-(a+2b)} (a+2b)^{3}}}{{3!}}\right) + \log\left(\frac{{e^{-(a+4b)} (a+4b)^{10}}}{{10!}}\right)\right)\]

\[\frac{{\partial \ell(a,b)}}{{\partial b}} = \frac{{\partial}}{{\partial b}} \left(\log\left(\frac{{e^{-(a+b)} (a+b)^{1}}}{{1!}}\right) + \log\left(\frac{{e^{-(a+2b)} (a+2b)^{3}}}{{3!}}\right) + \log\left(\frac{{e^{-(a+4b)} (a+4b)^{10}}}{{10!}}\right)\right)\]

Calculating these derivatives will give us the gradient of the log likelihood function with respect to \(a\) and \(b\).