Let {\boldsymbol \beta }\in \mathbb {R}^1 and let \ell _2({\boldsymbol \beta }) = \sum _{i=1}^2 \frac{Y_ i X_ i^ T {\boldsymbol \beta }- e^{X_ i^ T {\boldsymbol \beta }}}{\phi } + c, for some constant c. For the Poisson GLM, recall that \phi =1 and we have b(\theta ) = e^\theta.

Question

Let {\boldsymbol \beta }\in \mathbb {R}^1 and let \ell _2({\boldsymbol \beta }) = \sum _{i=1}^2 \frac{Y_ i X_ i^ T {\boldsymbol \beta }- e^{X_ i^ T {\boldsymbol \beta }}}{\phi } + c, for some constant c. For the Poisson GLM, recall that \phi =1 and we have b(\theta ) = e^\theta.

What is \nabla \ell _2(\beta ) for any \beta?

Use X_i for X_ i and Y_i for Y_ i.

\sum _{i=1}^2

Let

X_1^ T = X_1 = 0.1, Y_1 = 1,

X_2^ T = X_2 = 0.2, Y_2 = 2,

{\boldsymbol \beta }= 0,

Step size t=0.01.

What is the new \beta = \beta + t \cdot \nabla \ell _2(\beta )?

Answer 1

To find the gradient of the function \ell_2(\beta), we will differentiate with respect to the vector \boldsymbol{\beta}.

First, let's simplify the expression inside the summation:

\frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{\phi} = \frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{1} = Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}

Now, let's compute the gradient:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \nabla \left(Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}\right)

Since the gradient is a linear operator, we can distribute it inside the summation:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(\nabla (Y_i X_i^T \boldsymbol{\beta}) - \nabla e^{X_i^T \boldsymbol{\beta}}\right)

Now, let's compute the gradients of the individual terms:

\nabla (Y_i X_i^T \boldsymbol{\beta}) = Y_i X_i^T

and

\nabla e^{X_i^T \boldsymbol{\beta}} = e^{X_i^T \boldsymbol{\beta}} X_i

Therefore, the gradient of \ell_2(\beta) is:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)

Plugging in the given values:

X_1^T = X_1 = 0.1, Y_1 = 1,
X_2^T = X_2 = 0.2, Y_2 = 2,
\boldsymbol{\beta} = 0,

we have:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)
= \left(1 \cdot 0.1 - e^{0.1 \cdot 0} \cdot 0.1\right) + \left(2 \cdot 0.2 - e^{0.2 \cdot 0} \cdot 0.2\right)
= 0.1 - 0.1 + 0.4 - 0.2
= 0.2

Therefore, \nabla \ell_2(\beta) = 0.2.

Finally, to find the new \boldsymbol{\beta}, we can use the update rule:

\boldsymbol{\beta}_{\text{new}} = \boldsymbol{\beta} + t \cdot \nabla \ell_2(\beta)

Plugging in t = 0.01 and \nabla \ell_2(\beta) = 0.2:

\boldsymbol{\beta}_{\text{new}} = 0 + 0.01 \cdot 0.2 = 0.002

Answer 2

To find the gradient, we need to compute the partial derivatives of the function with respect to each component of β.

First, let's write out the function \ell_2(β) using the given values of X_i, Y_i, and β:

\ell_2(β) = (Y_1X_1^Tβ - e^{X_1^Tβ})/φ + (Y_2X_2^Tβ - e^{X_2^Tβ})/φ + c

Since φ = 1 for the Poisson GLM, we can simplify the expression:

\ell_2(β) = Y_1X_1^Tβ - e^{X_1^Tβ} + Y_2X_2^Tβ - e^{X_2^Tβ} + c

To find the gradient ∇\ell_2(β), we differentiate \ell_2(β) with respect to each component of β:

∇\ell_2(β) = (∂\ell_2(β)/∂β)

Taking the derivative with respect to β:

∇\ell_2(β) = Y_1X_1 - X_1e^{X_1^Tβ} + Y_2X_2 - X_2e^{X_2^Tβ}

Now, substituting the given values:

X_1 = X_1^T = 0.1
Y_1 = 1
X_2 = X_2^T = 0.2
Y_2 = 2
β = 0

∇\ell_2(β) = 1(0.1) - 0.1e^{0.1*0} + 2(0.2) - 0.2e^{0.2*0}

Simplifying further:

∇\ell_2(β) = 0.1 - 0.1 + 0.4 - 0.2

∇\ell_2(β) = 0.2

Now, to find the new β using the step size t = 0.01:

new β = β + t * ∇\ell_2(β)
= 0 + 0.01 * 0.2
= 0.002

Therefore, the new β is 0.002.