Let {\boldsymbol \beta }\in \mathbb {R}^1 and let \ell _2({\boldsymbol \beta }) = \sum _{i=1}^2 \frac{Y_ i X_ i^ T {\boldsymbol \beta }- e^{X_ i^ T {\boldsymbol \beta }}}{\phi } + c, for some constant c. For the Poisson GLM, recall that \phi =1 and we have b(\theta ) = e^\theta.
What is \nabla \ell _2(\beta ) for any \beta?
Use X_i for X_ i and Y_i for Y_ i.
\sum _{i=1}^2
Let
X_1^ T = X_1 = 0.1, Y_1 = 1,
X_2^ T = X_2 = 0.2, Y_2 = 2,
{\boldsymbol \beta }= 0,
Step size t=0.01.
What is the new \beta = \beta + t \cdot \nabla \ell _2(\beta )?
To find the gradient of the function \ell_2(\beta), we will differentiate with respect to the vector \boldsymbol{\beta}.
First, let's simplify the expression inside the summation:
\frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{\phi} = \frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{1} = Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}
Now, let's compute the gradient:
\nabla \ell_2(\beta) = \sum_{i=1}^2 \nabla \left(Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}\right)
Since the gradient is a linear operator, we can distribute it inside the summation:
\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(\nabla (Y_i X_i^T \boldsymbol{\beta}) - \nabla e^{X_i^T \boldsymbol{\beta}}\right)
Now, let's compute the gradients of the individual terms:
\nabla (Y_i X_i^T \boldsymbol{\beta}) = Y_i X_i^T
and
\nabla e^{X_i^T \boldsymbol{\beta}} = e^{X_i^T \boldsymbol{\beta}} X_i
Therefore, the gradient of \ell_2(\beta) is:
\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)
Plugging in the given values:
X_1^T = X_1 = 0.1, Y_1 = 1,
X_2^T = X_2 = 0.2, Y_2 = 2,
\boldsymbol{\beta} = 0,
we have:
\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)
= \left(1 \cdot 0.1 - e^{0.1 \cdot 0} \cdot 0.1\right) + \left(2 \cdot 0.2 - e^{0.2 \cdot 0} \cdot 0.2\right)
= 0.1 - 0.1 + 0.4 - 0.2
= 0.2
Therefore, \nabla \ell_2(\beta) = 0.2.
Finally, to find the new \boldsymbol{\beta}, we can use the update rule:
\boldsymbol{\beta}_{\text{new}} = \boldsymbol{\beta} + t \cdot \nabla \ell_2(\beta)
Plugging in t = 0.01 and \nabla \ell_2(\beta) = 0.2:
\boldsymbol{\beta}_{\text{new}} = 0 + 0.01 \cdot 0.2 = 0.002
To find the gradient, we need to compute the partial derivatives of the function with respect to each component of β.
First, let's write out the function \ell_2(β) using the given values of X_i, Y_i, and β:
\ell_2(β) = (Y_1X_1^Tβ - e^{X_1^Tβ})/φ + (Y_2X_2^Tβ - e^{X_2^Tβ})/φ + c
Since φ = 1 for the Poisson GLM, we can simplify the expression:
\ell_2(β) = Y_1X_1^Tβ - e^{X_1^Tβ} + Y_2X_2^Tβ - e^{X_2^Tβ} + c
To find the gradient ∇\ell_2(β), we differentiate \ell_2(β) with respect to each component of β:
∇\ell_2(β) = (∂\ell_2(β)/∂β)
Taking the derivative with respect to β:
∇\ell_2(β) = Y_1X_1 - X_1e^{X_1^Tβ} + Y_2X_2 - X_2e^{X_2^Tβ}
Now, substituting the given values:
X_1 = X_1^T = 0.1
Y_1 = 1
X_2 = X_2^T = 0.2
Y_2 = 2
β = 0
∇\ell_2(β) = 1(0.1) - 0.1e^{0.1*0} + 2(0.2) - 0.2e^{0.2*0}
Simplifying further:
∇\ell_2(β) = 0.1 - 0.1 + 0.4 - 0.2
∇\ell_2(β) = 0.2
Now, to find the new β using the step size t = 0.01:
new β = β + t * ∇\ell_2(β)
= 0 + 0.01 * 0.2
= 0.002
Therefore, the new β is 0.002.