In the setting of deterministic design for linear regression, we assume that the design matrix \mathbb {X} is deterministic instead of random. The model still prescribes \mathbf Y= \mathbb {X} {\boldsymbol \beta }+ {\boldsymbol \varepsilon }, where {\boldsymbol \varepsilon }= (\varepsilon _1, \ldots , \varepsilon _ n) is a random vector that represents noise. Take note that the only random object on the right hand side is \mathbf\varepsilon, and that Y is still random.

Question

In the setting of deterministic design for linear regression, we assume that the design matrix \mathbb {X} is deterministic instead of random. The model still prescribes \mathbf Y= \mathbb {X} {\boldsymbol \beta }+ {\boldsymbol \varepsilon }, where {\boldsymbol \varepsilon }= (\varepsilon _1, \ldots , \varepsilon _ n) is a random vector that represents noise. Take note that the only random object on the right hand side is \mathbf\varepsilon, and that Y is still random.

For the rest of this section, we will always assume (\mathbb {X}^ T \mathbb {X})^{-1} exists; i.e. \mathrm{rank}(\mathbb {X}) = p.

Recall that the Least-Squares Estimator \hat{{\boldsymbol \beta }} has the formula

\hat{{\boldsymbol \beta }} = (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbf Y.

If we assume that the vector {\boldsymbol \varepsilon } is a random variable with mean \mathbb E[{\boldsymbol \varepsilon }] = 0, then in the deterministic design setting: “The LSE \hat{{\boldsymbol \beta }} is a random variable, with mean..." (choose all that apply)

0

(\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y]

\mathbb {X}^ T \mathbb {X} \beta

\beta

\epsilon

Answer 1

The correct answers are:

- 0
- (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y]
- \mathbb {X}^ T \mathbb {X} \beta

Since we assume that the vector {\boldsymbol \varepsilon } is a random variable with mean \mathbb E[{\boldsymbol \varepsilon }] = 0, the LSE \hat{{\boldsymbol \beta }} is still a random variable. However, the mean of \hat{{\boldsymbol \beta }} is not provided in the information given. The correct options are those that describe the mean of \hat{{\boldsymbol \beta }}.

- The mean of \hat{{\boldsymbol \beta }} is not necessarily 0, so option 0 is correct.
- Since \mathbb E[{\boldsymbol \varepsilon }] = 0, we can express \mathbf Y as \mathbb Y = \mathbb {X} \beta + \mathbb E[\mathbf Y]. Therefore, the mean of \mathbf Y is \mathbb E[\mathbf Y] and multiplying it by (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T would give the mean of \hat{{\boldsymbol \beta }}. Thus, option (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y] is correct.
- \mathbb {X}^ T \mathbb {X} \beta is the deterministic part of the equation without any random variable involved. Therefore, option \mathbb {X}^ T \mathbb {X} \beta is correct.
- \beta is the true coefficient vector and not a random variable, so option \beta is incorrect.
- \epsilon represents the random noise, not the mean of the LSE, so option \epsilon is incorrect.

Answer 2

In the deterministic design setting, since we assume that the vector ε has a mean of 0, the LSE β-hat is a random variable with mean 0. The correct answers would be:

0

(\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y]

However, it is important to note that β is not a random variable, and neither is ε.