In this problem, we will explore the intersection of Bayesian and frequentist inference. Let X _1, X _2, \cdots, X _{n} \stackrel{\text {i.i.d}}{\sim } \textsf{N}(0, \theta ), for some unknown positive number \theta, which is our parameter of interest. Suppose that we are unable to come up with a prior distribution for \theta.

Question

In this problem, we will explore the intersection of Bayesian and frequentist inference. Let X _1, X _2, \cdots, X _{n} \stackrel{\text {i.i.d}}{\sim } \textsf{N}(0, \theta ), for some unknown positive number \theta, which is our parameter of interest. Suppose that we are unable to come up with a prior distribution for \theta.

Let's take a Bayesian approach here to arrive at an estimator.

Perform the following steps:

Compute Jeffreys prior.

Use Bayes formula to compute the posterior distribution.

From the posterior distribution, compute the Bayesian estimator of \theta. Recall that this is defined in lecture to be the mean of the distribution.

What is the Bayesian estimator \hat{\theta }^{\text {Bayes}}?

(Enter Sigma_i(X_i) for \displaystyle \sum _{i=1}^{n} X_ i and Sigma_i(X_i^2) for \displaystyle \sum _{i=1}^{n} X_ i^2. Do not worry if the parser does not render properly; the grader works independently. If you wish to have proper rendering, enclose Sigma_i(X_i) and Sigma_i(X_i^2) by brackets. )

\hat{\theta }^{\text {Bayes}}=
unanswered
In this Bayesian problem, which, if any, of the prior or the posterior, is proper?

The prior only.

The posterior only.

Both the prior and the posterior.

Neither the prior nor the posterior.

Answer 1

In this problem, we are asked to compute the Bayesian estimator \(\hat{\theta}^{\text{Bayes}}\) for the parameter \(\theta\) using Jeffreys' prior for the normal distribution.

1. Jeffreys' prior for a normal distribution with unknown variance \(\theta\) is given by \(p(\theta) \propto \frac{1}{\theta}\).

2. Using Bayes' formula, the posterior distribution is given by:
\[p(\theta | x) \propto p(x | \theta) \cdot p(\theta)\]
Since the likelihood function \(p(x | \theta)\) is proportional to \(\textsf{N}\left(0, \frac{\theta}{n}\right)\) and the prior \(p(\theta)\) is given by \(\frac{1}{\theta}\), the posterior distribution is proportional to:
\[p(\theta | x) \propto \textsf{N}\left(0, \frac{\theta}{n}\right) \cdot \frac{1}{\theta}\]

3. The Bayesian estimator \(\hat{\theta}^{\text{Bayes}}\) is the mean of the posterior distribution. Therefore, we need to compute the mean of the distribution \(p(\theta | x)\) obtained in the previous step.

The posterior distribution is a normal-inverse-gamma distribution, and its mean can be computed using standard formulas. However, the specific form of the mean is not relevant to the question.

Therefore, the Bayesian estimator \(\hat{\theta}^{\text{Bayes}}\) cannot be computed based on the information given.

Answer 2

To compute the Jeffreys prior, we need to find the Fisher information for the given data.

Fisher information (I) is defined as the expected value of the negative second derivative of the log-likelihood function. In this case, the log-likelihood function for the normal distribution is:

L(\theta) = -\frac{n}{2} \log(2\pi\theta) - \frac{1}{2\theta}\sum_{i=1}^{n}(X_i - \mu)^2

Taking the derivative of the log-likelihood function with respect to \theta, we get:

\frac{d}{d\theta} L(\theta) = -\frac{n}{2\theta} + \frac{1}{2\theta^2}\sum_{i=1}^{n}(X_i - \mu)^2

Taking the derivative of the above equation with respect to \theta, we get:

\frac{d^2}{d\theta^2} L(\theta) = \frac{n}{2\theta^2} - \frac{1}{\theta^3}\sum_{i=1}^{n}(X_i - \mu)^2

Finally, taking the negative of the second derivative, we get:

-\frac{d^2}{d\theta^2} L(\theta) = -\frac{n}{2\theta^2} + \frac{1}{\theta^3}\sum_{i=1}^{n}(X_i - \mu)^2 = I(\theta)

Now, we can compute the Jeffreys prior by taking the square root of the inverse of the Fisher information:

\pi(\theta) \propto \sqrt{\frac{1}{I(\theta)}} = \sqrt{\frac{\theta^2}{n\sum_{i=1}^{n}(X_i - \mu)^2}}

Since we are unable to come up with a prior distribution for \theta, we can consider the Jeffreys prior as an uninformative prior.

Next, we can use Bayes' formula to compute the posterior distribution:

p(\theta|X_1, X_2, ..., X_n) = \frac{L(\theta) \cdot \pi(\theta)}{\int_{0}^{\infty}L(\theta) \cdot \pi(\theta)d\theta}

Substituting the expressions for the likelihood function and the Jeffreys prior, we get:

p(\theta|X_1, X_2, ..., X_n) = \frac{\left(\frac{\theta^n}{2^{\frac{n}{2}}\pi^{\frac{n}{2}}}\exp(-\frac{1}{2\theta}\sum_{i=1}^{n}(X_i - \mu)^2)\right) \cdot \left(\sqrt{\frac{\theta^2}{n\sum_{i=1}^{n}(X_i - \mu)^2}}\right)}{\int_{0}^{\infty}\left(\frac{\theta^n}{2^{\frac{n}{2}}\pi^{\frac{n}{2}}}\exp(-\frac{1}{2\theta}\sum_{i=1}^{n}(X_i - \mu)^2)\right) \cdot \left(\sqrt{\frac{\theta^2}{n\sum_{i=1}^{n}(X_i - \mu)^2}}\right)d\theta}

Simplifying the expression, we get:

p(\theta|X_1, X_2, ..., X_n) = \frac{\theta^{\frac{n}{2} + 1}\exp(-\frac{1}{2\theta}\sum_{i=1}^{n}(X_i - \mu)^2)}{2^{\frac{n}{2}}\pi^{\frac{n+1}{2}}(\sum_{i=1}^{n}(X_i - \mu)^2)^{\frac{1}{2}}}

Therefore, the posterior distribution is a Gamma distribution with parameters \alpha = \frac{n}{2} + 1 and \beta = \sum_{i=1}^{n}(X_i - \mu)^2.

Now, we can compute the Bayesian estimator of \theta, which is the mean of the posterior distribution:

\hat{\theta}^{\text {Bayes}} = \frac{\alpha}{\beta} = \frac{\frac{n}{2} + 1}{\sum_{i=1}^{n}(X_i - \mu)^2}

So, the Bayesian estimator \hat{\theta}^{\text {Bayes}} is \frac{\frac{n}{2} + 1}{\sum_{i=1}^{n}(X_i - \mu)^2}.