Suppose that X_1=\Theta +W_1 and X_2=2\Theta +W_2, where \Theta ,W_1,W_2 are independent standard normal random variables. If the values that we observe happen to be X_1=-1 and X_2=1, then what is the MAP estimate of \Theta?

Question

Suppose that X_1=\Theta +W_1 and X_2=2\Theta +W_2, where \Theta ,W_1,W_2 are independent standard normal random variables. If the values that we observe happen to be X_1=-1 and X_2=1, then what is the MAP estimate of \Theta?

Answer 1

To find the maximum a posteriori (MAP) estimate of \(\Theta\), we need to calculate the posterior probability density function (PDF) of \(\Theta\) given the observed values \(X_1 = -1\) and \(X_2 = 1\).

First, let's write the joint distribution of \(X_1\) and \(X_2\) in terms of \(\Theta\):
\(f(x_1, x_2 | \Theta) = f(x_1 | \Theta) \cdot f(x_2 | \Theta)\)
Since \(X_1 = \Theta + W_1\) and \(X_2 = 2\Theta + W_2\), we can substitute these equations into the joint distribution:
\(f(x_1, x_2 | \Theta) = f(x_1 | \theta + w_1) \cdot f(x_2 | 2\theta + w_2)\)
Since \(W_1\) and \(W_2\) are independent standard normal random variables, the conditional distributions \(f(x_1 | \theta + w_1)\) and \(f(x_2 | 2\theta + w_2)\) are also standard normal distributions centered at \(\theta\) and \(2\theta\) respectively.

Now, let's write the prior distribution of \(\Theta\) as \(f(\Theta) = f(\Theta | \mu, \sigma^2)\), where \(\mu\) and \(\sigma^2\) are the parameters of the prior distribution. Since \(\Theta\) is a standard normal random variable, we have \(\mu = 0\) and \(\sigma^2 = 1\).

The posterior probability density function (PDF) of \(\Theta\) given the observed values \(X_1 = -1\) and \(X_2 = 1\) is given by Bayes' theorem as follows:
\(f(\Theta | x_1, x_2) = \frac{{f(x_1, x_2 | \Theta) \cdot f(\Theta)}}{{f(x_1, x_2)}}\)

We can now calculate the denominator \(f(x_1, x_2)\) by integrating the joint distribution over all possible values of \(\Theta\):
\(f(x_1, x_2) = \int_{-\infty}^{\infty} f(x_1, x_2 | \Theta) \cdot f(\Theta) d\Theta\)
\(= \int_{-\infty}^{\infty} f(x_1 | \theta + w_1) \cdot f(x_2 | 2\theta + w_2) \cdot f(\Theta) d\Theta\)

Since the standard normal distribution is symmetric, the integral of the product of the two standard normal distributions centered at \(\theta\) and \(2\theta\) respectively is equal to the product of their individual integrals. Therefore, we can simplify the above equation as:
\(f(x_1, x_2) = f(x_1 | \theta + w_1) \int_{-\infty}^{\infty} f(x_2 | 2\theta + w_2) \cdot f(\Theta) d\Theta\)

The denominator \(f(x_1, x_2)\) can be further simplified by recognizing that the integrand is the joint distribution of \(X_1\) and \(X_2\) if \(\Theta\) is fixed to a specific value. Therefore, the integral over all possible values of \(\Theta\) is equal to the joint distribution itself:
\(f(x_1, x_2) = f(x_1 | \theta + w_1) \cdot f(x_2 | 2\theta + w_2)\)

Now, we can calculate the posterior probability density function (PDF) of \(\Theta\) given the observed values \(X_1 = -1\) and \(X_2 = 1\) as:
\(f(\Theta | x_1, x_2) = \frac{{f(x_1 | \theta + w_1) \cdot f(x_2 | 2\theta + w_2) \cdot f(\Theta)}}{{f(x_1 | \theta + w_1) \cdot f(x_2 | 2\theta + w_2)}}\)
\(= f(\Theta)\)
\(= f(\Theta | \mu, \sigma^2)\)
Since the observed values of \(X_1 = -1\) and \(X_2 = 1\) do not change the prior distribution of \(\Theta\), the posterior distribution is the same as the prior distribution.

Therefore, the MAP estimate of \(\Theta\) is simply the mean of the prior distribution, which is \(\mu = 0\).

Answer 2

To find the Maximum A Posteriori (MAP) estimate of Θ given X1 = -1 and X2 = 1, we need to calculate the posterior distribution and find the value of Θ that maximizes it.

1. Begin by writing the joint probability distribution function for X1 and X2 in terms of Θ and the independent standard normal random variables W1 and W2:
P(X1, X2 | Θ) = P(X1 | Θ) * P(X2 | Θ)

2. Since Θ, W1, and W2 are independent standard normal random variables, we can write the probability density functions as:
P(X1 | Θ) = (1 / √(2π)) * exp(-(X1 - Θ)^2 / 2) [using the standard normal distribution formula]
P(X2 | Θ) = (1 / √(2π)) * exp(-(X2 - 2Θ)^2 / 2) [using the standard normal distribution formula]

3. Multiply the individual likelihood functions:
L(Θ | X1, X2) = P(X1, X2 | Θ) = P(X1 | Θ) * P(X2 | Θ)
= (1 / (2π)) * exp(-(X1 - Θ)^2 / 2) * exp(-(X2 - 2Θ)^2 / 2)
= (1 / (2π)) * exp(-(X1^2 - 2X1Θ + Θ^2) / 2) * exp(-(X2^2 - 4X2Θ + 4Θ^2) / 2)
= (1 / (2π)) * exp(-(X1^2 + X2^2 - 2X1Θ - 4X2Θ + 5Θ^2) / 2)

4. Now, we need to find the value of Θ that maximizes the likelihood function L(Θ | X1, X2).
To make it easier to find the maximum, we can maximize the log of the likelihood function, which is equivalent since the logarithm is a monotonically increasing function:
log L(Θ | X1, X2) = log((1 / (2π)) * exp(-(X1^2 + X2^2 - 2X1Θ - 4X2Θ + 5Θ^2) / 2))
= log(1 / (2π)) - ((X1^2 + X2^2 - 2X1Θ - 4X2Θ + 5Θ^2) / 2)
= -log(2π) - ((X1^2 + X2^2 - 2X1Θ - 4X2Θ + 5Θ^2)/2)

5. To find the maximum, we differentiate the log likelihood function with respect to Θ and set the derivative equal to zero:
d/dΘ (log L(Θ | X1, X2)) = (-2X1 + 10Θ - 4X2) / 2 = 0

6. Solve the equation for Θ:
-2X1 + 10Θ - 4X2 = 0
10Θ = 2X1 + 4X2
Θ = (2X1 + 4X2)/10

7. Substitute the observed values of X1 and X2 into the equation to get the MAP estimate of Θ:
Θ = (2(-1) + 4(1))/10
Θ = (2 - 4)/10
Θ = -2/10
Θ = -0.2

Therefore, the MAP estimate of Θ, given X1 = -1 and X2 = 1, is -0.2.