Let X_1, \ldots , X_ n \stackrel{iid}{\sim }X\sim \mathbf{P} for some unknown distribution \mathbf{P} with continuous cdf F. Below we describe a \chi ^2 test for the null and alternative hypotheses

Question

Let X_1, \ldots , X_ n \stackrel{iid}{\sim }X\sim \mathbf{P} for some unknown distribution \mathbf{P} with continuous cdf F. Below we describe a \chi ^2 test for the null and alternative hypotheses

\displaystyle H_0: \mathbf{P} \displaystyle \in \{ N(\mu , \sigma ^2) \} _{\mu \in \mathbb {R}, \sigma ^2 > 0}
\displaystyle H_1: \mathbf{P} \displaystyle \notin \{ N(\mu , \sigma ^2) \} _{\mu \in \mathbb {R}, \sigma ^2 > 0}.
We divide the sample space into 5 disjoint subsets refered to as bins :

\displaystyle A_1 \displaystyle = (-\infty , -2), \quad A_2 = (-2, -0.5),
\displaystyle A_3 \displaystyle = (-0.5, 0.5), \quad A_4 = (0.5, 2)
\displaystyle A_5 \displaystyle = (2, \infty ).
Now, define discrete random variables Y_ i as functions of X_ i by

\displaystyle Y_ i\, =\, k\qquad \text {if }\, X_ i\in A_ k.
For example, if \, X_ i = 0.1,\, then \, X_ i\in A_3\, and so \, Y_ i = 3.\, \, In other words, \, Y_ i\, is the label of the bin that contains \, X_ i.

By the definition above,

Y_1, \ldots , Y_ n \stackrel{iid}{\sim } Y

and Y follows the multinomial distribution on \{ 1, 2, 3, 4, 5 \} with (vector) parameter \, \mathbf{p}=\begin{pmatrix} p_1& p_2& p_3& p_4& p_5\end{pmatrix} \in \Delta _5\, where p_ j denote the probability that Y = j.

Assume the following special case of the null hypothesis holds:

X_1, \ldots , X_ n \stackrel{iid}{\sim }\mathcal{N}(0,1).

What is the vector parameter \mathbf{p} \in \Delta _5 of the multinomial distribution followed by Y_ i? Fill in the first three entries p_1,\, p_2,\, p_3\, below.

(Enter Phi(x) for the cdf \Phi (x) of a standard normal distribution, e.g. type Phi(1) for \Phi (1), or enter your answers accurate to 3 decimal places)

\mathbf{p}_1 =
unanswered
[Math Processing Error]
\mathbf{p}_2 =
unanswered
{::}
\mathbf{p}_3 =
unanswered
{::}
(What is p_4 and p_5 in terms of p_1,\, p_2,\, p_3?)

Answer 1

To find the vector parameter \mathbf{p} \in \Delta _5 of the multinomial distribution followed by Y_i, we need to determine the probabilities p_1, p_2, p_3, p_4, and p_5.

Since X_i follows a standard normal distribution, we can use the cumulative distribution function (CDF) of a standard normal distribution, denoted as \Phi(x), to find these probabilities.

We divide the sample space into 5 disjoint bins:

A_1 = (-∞, -2)
A_2 = (-2, -0.5)
A_3 = (-0.5, 0.5)
A_4 = (0.5, 2)
A_5 = (2, ∞)

To find p_1, we calculate the probability that Y_i takes the value 1, which means X_i falls into bin A_1. The probability is given by the difference in the CDF values at the upper and lower bounds of A_1:

p_1 = \Phi(-2) - \Phi(-∞)

Since the CDF approaches 0 as the argument approaches -∞, we have:

p_1 = \Phi(-2) - 0

Similarly, to find p_2, we calculate the probability that Y_i takes the value 2, which means X_i falls into bin A_2. The probability is given by the difference in the CDF values at the upper and lower bounds of A_2:

p_2 = \Phi(-0.5) - \Phi(-2)

To find p_3, we calculate the probability that Y_i takes the value 3, which means X_i falls into bin A_3. The probability is given by the difference in the CDF values at the upper and lower bounds of A_3:

p_3 = \Phi(0.5) - \Phi(-0.5)

Note: Since \Phi(x) is symmetric about 0, p_4 is equal to p_2 and p_5 is equal to p_1. Therefore:

p_4 = p_2
p_5 = p_1

So the vector parameter \mathbf{p} is given by:

\mathbf{p} = \begin{pmatrix} p_1 & p_2 & p_3 & p_4 & p_5 \end{pmatrix} = \begin{pmatrix} \Phi(-2) - 0 & \Phi(-0.5) - \Phi(-2) & \Phi(0.5) - \Phi(-0.5) & \Phi(-0.5) - \Phi(-2) & \Phi(-2) - 0 \end{pmatrix}

Answer 2

To find the vector parameter \textbf{p} of the multinomial distribution followed by Y\_i, we need to calculate the probabilities p\_j for j = 1, 2, 3, 4, 5.

Since we are assuming X\_1, ..., X\_n ~ N(0,1), we need to find the probabilities of Y\_i = j for j = 1, 2, 3, 4, 5.

We can calculate these probabilities by finding the area under the standard normal curve in each of the respective bins.

The bins A\_1, A\_2, A\_3, A\_4, and A\_5 correspond to the intervals (-∞, -2), (-2, -0.5), (-0.5, 0.5), (0.5, 2), and (2, ∞) respectively.

To calculate p\_1, we need to find the probability that X\_i falls in the interval (-∞, -2).

p\_1 = P(X\_i ∈ A\_1)
= P(X\_i < -2)
= Phi(-2), where Phi is the standard normal cdf.
= Phi(-2)

To calculate p\_2, we need to find the probability that X\_i falls in the interval (-2, -0.5).

p\_2 = P(X\_i ∈ A\_2)
= P(-2 < X\_i < -0.5)
= Phi(-0.5) - Phi(-2), as Phi is a cumulative distribution function.
= Phi(-0.5) - Phi(-2)

To calculate p\_3, we need to find the probability that X\_i falls in the interval (-0.5, 0.5).

p\_3 = P(X\_i ∈ A\_3)
= P(-0.5 < X\_i < 0.5)
= Phi(0.5) - Phi(-0.5), as Phi is a cumulative distribution function.
= Phi(0.5) - Phi(-0.5)

Therefore, we have:

p\_1 = Phi(-2)
p\_2 = Phi(-0.5) - Phi(-2)
p\_3 = Phi(0.5) - Phi(-0.5)

To find p\_4 and p\_5 in terms of p\_1, p\_2, and p\_3, we need to consider that the probabilities must sum up to 1 and p\_1, p\_2, and p\_3 cover the intervals from -∞ to 2.

Since Y follows a multinomial distribution, we have:

p\_4 = P(X\_i ∈ A\_4) = 1 - p\_1 - p\_2 - p\_3

Similarly,

p\_5 = P(X\_i ∈ A\_5) = 1 - p\_1 - p\_2 - p\_3 - p\_4

Therefore, we can calculate p\_4 and p\_5 once we know the values of p\_1, p\_2, and p\_3.

Please note that the values of p\_1, p\_2, and p\_3 can be approximated using the standard normal distribution function Phi(x).