Let X_1, \ldots , X_ n \stackrel{iid}{\sim }X\sim \mathbf{P} for some unknown distribution \mathbf{P} with continuous cdf F. Below we describe a \chi ^2 test for the null and alternative hypotheses
\displaystyle H_0: \mathbf{P} \displaystyle \in \{ N(\mu , \sigma ^2) \} _{\mu \in \mathbb {R}, \sigma ^2 > 0}
\displaystyle H_1: \mathbf{P} \displaystyle \notin \{ N(\mu , \sigma ^2) \} _{\mu \in \mathbb {R}, \sigma ^2 > 0}.
We divide the sample space into 5 disjoint subsets refered to as bins :
\displaystyle A_1 \displaystyle = (-\infty , -2), \quad A_2 = (-2, -0.5),
\displaystyle A_3 \displaystyle = (-0.5, 0.5), \quad A_4 = (0.5, 2)
\displaystyle A_5 \displaystyle = (2, \infty ).
Now, define discrete random variables Y_ i as functions of X_ i by
\displaystyle Y_ i\, =\, k\qquad \text {if }\, X_ i\in A_ k.
For example, if \, X_ i = 0.1,\, then \, X_ i\in A_3\, and so \, Y_ i = 3.\, \, In other words, \, Y_ i\, is the label of the bin that contains \, X_ i.
By the definition above,
Y_1, \ldots , Y_ n \stackrel{iid}{\sim } Y
and Y follows the multinomial distribution on \{ 1, 2, 3, 4, 5 \} with (vector) parameter \, \mathbf{p}=\begin{pmatrix} p_1& p_2& p_3& p_4& p_5\end{pmatrix} \in \Delta _5\, where p_ j denote the probability that Y = j.
Assume the following special case of the null hypothesis holds:
X_1, \ldots , X_ n \stackrel{iid}{\sim }\mathcal{N}(0,1).
What is the vector parameter \mathbf{p} \in \Delta _5 of the multinomial distribution followed by Y_ i? Fill in the first three entries p_1,\, p_2,\, p_3\, below.
(Enter Phi(x) for the cdf \Phi (x) of a standard normal distribution, e.g. type Phi(1) for \Phi (1), or enter your answers accurate to 3 decimal places)
\mathbf{p}_1 =
unanswered
[Math Processing Error]
\mathbf{p}_2 =
unanswered
{::}
\mathbf{p}_3 =
unanswered
{::}
(What is p_4 and p_5 in terms of p_1,\, p_2,\, p_3?)
To find the vector parameter \mathbf{p} \in \Delta _5 of the multinomial distribution followed by Y_i, we need to determine the probabilities p_1, p_2, p_3, p_4, and p_5.
Since X_i follows a standard normal distribution, we can use the cumulative distribution function (CDF) of a standard normal distribution, denoted as \Phi(x), to find these probabilities.
We divide the sample space into 5 disjoint bins:
A_1 = (-∞, -2)
A_2 = (-2, -0.5)
A_3 = (-0.5, 0.5)
A_4 = (0.5, 2)
A_5 = (2, ∞)
To find p_1, we calculate the probability that Y_i takes the value 1, which means X_i falls into bin A_1. The probability is given by the difference in the CDF values at the upper and lower bounds of A_1:
p_1 = \Phi(-2) - \Phi(-∞)
Since the CDF approaches 0 as the argument approaches -∞, we have:
p_1 = \Phi(-2) - 0
Similarly, to find p_2, we calculate the probability that Y_i takes the value 2, which means X_i falls into bin A_2. The probability is given by the difference in the CDF values at the upper and lower bounds of A_2:
p_2 = \Phi(-0.5) - \Phi(-2)
To find p_3, we calculate the probability that Y_i takes the value 3, which means X_i falls into bin A_3. The probability is given by the difference in the CDF values at the upper and lower bounds of A_3:
p_3 = \Phi(0.5) - \Phi(-0.5)
Note: Since \Phi(x) is symmetric about 0, p_4 is equal to p_2 and p_5 is equal to p_1. Therefore:
p_4 = p_2
p_5 = p_1
So the vector parameter \mathbf{p} is given by:
\mathbf{p} = \begin{pmatrix} p_1 & p_2 & p_3 & p_4 & p_5 \end{pmatrix} = \begin{pmatrix} \Phi(-2) - 0 & \Phi(-0.5) - \Phi(-2) & \Phi(0.5) - \Phi(-0.5) & \Phi(-0.5) - \Phi(-2) & \Phi(-2) - 0 \end{pmatrix}
To find the vector parameter \textbf{p} of the multinomial distribution followed by Y\_i, we need to calculate the probabilities p\_j for j = 1, 2, 3, 4, 5.
Since we are assuming X\_1, ..., X\_n ~ N(0,1), we need to find the probabilities of Y\_i = j for j = 1, 2, 3, 4, 5.
We can calculate these probabilities by finding the area under the standard normal curve in each of the respective bins.
The bins A\_1, A\_2, A\_3, A\_4, and A\_5 correspond to the intervals (-∞, -2), (-2, -0.5), (-0.5, 0.5), (0.5, 2), and (2, ∞) respectively.
To calculate p\_1, we need to find the probability that X\_i falls in the interval (-∞, -2).
p\_1 = P(X\_i ∈ A\_1)
= P(X\_i < -2)
= Phi(-2), where Phi is the standard normal cdf.
= Phi(-2)
To calculate p\_2, we need to find the probability that X\_i falls in the interval (-2, -0.5).
p\_2 = P(X\_i ∈ A\_2)
= P(-2 < X\_i < -0.5)
= Phi(-0.5) - Phi(-2), as Phi is a cumulative distribution function.
= Phi(-0.5) - Phi(-2)
To calculate p\_3, we need to find the probability that X\_i falls in the interval (-0.5, 0.5).
p\_3 = P(X\_i ∈ A\_3)
= P(-0.5 < X\_i < 0.5)
= Phi(0.5) - Phi(-0.5), as Phi is a cumulative distribution function.
= Phi(0.5) - Phi(-0.5)
Therefore, we have:
p\_1 = Phi(-2)
p\_2 = Phi(-0.5) - Phi(-2)
p\_3 = Phi(0.5) - Phi(-0.5)
To find p\_4 and p\_5 in terms of p\_1, p\_2, and p\_3, we need to consider that the probabilities must sum up to 1 and p\_1, p\_2, and p\_3 cover the intervals from -∞ to 2.
Since Y follows a multinomial distribution, we have:
p\_4 = P(X\_i ∈ A\_4) = 1 - p\_1 - p\_2 - p\_3
Similarly,
p\_5 = P(X\_i ∈ A\_5) = 1 - p\_1 - p\_2 - p\_3 - p\_4
Therefore, we can calculate p\_4 and p\_5 once we know the values of p\_1, p\_2, and p\_3.
Please note that the values of p\_1, p\_2, and p\_3 can be approximated using the standard normal distribution function Phi(x).