Let \widehat{\mathbf{p}} denote the MLE for a categorical statistical model ( \{ a_1, \ldots , a_ K \} , \{ \mathbf{P}_{\mathbf{p}} \} _{\mathbf{p} \in \Delta _ K}). Let \mathbf{p}^* denote the true parameter. Then \sqrt{n}(\widehat{\mathbf{p}} - \mathbf{p}^*) is asymptotically normal and

Question

Let \widehat{\mathbf{p}} denote the MLE for a categorical statistical model ( \{ a_1, \ldots , a_ K \} , \{ \mathbf{P}_{\mathbf{p}} \} _{\mathbf{p} \in \Delta _ K}). Let \mathbf{p}^* denote the true parameter. Then \sqrt{n}(\widehat{\mathbf{p}} - \mathbf{p}^*) is asymptotically normal and

n \sum _{i = 1}^ K \frac{ ( \widehat{ p_ i } - p_ i^*)^2 }{p_ i^*} \xrightarrow [n \to \infty ]{(d)} \chi _{K -1}^2.

Consider the particular categorical distribution from the previous problems in this lecture, where we have the statistical experiment X_1, \ldots , X_ n \stackrel{iid}{\sim } \mathbf{P}_{\mathbf{p}} and associated statistical model (\{ 1,2,3\} , \{ \mathbf{P}_{\mathbf{p}} \} _{\mathbf{p} \in \Delta _3}). We will use the above fact to hypothesis test between the following null and alternative:

\displaystyle H_0: \mathbf{p}^* \displaystyle = [1/3~ ~ 1/3~ ~ 1/3]^ T
\displaystyle H_1: \mathbf{p}^* \displaystyle \neq [1/3~ ~ 1/3~ ~ 1/3]^ T.
Consider the test

\psi = \mathbf{1}\left( n \sum _{i = 1}^3 \frac{ ( \widehat{ p_ i } - \frac{1}{3})^2 }{1/3} >C \right),

for a threshold C.

Compute the asymptotic p-value of the test \psi on the data set

\mathbf{x} = 1, 3, 1, 2, 2, 2, 1, 1, 3, 1, 1, 2.

Give a numerical value with at least 4 decimals. Use this tool to find the tail probabilities of a \chi ^2 distribution (you may also use any other software). If you are using this tool, note that you need to set "Choose Type of Control" to "Adjust X-axis quantile (Chi square) value" to find the tail probability associated with an x-axis value for a chi-squared distribution with degrees of freedom set in the "Degrees of Freedom" box.

Answer 1

To compute the asymptotic p-value of the test, we calculate the test statistic:

\[T = n \sum_{i=1}^3 \frac{(\hat{p}_i - \frac{1}{3})^2}{\frac{1}{3}} = 12 \sum_{i=1}^3 (\hat{p}_i - \frac{1}{3})^2\]

First, we need to find the MLEs of the probabilities:

\[\hat{p}_1 = \frac{5}{12}, \quad \hat{p}_2 = \frac{5}{12}, \quad \hat{p}_3 = \frac{2}{12}\]

Substituting these values into the test statistic:

\[T = 12 \left((\frac{5}{12} - \frac{1}{3})^2 + (\frac{5}{12} - \frac{1}{3})^2 + (\frac{2}{12} - \frac{1}{3})^2 \right) = \frac{38}{36} = \frac{19}{18}\]

We want to find the tail probability of the chi-squared distribution with 2 degrees of freedom that gives us a test statistic greater than or equal to T. Using the tool provided or any other software, we find that the tail probability is approximately 0.444.

Since this is a two-sided test, the p-value is twice the tail probability:

p-value = 2 * 0.444 = 0.888

Therefore, the asymptotic p-value of the test is approximately 0.888.

Answer 2

To compute the asymptotic p-value of the test, we need to calculate the test statistic and compare it to the threshold value.

Step 1: Calculate the test statistic:
For each category i, calculate:
z_i = \frac{(\hat{p_i} - \frac{1}{3})^2}{1/3}
Then, compute the sum of these values:
T = n \sum_{i=1}^3 z_i

Step 2: Find the threshold value:
To find the threshold value, we need to determine the critical value C such that the tail probability associated with it in the chi-square distribution with (K-1) degrees of freedom is the desired significance level (say, alpha).
In this case, we want to choose alpha (the significance level) and find the corresponding critical value C.

Step 3: Compare the test statistic with the threshold value:
If T > C, reject the null hypothesis and conclude that there is evidence against the null hypothesis. Otherwise, fail to reject the null hypothesis.

Step 4: Calculate the p-value:
The p-value is the probability of observing a test statistic as extreme as the one calculated (or even more extreme) under the null hypothesis. In this case, we want to calculate the tail probability of the chi-square distribution with (K-1) degrees of freedom, where the test statistic T follows this distribution.

To calculate the asymptotic p-value, we need to find the tail probability associated with T. This can be done using a chi-square distribution table or using software like R or Python. The tail probability is given by:

p-value = P(\chi^2_{K-1} > T)

Using the software or chi-square distribution table, find the tail probability associated with T, which gives the asymptotic p-value.