The problems on this page concern the data set

Question

The problems on this page concern the data set

S = \{ 0.28, 0.2, 0.01, 0.80, 0.1 \} .

Let x_ i denote the i'th element of the data set S.
In this problem, you will test the null and alternative hypotheses

\displaystyle H_0 \displaystyle = \text {the data set is distributed as} \, \text {Unif}(0,1)
\displaystyle H_1 \displaystyle = \text {the data set is not distributed as} \, \text {Unif}(0,1) .
What is the value of the Kolmogorov-Smirnov test statistic on the data set S? Enter \, T_5^{\text {KS}}/\sqrt{5},\, the KS statistic without the factor of \sqrt{n}, below.

T_5^{\text {KS}}/\sqrt{5}=\quad

Answer 1

To calculate the Kolmogorov-Smirnov (KS) test statistic, we first need to sort the data set in ascending order:

S = {0.01, 0.1, 0.2, 0.28, 0.8}

Next, we calculate the cumulative distribution function (CDF) for the uniform distribution on the interval (0, 1):

CDF(x) = (x - 0)/(1 - 0) = x

Now, we calculate the empirical distribution function (EDF) for the data set S:

EDF(x) = (number of elements in S ≤ x) / (total number of elements in S)

For each data point x_i in S, we calculate EDF(x_i). In this case, since all elements in S are unique, the EDF values will be 1/5, 2/5, 3/5, 4/5, and 5/5 for the corresponding data points.

Finally, we calculate the KS test statistic as the maximum absolute difference between the CDF and EDF:

KS statistic = max|CDF(x_i) - EDF(x_i)|

Using the sorted data set S and the calculated EDF values, we can calculate the KS statistic.

KS statistic = max|CDF(0.01) - EDF(0.01)|, |CDF(0.1) - EDF(0.1)|, |CDF(0.2) - EDF(0.2)|, |CDF(0.28) - EDF(0.28)|, |CDF(0.8) - EDF(0.8)|

Calculating each difference:

|0.01 - 0.2| = 0.19
|0.1 - 0.4| = 0.3
|0.2 - 0.6| = 0.4
|0.28 - 0.8| = 0.52
|0.8 - 1| = 0.2

The maximum absolute difference is 0.52, so the KS statistic is 0.52.

Therefore, T_5^{\text {KS}}/\sqrt{5} = 0.52/\sqrt{5} = 0.2328.

Answer 2

To find the value of the Kolmogorov-Smirnov (KS) test statistic on the data set S, we need to follow these steps:

Step 1: Calculate the empirical cumulative distribution function (ECDF) for the data set.

Step 2: Calculate the distance between the ECDF and the cumulative distribution function (CDF) of the Uniform(0, 1) distribution.

Step 3: Find the maximum distance between the two distributions, which gives us the KS test statistic.

Let's go through each step in detail.

Step 1: Calculate the empirical cumulative distribution function (ECDF)
The ECDF is defined as the fraction of data points in the dataset that are less than or equal to a given value.

For the data set S = {0.28, 0.2, 0.01, 0.80, 0.1}, the ECDF can be calculated as follows:

Sorted Data: {0.01, 0.1, 0.2, 0.28, 0.8}

ECDF Values: {1/5, 2/5, 3/5, 4/5, 5/5}

Step 2: Calculate the distance between the ECDF and the CDF of the Uniform(0, 1) distribution
The CDF of the Uniform(0, 1) distribution is given by F(x) = x for 0 ≤ x ≤ 1.

We calculate the distance (D) between the ECDF and the CDF of the Uniform(0, 1) distribution for each data point:

D = |ECDF(x) - F(x)|

For our data set, the distances are:

D = {0.172, 0.278, 0.322, 0.42, 0.52}

Step 3: Find the maximum distance
The KS test statistic is the maximum distance (D) between the ECDF and the CDF of the Uniform(0, 1) distribution.

So, T_5^{KS} = max(D) = 0.52

However, we need to divide this value by √n.

Since n = 5, we have:

T_5^{KS}/√5 = 0.52/√5 ≈ 0.232

Therefore, T_5^{KS}/√5 ≈ 0.232.