Problem 6 [Total 5 marks] The drug Ziac is used to treat hypertension. In a large clinical trial it was found that 32% of those in the placebo group experienced dizziness.

(a) [1 mark] Assuming that the same 32% proportion applies to subjects taking the drug, find the probability that among 20 drug users, exactly 6 have the side-effect.
(b) [1 mark] If 10 of the 20 drug users were to experience dizziness, find an exact 95% confidence interval for the proportion of drug users having dizziness. Would it appear that the proportion of drug users having dizziness is different from the 32% proportion? Why?
(c) [2 marks] In a subsequent clinical trial, it turned out that 4 out of 20 subjects taking the drug experienced the side effect. Carry out an exact hypothesis test at the 0.05 significance level to test that the true proportion of the drug users experiencing dizziness differs from 32%. State clearly the hypotheses and give the p-value.
(d) [1 mark] In the future, the researchers plan to collect more data to improve the accuracy of the current results. Compute the sample size required to obtain an approximate 95% confidence interval with half-width 0.05 for the true proportion.

(a) To find the probability that exactly 6 out of 20 drug users experience the side effect, we can use the binomial probability formula. The formula is:

P(X=k) = C(n, k) * p^k * (1-p)^(n-k)

Where:
P(X=k) is the probability of getting exactly k successes.
C(n, k) is the number of combinations of n things taken k at a time.
p is the probability of success on a single trial.
n is the total number of trials.

In this case, n = 20, p = 0.32, and we want to find P(X=6). Plugging the values into the formula, we get:

P(X=6) = C(20, 6) * 0.32^6 * (1-0.32)^(20-6)

Calculating this expression will give us the probability that exactly 6 out of 20 drug users experience the side effect.

(b) To find an exact 95% confidence interval for the proportion of drug users having dizziness, we can use the Wilson score interval. The formula for the Wilson score interval is:

CI = (p_hat + (z^2 / (2n)) ± (z * sqrt((p_hat * (1 - p_hat) + (z^2 / (4n)))/n + (z^2 / (4n^2)))) / (1 + (z^2 / n))

Where:
CI is the confidence interval.
p_hat is the proportion of drug users experiencing the side effect (10/20 in this case).
z is the z-score corresponding to the desired confidence level (For a 95% confidence level, z ≈ 1.96).
n is the sample size (20 in this case).

Plugging in the values into the formula will give us the lower and upper bounds of the confidence interval.

To determine if the proportion of drug users having dizziness is different from the 32% proportion, we can check if the 32% falls within the confidence interval. If it does not, then it suggests that the proportion is different.

(c) To carry out an exact hypothesis test, we can use the binomial test. The hypotheses for the test would be:

Null hypothesis (H0): The true proportion of drug users experiencing dizziness is equal to 32%.
Alternative hypothesis (Ha): The true proportion of drug users experiencing dizziness differs from 32%.

To test these hypotheses, we can calculate the p-value using the binomial distribution. The p-value is the probability of observing a result as extreme or more extreme than the one observed under the assumption that the null hypothesis is true.

By calculating the p-value, we can compare it to the significance level of 0.05 to determine if we reject or fail to reject the null hypothesis.

(d) To compute the sample size required to obtain an approximate 95% confidence interval with half-width 0.05, we can use the formula for sample size in proportion estimation. The formula is:

n = (z^2 * p_hat * (1 - p_hat)) / (E^2)

Where:
n is the required sample size.
z is the z-score corresponding to the desired confidence level (For a 95% confidence level, z ≈ 1.96).
p_hat is the estimated proportion (can be the observed proportion from previous data).
E is the desired half-width of the confidence interval (0.05 in this case).

Plugging in the values into the formula will give us the required sample size.