I have a question to do in which the efficacy of a drug used to treat panic disorder is tested. Patients are given a placebo for one week and then the drug for one week, and the number of panic attacks is measured for each week. The data looks like this:

Placebo Drug
0 0
2 1
0 3
1 2
0 0
10 5
2 0
6 4
1 1
1 0
1 3
0 2
3 1
3 1
3 4
4 2
6 4
15 21
28 8
30 0
13 0
15 12
18 18
9 6
8 7
22 14
13 3
12 8
6 7
4 5
0 2

Where each line is a subject. I am pretty sure that I need to do a goodness-of-fit chi square test, where my null hypothesis is that the distribution of panic attacks on the drug is the same as (or greater than) the distribution of panic attacks on the placebo. My alternative hypothesis is that the distribution of panic attacks on the drug is less than the distribution of panic attacks on the placebo. I also know that the chi-square test statistic is found by ((O-E)^2)/E.

I'm confused about how to set up my categories to perform the chi-square test. I don't think I'm supposed to use the raw panic attack data shown above because that gives me some divide by zero errors when I try to calculate my test statistic. I tried to create a 2x2 grid (placebo vs. drug by no panic attack vs. panic attack) but that loses information about the number of panic attacks. Any advice?

To perform a goodness-of-fit chi-square test, you need to determine appropriate categories for your data. In this case, you have two variables: treatment (placebo or drug) and outcome (number of panic attacks). To properly set up the categories, you can create frequency tables for both the placebo and drug groups.

First, create a frequency table for the placebo group. Count the number of occurrences for each possible number of panic attacks (0, 1, 2, etc.). The table may look like this:

Number of Panic Attacks in Placebo Group:
Number of Panic Attacks Frequency
0 5
1 4
2 3
3 2
4 2
5 0
6 1
10 1
12 0
13 0
15 1
18 1
22 1
28 1
30 1

Do the same for the drug group:

Number of Panic Attacks in Drug Group:
Number of Panic Attacks Frequency
0 5
1 6
2 4
3 5
4 5
5 1
6 2
7 2
8 2
12 1
14 1
18 1
21 1

Once you have these frequency tables, you can determine the expected frequencies for each category. According to your null hypothesis, the distribution of panic attacks on the drug is the same as or greater than the distribution of panic attacks on the placebo. Therefore, you will assume that the probabilities follow the same distribution.

To calculate the expected frequencies, you can perform a proportionate allocation based on the total number of panic attacks in each group. Since the sample size is the same for both groups (n=30), the total number of panic attacks in the placebo group is 88, and in the drug group is 60. Here's how you can calculate the expected frequencies:

Expected Frequency = (Total number of panic attacks in the group / Total number of panic attacks in both groups) * Total observations in the category

For example, for the placebo group with 0 panic attacks:
Expected Frequency = (88 / (88 + 60)) * 30 = 14.47 ≈ 14.5 (rounded to the nearest whole number)

Repeat this calculation for each category in both frequency tables. Once you have the expected frequencies, you can perform the chi-square test statistic calculation:

Chi-square Test Statistic = Σ((O-E)^2 / E)

Summation over all categories: (observed frequency - expected frequency) squared divided by expected frequency. Sum up all these values to obtain the chi-square test statistic.

Finally, you can compare the calculated chi-square test statistic to the critical value from the chi-square distribution with appropriate degrees of freedom to determine whether to reject the null hypothesis or not.

Note: Since you have relatively small expected frequencies in some categories, you may consider combining adjacent categories with very low counts to satisfy the expected frequency minimum requirement. However, always ensure that the modification does not alter the interpretation of your analysis or violate any assumptions of the chi-square test.

To set up your categories and perform a chi-square test for this scenario, you can create a contingency table that includes the number of subjects falling into each category. Since you want to compare the distribution of panic attacks on the drug to the placebo, you can use the following categories:

Placebo vs. Drug: This represents the two treatment conditions.
No Panic Attack vs. Panic Attack: This represents the occurrence or absence of panic attacks.

To construct the contingency table, you will count the number of subjects in each combination of these categories.

Here's how you can set up the table:

No Panic Attack Panic Attack
Placebo [number count] [number count]
Drug [number count] [number count]

For example, if in the placebo group, there were 8 subjects with no panic attack and 7 subjects with panic attack, and in the drug group, there were 10 subjects with no panic attack and 12 subjects with panic attack, the contingency table would look like this:

No Panic Attack Panic Attack
Placebo 8 7
Drug 10 12

Once you have set up the contingency table, you can perform the chi-square test. The test compares the observed frequencies (O) with the expected frequencies (E) under the assumption of the null hypothesis.

To calculate the expected frequencies, you can use the row and column totals of the table. For each cell, the expected count is given by (row total × column total) / grand total.

Let's assume the row totals are R1 and R2, the column totals are C1 and C2, and the grand total is T.

For the cell in the top left:
E = (R1 * C1) / T

For the cell in the top right:
E = (R1 * C2) / T

For the cell in the bottom left:
E = (R2 * C1) / T

For the cell in the bottom right:
E = (R2 * C2) / T

After calculating the expected frequencies for each cell, you can use the formula ((O - E)^2) / E to find the contribution to the chi-square statistic from each cell.

Finally, sum up all the contributions from each cell to get the chi-square test statistic.

Once you have the chi-square statistic, you can compare it to the chi-square distribution with appropriate degrees of freedom to determine the p-value. The degrees of freedom for a contingency table with 1 row and 1 column is 1.

A smaller p-value would provide evidence against the null hypothesis, suggesting that the distribution of panic attacks on the drug is different from the placebo.

Note: It is important to remember that the chi-square test assumes independence between the variables, so make sure there are no other factors that could confound the results.

Hope this helps!