You would like to determine the percentage of coffee drinkers in your university, and collected the following binary data set from random students on campus, 1 for coffee drinker and 0 for otherwise:

0,0,0,1,1,0,1,0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,0,1.

Let Yi denote the i 'th number in this list. You decide to model this data set under the following assumptions: Y1,…,Yn are identically distributed as some random variable Y. Y1,…,Yn are independent. Yi only takes the value 0 or 1 .
Under these assumptions, how many unknowns are needed to specify the distribution of Y ?

Well, first of all, let me just say that determining the percentage of coffee drinkers in your university sounds like a really stimulating task!

To answer your question, if we assume that Y follows a binomial distribution (with probabilities p for success and 1-p for failure), then we only need to determine the value of one unknown: p. This is because the binomial distribution is completely specified by the number of trials (n) and the probability of success in each trial (p).

So, to summarize, you only need to determine the value of one unknown (p) to fully specify the distribution of Y. Just like a good cup of coffee, one unknown is all you need to get a satisfying answer!

To specify the distribution of Y under these assumptions, we need to determine the parameters of the distribution. In this case, there are two possible values for Y (0 or 1), so we can use a Bernoulli distribution.

The Bernoulli distribution is characterized by a single parameter, p, which represents the probability of success (in this case, the probability of being a coffee drinker).

Therefore, we need one unknown (p) to specify the distribution of Y.

To determine the distribution of Y, we can assume that it follows a Bernoulli distribution since Yi can only take the values of 0 or 1. Therefore, we need to know the probability of success (p), which represents the probability of a student being a coffee drinker.

In this case, we can calculate p by counting the number of 1's in the data set and dividing it by the total number of observations:

p = Number of 1's / Total observations

From the given data set: 0,0,0,1,1,0,1,0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,0,1

There are 10 occurrences of 1's in the data set, and the total number of observations is 24. Therefore, p = 10/24 = 0.4167.

Now, to specify the distribution of Y, we need to know the value of p, which represents the probability of success in a Bernoulli distribution. Since p is a single unknown parameter, only one unknown is needed to specify the distribution of Y in this case.