Maximum Likelihood Estimate

Let theta* = theta*_a, theta*_b...theta*_z be the parameters of the multinomial model M* that maximize the likelihood of generating a document D.

Further, it is known that the letter 'e' is twice as likely to occur as the letter 'z' in document D.

What is the correct expression relating theta*_e and theta*_z?

To find the correct expression relating theta*_e and theta*_z, we need to use the Maximum Likelihood Estimation (MLE) principle.

In a multinomial model, the probability of generating a document D is determined by the probabilities of each letter occurring in the document. The MLE principle states that we need to find the values of the parameters (theta*) that maximize the likelihood of the observed data (document D).

Let's assume the number of occurrences of the letter 'e' in document D is denoted by n_e, and the number of occurrences of the letter 'z' is denoted by n_z.

The probability of generating the letter 'e' in document D, given the parameter theta*_e, can be written as:
P('e' | theta*_e) = theta*_e

Similarly, the probability of generating the letter 'z', given the parameter theta*_z, can be written as:
P('z' | theta*_z) = theta*_z

Since it is known that the letter 'e' is twice as likely to occur as the letter 'z' in document D, we can express this relationship as:
P('e' | theta*_e) = 2 * P('z' | theta*_z)

Substituting the above expressions, we have:
theta*_e = 2 * theta*_z

Therefore, the correct expression relating theta*_e and theta*_z is:
theta*_e = 2 * theta*_z