1. In an ANOVA, one group has a much larger sample mean than the other. The analyst decides to remove this group and to conduct the analysis on the remaining groups only. Which of the following statements is correct?

Select one:
a. This analysis is not correct, as now the estimate of variance will be based on fewer observations and this will not provide the best test.
b. This analysis is doubtful, as the assumptions of ANOVA may now not be satisfied.
c. This analysis is correct but we will no longer be able to compare the largest mean with the other means.
d. This analysis is good, as it allows the analyst now to investigate if there are differences in the means of the remaining groups.

2. An ANOVA for comparing three means is based on equal sample sizes of 10 each. The within sum of squares is 540. Compute the 5% ciritical difference for a Kramer-Tukey procedure. You are given that the corresponding critical vaue from the studentised range distribution is 3.5064. Give your answer to one decimal place.
Answer:

3. Which one of the following statements is true regarding the equation of regression for the response variable y on the explanatory variable x?
Select one:
a. When x changes by 1 the value of y changes by 1 as well.
b. The equation must be of the form y = ax + b.
c. The equation gives the average value of y for a given value of x.
d. The equation gives the value of y for a given value of x.

4. Which one of the following statements is NOT true regarding the assumptions of regression.
Select one:
a. The error terms are independent.
b. The error terms are normally distributed.
c. The error terms have mean 0.
d. The error terms can have different variances.

5. A simple linear regression analysis is run on 50 observations. What is the degrees
of freedom for residual sum of squares?
Answer:


6. A linear regression analyis give the regression equation as y = 4.5 x -2. What is the change in y when x decreases by 2? Give your answer to one decimal place.
Answer:


7. A regression line is fitted to data, where the x variable takes values between 20 and 50. A prediction is to be made for x = 70. Which one of the following is correct?
Select one:
a. The prediction can be made by using the equation of regression.
b. We should not predict outside the the range of x values as the model may not hold outside this range.
c. The prediction will give the value of y as twice the value when x = 35.
d. We will have to use a computer to predict the value of y for the given value of x.

8. Which one of the following is correct.
Select one:
a. Correlation is a measure on linear relationship between variables.
b. A high correlation indicates a good linear relationship between the variables.
c. If the correlation is less than 0.9 then the relationship between the variables in not linear.
d. A correlation of 0 indicates that there is no relationship between the variables.

1.c

1: (A) This analysis is not correct, as now the estimate of variance will be based on fewer observations and this will not provide the best test.

2: 5

3: (D) The equation gives the average value of y for a given value of x.

4: (D) The error terms can have different variances.

5: 48

6: -9

7: (B) We should not predict outside the the range of x values as the model may not hold outside this range.

8: (A) Correlation is a measure on linear relationship between variables.

1. The correct statement is a. This analysis is not correct, as now the estimate of variance will be based on fewer observations and this will not provide the best test. Removing a group with a larger sample mean can significantly impact the estimate of variance and invalidate the results of the ANOVA. It is essential to include all relevant groups in the analysis to obtain accurate results.

2. To compute the 5% critical difference for a Kramer-Tukey procedure, we need to multiply the corresponding critical value from the studentized range distribution (3.5064) by the standard error calculated as the square root of the within sum of squares divided by the total sample size (N). In this case, since the sample sizes are equal (10 each), the total sample size is 30. Using this information, we can calculate the critical difference as follows:

Critical Difference = 3.5064 * sqrt(540/30) ≈ 9.08

So the 5% critical difference for the Kramer-Tukey procedure is approximately 9.1.

3. The correct statement is c. The equation gives the average value of y for a given value of x. In a regression equation, the coefficients (a and b) represent the relationship between the response variable (y) and the explanatory variable (x). The equation provides an estimate of the expected average value of y for a given value of x, considering the linear relationship between the variables.

4. The correct statement is d. The error terms can have different variances. One of the assumptions of regression is that the error terms (residuals) have constant variance (homoscedasticity). This assumption implies that the spread of the residuals should be relatively constant across all levels of the predictor variable. Violation of this assumption can lead to biased parameter estimates and incorrect inference.

5. The degrees of freedom for residual sum of squares in a simple linear regression analysis with 50 observations is (n - 2), where n is the number of observations. In this case, it would be (50 - 2) = 48. The residual degrees of freedom represent the number of independent pieces of information available for estimating the variability not explained by the regression model.

6. The change in y when x decreases by 2 can be calculated by substituting the new value of x into the regression equation and computing the corresponding y value. In this case, the regression equation is y = 4.5x - 2. When x decreases by 2, we can substitute x - 2 into the equation to find the new value of y:

y = 4.5(x - 2) - 2
y = 4.5x - 9 - 2
y = 4.5x - 11

So, the change in y when x decreases by 2 is -11 (or -11.0 when rounded to one decimal place).

7. The correct statement is b. We should not predict outside the range of x values as the model may not hold outside this range. When making predictions using a regression model, it is important to consider the range of the predictor variable for which the model was developed. Extrapolating outside this range can potentially lead to inaccurate predictions as the model's assumptions may not hold.

8. The correct statement is a. Correlation is a measure of linear relationship between variables. Correlation measures the strength and direction of the linear association between two variables. It ranges from -1 to +1, where -1 indicates a perfect negative linear relationship, +1 indicates perfect positive linear relationship, and 0 indicates no linear relationship. Correlation does not capture non-linear relationships between variables.