Suppose that in the aerobic exercise group we also measured the number of hours of aerobic exercise per week and the mean is 5.2 hours with a standard deviation of 2.1 hours. The sample correlation is -0.42.

a)Is there evidence of a significant correlation between number of hours of exercise per week and HDL cholesterol level? Run the test at a 5% level of significance.
b) Estimate the equation of the regression line that best describes the relationship between number of hours of exercise per week and HDL cholesterol level (Assume that the dependent variable is HDL level).
c) Estimate the HDL level for a person who exercises 7 hours per week.
d) Estimate the HDL level for a person who does not exercise.

To answer these questions, we will need to use statistical tools like hypothesis testing and regression analysis. Let's go step by step:

a) Hypothesis Testing:
To test the correlation between the number of hours of exercise per week and HDL cholesterol level, we will use a hypothesis test. We need to test if the correlation is significantly different from zero.

Null hypothesis (H0): There is no correlation between the number of hours of exercise per week and HDL cholesterol level (correlation coefficient = 0).
Alternative hypothesis (H1): There is a significant correlation between the number of hours of exercise per week and HDL cholesterol level (correlation coefficient ≠ 0).

To test the hypothesis, we will use the sample correlation coefficient and the sample size. We will calculate the t-statistic using the formula:
t = (r * sqrt(n-2)) / sqrt(1 - r^2)

where r is the sample correlation coefficient and n is the sample size.

Comparing the t-statistic to the critical value of the t-distribution at a 5% level of significance (two-tailed test), we can determine if there is evidence of a significant correlation.

b) Regression Analysis:
To estimate the equation of the regression line, we will use simple linear regression. The equation of a regression line is of the form: Y = a + bX, where Y is the dependent variable (HDL cholesterol level) and X is the independent variable (number of hours of exercise per week). We need to estimate the values of a and b.

The formula for the slope (b) of a regression line is: b = r * (sy / sx),
where r is the sample correlation coefficient, sy is the standard deviation of the dependent variable, and sx is the standard deviation of the independent variable.

The formula for the y-intercept (a) is: a = ȳ - b * x̄,
where ȳ is the mean of the dependent variable and x̄ is the mean of the independent variable.

c) To estimate the HDL level for a person who exercises 7 hours per week, we can plug in the value of X (7) into the regression equation: Y = a + bX.

d) To estimate the HDL level for a person who does not exercise, we can plug in the value of X as 0 into the regression equation: Y = a + bX.

By following these steps, we can answer all the given questions.