1. The question asks to choose a data point in our graph (we chose 476,44) of median home prices (x) and percentage of students who scored Advanced on math MCAS (y).

Question

1. The question asks to choose a data point in our graph (we chose 476,44) of median home prices (x) and percentage of students who scored Advanced on math MCAS (y).

We have to find the residual. I'm confused because the equation I was given is e=y-yhat
I know one is the expected value and one is what we got.
What do I plug in for y and what do I plug in for yhat?
The value of Rsquared is .72 on the calculator and if you square it it's .5184. Do I need to do anything with Rsquared or is that not involved in the residual?
2. Explain what residual measures?

Answer 1

For 1):

Y(hat) comes from substituting an x value into a regression equation and solving for y(hat). Y(hat) is also called the predicted y value in a regression equation.

Let's use an example. Suppose the regression equation is this:
y(hat) = 2.75 + .5x

If x = 1, then y(hat) = 3.25

Suppose y = 3. Then y - y(hat) would be 3 - 3.25 = -0.25 (using the above example). This would be your residual.

For 2):
The residuals in regression are measuring how far each observed y is from the regression line, y(hat), for a given value of x.

I hope this helps.

Answer 2

1. To find the residual, you need to plug in the actual value of the dependent variable (y) and the predicted value of the dependent variable (yhat) into the equation e = y - yhat.

In this case, you have chosen a data point (476, 44) which means the actual value of the dependent variable (y) is 44. To find the predicted value of the dependent variable (yhat), you need to use the regression equation or the line of best fit that represents the relationship between the independent variable (x) and the dependent variable (y).

For example, if the equation of the line of best fit is yhat = 10x + 20, you would plug in the x-value of your chosen data point (476) into the equation to find the predicted value of the dependent variable:
yhat = 10(476) + 20 = 4,800 + 20 = 4,820.

Now that you have both the actual (y) and predicted (yhat) values, you can calculate the residual:
e = y - yhat = 44 - 4,820 = -4,776.

So, the residual for the data point (476, 44) is -4,776.

Regarding the R-squared value, it is a measure of how well the regression line fits the data points. It represents the proportion of the variance in the dependent variable that can be explained by the independent variable(s). However, it is not directly involved in calculating residuals. The residual is simply the difference between the actual value of the dependent variable and the predicted value, regardless of the R-squared value.

2. A residual measures the difference between the actual value of the dependent variable (y) and the predicted value of the dependent variable (yhat) based on a regression model or a line of best fit. It quantifies the extent to which the model overestimates or underestimates the observed data points.

Residuals are useful in evaluating the accuracy and reliability of a regression model. If the residuals are small and randomly distributed around zero, it suggests that the model is a good fit for the data. On the other hand, if the residuals are large or exhibit a pattern, it indicates that the model may not adequately capture the relationship between the independent and dependent variables.

By examining the residuals, you can assess the quality of the regression model and identify any outliers or influential data points that might be affecting the model's performance. Residual analysis helps to validate the assumptions of the regression analysis and identify areas for improvement or further investigation.