To conduct a valid regression analysis, both x and y must be approximately normally distributed.



FALSE... Sorry Math Guru.

Conditions for a regression analysis are

We have n observations on an explanatory variable x and a response variable y. Our goal is to study or predict the behavior of y for given values of x.

For any fixed value of x, the response varies according to a normal distribution. Repeated responses y are independent of each other.
The mean response ìy has a straight-line relationship with x:
ìy = á + âx

The slope â and intercept á are unknown parameters.

The standard deviation of y ( call it ó) is the same for all values of x. The value of ó is unknown.


To conduct a valid regression analysis, there are assumptions that need to be met, but the normality of the distribution of the variables (x and y) is not one of them.

The assumptions for a valid regression analysis typically include: linearity, independence, constant variance (homoscedasticity), and absence of multicollinearity. Additionally, for hypothesis testing, the residuals should be normally distributed. But the variables themselves (x and y) are not required to be normally distributed.
