3. For the following data set, the dependent variable {response) is the first variable. Choose the independent variables {predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., SPSS or SYSTAT or MIN1TAB) to perform the necessary regression calculations and to obtain the required graphs.

Question

3. For the following data set, the dependent variable {response) is the first variable. Choose the independent variables {predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., SPSS or SYSTAT or MIN1TAB) to perform the necessary regression calculations and to obtain the required graphs.

Consumer Prices, Capacity Utilization, and Money Supply (n = 41, k = 4)
Year ChgCPI CapUtil ChgM1 ChgM2 ChgM3
1960 0.7 80.1 0.5 4.9 5.2
1961 1.3 77.3 3.2 7.4 8.1
1962 1.6 81.4 1.8 8.1 8.9
1963 1.0 83.5 3.7 8.4 9.3
1964 1.9 85.6 4.6 8.0 9.0
1965 3.5 89.5 4.7 8.1 9.0
1966 3.0 91.1 2.5 4.6 4.8
1967 4.7 87.2 6.6 9.3 10.4
1968 6.2 87.1 7.7 8.0 8.8
1969 5.6 86.6 3.3 3.7 1.4
1970 3.3 79.4 5.1 6.5 9.9
1971 3.4 77.9 6.5 13.4 14.6
1972 8.7 83.4 9.2 13.0 14.2
1973 12.3 87.7 5.5 6.6 11.2
1974 6.9 83.4 4.3 5.4 8.6
1975 4.9 72.9 4.7 12.7 9.4
1976 6.7 78.2 6.7 13.4 11.9
1977 9.0 82.6 8.0 10.3 12.2
1978 13.3 85.2 8.0 7.5 11.8
1979 12.5 85.3 6.9 7.9 10.0
1980 8.9 79.5 7.0 8.6 10.3
1981 3.8 78.3 6.9 9.7 13.0
1982 3.8 71.8 8.7 8.8 9.1
1983 3.9 74.4 9.8 11.3 9.6
1984 3.8 79.8 5.8 8.6 10.9
1985 1.1 78.8 12.3 8.0 7.3
1986 4.4 78.7 16.9 9.5 9.1
1987 4.4 81.3 3.5 3.6 5.4
1988 4.6 83.8 4.9 5.8 6.6
1989 6.1 83.6 0.8 5.5 3.8
1990 3.1 81.4 4.0 3.8 1.9
1991 2.9 77.9 8.7 3.0 1.3
1992 2.7 79.4 14.3 1.6 0.3
1993 2.7 80.4 10.3 1.5 1.5
1994 2.5 82.5 1.8 0.4 1.9
1995 3.3 82.6 -2.1 4.1 6.1
1996 1.7 81.6 -4.1 4.8 7.5
1997 1.6 82.7 -0.7 5.7 9.2
1998 2.7 81.4 2.2 8.8 11.0
1999 3.4 80.6 2.5 6.1 8.3
2000 1.6 80.7 -3.3 6.1 8.9

Variable Names: ChgCPI = percent change in the Consumer Price Index (CPI) over previous year, CapUtil = percent utilization of manufacturing capacity in current year, ChgM1 = percent change in currency and demand deposits (M1) over previous year, ChgM2 = percent change in small time deposits and other near-money (M2) over previous year, ChgM3 = percent change in large time deposits, Eurodollars, and other institutional balances (M3) over previous year

Write a concise report answer¬ing following questions 1 through 10. Label sec¬tions of your report to correspond to the questions. Insert tables and graphs in your report as appropri¬ate.

1. Is this cross-sectional data or time-series data? What is the unit of observation?

2. Are the X and Y data well-conditioned? If not, make any transformations that may be necessary and explain.
3. State your a priori hypotheses about the sign (+ or —) of each predictor and your reasoning about cause and effect. Would the intercept have meaning in this problem? Explain.
4. Perform the regression and write the estimated regression equation (round off to 3 or 4 significant digits for clarity). Do the coefficient signs agree with your a priori expectations?
5. Does the 95 percent confidence interval for each predictor coefficient include zero? What conclusion can you draw?
6. Do a two-tailed ‘t’ test for zero slope for each predictor coefficient at a = .05.
7. (a) Which p-values indicate predictor significance at α=0.05? (b) Do the p-values support the conclusions you reached from the ‘t’ tests? (c) Do you prefer the ‘t’ test or the p-value approach? Why?
8. Based on the R2 and ANOVA table for your model, how would you describe the fit?
9. Use the standard error to construct an approximate prediction interval for Y. Based on the width of this prediction interval, would you say the predictions are good enough to have practical value?
10. (a) Generate a correlation matrix for your predictors. Round the results to three decimal places,
(b) Based on the correlation matrix, is collinearity a problem? Explain.

Source: Applied Statistics in Business and Economics, Tata Mc-Graw Hill,

*****

Answer 1

To answer these questions, we need to perform regression analysis on the given data set. Let's go through each question step by step:

1. Is this cross-sectional data or time-series data? What is the unit of observation?

Answer: This is time-series data because it includes observations over multiple years. The unit of observation is each year in the data set.

2. Are the X and Y data well-conditioned? If not, make any transformations that may be necessary and explain.

Answer: In order to determine if the data is well-conditioned, we need to check for any outliers, missing values, or extreme values that may impact the regression analysis. It is also necessary to check for multicollinearity among the independent variables. If any issues are found, appropriate transformations or adjustments should be made.

3. State your a priori hypotheses about the sign (+ or —) of each predictor and your reasoning about cause and effect. Would the intercept have meaning in this problem? Explain.

Answer: The a priori hypotheses about the signs of the predictors should be based on economic theory and prior knowledge. For example, one might hypothesize that an increase in consumer prices (ChgCPI) would lead to a decrease in capacity utilization (CapUtil), as higher prices may reduce demand. The intercept in the regression equation represents the expected value of the dependent variable when all independent variables are zero. In this case, it may not have a direct meaning as it would be difficult for all the independent variables to be zero in the real world.

4. Perform the regression and write the estimated regression equation (round off to 3 or 4 significant digits for clarity). Do the coefficient signs agree with your a priori expectations?

Answer: To perform the regression analysis, we can use a statistical software package like SPSS or Excel. The regression equation will provide estimates of the coefficients for each independent variable and the intercept term.

5. Does the 95 percent confidence interval for each predictor coefficient include zero? What conclusion can you draw?

Answer: The 95 percent confidence interval for each predictor coefficient will give us an idea of the range in which the true population coefficient lies with 95 percent certainty. If the confidence interval includes zero, it suggests that the corresponding predictor may not have a statistically significant effect on the dependent variable.

6. Do a two-tailed 't' test for zero slope for each predictor coefficient at a = .05.

Answer: A two-tailed 't' test can be performed to test whether the coefficient for each predictor is significantly different from zero. With a significance level of 0.05, we will test the null hypothesis that the coefficient is equal to zero.

7. (a) Which p-values indicate predictor significance at α=0.05? (b) Do the p-values support the conclusions you reached from the 't' tests? (c) Do you prefer the 't' test or the p-value approach? Why?

Answer: The p-values indicate the statistical significance of each predictor. A p-value less than 0.05 suggests that the corresponding predictor is statistically significant at the 0.05 significance level. The p-values should generally support the conclusions reached from the 't' tests. As for which approach is preferred, both t-tests and p-values provide similar information. The choice between the two depends on personal preference and familiarity with the method.

8. Based on the R2 and ANOVA table for your model, how would you describe the fit?

Answer: The R2 value provides an indication of the proportion of variance in the dependent variable that is explained by the independent variables. A higher R2 value suggests a better fit. The ANOVA table provides information about the overall significance of the regression model. The F-statistic and its associated p-value help determine if the model as a whole is statistically significant in explaining the dependent variable.

9. Use the standard error to construct an approximate prediction interval for Y. Based on the width of this prediction interval, would you say the predictions are good enough to have practical value?

Answer: The standard error can be used to construct a prediction interval for the dependent variable Y. The prediction interval will provide a range within which the true value of Y is likely to fall with a certain level of certainty. The width of the prediction interval will determine the precision of the predictions. If the width is small, it suggests that the predictions are more precise and have practical value.

10. (a) Generate a correlation matrix for your predictors. Round the results to three decimal places, (b) Based on the correlation matrix, is collinearity a problem? Explain.

Answer: A correlation matrix can be generated to measure the pairwise relationships between the predictors. Correlation values range from -1 to +1, with 0 indicating no correlation and values closer to -1 or +1 indicating stronger correlations. Collinearity may be a problem if there are high correlations between predictors. High correlations can lead to unstable coefficient estimates and difficulty in determining the independent effect of each predictor.

Answer 2

1. This is time-series data because the data is collected over multiple years. The unit of observation is the year.

2. To determine if the X and Y data are well-conditioned, we need to assess if there are any issues such as outliers, non-linearity, or heteroscedasticity. Additional transformations may be necessary to address these issues and improve the condition of the data.

3. A priori hypotheses about the sign of each predictor could be stated based on economic theory or logical reasoning. For example, it is reasonable to expect that an increase in the percent change in the Consumer Price Index (CPI) would have a positive effect on the response variable. The intercept in this problem represents the estimated response when all predictors are equal to zero. In this case, it may not have a meaningful interpretation since it is unlikely for all predictors to be zero.

4. Performing the regression analysis using a statistical software will provide the estimated regression equation. It is necessary to compute the regression coefficients for each predictor variable. The signs of the coefficients will indicate whether they agree with the a priori expectations.

5. The 95 percent confidence interval for each predictor coefficient can be calculated to determine if it includes zero. If the confidence interval includes zero, it suggests that the predictor may not have a significant effect on the response variable.

6. A two-tailed 't' test can be performed to test if the slope for each predictor coefficient is significantly different from zero at a significance level of 0.05.

7. (a) The p-values for each predictor coefficient can be examined to identify the ones that are statistically significant at α=0.05.
(b) Comparing the p-values with the conclusions drawn from the 't' tests can help determine if they support each other.
(c) The preference between the 't' test and p-value approach can depend on the specific research question and the interpretation of the results. Both approaches provide information about the significance of the predictor coefficients.

8. The R-squared value and ANOVA table provide information about the quality of fit for the model. The R-squared value indicates the proportion of variance in the response variable that can be explained by the predictors. The ANOVA table provides information about the overall significance of the model in predicting the response variable.

9. Using the standard error, an approximate prediction interval for the response variable can be constructed. The width of this prediction interval can give an indication of the accuracy of the predictions and whether they have practical value.

10. (a) A correlation matrix can be generated for the predictor variables to assess the relationships between them.
(b) Based on the correlation matrix, it can be determined if there are high correlations between predictor variables, indicating collinearity. Collinearity can be problematic as it can affect the stability and interpretation of regression coefficients.