Instructions for Data Sets: In each data set, the dependent variable (response) is the first variable. Choose the independent variables (predictors) as you judge appropriate. Use a spreadsheet or a statistical package (e.g., MegaStat or MINITAB) to perform the necessary regression calculations and to obtain the required graphs. Write a concise report answering questions 13.9 through 13.25. Label sections of your report to correspond to the questions. Insert tables and graphs in your report as appropriate.

13.9 Is this cross-sectional data or time-series data? What is the unit of observation (e.g., firm, individual, year)?
13.10 Are the X and Y data well-conditioned? If not, make any transformations that may be necessary and explain.
13.11 State your a priori hypotheses about the sign (+ or −) of each predictor and your reasoning about cause and effect. Would the intercept have meaning in this problem? Explain.
13.12 Does your sample size fulfill Evans¡¯s Rule (n/k ¡Ý 10) or at least Doane¡¯s Rule (n/k ¡Ý 5)?
13.13 Perform the regression and write the estimated regression equation (round off to 3 or 4 significant digits for clarity). Do the coefficient signs agree with your a priori expectations?
13.14 Does the 95 percent confidence interval for each predictor coefficient include zero? What conclusion can you draw? Note: Skip this question if you are using MINITAB, since predictor confidence intervals are not shown.
13.15 Do a two-tailed t test for zero slope for each predictor coefficient at ¦Á = .05. State the degrees of freedom and look up the critical value in Appendix D (or from Excel).
13.16 (a) Which p-values indicate predictor significance at ¦Á = .05? (b) Do the p-values support the conclusions you reached from the t tests? (c) Do you prefer the t test or the p-value approach? Why?
13.17 Based on the R2 and ANOVA table for your model, how would you describe the fit?
13.18 Use the standard error to construct an approximate prediction interval for Y. Based on the width of this prediction interval, would you say the predictions are good enough to have practical value?
13.19 (a) Generate a correlation matrix for your predictors. Round the results to three decimal places. (b) Based on the correlation matrix, is collinearity a problem? What rules of thumb (if any) are you using?
13.20 (a) If you did not already do so, re-run the regression requesting variance inflation factors (VIFs) for your predictors. (b) Do the VIFs suggest that multicollinearity is a problem? Explain.
13.21 (a) If you did not already do so, request a table of standardized residuals. (b) Are any residuals outliers (three standard errors) or unusual (two standard errors)?
13.22 If you did not already do so, request leverage statistics. Are any observations influential? Explain.
13.23 If you did not already do so, request a histogram of standardized residuals and/or a normal probability plot. Do the residuals suggest non-normal errors? Explain.
13.24 If you did not already do so, request a plot of residuals versus the fitted Y. Is heteroscedasticity a concern?
13.25 If you are using time-series data, perform one or more tests for autocorrelation (visual inspection of residuals plotted against observation order, runs test, Durbin-Watson test). Is autocorrelation a concern?

DATA SET C Assessed Value of Small Medical Office Buildings (n = 32, k = 5)

Assessed
Floor Office Entrance
Obs Assess Flo Offic Entran Age Freeway
1 1796 4790 4 2 8 0
2 1544 4720 3 2 12 0
3 2094 5940 4 2 2 0
4 1968 5720 4 2 34 1
5 1567 3660 3 2 38 1
6 1878 5000 4 2 31 1
7 949 2990 2 1 19 0
8 910 2610 2 1 48 0
9 1774 5650 4 2 42 0
10 1187 3570 2 1 4 1
11 1113 2930 3 2 15 1
12 671 1280 2 1 31 1
13 1678 4880 3 2 42 1
14 710 1620 1 2 35 1
15 678 1820 2 1 17 1
16 1585 4530 2 2 5 1
17 842 2570 2 1 13 0
18 1539 4690 2 2 45 0
19 433 1280 1 1 45 1
20 1268 4100 3 1 27 0
21 1251 3530 2 2 41 1
22 1094 3660 2 2 33 0
23 638 1110 1 2 50 1
24 999 2670 2 2 39 1
25 653 1100 1 1 20 1
26 1914 5810 4 3 17 0
27 772 2560 2 2 24 0
28 890 2340 3 1 5 0
29 1282 3690 2 2 15 1
30 1264 3580 3 2 27 0
31 1162 3610 2 1 8 1
32 1447 3960 3 2 17 0

To answer questions 13.9 through 13.25 for Data Set C, follow these steps:

1. Identify the type of data: Determine whether the data is cross-sectional or time-series. In this case, the data is cross-sectional as it represents a single point in time.

2. Identify the unit of observation: Determine the unit of observation in the data set. In this case, the unit of observation is small medical office buildings.

3. Check if the X and Y data are well-conditioned: Check if the predictor (X) and response (Y) variables are suitable for regression analysis. If not, make any necessary transformations. Without further information, it is not possible to determine if the data is well-conditioned or if transformations are required.

4. Formulate a priori hypotheses: State your hypotheses about the sign of each predictor variable (X) and the reasoning behind it. Determine if the intercept has meaning in this problem. In this case, the information provided does not specify the predictor variables or the a priori hypotheses. You would need to define the predictor variables and their expected relationships based on the context of the problem.

5. Check sample size: Determine if the sample size fulfills Evans's Rule (n/k >= 10) or Doane's Rule (n/k >= 5), where n is the number of observations and k is the number of predictors. In this case, n = 32 and k = 5, so both rules are fulfilled.

6. Perform regression and obtain estimated regression equation: Use a statistical package (e.g., MegaStat or MINITAB) to perform the regression analysis and obtain the estimated regression equation. Round off the coefficients to three or four significant digits for clarity. Without the specific X and Y variables, it is not possible to perform the regression analysis or provide the estimated regression equation.

7. Check confidence intervals: Determine if the 95 percent confidence interval for each predictor coefficient includes zero. This can provide information on the statistical significance of the predictor variables. Note that if you are using MINITAB, predictor confidence intervals are not shown. Without the specific regression results, it is not possible to check the confidence intervals.

8. Conduct t-tests: Perform a two-tailed t-test for zero slope for each predictor coefficient at α = 0.05. Calculate the degrees of freedom and look up the critical value in Appendix D or using Excel. Without the specific regression results, it is not possible to conduct the t-tests.

9. Analyze p-values: Determine which p-values indicate predictor significance at α = 0.05. Compare the p-values to the conclusions reached from the t-tests. Express your preference for either the t-test or p-value approach and provide reasoning. Without the specific regression results, it is not possible to analyze the p-values.

10. Evaluate model fit: Assess the fit of the regression model based on the R2 value and ANOVA table. Describe the fit as either good or poor. Without the specific regression results, it is not possible to evaluate the model fit.

11. Construct prediction interval: Use the standard error to construct an approximate prediction interval for the response variable (Y). Assess the width of the prediction interval to determine if the predictions have practical value. Without the specific regression results, it is not possible to construct the prediction interval or evaluate its width.

12. Examine correlation matrix: Generate a correlation matrix for the predictor variables and round the results to three decimal places. Determine if collinearity is a problem based on the correlation matrix. Without knowing the predictor variables, it is not possible to generate the correlation matrix or assess collinearity.

13. Check for multicollinearity: If not already done, request variance inflation factors (VIFs) for the predictor variables. Determine if the VIFs suggest that multicollinearity is a problem. Without the specific regression results, it is not possible to check for multicollinearity using VIFs.

14. Request standardized residuals: If not already done, request a table of standardized residuals. Identify any outliers or unusual residuals based on their magnitude (three or two standard errors). Without the specific regression results, it is not possible to request the standardized residuals or analyze for outliers.

15. Check for influential observations: Request leverage statistics to determine if any observations are influential. Evaluate the impact of influential observations on the regression model. Without the specific regression results, it is not possible to request leverage statistics or assess influential observations.

16. Analyze residuals for normality: Request a histogram of standardized residuals and/or a normal probability plot. Determine if the residuals suggest non-normal errors. Without the specific regression results, it is not possible to request the residuals or assess their normality.

17. Assess heteroscedasticity: Request a plot of residuals versus the fitted Y to determine if heteroscedasticity is a concern. Without the specific regression results, it is not possible to request the plot or assess heteroscedasticity.

18. Test for autocorrelation: If using time-series data, perform one or more tests for autocorrelation. Inspect residuals plotted against observation order, conduct runs test, or utilize the Durbin-Watson test. Determine if autocorrelation is a concern. Without the information specifying if the data is time-series or not, it is not possible to conduct the tests for autocorrelation.

Note: To fully answer the questions and complete the required analysis, you would need to know the specific X and Y variables used in the regression analysis for Data Set C.