The following data on the number of iron workers in the United States for the years 1978 through 2008 are

provided by the U.S. Bureau of Labor Statistics. Using regression techniques discussed in this section, analyze
the data for trend. Develop a scatter plot of the data and fit the trend line through the data. Discuss the strength
of the model.
Year Union members
(1000’s)
1978 17,340
1979 16,996
1980 16,975
1981 16,913
1982 17,002
1983 16,960
1984 16,740
1985 16,568
1986 16,390
1987 16,598
1988 16,748
1989 16,360
1990 16,269
1991 16,110
1992 16,211
1993 16,477
1994 16,334
1995 16,305
1996 16,145
1997 15,776
1998 15,472
1999 16,685
2000 15,359
2001 16,670
2002 16,098
2003 16,212
2004 16,316
2005 16,718
2006 16,707
2007 16,113
2008 15,128

y(predicted) = -34.92903226(X) + 86023.07742, where X = Year (1978, 1979, etc.)


The scatter plot slopes downward.

Correlation = - 0.633023997, which is consistent with a downward sloping plot. Also, the correlation is fairly strong at -0.633 suggesting the trend in iron workers have been decreasing over the last 31 years.

how did you calculate corelation?

To analyze the trend in the number of iron workers in the United States from 1978 to 2008, we can use regression techniques. Regression helps us understand how one variable (in this case, the year) affects another variable (the number of union members).

First, let's create a scatter plot of the data. The x-axis will represent the years, and the y-axis will represent the number of union members. We can use a spreadsheet program like Microsoft Excel or Google Sheets to create the plot. Simply input the years and the corresponding number of union members into two separate columns, then select the data and create a scatter plot.

Once the scatter plot is created, we can fit a trend line through the data. The trend line is a line that best fits the overall trend in the data points. It helps us visualize the general direction and slope of the data.

To draw a trend line, select the scatter plot on the spreadsheet and add a trend line or regression line. You can use the "Add Trendline" feature in Excel or the "Insert" option in Google Sheets. Choose the linear regression option to fit a straight line.

Now, to assess the strength of the model, you can look at the correlation coefficient (r-value) and the coefficient of determination (r-squared value). These values provide insights into the strength of the relationship between the year and the number of union members. Higher values indicate a better fit.

To calculate these values, you can use the built-in functions in the spreadsheet program or use a statistical software package like R or Python. Calculate the correlation coefficient and the coefficient of determination using the formula or the appropriate function.

After obtaining these values, you can interpret them as follows:
- A correlation coefficient (r-value) close to 1 indicates a strong positive relationship between the year and the number of union members. If the value is close to -1, it shows a strong negative relationship.
- The coefficient of determination (r-squared value) represents the proportion of the variation in the number of union members that can be explained by the year. A value close to 1 suggests a good fit of the trend line to the data.

By analyzing the scatter plot, the trend line, and the strength of the model, you can draw conclusions about the trend in the number of iron workers in the United States and the effectiveness of the regression analysis in capturing that trend.