# statistics

Suppose for the same 3000 women, we also measured their weight. Suppose the data were again normally distributed. The average weight is 143 lbs., and the standard deviation is 30 lbs. Suppose the correlation between height and weight is r = +.69.
What is the slope of the best-fit line to predict height from weight? What is the intercept of the line? (Make height the Y variable and weight the X variable).
Write a sentence or two that says what the equation of the line tells you about the relation between the variables: that is, when weight increases by one pound, how much does the predicted value of height increase?
What is your best guess for the height of a woman who weighs 143 lbs? What is your best guess for the height of a woman who weighs 158 lbs?

4 points for part A: 1 point for correct final answer, 1 point for showing work – this applies to both slope and intercept. You don’t need any original data here – you should use the shortcut formulas in the book, first for the slope, and then for the Y-intercept.
Slope = .? Intercept = ?

2 points for part B: The slope is ______, so predicted height will _______with weight. For each additional pound added, one should predict an increase of _____ inches in height.

4 points for part C: 2 points for each guess: 1 for correct final answer, 1 for showing work.
Plug and chug into regression equation: Y = ____________

For the second, Y = _________

3. (20 pts) For the same data as in Questions 1-2, make a fairly detailed drawing by hand of what the scatterplot would look like. (You don’t have the original data, but you can actually provide quite a bit of information about the scatterplot!) Be sure to clearly indicate each of the following: which variable is X or Y, the range on the X and Y axis of each of the variables (you can figure out the approximate range that will include most of the scores of height and weight by knowing the means and standard deviations, and by using Table A in Appendix D), the equation of the best-fit line for predicting Y from X, and a sense of the dispersion of points that is close to a correlation of +.69 (see examples in chapter 6). Be sure to accurately draw the regression line on the scatterplot, label it with its equation (taken from Question 2), and also to include a few sample deviates for the best-fit line (again, see examples in chapter 6); make sure the deviates go in the right direction. (Note: You don’t need to draw all 3000 data points – just include enough to give a sense of the spread of the data.

Point allocation:
2 points for labeling X axis _______

2 points for labeling Y axis __________

2 points for correct units and range on X axis (+ 3 standard deviations – the N is large)

2 points for correct units and range on Y axis (+ 3 standard deviations – the N is large)

6 points for correct best fit line for predicting Y (drawn precisely and labeled with the equation from question #2:

2 points for a scatter of points that is roughly like r = (see examples in chapter 6)

2 points for sample deviates from best fit line for predicting Y ( )

1 point for title for the graph

1 point for concentration of points in middle of graph, reflecting normally distributed variables

4. (15 pts) Use the SPSS data file ‘hw3-spring2015’ for this problem. The data represent the scores of some students on two tests. Make a scatterplot of the data (however, you don’t need to include it when you turn in your assignment).
A. Is the relation linear or nonlinear? Is it perfect or imperfect? Is it positive or negative?
B. What is the value of the correlation coefficient between the scores for each student on the first and second exams?
C. What is the equation of the best-fit line when you try to predict a student’s score on the second test (Y) by looking at their score on the first test (X)?
D. What would you predict as a student’s score on the second test if that student scored a 75 on the first test? What would you predict as a student’s score on the second test if that student scored an 88 on the first test? (Do these by hand – show you work, and round each answer to the nearest whole number.)

The SPSS file in on Blackboard.

A. 3 points, 1 for each subquestion: The relationship is _____,____,________.

B. 1 point, straight out of SPSS. The correlation coefficient is_______.

C. 5 points: 1 for correct slope = ______, 1 for correct intercept = _____, 1 for correct general form of the equation:____________, 2 for naming the variables (1 point for _____and 1 point for ______, rather than just generic X and Y).

D. 6 points: for each part, 1 for correct final answer, 2 for showing work.
For the first part:

Answer is________.You should just plug in 75 into the regression equation in Part C and show your work.

For the second part:
Answer is______. You should just plug in 88 into the regression equation in Part C and you should show your work.

5. (20 pts) Answer each of the following True-False questions. Assume that all of the assumptions for correlation and linear regression have been met (including the assumption that the X and Y variables are each normally distributed). In your write-up, just list the sub-question letter (A-J) and whether the statement is True or False – no need to restate the question or to justify your answer.
A. Correlation always implies causation.
B. If a correlation is negative, then as it becomes even more negative, r2 decreases.
C. If the units used to measure the Y variable change (like from inches to centimeters), then the value of r will change.
D. As |r| increases, the average deviation of data from the predicted value (according to the best-fit regression line) increases.
E. The best-fit regression line to predict Y when you know X will always go through the point ZX = 0 and ZY = 0.
F. If a positive correlation exists between X and Y, and the range of X is then greatly restricted, |r| must increase.
G. If a positive correlation exists between X and Y, and a new data point is added whose ZX = 3 and ZY = 0, the correlation will increase. (Note: for G, H, I, and J, assume there are many, many data points in the dataset, so that the introduction of new data points doesn’t change the values of the averages in any meaningful way.)
H. If no correlation exists between X and Y, and a new data point is added whose ZX = 2.5 and ZY = 2.5, r will increase.
I. If a negative correlation exists between X and Y, and a new data point is added whose ZX = 2.5 and ZY = 2.5, |r| will decrease.
J. If a positive correlation exists between X and Y, and a new data point is added whose ZX = 2.5 and ZY = 2.5, |r| will decrease.

1. 👍 0
2. 👎 0
3. 👁 210
1. We do not do your homework for you. Although it might take more effort to do the work on your own, you will profit more from your effort. We will be happy to evaluate your work though.

1. 👍 0
2. 👎 0
2. 5.
A. False
B. False

1. 👍 0
2. 👎 0

## Similar Questions

A group of 894 women aged 70-79 had their height and weight measured. The mean height was 159 cm with a standard deviation of 5 cm and the mean weight was 65.9kg with a standard deviation of 12.7kg. Both sets of data are fairly

asked by Candice on June 21, 2016
2. ### Mathematics

Suppose that the maximum weight that a certain type of rectangular beam can support varies inversely as its length and jointly as its width and the square of its height. Suppose also that a beam 4 inches wide, 3 inches high, and

asked by Anonymous on March 11, 2016
3. ### Data Analysis

Question 1 a. What does it mean for two events A and B to be statistically independent? b. What is the difference between the standard deviation for Continuous data and the standard deviation for Discrete data (you cannot state

asked by Sam on March 31, 2020
4. ### Physics

The ski slopes at Bluebird Mountain make use of tow ropes to transport snowboarders and skiers to the summit of the hill. One of the tow ropes is powered by a 22-kW motor which pulls skiers along an icy incline of 14° at a

asked by Honey on April 4, 2013
5. ### math

suppose you deposit \$3000 in a savings account that pays interest in a a rate of 4%. if no money is added or withdrawn from the account, how much will be in the account after ten years.

asked by brandon on November 9, 2015
1. ### math

1. Finding the Counterfeit Coin. You have 5 coins. You know that one of them is counterfeit and weighs less than the others. Suppose also that you have a balance scale. Is there a strategy for finding the counterfeit coin using

asked by al on September 30, 2014
2. ### Algebra

If a researcher wanted to know the mean weight (the mean is the sum of all the measurements divided by the number of measurements) of women in the U.S., the weight of every woman would have to be measured and then the mean weight

asked by Kim on September 29, 2014
3. ### physics 2

Question 1: Suppose a child of weight w climbs onto the sled. If the tension force is measured to be 58.5 N, find the weight of the child and the magnitude of the normal force acting on the sled. Question 2 : (a) Suppose a hockey

asked by Michelle J on October 5, 2019
4. ### Physics

A bothersome feature of many physical measurements is the presence of a background signal (commonly called "noise"). It is necessary, therefore, to subtract off this background level from the data to obtain a valid measurement.

asked by Marissa on October 8, 2009
5. ### stats

A bakery stated that the average number of breads sold daily is 3000. An employee thinks that the actual value might differ from this and wants to test this statement. The correct hypotheses are: (I think it is 1) (1) H0 : ¦Ì =

asked by joseph on March 31, 2010