Consider the following data on lengths of 30 pieces of metal.

18 26 15 17 7 27 24 17 10 17
23 29 28 18 10 23 16 6 12 26
5 12 23 22 24 14 16 26 19 22
a. Define an outlier. Give an example of an outlier from the above data set. [3]
b. (i) Calculate and interpret the: range, mean, mode and median of the lengths of metal pieces. [8]
(ii) Calculate the sample standard deviation of the above data set. [4]
(iii) Calculate and interpret the Lower quartile, the Second quartile, and the Upper quartile. [9]
(iv) Calculate the Inter-quartile range and the Semi-inter quartile range. [4]
(v) Calculate and interpret the 20th percentile and 85th percentile from the data set above. [6]

An outlier is a value that is much larger than the rest or much smaller than the rest.

range is the difference between the largest and the smalleest value.

mean = add them all and divide by 30

mode is the value that occurs more than once.

median. put all of the data in order from lowest to highest and find the middle number.

I hope you are allowed to use a statistical calculator because this would be a lot of work to do data piece by data piece.

The lower quartile is the is the lowest 1/4 of the data. You can actually find it by finding the middle between the lowest value and the median.
second quartile is just the median

Upper quartile is the upper 1/4 of the data find it by finding the value that is half way between the median and the largest number.

The interquartile range is the difference between the lower and upper quartiles.

20th percentile. 20% of the data falls below that point. 20% of 30 = 6 so it would be the bottom 6.

85% is the data below this point.

.85 x 30

a. An outlier is a data point that is significantly different from other data points in the dataset. It is usually an unusually high or low value that does not follow the general trend of the data. An example of an outlier from the given data set is 5, as it is much smaller compared to the other values.

b. (i) To calculate the range, subtract the smallest value from the largest value:
Range = 29 - 5 = 24
The range represents the difference between the highest and lowest values in the dataset.

To calculate the mean, sum up all the values and divide by the total number of values:
Mean = (18 + 26 + 15 + 17 + 7 + 27 + 24 + 17 + 10 + 17 + 23 + 29 + 28 + 18 + 10 + 23 + 16 + 6 + 12 + 26 + 5 + 12 + 23 + 22 + 24 + 14 + 16 + 26 + 19 + 22) / 30
Mean = 18
The mean represents the average value of the dataset.

The mode is the value that appears most frequently in the dataset. In this case, there is no mode because no value appears more than once.

The median is the middle value when the dataset is arranged in ascending order. In this case, we have 30 values, so the median is the average of the 15th and 16th values when arranged in ascending order:
Median = (16 + 17) / 2
Median = 16.5

(ii) To calculate the sample standard deviation, use the following formula:
1. Calculate the mean of the dataset.
2. Subtract the mean from each value and square the result.
3. Sum up all the squared differences.
4. Divide the sum by (n-1), where n is the total number of values.
5. Take the square root of the result.

Calculating step by step:
1. Mean = 18 (from previous calculation)
2. Subtract the mean from each value and square the result:

(18-18)^2 + (26-18)^2 + (15-18)^2 + (17-18)^2 + (7-18)^2 + (27-18)^2 + (24-18)^2 + (17-18)^2 + (10-18)^2 + (17-18)^2 +
(23-18)^2 + (29-18)^2 + (28-18)^2 + (18-18)^2 + (10-18)^2 + (23-18)^2 + (16-18)^2 + (6-18)^2 + (12-18)^2 + (26-18)^2 +
(5-18)^2 + (12-18)^2 + (23-18)^2 + (22-18)^2 + (24-18)^2 + (14-18)^2 + (16-18)^2 + (26-18)^2 + (19-18)^2 + (22-18)^2 = 1522

3. Sum of squared differences = 1522
4. Sample standard deviation = sqrt(1522 / (30-1)) = 5.85 (approximately)

(iii) To calculate quartiles, we first need to arrange the dataset in ascending order:
5, 6, 7, 10, 10, 12, 12, 14, 15, 16, 16, 17, 17, 17, 18, 18, 19, 22, 22, 23, 23, 24, 24, 26, 26, 26, 27, 28, 29.

Lower quartile (Q1) is the median of the lower half of the dataset. In this case, it is the median of the first 15 values: Q1 = (12 + 14) / 2 = 13.

Second quartile (Q2) is the median of the whole dataset, which we calculated earlier as 16.5.

Upper quartile (Q3) is the median of the upper half of the dataset. In this case, it is the median of the last 15 values: Q3 = (24 + 26) / 2 = 25.

(iv) To calculate the interquartile range (IQR), subtract the lower quartile (Q1) from the upper quartile (Q3):
IQR = Q3 - Q1 = 25 - 13 = 12.
The interquartile range represents the range of values where the middle 50% of the dataset lies.

The semi-interquartile range (SIQR) is half of the interquartile range:
SIQR = IQR / 2 = 12 / 2 = 6.
The semi-interquartile range represents the range of values where the middle 25% of the dataset lies.

(v) To calculate the 20th percentile, multiply the total number of values by 0.20 and round it up to the nearest whole number:
20th percentile = 30 * 0.20 = 6 (rounded up).

From the dataset arranged in ascending order, the 20th percentile is the 6th value, which is 7.

To calculate the 85th percentile, multiply the total number of values by 0.85 and round it up to the nearest whole number:
85th percentile = 30 * 0.85 = 26 (rounded up).

From the dataset arranged in ascending order, the 85th percentile is the 26th value, which is 26.