Print or save as PDF — or build a custom worksheet from any module's questions.
Five friends compare their pocket money: $5, $8, $8, $10, $24. What is the "average" amount? Is it a fair representation of what most friends receive? Why or why not?
The mean is the value that would balance the data if each value were a weight on a number line. It spreads the total evenly across all values — but a single very large or small value can drag it far from the centre of the group.
The mean can be misleading. If one friend earns $5 and another earns $500, the mean of $252.50 represents neither person well. Always check whether the mean is a fair summary — if the data has outliers or is very spread out, the median may be a better measure of the "typical" value.
Divide by n, not by the range. A common error: students add the values then divide by the largest value or the range instead of the count $n$.
$$\bar{x} = \frac{\sum x}{n} = \frac{\text{sum of all values}}{\text{number of values}}$$
Worked example: Pocket money values: $5, $8, $8, $10, $24.
Step 1 — Add: $5 + 8 + 8 + 10 + 24 = 55$
Step 2 — Count: $n = 5$
Step 3 — Divide: $\bar{x} = \dfrac{55}{5} = \mathbf{\$11}$
Notice that $\$11$ is higher than 4 of the 5 values. The outlier $\$24$ has pulled the mean upward.
An outlier is a data value that is much larger or smaller than the others. Because the mean uses every value in the calculation, a single extreme outlier can significantly distort it.
Example: Five test scores: 72, 75, 78, 80, 20.
With all 5: $\bar{x} = \dfrac{72+75+78+80+20}{5} = \dfrac{325}{5} = \mathbf{65}$
Without the outlier (20): $\bar{x} = \dfrac{72+75+78+80}{4} = \dfrac{305}{4} = \mathbf{76.25}$
The outlier drags the mean from 76 down to 65 — a difference of over 10 marks. In this case, the mean of 65 under-represents most students' performance.
When to be careful: House prices, incomes, and any data with a few very large values — the mean alone can give a misleading picture.
When data is given as a frequency table, use: $$\bar{x} = \frac{\sum f \cdot x}{\sum f}$$
Add an extra column for $f \times x$ (frequency × value), then divide the total by the total frequency.
| Value (x) | Frequency (f) | f × x |
|---|---|---|
| 1 | 2 | 2 |
| 2 | 4 | 8 |
| 3 | 6 | 18 |
| 4 | 5 | 20 |
| 5 | 3 | 15 |
| Total | 20 | 63 |
$$\bar{x} = \frac{63}{20} = \mathbf{3.15}$$
When data is grouped into class intervals, we cannot know exact values. Instead, we assume every value in a class sits at its midpoint.
Midpoint of an interval = $\dfrac{\text{lower bound} + \text{upper bound}}{2}$
Example: interval 20–<30 → midpoint = $\dfrac{20+30}{2} = 25$
Then use: $\bar{x} = \dfrac{\sum f \times \text{midpoint}}{\sum f}$ — this gives an estimate of the mean.
Important: always state this is an estimated mean (or "approximate mean") when working with grouped data.
$$\bar{x} = \frac{\sum x}{n} \quad \text{(raw data)}$$
$$\bar{x} = \frac{\sum f \cdot x}{\sum f} \quad \text{(frequency table)}$$
$$\bar{x} \approx \frac{\sum f \times \text{midpoint}}{\sum f} \quad \text{(grouped data — estimate only)}$$
Outliers strongly affect the mean. If outliers are present, consider whether the mean is representative.
Calculate the mean of: 4, 7, 9, 6, 14, 8.
A frequency table shows: value 1 with frequency 3, value 2 with frequency 5, value 3 with frequency 4. What is the mean?
A data set is: 2, 5, 6, 8, 9. The mean is 6. Which value, when removed, increases the mean?
Grouped data: 10–<20: 4 students, 20–<30: 9 students, 30–<40: 7 students. What is the estimated mean?
A data set has a mean of 10. A new data value of 100 is added. What happens to the mean?
Q6. Eight test scores: 62, 71, 68, 74, 70, 69, 72, 20. (a) Calculate the mean. (b) Calculate the mean without the outlier. (c) Is the mean with or without the outlier more representative of most students? Explain.
Q7. Calculate the mean from this frequency table. Show the full f × x column and working.
Values and frequencies: (3, 4), (5, 6), (7, 8), (9, 5), (11, 2). (Format: value, frequency)
Q8. Estimate the mean from this grouped frequency table (total n = 40). Show all working including midpoints and f × midpoint column.
0–<10: 6, 10–<20: 12, 20–<30: 14, 30–<40: 8.
(a) Sum = 62+71+68+74+70+69+72+20 = 506. Mean = 506÷8 = 63.25.
(b) Without 20: sum = 506−20 = 486. n = 7. Mean = 486÷7 ≈ 69.4.
(c) The mean without the outlier (≈69) is more representative — it is close to all 7 remaining scores. The mean of 63.25 is pulled down by the single low score of 20, making it seem like most students performed poorly when they actually scored 62–74.
f × x: 3×4=12, 5×6=30, 7×8=56, 9×5=45, 11×2=22. Sum(fx) = 165. Sum(f) = 25.
$\bar{x} = \dfrac{165}{25} = \mathbf{6.6}$
Midpoints: 5, 15, 25, 35. f × midpoint: 6×5=30, 12×15=180, 14×25=350, 8×35=280. Sum = 840. n = 40.
Estimated mean = $\dfrac{840}{40} = \mathbf{21}$ (estimate — grouped data).
A class of 25 students has a mean test score of 68. Another class of 15 students has a mean test score of 72.
(a) What is the total number of marks scored by each class?
(b) What is the combined mean test score for all 40 students?
(c) Why can't you simply average 68 and 72 to get the combined mean? Explain the difference.