Print or save as PDF — or build a custom worksheet from any module's questions.
You score 40, 45, 48, 50, 52, 55, 96 on 7 tests. Which number best describes your "typical" performance — the mean (about 55) or the middle value (50)? Why might the middle value be more honest?
When data has outliers or is skewed, the median (middle value) and mode (most common value) often describe the data more fairly than the mean.
Must order first: The median only works on data that has been sorted from smallest to largest. Finding the "middle" of unordered data gives the wrong answer.
Even n — average the two middle values: For 8 values, the median is the average of the 4th and 5th values, not just the 4th. Many students forget to average.
Mode is not always meaningful: If every value appears exactly once, there is no mode. Don't invent one.
Step 1: Order the data from smallest to largest.
Step 2: Find the middle position.
Odd example (n = 5): Data: 3, 7, 9, 12, 15. Position = (5+1)/2 = 3rd. Median = 9.
Even example (n = 6): Data: 3, 5, 7, 9, 11, 14. Positions 3 and 4. Median = (7+9)/2 = 8.
The median does not have to be a value in the data set (as seen above — 8 is not in the list).
Data: 40, 45, 48, 50, 52, 55, 96 (n = 7, ordered).
Mean: Sum = 386. $\bar{x} = 386 \div 7 \approx 55.1$
Median: Position = (7+1)/2 = 4th value = 50.
The outlier (96) increases the mean from about 50 to 55 — a shift of 5 marks. But the median stays exactly at 50 regardless of whether the last value is 96 or 9600. The median is resistant to outliers.
For these 7 test scores, 50 (median) is the more honest description of "typical" performance — 4 of the 7 scores are between 45 and 55.
The mode is the value that appears most often. Count the frequency of each value — the one with the highest count is the mode.
One mode: 2, 3, 3, 4, 5, 5, 5, 7 → mode = 5 (appears 3 times).
Bimodal: 2, 2, 3, 3, 4 → mode = 2 and 3 (each appears twice — bimodal).
No mode: 1, 2, 3, 4, 5 → each appears once, so there is no mode.
The mode is the only measure of centre that works for categorical data: the modal colour of cars in a car park could be "white" — you can't find the mean or median of colours.
In a frequency table: the mode is the value with the highest frequency.
| Measure | Best used when… | Weakness |
|---|---|---|
| Mean | Data is symmetric, no outliers | Pulled strongly by outliers |
| Median | Data is skewed or has outliers (house prices, incomes) | Ignores the size of extreme values |
| Mode | Categorical data, or "most popular" value needed | May not exist; ignores other values |
Real-world example: Why do you think newspaper articles about house prices usually report the median sale price rather than the mean? (A few luxury mansions would inflate the mean, making typical houses seem more expensive than they are.)
Median: Order data. Odd n → position $\frac{n+1}{2}$. Even n → average of positions $\frac{n}{2}$ and $\frac{n}{2}+1$.
Mode: Most frequent value. Can be none, one, or two (bimodal).
Resistant: Median barely moves when outliers are added. Mean is strongly affected.
Choose: mean for symmetric data; median for skewed/outlier data; mode for categorical or "most popular".
Find the median of: 11, 5, 9, 3, 15, 7, 13.
Find the median of: 14, 4, 10, 6, 12, 8.
Find the mode of: 3, 7, 5, 7, 2, 9, 7, 4, 5.
A real estate report says the average house price in a suburb is $1.4 million, but most houses sell for around $900 000. Which measure of centre is most likely being reported, and what should be used instead?
Which of the following data sets is bimodal? A. 1, 2, 3, 4, 5 B. 2, 2, 3, 4, 4, 5 C. 1, 1, 1, 2, 3 D. 5, 6, 7, 8, 9
Q6. Find the median and mode of: 14, 9, 7, 14, 11, 8, 6, 14, 10, 9. Show all working including ordering the data.
Q7. A data set of 5 values has a mean of 6.2. Four of the values are: 4, 7, 8, 5. Find the missing fifth value. Show your working using the mean formula.
Q8. A data set of house sale prices (in $000s): 420, 450, 460, 470, 480, 490, 820. Calculate both the mean and the median. Which measure better represents the typical house price? Explain why, referring to the shape of the data.
Ordered: 6, 7, 8, 9, 9, 10, 11, 14, 14, 14.
n = 10 (even). Median = average of 5th and 6th values = (9 + 10) ÷ 2 = 9.5.
Mode = 14 (appears 3 times — most frequent).
Mean = 6.2, n = 5. Total sum = 6.2 × 5 = 31.
Known values: 4 + 7 + 8 + 5 = 24.
Missing value = 31 − 24 = 7.
Sum = 420+450+460+470+480+490+820 = 3590. n = 7.
Mean = 3590 ÷ 7 ≈ $512 900.
Ordered (already ordered). Median = 4th value = $470 000.
The median ($470 000) is more representative. The data is skewed right — six of the seven prices are between $420k and $490k, but the $820k outlier pulls the mean up to $513k. The median is not affected by this extreme value and better represents what most buyers actually paid.
A data set has 9 values. When a 10th value of 100 is added, the median does not change but the mean increases by 5.
(a) What was the original mean? (Hint: if adding 100 increases the mean by 5, you can set up an equation.)
(b) Where does the value 100 sit when the 10 values are ordered? Does this make sense given that the median didn't change?
(c) If the original median of the 9 values is the 5th value = 40, and the new median of 10 values is the average of the 5th and 6th values = 40, what can you say about the 6th value in the ordered set of 10?