Your weak spots

Insights load after your first practice round.

Module 5 · L6 of 15 ~35 min ⚡ +95 XP available

Measures of Centre and Spread

A billionaire walks into a room of ten people earning average wages. The mean income skyrockets — but it represents nobody in that room. One extreme value can make a distribution look completely different depending on which measure you choose. By the end of this lesson you'll know exactly when to use mean versus median, and how to quantify spread with range, IQR, and standard deviation.

Today's hook — In 2023, Australia's mean household income was ~$120,000 but the median was ~$90,000. Politicians choose "average income" to make the economy look stronger. The $30,000 gap exists entirely because a few very high earners pull the mean. Which number tells the truth?

0/5QUESTS

Worksheets

Practise this lesson

Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.

Build Foundations & guided practice Apply Application practice Master Mastery challenge Build custom Build your own from any module question

Recall — your gut answer first

+5 XP warm-up

A data set has mean $10$ and standard deviation $2$. If every data point increases by 5, what happens to the mean and standard deviation? Predict before reading on.

auto-saved

The shift and scale rules — memorise these

+5 XP to read

Two rules underlie most exam questions in this topic. Lock them in before diving into the content.

Adding a constant $c$ shifts all measures of centre by $c$ but leaves every measure of spread completely unchanged. Multiplying by a constant $k$ scales both centre and spread — but variance scales by $k^2$.

$$\bar{x} = \frac{\sum x}{n} \qquad \text{IQR} = Q_3 - Q_1 \qquad \text{Outlier fences: } Q_1 - 1.5\text{IQR},\; Q_3 + 1.5\text{IQR}$$

Mean

$\bar{x} = \frac{\sum x}{n}$. Sensitive to outliers — one extreme value pulls it strongly.

Median

Middle value when ordered. Robust to outliers — the preferred measure for skewed data.

Grouped data mean

$\bar{x} \approx \frac{\sum f \cdot m}{\sum f}$ where $m$ = mid-interval value. An estimate only.

What you'll master

Know

Key facts

Mean $= \frac{\sum x}{n}$; median = middle value; mode = most frequent
Range = max $-$ min; IQR $= Q_3 - Q_1$
Adding $c$ shifts mean by $c$ but leaves SD unchanged

Understand

Concepts

Mean is sensitive to outliers; median is robust
Standard deviation measures average distance from the mean
When to use each measure depending on data shape

Can do

Skills

Calculate mean, median, mode, range, IQR, and standard deviation
Estimate mean from grouped data using mid-interval values
Choose appropriate measures for skewed or outlier-prone data

Key terms

Mean $\bar{x}$Arithmetic average: $\bar{x} = \frac{\sum x}{n}$. Uses every data point; sensitive to outliers.

MedianMiddle value of ordered data. Position $= \frac{n+1}{2}$. Robust to outliers.

ModeMost frequently occurring value. A data set may have no mode or multiple modes.

IQRInterquartile range $= Q_3 - Q_1$. Spread of the middle 50% of data; robust to outliers.

Standard deviation $s$Typical distance from the mean: $s = \sqrt{\frac{\sum(x-\bar{x})^2}{n-1}}$. Not robust to outliers.

OutlierA value outside $Q_1 - 1.5\text{IQR}$ or $Q_3 + 1.5\text{IQR}$ (the 1.5 × IQR rule).

Types of data — classify before you analyse

core concept

Before analysing any data set, identify what type of data you have. Appropriate displays, statistics, and analyses all depend on this.

Categorical data describes qualities or characteristics:

Nominal: categories with no natural order — eye colour, blood type, postcode
Ordinal: categories with a natural order — exam grade, satisfaction rating, shirt size

Numerical data represents counts or measurements:

Discrete: countable values (usually integers) — number of students, goals scored
Continuous: any value in an interval — height, weight, temperature, time

Data type	Displays	Statistics
Categorical	Bar chart, pie chart	Mode, proportions
Numerical discrete	Dot plot, histogram	Mean, median, SD, IQR
Numerical continuous	Histogram, box plot	Mean, median, SD, IQR, percentiles

Trap: Numbers are not always numerical data. A postcode is categorical — it's a label, not a measurement.

Categorical: nominal (no order) vs ordinal (natural order); Numerical: discrete (countable) vs continuous (measurable interval)

Pause — copy the data classification hierarchy: categorical (nominal = no order; ordinal = natural order) and numerical (discrete = countable; continuous = measurable interval) into your book.

Quick check: Which of the following is an example of numerical continuous data?

Measures of centre and spread

Mean, median, and mode

core concept

We just saw that numerical data can be discrete or continuous, and categorical data can be nominal or ordinal. That raises a question: once we know the data type, which measures of centre are actually meaningful — and when does mean become misleading? This card answers it → mean $\bar{x} = \frac{\sum x}{n}$ is sensitive to outliers; median is robust and preferred for skewed distributions.

Three ways to find the "middle" of a data set — each with strengths and weaknesses.

In skewed data, one outlier can move the mean dramatically while the median barely changes.

Mean ($\bar{x}$): $\bar{x} = \dfrac{\sum x}{n}$. Uses every data point; sensitive to outliers.

Median: middle value when ordered. For $n$ values: position $= \dfrac{n+1}{2}$. Robust.

Mode: most frequent value. A data set can have multiple modes or no mode.

Measure	Best for	Avoid when
Mean	Symmetric data, no outliers	Skewed data or outliers present
Median	Skewed data, data with outliers	Needs precise average for calculations
Mode	Categorical data, identifying peaks	Continuous data with no repeats

Australian household income. In 2023, mean household income ≈ $120,000; median ≈ $90,000. The mean is pulled up by a small number of very high earners. When politicians quote "average income," they usually mean the mean — which makes the economy look stronger than it feels for most people. The median gives a truer picture of the typical household.

$\bar{x} = \frac{\sum x}{n}$ — sensitive to outliers, good for symmetric data; Median — robust to outliers, preferred for skewed distributions (income, house prices)

Pause — copy the formula $\bar{x} = \frac{\sum x}{n}$ and the outlier-sensitivity decision rule: use mean for symmetric data, median for skewed distributions (e.g. income, house prices) into your book.

Did you get this? True or false: when data is right-skewed with outliers, the median is usually a better measure of centre than the mean.

Worked examples · 3 in a row, reveal as you go

PROBLEM 1 · OUTLIER EFFECT

Data: 12, 15, 18, 20, 22, 25, 28, 100. Find (a) mean and median, (b) range and IQR, (c) standard deviation, then note the effect of the outlier 100.

$\bar{x} = \dfrac{12+15+18+20+22+25+28+100}{8} = \dfrac{260}{8} = 32.5$

Median = avg of 4th & 5th values = $\frac{20+22}{2} = 21$. Mean is pulled far above median by the outlier.

PROBLEM 2 · GROUPED DATA MEAN

Estimate the mean from: Score 0–10 (freq 5), 10–20 (freq 12), 20–30 (freq 8), 30–40 (freq 5).

Mid-interval values: $m = 5, 15, 25, 35$.

Use the midpoint of each class interval as the representative value.

PROBLEM 3 · 1.5 × IQR OUTLIER TEST

Data: 10, 12, 14, 15, 16, 18, 50. Identify any outliers using the 1.5 × IQR rule.

Ordered: 10, 12, 14, 15, 16, 18, 50. $Q_1 = 12$, $Q_3 = 18$, IQR $= 6$.

With 7 values: lower half is 10,12,14 → $Q_1=12$; upper half is 16,18,50 → $Q_3=18$.

Fill the gap: A data set has mean $= 15$ and $\text{SD} = 3$. Every value is increased by 10. The new mean $= \underline{\quad}$ and the new $\text{SD} = \underline{\quad}$.

Common errors · the 3 traps that cost marks

Trap 01

"Mean is always the best measure"

For household incomes, house prices, survival times after illness — any right-skewed distribution — the mean is distorted. The median is usually more representative. Always ask: is the data skewed? Are there outliers?

Trap 02

Adding 5 increases the standard deviation by 5

Adding a constant shifts all data by the same amount. The distances between points stay exactly the same. Standard deviation, variance, IQR, and range are all unchanged. Only measures of centre shift.

Trap 03

Forgetting that the grouped mean is an estimate

$\bar{x} \approx \frac{\sum f \cdot m}{\sum f}$ uses midpoints — the actual values within each class are unknown. The estimate could be off if values cluster near one end of a class. Always write "estimate" or "approximately."

Match each measure to its property. Which statement correctly describes the measure listed?

Quick-fire activities

Data: 4, 7, 8, 8, 9, 12, 15. Find the mean, median, and mode.

Data: 2, 5, 6, 8, 10, 12, 15. Find range, IQR, and standard deviation.

Add 3 to every value in Q2. What happens to mean, median, range, and standard deviation?

Grouped data: 0–5 (freq 4), 5–10 (freq 8), 10–15 (freq 6), 15–20 (freq 2). Estimate the mean.

Explain why the ABS reports median household income rather than mean household income.

Revisit your thinking

Adding 5 to every data point shifts the mean by 5 (10 → 15) but leaves the standard deviation unchanged (still 2). Standard deviation measures spread — the distance between data points — and adding a constant moves every point by the same amount, so the distances between them do not change. This is fundamental: adding $c$ to all data shifts measures of centre by $c$ but preserves all measures of spread.

auto-saved

Multiple choice

+5 XP per correct · +25 XP all-correct

Pick your answer, then rate your confidence — that tells the system what to drill next.

Short answer

ApplyBand 43 marks

Q1. The ages of 10 employees at a tech startup are: 22, 23, 24, 25, 26, 27, 28, 30, 32, 65. (a) Calculate the mean, median, and mode. (b) Calculate range and IQR. (c) Identify any outliers using the 1.5 × IQR rule. (d) Which measure of centre best represents the typical age? Justify. (3 marks)

auto-saved

ApplyBand 43 marks

Q2. A data set of 50 test scores is in the grouped frequency table below. (a) Estimate the mean using mid-interval values. (b) The actual mean from raw data is 58.2. Calculate the percentage error and explain why the estimate differs. (c) If every student received 5 bonus marks, what happens to the estimated mean and the estimated standard deviation? (3 marks)

Score	0–20	20–40	40–60	60–80	80–100
Frequency	3	8	15	18	6

auto-saved

AnalyseBand 53 marks

Q3. Two schools report HSC Maths Advanced results. School A: mean = 82, median = 80, SD = 8. School B: mean = 82, median = 75, SD = 18. (a) Describe the likely distribution shape at each school. (b) A parent argues "Both schools have the same mean — they perform equally." Evaluate this claim using at least two statistical measures. (c) As a principal, which school's data is more concerning, and what targeted intervention would you propose? (3 marks)

auto-saved

Comprehensive answers (click to reveal)

Drill 1: Mean $= 63/7 = 9$; Median $= 8$; Mode $= 8$.

Drill 2: Range $= 13$; $Q_1 = 5.5$, $Q_3 = 13.5$, IQR $= 8$; $s \approx 4.35$. (Using $\sum x^2 = 558$, $\bar{x} = 58/7 \approx 8.29$, $s = \sqrt{(558 - 58^2/7)/6} \approx 4.35$.)

Drill 3: Mean increases by 3 (→ 11.29). Median increases by 3 (→ 11). Range unchanged (13). SD unchanged (≈ 4.35).

Drill 4: Midpoints: 2.5, 7.5, 12.5, 17.5. $\bar{x} \approx \frac{4(2.5)+8(7.5)+6(12.5)+2(17.5)}{20} = \frac{180}{20} = 9$.

Drill 5: Income is right-skewed — a few very high earners pull the mean well above what most households earn. The median represents the typical household more accurately.

Q1 (3 marks): (a) Mean $= 322/10 = 32.2$; Median $= (26+27)/2 = 26.5$; No mode [1.5]. (b) Range $= 43$; $Q_1 = 24$, $Q_3 = 31$, IQR $= 7$ [0.5]. (c) Fences: $24-10.5=13.5$, $31+10.5=41.5$. Outlier: 65 [0.5]. (d) Median (26.5) — the mean (32.2) is distorted by the 65-year-old outlier; most employees are in their mid-20s [0.5].

Q2 (3 marks): (a) $\bar{x} \approx \frac{3(10)+8(30)+15(50)+18(70)+6(90)}{50} = \frac{2820}{50} = 56.4$ [1]. (b) $\% \text{ error} = \frac{|56.4-58.2|}{58.2} \times 100 \approx 3.1\%$; the estimate assumes all values sit at the midpoint — actual values may cluster toward one end of each interval [1]. (c) Estimated mean increases by 5 to 61.4; estimated standard deviation is unchanged [1].

Q3 (3 marks): (a) School A: approximately symmetric (mean ≈ median), moderate spread. School B: right-skewed (mean $>$ median), large spread — many low performers with some very high performers [1]. (b) The claim is flawed: School B's median is 5 marks lower, meaning more than half of School B's students score below 75. School B's SD (18 vs 8) shows far greater variability — some students excel while many struggle. School A delivers consistent results for most students [1]. (c) School B is more concerning. Intervention: diagnostic testing to identify at-risk students, targeted support or tutoring for low-performers, differentiated instruction, and extension for high-performers — addressing the inequitable spread of outcomes [1].

Boss battle · The Data Journalist

earn bronze · silver · gold

Five timed questions on measures of centre, spread, and outliers. Beat the boss to bank a tier — gold (90% + speed), silver (75%), or bronze (50%). Replays welcome.

Enter the arena

Science Jump · platform challenge

Climb platforms by answering mean, median, IQR, and standard deviation questions. Lighter alternative to the boss.

Mark lesson as complete

Tick when you've finished the practice and review.

← Lesson 5 · Discrete Probability Distributions Lesson 7 · Representing Data →

Module overview · Maths Advanced · Checkpoint 2