Skip to content
M
hscscience Maths Std · Y12
0/100daily goal
0
0
0 due
0
L1 · 0 XP
KJ
Your weak spots
Insights load after your first practice round.
Module 8 · L5 of 12 ~25 min MS12-9 ⚡ +50 XP available

Box Plots and Outliers

A single box can reveal what pages of numbers cannot. The box plot — also called a box-and-whisker plot — distils an entire data set into five numbers: minimum, lower quartile, median, upper quartile, and maximum. With one glance you can see the centre, spread, skewness, and any unusual values. This lesson shows you how to construct box plots, identify outliers using the 1.5 × IQR rule, and compare distributions side by side.

Today's hook — Data: 5, 8, 12, 15, 18, 20, 22, 25, 30, 100. Without calculating, which value seems like an outlier? How might you decide mathematically?
0/5QUESTS
Worksheets

Practise this lesson

Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.

01
Recall — your gut answer first
+5 XP warm-up

Data: 5, 8, 12, 15, 18, 20, 22, 25, 30, 100. Without calculating, which value seems like an outlier? How might you decide mathematically?

Before reading on — write your gut feeling. We will revisit this at the end of the lesson.

auto-saved
02
Key ideas for this lesson
reference

Box plots (box-and-whisker plots) summarise data using five numbers. Two rules underpin every box plot question.

Five-number summary: Min, $Q_1$, Median ($Q_2$), $Q_3$, Max. These five values define the shape and spread of any data set.

Outlier rule: A value is an outlier if it falls below $Q_1 - 1.5 \times IQR$ or above $Q_3 + 1.5 \times IQR$.

OUTLIER FENCES IQR = Q3 - Q1 Lower = Q1 - 1.5 x IQR Upper = Q3 + 1.5 x IQR Values beyond fences = outliers
The IQR measures the middle 50% spread — box plots make this visible at a glance
Box = middle 50%
The box spans $Q_1$ to $Q_3$ and always contains the middle half of the data. Its width is the IQR.
Whiskers to fences
Whiskers extend to the most extreme value within each fence — not necessarily to min/max if outliers exist.
Outliers: investigate
Outliers should be investigated — they may be data errors or genuine extreme values. Never remove automatically.
03
What you will master
Know

Key facts

  • Five-number summary: Min, $Q_1$, Median, $Q_3$, Max
  • Outlier rule: $1.5 \times IQR$ fences
  • Box plot structure and components
Understand

Concepts

  • Why box plots reveal skewness
  • How outliers are defined mathematically
  • When to exclude vs investigate outliers
Can do

Skills

  • Find the five-number summary and IQR
  • Apply the 1.5 × IQR rule to identify outliers
  • Draw and compare side-by-side box plots
04
Key terms
Five-number summaryMin, $Q_1$, Median, $Q_3$, Max — the five values that define a box plot.
Interquartile range (IQR)$IQR = Q_3 - Q_1$. The spread of the middle 50% of the data.
Quartiles$Q_1$ (25th percentile), $Q_2$ (median, 50th percentile), $Q_3$ (75th percentile).
OutlierA data value that falls outside the lower or upper fence. Plotted as an individual point on a box plot.
Lower fence$Q_1 - 1.5 \times IQR$. Values below this are outliers.
Upper fence$Q_3 + 1.5 \times IQR$. Values above this are outliers.
05
Five-number summary
core concept

The five-number summary consists of five values that together describe the entire distribution:

  1. Minimum: The smallest value in the data set
  2. $Q_1$ (lower quartile): The 25th percentile — median of the lower half
  3. Median ($Q_2$): The 50th percentile — middle value
  4. $Q_3$ (upper quartile): The 75th percentile — median of the upper half
  5. Maximum: The largest value in the data set

Example: Data set: 8, 12, 15, 18, 20, 22, 25, 30, 35, 40

  • $n = 10$, so median = average of 5th and 6th values = $(20 + 22)/2 = 21$
  • Lower half: 8, 12, 15, 18, 20 → $Q_1 = 15$
  • Upper half: 22, 25, 30, 35, 40 → $Q_3 = 30$
  • $IQR = Q_3 - Q_1 = 30 - 15 = 15$

Five-number summary: 8, 15, 21, 30, 40

Finding quartiles: Split the data at the median. For the lower half, find its median to get $Q_1$. For the upper half, find its median to get $Q_3$. Do not include the median value in either half when $n$ is odd.
What to write in your book
  • Five-number summary: Min, $Q_1$, Median, $Q_3$, Max.
  • $IQR = Q_3 - Q_1$ — the range of the middle 50% of data.
  • To find $Q_1$: median of the lower half. To find $Q_3$: median of the upper half.

Quick check: For the data set 4, 7, 9, 12, 15, 18, 21, what is $Q_1$?

06
Identifying outliers — the 1.5 × IQR rule
core concept

An outlier is any value that falls below the lower fence or above the upper fence:

$$\text{Lower fence} = Q_1 - 1.5 \times IQR$$ $$\text{Upper fence} = Q_3 + 1.5 \times IQR$$

Worked example: Data: 5, 8, 12, 15, 18, 20, 22, 25, 30, 100

  • $Q_1 = 12$, $Q_3 = 25$, $IQR = 13$
  • Lower fence $= 12 - 1.5 \times 13 = 12 - 19.5 = -7.5$
  • Upper fence $= 25 + 1.5 \times 13 = 25 + 19.5 = 44.5$
  • $100 > 44.5$, so 100 is an outlier
  • No values below $-7.5$, so no low outliers
Important: Outliers should always be investigated, not automatically removed. They may represent genuine extreme values (e.g., a very tall person) or data entry errors (e.g., 100 cm when 10.0 cm was intended). Always report analysis both with and without the outlier if in doubt.
What to write in your book
  • Lower fence $= Q_1 - 1.5 \times IQR$. Upper fence $= Q_3 + 1.5 \times IQR$.
  • Values outside the fences are outliers — marked as individual points on a box plot.
  • Always investigate outliers before deciding whether to include or exclude them.

True or false: An outlier identified by the 1.5 × IQR rule should always be removed from the data set before analysis.

PROBLEM 1 · FIVE-NUMBER SUMMARY AND OUTLIERS

Daily temperatures (°C): 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35. Find the five-number summary, IQR, and identify any outliers.

1
$n = 13$. Min $= 14$, Max $= 35$
Count values, identify extremes
PROBLEM 2 · FIND AND CONFIRM AN OUTLIER

Data: 10, 12, 15, 18, 20, 22, 25, 28, 30, 60. Find the five-number summary and identify any outliers.

1
$n = 10$. Median $= (20+22)/2 = 21$
Average of 5th and 6th values
08
Drawing and interpreting box plots
core concept

To draw a box plot:

  1. Draw a horizontal number line covering the data range
  2. Draw a box from $Q_1$ to $Q_3$ with a vertical line at the median
  3. Extend whiskers from the box to the most extreme values within the fences
  4. Plot any outliers as individual points beyond the whiskers

Interpreting box plots:

  • Median line position: Left of box centre = right-skewed data; right of centre = left-skewed
  • Box width (IQR): Wider box means more spread in the middle 50%
  • Whisker length: Longer whisker on one side indicates more spread in that tail
  • Outlier dots: Individual points beyond the whiskers
Skewness rule: If the right whisker is longer than the left whisker (or the median is closer to $Q_1$ than $Q_3$), the distribution is right-skewed (positively skewed). The reverse means left-skewed.
What to write in your book
  • Box: $Q_1$ to $Q_3$, line at median. Whiskers to most extreme non-outlier values.
  • Outliers plotted as dots beyond whiskers.
  • Longer right whisker or median closer to $Q_1$ = right-skewed distribution.

Fill the gap: A box plot has $Q_1 = 20$, $Q_3 = 35$ and $IQR = 15$. The upper fence is .

Trap 01
Including median in quartile calculation
When $n$ is odd, the median is not included in either half when calculating $Q_1$ or $Q_3$. Example: for 7 values, the lower half for $Q_1$ is only the bottom 3 values.
Trap 02
Whiskers to min/max regardless of outliers
If outliers exist, whiskers extend to the most extreme value within the fence, not to min/max. Outliers are plotted as separate dots. Drawing whiskers to an outlier loses marks.
Trap 03
Automatically removing outliers
The 1.5 × IQR rule identifies outliers but does not automatically remove them. Always state that outliers need investigation. Removing without justification is poor statistical practice.
09
Comparing distributions — side-by-side box plots
core concept

Side-by-side box plots are ideal for comparing two or more groups. Always comment on:

  • Centre: Compare medians. Which group has a higher typical value?
  • Spread: Compare IQRs. Which group is more variable?
  • Skewness: Is one group more symmetric than the other?
  • Outliers: Does one group have more extreme values?

Example: Class A: median = 72, IQR = 10. Class B: median = 75, IQR = 18.

Class B has a higher median (centre) but also a wider IQR (more variability). Class A is more consistent. Class B has a higher typical score but greater variation between students.

HSC tip: In a written comparison, always address both centre and spread. A complete comparison references both the medians (or means) and the IQR (or range). One-dimensional comparisons lose marks.
What to write in your book
  • Compare medians (centre), IQR (spread), skewness, and outliers.
  • A higher median does not mean a group is better if the spread is much larger.
  • Always use data values in your comparison — not just "higher" or "lower".

Match each box plot feature to what it tells you:

1

Find the five-number summary and identify any outliers for: 5, 8, 10, 12, 15, 18, 20, 22, 25, 30. Then decide whether to investigate or keep all values.

2

Two box plots show Class A (median=70, IQR=8) and Class B (median=75, IQR=15). Compare the two classes in two sentences, addressing both centre and spread.

Top 3 list: Name THREE real-world situations where outliers might appear in data and explain whether you would investigate or remove each one.

10
Revisit your thinking

For the data set 5, 8, 12, 15, 18, 20, 22, 25, 30, 100: $Q_1 = 12$, $Q_3 = 25$, $IQR = 13$. Upper fence $= 25 + 19.5 = 44.5$. Since $100 > 44.5$, it is confirmed as an outlier. The value 100 stands out visually and mathematically — it could be a genuine extreme value or a data error (perhaps 10.0 was intended). This is exactly why the 1.5 × IQR rule is useful: it gives an objective, mathematical criterion rather than relying on guesswork.

What has changed in your understanding? What did you get right? What surprised you?

auto-saved
01
Multiple choice
+5 XP per correct · +25 XP all-correct

Pick your answer, then rate your confidence — that tells the system what to drill next.

Q1. A data set has $Q_1 = 20$ and $Q_3 = 35$. What is the upper fence for outliers?

Q2. For the data 3, 5, 7, 9, 11, 13, 15, what is the IQR?

Q3. A box plot has its median line much closer to $Q_1$ than to $Q_3$. This indicates:

Q4. On a box plot, where does a whisker end if there are outliers?

Q5. Two classes have medians of 72 and 75. Class A has IQR = 10; Class B has IQR = 20. Which statement is correct?

02
Short answer
ApplyBand 42 marks

SA 1. Find the five-number summary for: 18, 20, 22, 25, 28, 30, 32, 35, 38, 40, 45, 80. (a) State all five values and the IQR. (b) Identify any outliers using the 1.5 × IQR rule. (c) Describe the skewness. (2 marks)

auto-saved
ApplyBand 42 marks

SA 2. Two classes took the same test. Class A: min=50, Q1=65, med=75, Q3=82, max=90. Class B: min=40, Q1=60, med=70, Q3=85, max=95. Both have one outlier each (Class A: 45, Class B: 35). (a) Compare the centres and spreads. (b) Which class performed better overall? Justify. (2 marks)

auto-saved
AnalyseBand 53 marks

SA 3. A real estate agent shows a box plot of house prices: median = $800K, Q1 = $650K, Q3 = $1.2M, with multiple outliers above $2.5M. (a) Explain why reporting only the median would mislead buyers. (b) A researcher removes all outliers before calculating correlation. Explain why this is problematic. (c) Design a three-step protocol for handling outliers that distinguishes data errors from genuine extreme values. (3 marks)

auto-saved
Comprehensive answers (click to reveal)

MC 1 — C: $IQR = 35 - 20 = 15$. Upper fence $= 35 + 1.5 \times 15 = 35 + 22.5 = 57.5$.

MC 2 — B: Lower half: 3,5,7 → $Q_1 = 5$. Upper half: 11,13,15 → $Q_3 = 13$. $IQR = 13 - 5 = 8$.

MC 3 — D: Median closer to $Q_1$ means more data piles up on the left, with the tail pulling right — right-skewed.

MC 4 — A: Whiskers end at the most extreme value within the fence. Outliers are plotted separately as dots.

MC 5 — C: Class B median (75) is higher (better centre), but Class A IQR (10) is smaller (more consistent results).

SA 1 (2 marks): $n=12$. Min=18, $Q_1 = (22+25)/2 = 23.5$, Median$=(30+32)/2=31$, $Q_3=(40+45)/2=42.5$, Max=80. $IQR=19$. Upper fence $= 42.5+28.5=71$. $80>71$ so 80 is an outlier. Right-skewed (long right tail/outlier). [1 mark five-number summary + IQR; 1 mark outlier + skewness]

SA 2 (2 marks): Centre: Class A higher median (75 vs 70). Spread: Class A IQR=17, Class B IQR=25 — Class A more consistent. Both have low outliers. Class A performed better overall — higher median and smaller spread, meaning more students scored well and consistently. [1 mark comparison; 1 mark justified conclusion]

SA 3 (3 marks): (a) Median $800K suggests affordability but $Q_3=\$1.2M$ means 25% cost over $1.2M. Outliers above $2.5M exist — buyers need full distribution context. [1 mark] (b) Outliers may be genuine signals (e.g., patients who respond unusually to treatment). Removing them can hide real effects and produce falsely strong correlations. [1 mark] (c) Step 1: Verify data entry (typos, unit errors). Step 2: Check measurement conditions (equipment failure). Step 3: If no error found, keep but report analysis with and without the outlier. [1 mark]

Drill 1: Min=5, $Q_1=9.5$, Med=16.5, $Q_3=23.5$, Max=30. $IQR=14$. Fences: $-11.5$ to $44.5$. No outliers — all values within fences.

Drill 2: Class B centre is higher (75 vs 70 median) suggesting better typical performance. However, Class A is more consistent (IQR=8 vs 15), meaning Class A students performed more uniformly.

01
Boss battle · The Outlier Detector
earn bronze · silver · gold

Five timed questions on box plots, five-number summaries, and the 1.5 × IQR outlier rule. Beat the boss to bank a tier — gold (90% + speed), silver (75%), or bronze (50%). Replays welcome.

⚔ Enter the arena
02
Science Jump · platform challenge

Climb platforms by answering box plot and outlier questions. Pool: lesson 5.

Mark lesson as complete

Tick when you've finished the practice and review.