Mathematics Advanced • Year 12 • Module 5 • Lesson 7
Representing Data
Build procedural fluency in stem-and-leaf plots, five-number summaries, box plots, histograms (with frequency density) and cumulative frequency.
1. Quick recall
Answer each question in the space provided. 1 mark each
Q1.1 List the five numbers in a five-number summary in order.
_______ , _______ , _______ , _______ , _______
Q1.2 Complete: in a histogram with unequal class widths, the bar height equals __________________ and the bar area equals __________________.
Q1.3 On a box plot, the whiskers extend to the most extreme data values within ____________________ and ____________________; any point outside these is shown as ____________________.
2. Worked example — five-number summary, box plot & stem-and-leaf for 20 test scores
Data: 42, 45, 48, 52, 55, 58, 62, 64, 65, 68, 70, 72, 75, 78, 82, 85, 88, 90, 95, 98.
Problem. Find the five-number summary, check for outliers, and draw a stem-and-leaf plot.
Step 1 — Count and confirm order.
n = 20; data already in ascending order.
Step 2 — Median (average of 10th and 11th).
Median = (68 + 70)/2 = 69
Step 3 — Quartiles (medians of each half).
Lower 10 {42,…,68}: Q₁ = (55 + 58)/2 = 56.5
Upper 10 {70,…,98}: Q₃ = (78 + 82)/2 = 80
IQR = 80 − 56.5 = 23.5
Step 4 — Outlier fences (1.5 × IQR rule).
Lower fence = 56.5 − 1.5(23.5) = 21.25
Upper fence = 80 + 1.5(23.5) = 115.25
All values lie inside [21.25, 115.25] → no outliers.
Step 5 — Five-number summary.
Min = 42, Q₁ = 56.5, Median = 69, Q₃ = 80, Max = 98
Step 6 — Stem-and-leaf plot (key: 4 | 2 = 42).
4 | 2 5 8
5 | 2 5 8
6 | 2 4 5 8
7 | 0 2 5 8
8 | 2 5 8
9 | 0 5 8
Conclusion. Five-number summary 42, 56.5, 69, 80, 98; no outliers; the distribution looks roughly symmetric around 69.
3. Faded example — fill in the missing steps
Find the five-number summary for the 12 values 23, 25, 31, 32, 32, 38, 41, 45, 45, 45, 52, 58 and decide whether 58 is an outlier. 4 marks
Step 1 — Count. n = ____.
Step 2 — Median. Average of 6th and 7th values = ( ____ + ____ ) / 2 = ____________
Step 3 — Quartiles.
Lower 6 {23, 25, 31, 32, 32, 38}: Q₁ = ( ____ + ____ ) / 2 = ____________
Upper 6 {41, 45, 45, 45, 52, 58}: Q₃ = ( ____ + ____ ) / 2 = ____________
IQR = ____________
Step 4 — Fences. Lower = ____________ · Upper = ____________
Step 5 — Five-number summary. Min ____, Q₁ ____, Median ____, Q₃ ____, Max ____.
Conclusion. 58 is / is not an outlier because ______________________________.
4. Graduated practice — build each representation asked for
Show your working. Sketch any plots clearly with labelled axes / keys.
Foundation — single-step tasks (4 questions)
| Q | Task | Answer space |
|---|---|---|
| 4.1 1 | State the five-number summary for 12, 14, 16, 18, 20, 22, 24. | |
| 4.2 1 | For class width 5 and frequency 12, find the frequency density. | |
| 4.3 1 | From a class set {10–20: 4, 20–30: 8, 30–40: 12, 40–50: 6}, write the cumulative-frequency column. | |
| 4.4 1 | State whether each is suitable for box plot, histogram, both, or neither: (a) shape with two peaks; (b) outlier identification; (c) labelling individual data values. |
Standard — typical HSC difficulty (6 questions)
Show your working in the space below each part.
4.5 Construct a stem-and-leaf plot (key: 2 | 3 = 23) for 23, 25, 31, 32, 32, 38, 41, 45, 45, 45, 52, 58. State the mode. 2 marks
4.6 From the stem plot in 4.5, find the median, Q₁ and Q₃. 2 marks
4.7 Use the 1.5 × IQR rule to decide whether 58 is an outlier in the data of 4.5. 2 marks
4.8 Sketch a box plot for the data in 4.5. Label min, Q₁, median, Q₃ and max along the axis, and show any outliers as dots beyond a whisker. 2 marks
4.9 A histogram has classes 0–5, 5–10, 10–25, 25–30 with frequencies 8, 12, 30, 6. Compute the frequency density for each class. Which class has the tallest bar in a frequency-density histogram? 2 marks
4.10 Using the cumulative-frequency column from 4.3 (n = 30), estimate the median and Q₃ by interpolation along the upper class boundary axis. 2 marks
Extension — combine concepts (2 questions)
4.11 Two data sets have identical box plots but very different histograms. (i) Explain in one sentence how this is possible. (ii) Sketch one box plot and two different histograms that could share it. 3 marks
4.12 From a cumulative-frequency graph reading {(20, 3), (40, 11), (60, 26), (80, 44), (100, 50)} (upper boundary, c.f.), find the 80th percentile by interpolation, showing how you used the value 0.8 × 50 = 40 on the c.f. axis. 3 marks
5. Self-check the easy 3
Tick the first three once you've checked your method works.
How did this worksheet feel?
What I'll revisit before next class:
Q1.1 — Five-number summary order
Min, Q₁, Median, Q₃, Max.
Q1.2 — Histograms with unequal class widths
Bar height = frequency density (= frequency / class width). Bar area = frequency.
Q1.3 — Box-plot whiskers
Whiskers extend to the most extreme values within Q₁ − 1.5 × IQR and Q₃ + 1.5 × IQR. Points beyond a whisker are shown as individual dots / crosses (outliers).
Q3 — Faded example for 23, 25, 31, 32, 32, 38, 41, 45, 45, 45, 52, 58
Step 1: n = 12.
Step 2: median = (38 + 41)/2 = 39.5.
Step 3: Q₁ = (31 + 32)/2 = 31.5, Q₃ = (45 + 45)/2 = 45, IQR = 13.5.
Step 4: Lower fence = 31.5 − 1.5(13.5) = 11.25; Upper fence = 45 + 1.5(13.5) = 65.25.
Step 5: Five-number summary = 23, 31.5, 39.5, 45, 58.
Conclusion: 58 is not an outlier (58 ≤ 65.25).
Q4.1 — Five-number summary of 12, 14, 16, 18, 20, 22, 24
n = 7. Min = 12, Median (4th) = 18, lower half {12, 14, 16} → Q₁ = 14, upper half {20, 22, 24} → Q₃ = 22, Max = 24. Summary: 12, 14, 18, 22, 24.
Q4.2 — Frequency density
Density = frequency / class width = 12 / 5 = 2.4.
Q4.3 — Cumulative-frequency column
Running totals: 4, 12, 24, 30.
Q4.4 — Choice of display
(a) Bimodal shape — histogram (a box plot would hide the two peaks). (b) Outlier identification — both (box plot is most efficient via the 1.5 × IQR rule). (c) Labelling individual values — stem-and-leaf (preserves every value).
Q4.5 — Stem-and-leaf plot
Key: 2 | 3 = 23.
2 | 3 5
3 | 1 2 2 8
4 | 1 5 5 5
5 | 2 8
Mode = 45 (appears 3 times).
Q4.6 — Median, Q₁, Q₃
n = 12, so median = (6th + 7th)/2 = (38 + 41)/2 = 39.5. Q₁ = (31 + 32)/2 = 31.5, Q₃ = (45 + 45)/2 = 45.
Q4.7 — Is 58 an outlier?
IQR = 45 − 31.5 = 13.5. Upper fence = 45 + 1.5(13.5) = 65.25. Since 58 ≤ 65.25, 58 is not an outlier.
Q4.8 — Box plot for 4.5 data
Number-line box plot with whisker left to 23, box from 31.5 to 45, internal line at 39.5, whisker right to 58. No outliers (so no separate dots).
Q4.9 — Frequency densities
0–5: 8/5 = 1.6. 5–10: 12/5 = 2.4. 10–25: 30/15 = 2.0. 25–30: 6/5 = 1.2. Tallest bar in a density histogram is the 5–10 class (density 2.4), even though the 10–25 class has the highest frequency.
Q4.10 — Median & Q₃ by interpolation (cumulative {4, 12, 24, 30}, classes 10–20, 20–30, 30–40, 40–50)
Median position n/2 = 15. The c.f. passes 15 inside the 30–40 class (c.f. jumps from 12 to 24 across this class). Linear interpolation: median ≈ 30 + 10 × (15 − 12)/(24 − 12) = 30 + 10(3/12) = 32.5.
Q₃ position 3n/4 = 22.5, also inside 30–40 class: Q₃ ≈ 30 + 10 × (22.5 − 12)/12 = 30 + 10(10.5/12) ≈ 38.75.
Q4.11 — Same box plot, different histograms
(i) A box plot compresses the data into only five numbers (min, Q₁, median, Q₃, max). Any rearrangement of the data that preserves those five numbers — including a bimodal vs unimodal version — produces the same box plot but very different histograms.
(ii) Sketch one box plot with whiskers at 0 and 100 and box 25–75 with median 50. Then show two histograms over the same range: one unimodal bell-shaped around 50, and one bimodal with peaks near 20 and 80 — both can have min 0, max 100, median 50, Q₁ 25, Q₃ 75.
Q4.12 — 80th percentile by interpolation
0.8 × 50 = 40 on the cumulative-frequency axis. From the table, c.f. = 26 at upper boundary 60 and c.f. = 44 at upper boundary 80, so the 40th c.f. value lies inside the 60–80 class. Interpolating linearly: P₈₀ ≈ 60 + 20 × (40 − 26)/(44 − 26) = 60 + 20(14/18) ≈ 75.6.