Mathematics Advanced • Year 12 • Module 5 • Lesson 7
Representing Data
Practise HSC-style writing on data representation — including a structured extended response on choosing and critiquing displays.
1. Short-answer questions
1.1 A small data set has five-number summary 5, 12, 18, 23, 38. (a) Compute the IQR. (b) Use the 1.5 × IQR rule to determine whether 38 is an outlier. 2 marks Band 3
1.2 A histogram of weekly sales (in $) uses class intervals 0–100 (frequency 12), 100–200 (frequency 24), 200–500 (frequency 30). Calculate the frequency density of each class and state which class produces the tallest bar in a density histogram. 3 marks Band 3-4
1.3 The cumulative-frequency table below summarises the IQ scores of 200 students:
| Upper boundary | 80 | 90 | 100 | 110 | 120 | 130 |
|---|---|---|---|---|---|---|
| Cumulative | 10 | 40 | 90 | 150 | 185 | 200 |
(a) Estimate the median by linear interpolation. (b) Estimate the 90th percentile. 4 marks Band 4
Stuck on 1.3? Find which class interval contains the target cumulative count, then interpolate linearly within it.2. Extended response
2.1 A school principal wants to publish a one-page report on student performance in the Year 12 trial exam. He has the following 25 marks (out of 100):
38, 42, 45, 48, 50, 52, 55, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 85, 88, 90, 92, 99
(a) Find the five-number summary and the IQR. (b) Test whether either extreme value (38 or 99) is an outlier under the 1.5 × IQR rule. (c) The principal is choosing between three single-display options for the front page: a histogram (5 equal classes of width 12 starting at 36), a box plot, or a stem-and-leaf plot. For each display, state one feature of these 25 marks it shows clearly and one feature it hides. Conclude with a recommendation and a single-sentence justification using the lesson principle "No single representation tells the whole story".
8 marks Band 5-6
Explicit marking criteria
Part (a) — 2 marks
• 1 mark — correct min/max (38, 99) and median (= 13th value = 68).
• 1 mark — correct Q₁ (= 7th value = 55) and Q₃ (= 19th value = 80), so IQR = 25.
Part (b) — 2 marks
• 1 mark — computes lower fence = 55 − 1.5(25) = 17.5 and upper fence = 80 + 1.5(25) = 117.5.
• 1 mark — explicitly concludes neither 38 nor 99 is an outlier (38 ≥ 17.5 and 99 ≤ 117.5).
Part (c) — 4 marks
• 1 mark — histogram: identifies a strength (e.g. shows distribution shape / symmetry) and a weakness (loses individual values; class boundaries can hide or create features).
• 1 mark — box plot: identifies a strength (compact, shows five-number summary and outliers — here, none — instantly) and a weakness (cannot reveal multiple peaks or the gap structure of the data).
• 1 mark — stem-and-leaf: identifies a strength (preserves every raw value; both shape and exact data visible) and a weakness (clumsy for large data sets; less compact for one-page report).
• 1 mark — recommendation: states a clear choice and justifies it using the principle "no single representation tells the whole story" — e.g. recommends a box plot + histogram pair, or stem-and-leaf for this small data set.
Your response:
Stuck on (c)? For each display, ask: "what does the principal's audience get to see?" and "what is lost?".How did this worksheet feel?
What I'll revisit before next class:
1.1 — IQR & outlier test (2 marks)
Sample response. (a) IQR = Q₃ − Q₁ = 23 − 12 = 11. (b) Upper fence = 23 + 1.5(11) = 39.5. Since 38 ≤ 39.5, 38 is not an outlier.
Marking notes. 1 mark — IQR correctly computed from the five-number summary. 1 mark — upper fence quoted and explicitly compared with 38. Just writing "no, it's not an outlier" without computing the fence scores 0.5.
1.2 — Frequency density (3 marks)
Sample response. Class widths 100, 100, 300; densities 12/100 = 0.12, 24/100 = 0.24, 30/300 = 0.10. Tallest density bar = 100–200 class (density 0.24).
Marking notes. 1 mark — each density correct (cumulative 1 mark across the three). 1 mark — identifies the 100–200 class as the tallest. 1 mark — recognises that the 200–500 class has the highest frequency but the lowest density because of its width.
1.3 — Median and 90th percentile by interpolation (4 marks)
Sample response. (a) Median position n/2 = 100, inside the 100–110 class (c.f. rises from 90 to 150). Median ≈ 100 + 10 × (100 − 90)/(150 − 90) = 100 + 10(10/60) ≈ 101.7.
(b) P₉₀ position 0.9 × 200 = 180, inside the 110–120 class (c.f. 150 → 185). P₉₀ ≈ 110 + 10 × (180 − 150)/(185 − 150) = 110 + 10(30/35) ≈ 118.6.
Marking notes. (a) 1 mark — identifies median position 100 and the 100–110 class; 1 mark — interpolation arithmetic correct. (b) 1 mark — identifies P₉₀ position 180 and the 110–120 class; 1 mark — interpolation arithmetic correct. Accept answers within ±0.2 due to rounding.
2.1 — Extended response (8 marks): sample Band-6 response with annotations
Sample Band-6 response.
(a) Five-number summary. n = 25; median = 13th value = 68. Lower 12 values (positions 1–12): Q₁ = median of these = (6th + 7th)/2 of lower half — equivalently the 7th value of the full data — actually for n = 25 the standard quartile positions are Q₁ at position (n + 1)/4 = 6.5, taken as the average of the 6th and 7th values = (52 + 55)/2 = 53.5; Q₃ at position 3(n + 1)/4 = 19.5, the average of the 19th and 20th values = (80 + 82)/2 = 81. IQR = 81 − 53.5 = 27.5. Min = 38, Max = 99. [1 mark — min/max/median; 1 mark — Q₁/Q₃/IQR.]
(b) Outlier test. Lower fence = 53.5 − 1.5(27.5) = 12.25. Upper fence = 81 + 1.5(27.5) = 122.25. Since 38 ≥ 12.25 and 99 ≤ 122.25, neither extreme is an outlier. [1 mark — fences computed; 1 mark — explicit conclusion for both 38 and 99.]
(c) Comparing the three displays.
Histogram (5 classes of width 12, starting at 36). Bins 36–48, 48–60, 60–72, 72–84, 84–96, 96–108 have frequencies 4, 5, 6, 5, 4, 1. Shows clearly: the overall shape — roughly symmetric, possibly slightly bimodal with a small upper-end mode. Hides: the exact value of each mark, and any structure finer than 12-mark bins. [1 mark — histogram strength + weakness.]
Box plot. Whiskers at 38 and 99; box from 53.5 to 81 with internal line at 68. Shows clearly: centre (median 68), middle 50% spread (IQR 27.5), the absence of outliers, and roughly symmetric shape (median is close to the centre of the box). Hides: bimodality and individual values — two very different data sets could produce this identical box plot. [1 mark — box plot strength + weakness.]
Stem-and-leaf (key 3 | 8 = 38).
3 | 8
4 | 2 5 8
5 | 0 2 5 8
6 | 0 2 4 6 8
7 | 0 2 4 6 8
8 | 0 2 5 8
9 | 0 2 9
Shows clearly: every individual mark, the central peak in the 60s–70s, and the long-thin upper tail to 99. Hides: nothing critical, but takes up more space than the other two; less compact for a one-page summary. [1 mark — stem-and-leaf strength + weakness.]
Recommendation. Because "no single representation tells the whole story", I recommend the principal print a box plot for the headline at-a-glance summary plus a small stem-and-leaf in the appendix: the box plot conveys centre and spread instantly, while the stem-and-leaf preserves every value for any teacher who wants to drill into individual marks. With only 25 students the stem-and-leaf remains compact, so this combination loses nothing the histogram would have shown. [1 mark — clear recommendation that explicitly invokes the lesson principle and justifies the pairing.]
Total: 8/8.
Band descriptors for marker.
Band 3: Computes most of (a) and (b) but with arithmetic slips; for (c) names displays without distinguishing strengths from weaknesses. ≈ 3-4 marks.
Band 4: (a) and (b) fully correct; in (c) one strength/weakness per display, but the recommendation is generic and does not invoke the lesson principle. ≈ 5-6 marks.
Band 5: All parts correct; (c) recommends a single display with a complete justification but does not pair displays or quote the principle. ≈ 7 marks.
Band 6: Full computational accuracy in (a) and (b); in (c) clearly states one specific strength and one specific weakness for each display, and concludes with a justified recommendation that explicitly references "no single representation tells the whole story" (or equivalent) and pairs / sequences displays to cover both shape and individual values. 8/8.