Mathematics Advanced • Year 12 • Module 5 • Lesson 8

Comparing Data Sets

Practise HSC-style writing on z-scores and comparative analysis — including a structured extended response on the fairness of raw-mark comparisons.

Master · Past-Paper Style

1. Short-answer questions

1.1 Two classes sat the same test. Class A: x̄ = 68, s = 12. Class B: x̄ = 72, s = 6. A student in A scored 80 and a student in B scored 78. (a) Calculate each z-score. (b) Which student performed better relative to their class? 2 marks Band 3

1.2 Parallel box plots for two basketball teams show: Sharks five-number summary 62, 78, 85, 92, 110; Eagles 70, 82, 88, 94, 105. Write a 3-sentence comparison using Centre → Spread → Shape. 3 marks Band 3-4

1.3 A national maths test has x̄ = 500 and s = 100. (a) A student scores 650 — find z. (b) Another student in a school test has z = 1.5 with x̄ = 60 and s = 10 — find their raw mark. (c) Which student has the better relative performance? 4 marks Band 4

Stuck on 1.3(c)? Compare the two z-scores directly.

2. Extended response

2.1 A Year 12 student wins three subject prizes by scoring the highest raw mark in three subjects taken at the same school. The school cohort summary statistics for each subject are:

Subject	Cohort size	Cohort mean	Cohort SD	Student's raw mark
Mathematics Advanced	120	62	14	92
English Advanced	180	74	8	90
Visual Arts	40	82	5	91

(a) Calculate the student's z-score for each subject. (b) Convert each z-score back to the raw mark that would be needed in Mathematics Advanced to match the same relative performance the student achieved in each subject. (c) The Principal proposes to rank dux candidates by average raw mark across these three subjects. The student instead argues the cohort should be ranked by average z-score. Use your results in (a) and (b), and the Real-World Anchor about ATAR scaling, to write a structured response that (i) compares the two ranking systems for this student, (ii) explains why raw-mark comparisons can be statistically misleading across subjects with different cohort means and SDs, and (iii) reaches a recommendation.

8 marks Band 5-6

Explicit marking criteria

Part (a) — 2 marks

• 2 marks — three z-scores correctly computed: Maths Adv ≈ 2.14, English Adv = 2.00, Visual Arts = 1.80 (1 mark for two correct, 2 marks for all three).

Part (b) — 2 marks

• 1 mark — applies x = 62 + z(14) for each subject (correct inverse formula in the Maths Adv context).

• 1 mark — produces all three equivalent raw marks (≈ 92, 90, 87.2) and notes that the student's actual Maths Adv mark (92) is the highest of the three on this common scale.

Part (c) — 4 marks

• 1 mark — direct comparison: computes the student's average raw mark (≈ 91.0) and average z-score (≈ 1.98) and notes that both ranking systems would favour this student strongly.

• 1 mark — why raw averages mislead: explains that raw marks live on different scales (different means and SDs by subject), so a 90 in English (z = 2) and a 91 in Visual Arts (z = 1.8) are not equivalent relative performances even though their raw values are similar.

• 1 mark — link to ATAR Real-World Anchor: explicitly references the lesson principle that "raw HSC marks across different subjects are statistically misleading without scaling" and explains why z-scores (or scaled marks) restore fairness.

• 1 mark — recommendation: recommends ranking by average z-score (or equivalently scaled marks), with a one-sentence justification that explicitly handles a hypothetical case where one student's raw average could exceed another's despite a lower average z-score.

Your response:

Stuck on (c)? Construct a hypothetical Student 2 whose raw average is higher but z-average is lower — does the Principal's system get it "right"?

How did this worksheet feel?

Got it Partly Lost

What I'll revisit before next class:

Answers — sample responses + marking notes

1.1 — Class A vs Class B (2 marks)

Sample response. (a) z_A = (80 − 68)/12 = 1.00; z_B = (78 − 72)/6 = 1.00. (b) The two students have equal relative performance — each scored exactly 1 SD above their own class mean. Raw marks alone (80 vs 78) cannot decide because the classes have different means and spreads.

Marking notes. 1 mark — both z-scores correct. 1 mark — conclusion is "equal" with a one-sentence justification appealing to "1 SD above the mean in both cases". Students who say "Class A student is better because 80 > 78" score 0.

1.2 — Sharks vs Eagles (3 marks)

Sample response. Centre: The Eagles' median score (88) is higher than the Sharks' (85), so the Eagles typically score more. Spread: The Eagles' IQR (12) and range (35) are both narrower than the Sharks' (14 and 48), so the Eagles are more consistent. Shape: Each box plot has the median roughly central but the Sharks' lower whisker is much longer (62 vs Q₁ 78), indicating a small left skew — occasional very low-scoring games.

Marking notes. 1 mark per sentence (Centre / Spread / Shape) for a comparison sentence that names a statistic and is set in context (basketball scoring). Bald statements like "the medians differ" with no number score 0.5; statements without context score 0.5.

1.3 — National test vs school test (4 marks)

Sample response. (a) z = (650 − 500)/100 = 1.50. (b) x = 60 + 1.5(10) = 75. (c) Both students have z = 1.50, so their relative performance is equal; despite scoring 650 vs 75, each student is exactly 1.5 SDs above their own cohort's mean.

Marking notes. (a) 1 mark — z correct. (b) 1 mark — raw mark correct via inverse formula. (c) 2 marks — explicit comparison of z-scores and a one-sentence justification that "raw scores live on different scales" or equivalent. A student who concludes "650 > 75 so the national test student is better" scores 0/2 on (c).

2.1 — Extended response (8 marks): sample Band-6 response with annotations

Sample Band-6 response.

(a) z-scores.

Mathematics Advanced: z = (92 − 62)/14 ≈ 2.14.
English Advanced: z = (90 − 74)/8 = 2.00.
Visual Arts: z = (91 − 82)/5 = 1.80. [2 marks — all three z-scores correct.]

(b) Equivalent Maths Adv raw marks. Apply x = x̄_MA + z · s_MA = 62 + z(14):

Maths Adv (z = 2.14): 62 + 2.14(14) ≈ 92. ✓ (matches the actual raw mark).
English Adv (z = 2.00): 62 + 2.00(14) = 90.
Visual Arts (z = 1.80): 62 + 1.80(14) ≈ 87.2. [1 mark — inverse formula in the Maths Adv frame; 1 mark — all three equivalent marks, with the comment that the student's Maths Adv mark is the highest on this common scale.]

(c) Which ranking system should the Principal use?

Direct comparison for this student: average raw mark = (92 + 90 + 91)/3 ≈ 91.0. Average z-score = (2.14 + 2.00 + 1.80)/3 ≈ 1.98. [1 mark — both averages computed.]

Why raw averages mislead. The three raw marks (92, 90, 91) look almost identical, but they correspond to different relative standings: Maths Adv (z ≈ 2.14) is clearly more impressive than Visual Arts (z = 1.80) because the Maths Adv cohort has a lower mean (62 vs 82) and a wider spread (SD 14 vs 5). Treating the marks as if they live on the same scale ignores this. [1 mark — explains that the raw scale differs between subjects.]

This is exactly the principle in the lesson's Real-World Anchor on ATAR scaling: "comparing raw HSC marks across different subjects is statistically misleading without scaling". Scaled marks (built from z-score-like transformations) ensure that the same relative performance in any subject — whether a tough cohort like Physics or an easier cohort like Visual Arts — is treated equivalently. [1 mark — explicit link to the lesson principle and to scaling.]

Recommendation. Rank dux candidates by average z-score (equivalently, by scaled marks), not by average raw mark. Consider a hypothetical Student 2 who scores 100, 100, 100 in three "easy" subjects each with cohort mean 95 and SD 2 — their average raw mark is 100 (higher than this student's 91), but their average z-score is only (5/2 + 5/2 + 5/2)/1 = 2.5… actually 2.5 vs our student's 1.98. Even so, the principle remains: a system that ranks by raw average will systematically reward students who pick the "easiest" cohorts. Ranking by z-score (or scaled marks) prevents that perverse incentive and rewards the genuine relative performance the dux award is meant to recognise. [1 mark — clear recommendation, justified with a counter-example reasoning that addresses the perverse-incentive flaw in raw-mark averaging.]

Total: 8/8.

Band descriptors for marker.

Band 3: Two of three z-scores correct; (b) uses the formula but stops short of computing all three equivalents; (c) names the issue but with no quantitative comparison or link to ATAR scaling. ≈ 3-4 marks.

Band 4: All z-scores and equivalents correct; (c) computes both averages and recommends z-scores but justifies only intuitively. ≈ 5-6 marks.

Band 5: All computations correct; (c) recommends z-scores with both a quantitative justification and a brief reference to ATAR-style scaling; missing the counter-example or perverse-incentive argument. ≈ 7 marks.

Band 6: All parts complete; (c) recommends z-scores, computes both averages, explicitly invokes the ATAR Real-World Anchor / scaling principle, and supports the recommendation with a counter-example or perverse-incentive argument showing that raw-average rankings can be systematically unfair. 8/8.