Mathematics Standard • Year 11 • Module 4 • Lesson 2

Collecting Data — Past-Paper Style

Practise HSC Mathematics Standard 2-style writing on sampling and bias — multi-mark short answers and one structured extended response.

Master · Past-Paper Style

1. Short-answer questions

1.1 A NSW school has 800 students made up of 480 students in the senior school and 320 in the junior school. The school counsellor wants to interview a stratified random sample of 50 students about wellbeing. Calculate the number of senior and junior students to interview.    2 marks    Band 3

1.2 A radio station invites listeners to call in and vote on a controversial city policy. After two hours, 8,400 votes have been counted and the station reports "the public has spoken — 73% support the policy." Identify the sampling method, name two distinct sources of bias, and in one sentence explain why the 73% headline is misleading.    3 marks    Band 3-4

1.3 A researcher uses a list of all 12,000 patients of a medical practice and selects every 60th patient for a study on blood pressure.
(a) Name the sampling method and calculate the sample size.
(b) Explain in one sentence one risk specific to this sampling method when the list is ordered by date of registration.    3 marks    Band 4

Stuck on 1.3(b)? Systematic sampling can lock in a hidden periodic pattern in the list — newer patients differ systematically from older ones.

2. Extended response

2.1 A NSW council wants to estimate the average weekly grocery spend of households in its local government area (LGA). The LGA has 24,000 households, with the following breakdown by suburb size:

Inner suburbs: 9,600 households

Middle suburbs: 8,400 households

Outer suburbs: 6,000 households

The council has access to three possible data-collection plans:

Plan A: Mail a paper survey to every household; analyse only those returned.

Plan B: Set up a stall at the largest shopping centre on Saturday and survey 600 shoppers as they leave.

Plan C: Stratified random sample of 600 households across the three suburb groups, with phone interviews and in-person follow-up to achieve a 100% response rate.

(a) For Plan C, calculate how many households should be sampled from each of the three strata.
(b) Identify the main bias in Plan A and the main bias in Plan B, naming the bias type in each case.
(c) Recommend the best plan with reasoning. Your recommendation must reference the strengths of stratified sampling, the response-rate problem with Plan A, and the selection problem with Plan B. End with a clear conclusion sentence.    7 marks    Band 5-6

Explicit marking criteria

Part (a) — 2 marks

1 mark — correct method (population fraction × 600 in each stratum).

1 mark — all three numbers correct (Inner = 240, Middle = 210, Outer = 150) and sum to 600.

Part (b) — 2 marks

1 mark — names Plan A's bias as non-response bias with a one-line explanation.

1 mark — names Plan B's bias as selection bias (or convenience-sample bias) with a one-line explanation.

Part (c) — 3 marks

1 mark — references the strength of stratified sampling (proportional representation across suburb groups).

1 mark — explicitly contrasts Plan C's 100% response with Plan A's low return rate and Plan B's restricted population.

1 mark — explicit conclusion sentence naming Plan C as the best choice and summarising why.

Your response:

Stuck on (c)? Three pillars: stratification (representativeness), response rate (data quality), restricted populations (selection bias). Hit each, then write a clear "Plan ___ is best because ___" sentence.

How did this worksheet feel?

What I'll revisit before next class:

Answers — sample responses + marking notes

1.1 — Senior/junior stratified sample of 50 (2 marks)

Sample response.
Senior fraction: 480/800 = 0.60 → 0.60 × 50 = 30 senior students.
Junior fraction: 320/800 = 0.40 → 0.40 × 50 = 20 junior students.
Check: 30 + 20 = 50. ✓

Marking notes. 1 mark — correct method shown (proportion × 50). 1 mark — both numbers correct with the totals check.

1.2 — Radio call-in poll (3 marks)

Sample response. Sampling method = self-selected (voluntary response) sample. Two sources of bias: (i) self-selection bias — only listeners motivated enough to call in are heard; (ii) selection bias — the sample is restricted to that station's listenership, which is not the wider public. The 73% headline is misleading because the sample is not representative of the public — it reflects the opinion of motivated listeners of one radio station, not "the public".

Marking notes. 1 mark — correctly names sampling method. 1 mark — two distinct biases named. 1 mark — clear explanation of why the headline is misleading (must mention either the listenership restriction or the motivation effect).

1.3 — Every 60th patient (3 marks)

(a) Sample response. Method = systematic sample. Sample size = 12,000 ÷ 60 = 200 patients.

(b) Sample response. If the list is ordered by registration date, every 60th patient may fall into a similar cohort (e.g. all registered around the same year), so the sample may systematically miss recent or long-term patients — introducing a hidden time-based bias.

Marking notes. (a) 1 mark — names systematic; 1 mark — correct sample size calculation. (b) 1 mark — identifies the risk that an ordered list can lock in a hidden pattern.

2.1 — Grocery-spend study (7 marks): sample Band-6 response with annotations

Sample Band-6 response.

(a) Stratified allocation for Plan C.

Inner: 9,600 / 24,000 × 600 = 240 households.
Middle: 8,400 / 24,000 × 600 = 210 households.
Outer: 6,000 / 24,000 × 600 = 150 households.
Check: 240 + 210 + 150 = 600. ✓ [1 mark — method; 1 mark — all three correct + total check.]

(b) Identifying the bias in each plan.

Plan A — non-response bias. Most households will not return the mailed survey; those who do are likely to differ systematically (perhaps older or more engaged residents), so the sample of "returners" is not representative of all households. [1 mark.]

Plan B — selection bias (convenience sample). Surveying only Saturday shoppers at the largest shopping centre excludes households who shop at smaller stores, who shop online, or who shop on weekdays — and excludes any non-shopping household members entirely. [1 mark.]

(c) Recommendation.

Plan C is strongest because the stratified random allocation guarantees that inner, middle, and outer suburbs are represented in correct proportion (240:210:150), avoiding the chance of a single suburb dominating the sample. [1 mark — strength of stratification.]

In addition, Plan C's phone + in-person follow-up achieves a 100% response rate — a 600-household sample with full response is far more reliable than Plan A's mailed census where only a self-selected fraction returns, and it avoids Plan B's selection problem of sampling only Saturday shoppers at one centre. [1 mark — contrasts response rate and restricted population.]

Conclusion: Plan C is the best choice because it combines proportional representation across all three suburb groups with a 100% response rate, eliminating both the non-response bias of Plan A and the selection bias of Plan B. [1 mark — explicit conclusion.]

Total: 7/7.

Band descriptors for marker.

Band 3: Calculates (a) correctly but identifies only one bias and gives no clear recommendation. ≈ 3 marks.

Band 4: (a) correct, both biases named correctly, but (c) recommendation is one-sentence and references only one of the three pillars. ≈ 5 marks.

Band 5: (a) and (b) fully correct; (c) covers all three pillars but the conclusion sentence is missing or hedged. ≈ 6 marks.

Band 6: Complete — calculations correct with check, both biases named with explanations, (c) covers stratification, response rate, restricted population, AND a clear conclusion sentence naming Plan C as best. 7/7.