Mathematics Standard • Year 11 • Module 4 • Lesson 2
Collecting Data — Problem Set
Apply census, sampling and bias concepts to real Australian polling, market-research and public-health scenarios.
Problem 1 — Stratified election poll
A polling company wants to predict the federal election result in a NSW electorate of 90,000 enrolled voters. The age breakdown of the electorate is: 18–34 years 27,000; 35–54 years 36,000; 55+ years 27,000. They will conduct a stratified random sample of 1,000 voters.
Set up: What are we solving for?
(i) Calculate the number of voters from each age stratum the poll should sample. 2 marks
(ii) Explain in one sentence why a stratified sample is preferred to a simple random sample of 1,000 voters here. 1 mark
(iii) The company calls landlines only. Identify the bias this introduces and name the age stratum most likely to be under-represented. 2 marks
Stuck? Revisit lesson § Sampling Methods (stratified proportions) and § Types of Bias (selection bias).Problem 2 — School canteen satisfaction survey
A NSW high school has 600 students. The student council wants to know whether students are satisfied with the canteen. They are deciding between three sampling plans.
Plan A: Hand out a paper form to every student in Roll Call on Friday morning.
Plan B: Stand outside the canteen at lunch and survey the next 100 students who walk past.
Plan C: Use the school roll to randomly pick 100 students from each year group (proportional to year size) and survey them in class.
Set up: What are we solving for?
(i) Name each plan's sampling method (census, convenience, stratified random, etc.). 2 marks
(ii) Identify the main bias in Plan B and explain in one sentence how it distorts the survey. 2 marks
(iii) Which plan should the council use, and why? Write a one-sentence justification. 1 mark
Stuck? Surveying only canteen users excludes students who avoid the canteen — that's where the bias lives.Problem 3 — Energy-drink street survey
A marketing company is testing a new energy drink. They stand outside a city gym at 6 am on a weekday and survey the next 80 people who walk in, asking "Would you buy this product?".
Set up: What are we solving for?
(i) Name the sampling method used. 1 mark
(ii) Identify TWO types of bias in this approach and explain each in one sentence. 2 marks
(iii) Recommend a better sampling plan in 2–3 sentences. Your plan must explain how it reduces at least one of the biases you identified. 2 marks
Stuck? Two key questions to ask: who is selected? (selection bias) and who is excluded? (also selection bias).Problem 4 — Should the principal census or sample?
A principal of a 2,000-student school wants to know how many students travel to school by bus. The IT team can either: (a) attach a one-question survey to the next school newsletter sent to every family, or (b) randomly select 200 students and follow up personally for a 100% response rate.
Set up: What are we solving for?
(i) Identify Option (a) as census or sample, and explain what type of bias is likely to dominate it. 2 marks
(ii) Identify Option (b) as census or sample, and name the sampling method. 1 mark
(iii) Which option will give a more accurate estimate of bus usage, and why? Write a one-sentence justification referencing response rates. 2 marks
Stuck? A targeted sample with 100% response beats a census with 25% response — quality of response beats quantity sent.Problem 5 — Design and critique a Year 11 survey
Your school has 240 Year 11 students (140 girls, 100 boys). You want to survey 60 of them about study habits. Below are two draft survey questions.
Question A: "How many hours did you study last week?"
Question B: "Most successful students study more than 20 hours a week. How many hours did you study last week?"
Set up: What are we solving for?
(i) Design a stratified random sample of 60 students from this Year 11 cohort. State the number of girls and boys. 2 marks
(ii) Compare Question A and Question B. Name the bias that Question B introduces, and explain in one sentence how a student is likely to respond differently to Question B. 2 marks
(iii) Rewrite Question B to remove the bias while still measuring study hours. 1 mark
Stuck? Revisit lesson § Types of Bias — response bias from a leading question is the most common HSC trap.How did this worksheet feel?
What I'll revisit before next class:
Problem 1 — Stratified election poll
Set up. We are allocating 1,000 sample slots in proportion to each age stratum, then evaluating one source of bias.
(i) 18–34: 27,000/90,000 × 1,000 = 300. 35–54: 36,000/90,000 × 1,000 = 400. 55+: 27,000/90,000 × 1,000 = 300. Total = 1,000. ✓
(ii) Stratified sampling guarantees each age group is represented proportionally; a simple random sample could under-represent one group by chance, distorting the poll prediction.
(iii) Selection bias. The 18–34 stratum is most likely to be under-represented because younger voters disproportionately use mobile-only households without landlines.
Problem 2 — Canteen satisfaction
Set up. We are naming each sampling method, finding the bias in Plan B, then recommending the best plan.
(i) Plan A = census. Plan B = convenience sample. Plan C = stratified random sample.
(ii) Selection bias. Surveying only students walking past the canteen excludes students who avoid the canteen entirely — those students are likely the most dissatisfied, so the survey will look more positive than reality.
(iii) Use Plan C — it samples randomly within year groups in proportion, capturing both canteen users and non-users without the selection bias of Plan B and without the response-bias risk of Plan A.
Problem 3 — Gym energy-drink survey
Set up. We are naming the method, finding two biases, and recommending a better plan.
(i) Convenience sample.
(ii) Two biases: Selection bias — only fitness-focused early-morning gym-goers are sampled (a very narrow slice of the wider energy-drink market). Time-of-day selection bias — 6 am attendees are not representative of all gym users; lunch-time, evening, and weekend gym-goers are excluded.
(iii) A better plan is to stratified random sample across (a) different gym types (CBD, suburban, regional), (b) different times of day, and (c) the wider population via shopping centres or supermarket exits — this prevents the result being driven by a single demographic and time slot.
Problem 4 — Census vs targeted sample
Set up. We are classifying each option, then judging which produces a more accurate estimate.
(i) Option (a) is a census (sent to every family). The dominant problem is non-response bias: only a fraction of families return the survey, and those who do may differ systematically from those who don't (e.g. families with no bus users may not bother replying).
(ii) Option (b) is a sample — specifically a simple random sample of 200 students with personal follow-up.
(iii) Option (b) will be more accurate because a randomly selected 200 with 100% response (200/200) gives an unbiased estimate, whereas a census with low response (e.g. 500/2,000) is heavily distorted by who chose to respond.
Problem 5 — Year 11 study survey
Set up. We are allocating a stratified sample, identifying response bias, then fixing the question.
(i) Girls: 140/240 × 60 = 35. Boys: 100/240 × 60 = 25. Total = 60. ✓
(ii) Response bias. Question B is leading — by stating that successful students study more than 20 hours, it pressures the respondent to inflate their study figure to look successful, biasing the data upward.
(iii) Sample rewrite: "How many hours did you study (outside of class) last week?" No prior claim, no implied benchmark, no judgement.