Skip to content
mathlab
0
0 XP
KJ
Lesson 1 ~25 min Unit 4 · Data & Probability +85 XP

Collecting Data

Discover how to gather information fairly. Understand populations, samples, sampling methods, and how to design unbiased surveys.

Today’s hook: The ABS runs a census counting every person in Australia every 5 years. It costs over $500 million. Why don’t all researchers do this? And when they use a smaller sample, how do they make sure the results are trustworthy?
0/5QUESTS
Think First
warm-up

How would you find out the most popular sport in your school? What would you actually do?

Record your ideas in your workbook.
1
The Big Idea
+5 XP

When we want information about a group, we can either ask everyone (a census) or carefully choose a smaller sample. Good data collection gives results we can trust.

The population is every member of the group we want to study. A sample is the subset we actually survey. Data flows: population → sample → results → conclusions about the population.

POPULATION SAMPLE DATA CONCLUSIONS infer about population Census: data from everyone Survey: data from a sample Sample must represent population
Population → Sample → Data → Conclusions
Census
Data from the whole population. Accurate but expensive (ABS Census costs ~$500M).
Sample
A carefully chosen subset. Cheaper and faster — but must be representative.
Key question
Does the sample represent the whole population? If not, the conclusions will be wrong.
2
What You’ll Master
objectives

Know

  • The difference between population, sample, and census
  • Four sampling methods: random, systematic, stratified, convenience
  • Features of a well-designed survey question

Understand

  • Why a census is not always practical
  • How bias enters a survey through sampling or question wording
  • Why random sampling produces fairer results

Can Do

  • Identify sources of bias in a given survey scenario
  • Design fair, unbiased survey questions
  • Calculate stratified sample sizes
3
Words You Need
vocabulary
PopulationThe entire group being studied (e.g. all Year 8 students in Australia).
SampleA smaller group selected from the population to represent it.
CensusData collected from every member of the population. Accurate but costly and time-consuming.
BiasSystematic error that makes results unrepresentative of the population.
Random sampleEvery member of the population has an equal chance of selection.
Systematic sampleEvery nth member of the population is selected (e.g. every 10th name on a roll).
Stratified samplePopulation split into subgroups; a proportional sample is taken from each subgroup.
Convenience sampleWhoever is available is surveyed. Easiest to do but most biased method.
4
Spot the Trap
heads-up

Wrong: “A bigger sample is always better.” — A large biased sample gives worse results than a small well-chosen random sample.

Right: Sample quality matters more than size. Ask “How was the sample chosen?” before trusting any result.

Wrong leading question: “Don’t you think PE should be compulsory?” — pushes respondents towards saying “yes”.

Right neutral question: “Should PE be compulsory? Yes / No / Unsure” — gives respondents genuine, balanced options.

5
Types of Sampling
+5 XP

Four main sampling methods, from most to least representative:

1. Random
Draw names from a hat or use a random number generator. Every member has an equal chance. Best for fairness.
2. Systematic
Select every nth person on a list (e.g. every 5th name). Easy to do; fairly unbiased if the list isn’t ordered in a biased way.
3. Stratified
Divide population into subgroups (strata), take proportional samples from each. Best for representing all subgroups.

4. Convenience sampling — ask whoever is nearby (friends, people at the canteen). Most biased method. Use only when nothing else is possible, and always state the limitation.

6
Designing Good Survey Questions
+5 XP

A good survey question is clear, neutral, and specific. Response options must cover all possibilities without overlapping.

Leading question
“Don’t you think homework is harmful?” — “Don’t you think” pushes towards “yes”.
Fix: “Do you think homework is beneficial, harmful, or neither?”
Overlapping response categories
“Hours of TV: 0–2 / 2–4 / 4–6” — the value 2 appears in two categories, so respondents don’t know which to tick.
Fix: “0 to less than 2 / 2 to less than 4 / 4 or more” — non-overlapping intervals.
Double-barrelled question
“Do you enjoy maths and science?” — you might like one but not the other, making the answer ambiguous.
Fix: Ask about maths and science in separate questions.
7
Sources of Bias
+5 XP

Bias makes results unrepresentative. Know these four key sources:

  • Self-selection bias: Only people with strong opinions volunteer to respond (common in online polls).
  • Question wording bias: Leading or emotive language pushes respondents toward a particular answer.
  • Sampling bias: The sample doesn’t represent the population (e.g. only surveying canteen users about food preferences).
  • Response bias: People don’t answer honestly (e.g. underreporting unhealthy habits).
Watch Me Solve It · Stratified sample
+15 XP per step
Q
PROBLEM
A school of 600 students has 40% Year 7, 35% Year 8, and 25% Year 9. A stratified sample of 60 is needed. How many from each year?
  1. 1
    Find the number in each year group
    Year 7: $0.40 \times 600 = 240$   Year 8: $0.35 \times 600 = 210$   Year 9: $0.25 \times 600 = 150$
    Convert percentages to actual counts first.
  2. 2
    Find the sampling fraction
    $\dfrac{\text{sample size}}{\text{population}} = \dfrac{60}{600} = \dfrac{1}{10}$
    We take 1 student for every 10 in the school.
  3. 3
    Apply the fraction to each group
    Year 7: $240 \times \tfrac{1}{10} = 24$   Year 8: $210 \times \tfrac{1}{10} = 21$   Year 9: $150 \times \tfrac{1}{10} = 15$
    Check: $24 + 21 + 15 = 60$ ✓
Answer24 from Year 7, 21 from Year 8, 15 from Year 9
D
Brain Trainer · Data Collection
4 problems
  1. 1 A researcher surveys every 8th student on the school roll of 400. What type of sampling is this?

    Systematic sampling — every nth person on an ordered list.Systematic sample
  2. 2 A town of 2,000 people has 50% adults, 30% teenagers, and 20% children. How many of each group in a stratified sample of 100?

    Sampling fraction = 100/2000 = 1/20. Adults: 1,000 ÷ 20 = 50. Teenagers: 600 ÷ 20 = 30. Children: 400 ÷ 20 = 20. Check: 50 + 30 + 20 = 100.50 adults, 30 teenagers, 20 children
  3. 3 Name ONE advantage and ONE disadvantage of a census over a sample.

    Advantage: complete and accurate data, no sampling error. Disadvantage: expensive, time-consuming, impractical for large or spread-out populations.Accurate vs expensive
  4. 4 Rewrite this biased question as a fair one: "How much do you hate waking up early for school?"

    The original assumes the person hates it. A fair version: "How do you feel about starting school at 8:30 am? Very positive / Positive / Neutral / Negative / Very negative"Neutral, balanced options
8
Common Pitfalls
heads-up
Confusing sample size with sample quality
A large biased sample (e.g. 500 gym members asked about daily exercise) gives worse results than 50 randomly chosen people from the whole community.
Fix: Always ask “How was the sample selected?” before trusting any statistic.
Not stating limitations of convenience sampling
In an exam, just saying “I surveyed my friends” without acknowledging it’s biased will lose marks.
Fix: Always name and explain the limitation when convenience sampling is used.
Copy This Into Your Book

Key Definitions

  • Population: entire group being studied
  • Sample: subset actually surveyed
  • Census: data from whole population
  • Bias: systematic error making results unrepresentative

Stratified Sample

  • Sampling fraction $= \dfrac{\text{sample size}}{\text{population}}$
  • Group sample $=$ group size $\times$ fraction
  • All group samples must sum to total sample

Sampling Methods (best to worst)

  • Random → every member equal chance
  • Systematic → every nth person
  • Stratified → proportional from each group
  • Convenience → most biased, avoid if possible

Good Survey Questions

  • Clear and specific
  • Neutral wording (not leading)
  • Non-overlapping, exhaustive responses
  • One idea per question (not double-barrelled)
1
Which statement best describes a census?
+10 XP
2
A researcher uses a random number generator to select 30 students from a roll of 500. This is:
+10 XP
3
Which survey question is most biased?
+10 XP
4
A school wants to survey students about a new uniform, ensuring fair representation from all year groups. The best sampling method is:
+10 XP
5
A website poll finds 80% of voters support a policy. The main problem with this result is:
+10 XP
Show Your Working
9 marks total
Analyse Medium 2 MARKS

Q6. A survey question states: “Most people agree that too much homework is given. How much homework do you get? None / Some / Too much.” Identify TWO flaws in this question and explain how each creates bias.

Answer in your workbook.
Create Medium 3 MARKS

Q7. Design a fair survey to find the average daily screen time of Year 8 students. Write THREE survey questions. For each, explain one feature that makes it fair.

Answer in your workbook.
Apply Hard 4 MARKS

Q8. A school of 600 students has 40% in Year 7, 35% in Year 8, and 25% in Year 9. A stratified sample of 60 students is required. How many should come from each year group? Show all working.

Answer in your workbook.
Comprehensive Answers

Quick Check

1. B — Census collects from every member of the population.

2. C — Random number generator → random sample.

3. A — “Don’t you think…” is leading language.

4. D — Stratified sampling proportionally represents each year group.

5. B — Self-selection bias: motivated people respond, others don’t.

Model Answers

Q6 (2 marks): Flaw 1 — Leading language: “Most people agree…” biases respondents towards “Too much” before they’ve even thought about it [1 mark]. Flaw 2 — Incomplete responses: there is no option for “A lot but not too much” or “A little”; “Some” is vague and doesn’t cover all amounts [1 mark].

Q7 (3 marks): One mark per question with valid reasoning. Example Q1: “How many hours per day do you use a screen for entertainment? Less than 1 / 1 to less than 2 / 2 to less than 4 / 4 or more” (fair: neutral wording, non-overlapping, exhaustive options). Q2: “Do you use a screen for school work? Yes / No / Sometimes” (fair: separates school from entertainment use). Q3: “What type of screen do you use most? Phone / Tablet / Computer / TV” (fair: specific, no leading language).

Q8 (4 marks): Sampling fraction $= \frac{60}{600} = \frac{1}{10}$ [1]. Year 7: $240 \times \frac{1}{10} = 24$ [1]. Year 8: $210 \times \frac{1}{10} = 21$ [1]. Year 9: $150 \times \frac{1}{10} = 15$ [1]. Check: $24 + 21 + 15 = 60$ ✓

Stretch Challenge · +25 XP, +10 coins

The Canteen Survey Problem

A survey of 50 students standing outside the canteen at lunch finds 80% prefer pizza. Give two reasons this might not represent the whole school. How would you redesign the study to get a more representative result?

Reveal solution

Reason 1: Convenience sampling — students outside the canteen at lunch are likely canteen users, so they are more predisposed to prefer canteen food (pizza). Students who bring lunch are excluded entirely. Reason 2: The sample of 50 is small and not randomly chosen; those who happen to be at the canteen that day may not reflect the range of food preferences across the whole school, all year groups, and all genders. Improved design: Use stratified random sampling, taking a proportional sample from each year group across multiple days and times, surveying students both in and away from the canteen.

R
Quick Review

Census

Data from every member of the population

Sample

A representative subset of the population

Random sampling

Every member has an equal chance of selection

Stratified

Proportional samples from each subgroup

Bias

Systematic error making results unrepresentative

Fair questions

Clear, neutral, specific, non-overlapping options

Badges This Lesson

0 of 6
Data Detective
Survey Designer
Bias Buster
Sample Selector
Question Crafter
Stats Starter

Mark lesson as complete

Tick when you’ve finished Learn, Practice, and the Stretch. Earns +85 XP and +25 coins.