Mathematics • Year 9 • Unit 4 • Lesson 10

Data Collection and Sampling

Build fluency with the four sampling methods (simple random, stratified, systematic, convenience) and the three types of bias (selection, response, non-response) — one step at a time, from a fully worked stratified-sample example through guided practice to independent problems.

Build · I Do / We Do / You Do

1. I do — fully worked example

Read every line. Each step has a short reason on the right so you can see why, not just what.

Problem. A company has $1000$ employees: $600$ full-time, $300$ part-time, $100$ casual. Design a stratified sample of 50 employees that reflects the make-up of the workforce.

Step 1 — Identify the strata.

Three groups: full-time, part-time, casual.

Reason: stratified sampling divides the population into meaningful groups (strata) and samples from each in proportion.

Step 2 — Find each group's proportion.

Full-time: $600/1000 = 0.6$ (or $60\%$)
Part-time: $300/1000 = 0.3$ (or $30\%$)
Casual: $100/1000 = 0.1$ (or $10\%$)

Reason: each group's share of the total population determines its share of the sample.

Step 3 — Multiply each proportion by the sample size $50$.

Full-time: $0.6 \times 50 = 30$ employees
Part-time: $0.3 \times 50 = 15$ employees
Casual: $0.1 \times 50 = 5$ employees

Reason: this gives a sample that mirrors the workforce — no group is over- or under-represented.

Step 4 — Check the totals.

$30 + 15 + 5 = 50$ ✓

Reason: always check the parts add to the whole sample. If they don't, you've miscalculated.

Answer: $\mathbf{30}$ full-time, $\mathbf{15}$ part-time, $\mathbf{5}$ casual.

Stuck? Revisit lesson § "Sampling Methods" — stratified sampling is the second method listed and is the most common in real research.

2. We do — fill in the missing steps

Same structure as Section 1, but with the working faded. Fill in each blank line. 5 marks

Problem. A high school has $600$ students: $300$ in Year 9, $200$ in Year 10, $100$ in Year 11. Design a stratified sample of $60$ students.

Step 1 — Identify the strata: the three year groups: Year __, Year __, Year __.

Step 2 — Find each proportion:

Year 9: $\dfrac{300}{600} = \_\_\_\_$

Year 10: $\dfrac{200}{600} = \_\_\_\_$ (about $33.3\%$)

Year 11: $\dfrac{100}{600} = \_\_\_\_$

Step 3 — Multiply by sample size $60$:

Year 9: $\_\_\_\_ \times 60 = \_\_\_\_$

Year 10: $\_\_\_\_ \times 60 = \_\_\_\_$

Year 11: $\_\_\_\_ \times 60 = \_\_\_\_$

Step 4 — Check total: __________ $= 60$ ✓ (or note any rounding needed).

Stuck? Revisit lesson § "Check Understanding" — this exact problem is the worked Check.

3. You do — independent practice

Show your working in the space under each problem. The first four are foundation (simple proportion or definitions). The middle two are standard (identify bias). The last two are extension (design + critique).

Foundation — definitions and proportion

3.1 Define population and sample in one short sentence each.    1 mark

3.2 A factory has $400$ workers. Calculate the sampling fraction if a sample of $80$ is taken.    1 mark

3.3 A school has $200$ boys and $300$ girls. A stratified sample of $50$ is taken. How many boys? How many girls?    1 mark

3.4 Name the sampling method used in each scenario: (a) selecting every $10$th name on a list, (b) asking the first $20$ people who walk by, (c) drawing names from a hat.    1 mark

Standard — identify the bias

3.5 A survey asks: "Don't you think the new uniform policy is unfair to students?" What type of bias is present and why?    2 marks

3.6 A researcher phones random households at 10 am on a Tuesday to ask about employment. What type of bias is most likely, and what kinds of people will be under-represented?    2 marks

Extension — push your thinking

3.7 A town has $4500$ residents: $1500$ under 18, $2000$ aged $18$-$65$, $1000$ over $65$. Design a stratified sample of $150$. Show your calculations and verify the total.    3 marks

3.8 A news website posts an online poll asking "Should the federal government raise taxes?" After $24$ hours, $80\%$ of voters say "no". Explain (a) which sampling method this is closest to, (b) at least two reasons why the result is biased, and (c) what would be a more reliable way to estimate public opinion.    3 marks

Stuck on 3.8? Revisit lesson § "Common Misconceptions" — the third misconception listed addresses online polls directly.

How did this worksheet feel?

What I'll revisit before next class:

Answers — Do not peek before attempting

Section 2 — We do (school 600 → sample 60)

Step 1: Year 9, Year 10, Year 11.
Step 2: Year 9 $= \mathbf{0.5}$, Year 10 $\approx \mathbf{0.333}$, Year 11 $= \mathbf{0.167}$.
Step 3: Year 9 $= 0.5 \times 60 = \mathbf{30}$; Year 10 $\approx 0.333 \times 60 = \mathbf{20}$; Year 11 $\approx 0.167 \times 60 = \mathbf{10}$.
Step 4: $30 + 20 + 10 = \mathbf{60}$ ✓.

3.1 — Population vs sample

Population: the entire group being studied. Sample: a subset of the population used to make inferences about the whole.

3.2 — Sampling fraction 80/400

$\dfrac{80}{400} = \dfrac{1}{5}$ (or $20\%$).

3.3 — Boys 200, girls 300, sample 50

Total = 500. Boys: $\dfrac{200}{500} \times 50 = \mathbf{20}$. Girls: $\dfrac{300}{500} \times 50 = \mathbf{30}$. Check: $20 + 30 = 50$ ✓.

3.4 — Name the sampling method

(a) Systematic sampling (every $n$th member).
(b) Convenience sampling (whoever is handy — usually biased).
(c) Simple random sampling (every member has equal chance).

3.5 — Leading question

Response bias. The question is worded to suggest the policy IS unfair (the phrase "Don't you think... unfair") pushes respondents toward agreeing. A neutral version: "How fair do you find the new uniform policy: very fair / fair / neutral / unfair / very unfair?"

3.6 — Phone survey at 10am Tuesday

Non-response bias (and a form of selection bias). Under-represented: full-time workers, school-age children, anyone away from home during working hours. The sample will skew toward retirees, parents at home with young children, and people working night shifts.

3.7 — Town stratified sample of 150

Proportions: Under 18 $= 1500/4500 = 1/3$. Aged 18-65 $= 2000/4500 = 4/9$. Over 65 $= 1000/4500 = 2/9$.
Sample numbers:
Under 18: $\dfrac{1}{3} \times 150 = \mathbf{50}$.
Aged 18-65: $\dfrac{4}{9} \times 150 \approx \mathbf{67}$ (66.67, round to nearest).
Over 65: $\dfrac{2}{9} \times 150 \approx \mathbf{33}$ (33.33).
Check: $50 + 67 + 33 = 150$ ✓.

3.8 — Online news poll

(a) Closest to convenience sampling — also called "self-selected" sampling.
(b) Reasons it is biased: (1) only people who read that news website see the poll (selection bias); (2) only people who feel strongly about taxation will bother to click vote (self-selection / non-response bias); (3) people can vote multiple times in many online polls (lack of randomisation).
(c) A more reliable method: a stratified random sample across age, income and state via random phone or in-person surveying (this is how official ABS surveys work).
The lesson's "Common Misconceptions" specifically warns about online polls.