Skip to content
mathlab
0
0
0 XP
Lvl 1
KJ
Lesson 2 ~25 min Unit 4 · Data & Chance +85 XP

Collecting Data

Surveys, observations and experiments — choosing the right method and avoiding bias to get data that actually tells you something useful.

Today’s hook: Instagram uses millions of data points to decide what to show you. But someone had to design HOW to collect that data — the wrong method and you’d get garbage results. How do you collect data that actually tells you something useful?
0/5QUESTS
Think First
warm-up

Imagine you want to find out the most popular sport in your school. You ask the first 10 students you see at lunch — they’re all in the cricket team. Will your results be reliable? What could go wrong? Jot your thoughts before reading on.

Record your answer in your workbook.
1
The Big Idea
+5 XP

Data can be collected by survey (asking people), observation (watching and recording), or experiment (manipulating conditions). The method must match the research question. Equally important: your data must represent the right group — that’s the difference between population and sample.

The population is everyone or everything you want to study. A sample is the smaller group you actually collect data from. A random sample gives everyone an equal chance of selection — this avoids bias, where your results unfairly favour certain outcomes. Biased data leads to wrong conclusions.

POPULATION (all students) SAMPLE selected randomly Sample must represent the population
Population → Sample (random) → Data → Conclusions
Match method to question
Survey = ask. Observation = watch. Experiment = manipulate.
Random beats convenient
Asking only your friends is convenient but biased. Random sampling is harder but fairer.
Define your population first
Before collecting any data, write down exactly who or what you’re studying.
2
What You'll Master
objectives

Know

  • Three data collection methods: survey, observation, experiment
  • The difference between population and sample
  • What bias is and why it is a problem

Understand

  • Why the collection method must match the research question
  • Why a random sample is more reliable than a convenient one
  • How biased questions and biased samples both cause bad data

Can Do

  • Select the appropriate collection method for a given question
  • Identify bias in survey questions and rewrite them fairly
  • Decide when to use a census (whole population) vs a sample
3
Words You Need
vocabulary
SurveyA collection method where you ask people questions (questionnaire, interview, online form).
ObservationWatching and recording data without interfering (e.g. counting cars passing a corner).
ExperimentDeliberately changing a variable and measuring the effect (controlled conditions).
PopulationThe entire group you want to draw conclusions about.
SampleA smaller group selected from the population to represent it.
BiasA tendency in data collection that makes results unfairly favour one outcome.
4
Spot the Trap
heads-up

Wrong question: “Don’t you agree that homework is too much?” — This is a leading question. It pushes respondents toward one answer and produces biased data.

Better question: “How many hours of homework do you do per night?” or “Do you think the amount of homework is: too much / about right / too little?”

Wrong sampling: Surveying only your friends about “the most popular music in your school.” Your friend group may not represent the whole school.

Better sampling: Use the class roll and randomly select student numbers. Give everyone a fair chance of being chosen.

5
Survey Design: Open vs Closed Questions
+5 XP

Surveys use two question types. Closed questions give fixed options (tick-box, yes/no, multiple choice) — easy to analyse but may miss the full picture. Open questions let people answer freely — richer responses but harder to quantify.

Closed: “How many hours of screen time do you have per day?   0–1 / 1–2 / 2–3 / 3+” — easy to tally and graph. Open: “What do you think about your screen time?” — gives nuance but responses must be manually categorised. Good surveys use neutral wording, clear categories and no double-barrelled questions (asking two things at once).

CLOSED Fixed options Easy to analyse Quick to complete OPEN Free response Rich detail Harder to tally GOOD SURVEY RULES Neutral · No leading · Clear options · Pilot test Pilot test your survey on 2–3 people first
Closed = easy to analyse. Open = richer data. Use both wisely.
Avoid double-barrelled
“Do you like PE and Art?” is two questions in one. Split them up.
Cover all options
For closed questions, ensure categories cover all possible responses, including “other.”
Pilot first
Test your survey on 2–3 people before using it. Check for confusing wording.
6
Population vs Sample
+5 XP

A census collects data from the whole population. It is accurate but expensive and time-consuming. A sample collects from a subset. It is faster and cheaper but introduces some uncertainty. The key is making the sample representative of the population.

Census: small population (e.g. testing all 30 students in your class), or when accuracy is critical (e.g. a national election count). Sample: large population (e.g. all Year 7 students in Australia), or when testing is destructive (e.g. crash-testing every car off the line would leave none to sell).

CENSUS All of population Very accurate Expensive & slow Best for small populations SAMPLE Subset only Some uncertainty Faster & cheaper Best for large populations Must be random & representative!
Census = all. Sample = subset. Both need careful planning.
Bigger sample, less error
A larger sample is more reliable but costs more. Balance is key.
Representative matters more
1,000 randomly chosen people beat 10,000 hand-picked friends.
5% rule of thumb
A sample of about 5% of the population is often sufficient for a reliable estimate.
7
Random Sampling and Why Randomness Matters
+5 XP

A random sample gives every member of the population an equal chance of being selected. This eliminates bias from the selection process. Simple methods: lottery method (names in a hat), random number generator, and systematic sampling (every nth person from a list).

Convenience sampling (asking whoever is nearby) is fast but biased. Systematic sampling: number all 200 students, then randomly pick a start point (say 7) and select every 10th student: 7, 17, 27, 37 … This gives a fair spread across the whole population without needing to know everyone personally.

Systematic Sampling (every 10th) 7 17 27 37 47 Every 10th from a random start point Fair spread across entire population Random start eliminates selection bias
Random = everyone has equal chance. No cherry-picking.
Number your list first
To pick randomly you need a complete numbered list of your population.
Random ≠ haphazard
Random means equal probability for all. Grabbing whoever walks past is NOT random.
Technology helps
Spreadsheets can generate random numbers. Use =RANDBETWEEN() in Excel.
Watch Me Solve It · Choose the collection method
+15 XP per step
Q1
PROBLEM
Identify the best collection method for each: (a) how many vehicles pass a school gate per hour, (b) whether a new fertiliser makes plants grow taller, (c) students’ opinions on the cafeteria menu.
  1. 1
    Identify what is being found out
    (a) Counting physical events — no asking, no manipulation needed
    (b) Testing a cause-and-effect relationship (fertiliser → growth)
    (c) Finding people’s opinions and feelings
  2. 2
    Match to method
    (a) Observation — stand and count vehicles
    (b) Experiment — grow plants with and without fertiliser
    (c) Survey — questionnaire given to students
  3. 3
    Justify each
    (a) You can’t ask a car its opinion. (b) You need controlled conditions to isolate the fertiliser’s effect. (c) Opinions live in people’s minds; you must ask them.
    The method must match where the data lives: in events, in a system, or in people.
Answers(a) Observation  (b) Experiment  (c) Survey
Watch Me Solve It · Spot and fix bias
+15 XP per step
Q2
PROBLEM
A survey asks: “Most students agree the school day is too long, don’t you?” Identify the bias and rewrite the question fairly.
  1. 1
    Identify the bias
    “Most students agree” pressures the respondent to agree. “don’t you?” is a further push. This is a leading question.
    Leading questions tell people what the expected answer is, producing biased data.
  2. 2
    Remove the leading language
    Remove “Most students agree” and the tag question. Make it neutral with balanced options.
  3. 3
    Rewrite
    “How do you feel about the length of the school day?   Too short / About right / Too long”
    Neutral wording and balanced options give every response an equal chance.
Fixed question“How do you feel about the length of the school day? Too short / About right / Too long”
Watch Me Solve It · Population vs sample decision
+15 XP per step
Q3
PROBLEM
A food company wants to test the quality of every mango in a shipment of 50,000. A school wants to know how Year 7 students (240 students) feel about PE. For each, decide: census or sample?
  1. 1
    Analyse the mango case
    Testing quality means tasting/cutting the mango. If you test every mango, none remain to sell.
    Destructive testing makes a census impossible. A sample is essential.
  2. 2
    Analyse the Year 7 case
    Population = 240 students. A 2-minute survey × 240 = 8 hours. Manageable for a school study.
  3. 3
    Make decisions
    Mangoes → Sample (testing is destructive)
    Year 7 PE survey → Census (small population, non-destructive, manageable)
    When testing doesn’t destroy the subject and the population is small, a census is better.
AnswersMangoes → Sample (destructive). Year 7 PE → Census (small, non-destructive).
9
Common Pitfalls
heads-up
Using leading questions without realising it
Phrases like “Don’t you think…”, “Most people believe…” or “Surely you agree…” all push the respondent. Even the order of options can bias results (people choose the first or last option more often).
Fix: Write your question, then ask: “Does this hint at which answer I prefer?” If yes, rewrite it neutrally.
Convenience sampling from one group
Surveying only students in the library about reading habits, or only athletes about exercise — these groups are not representative of all students. Your conclusions only apply to that group, not the whole school.
Fix: Always state your population first, then use a method (lottery, systematic) that gives all members an equal chance.
Not defining the population before collecting
“Students” could mean just your class, your year group, your school, or all NSW students. The conclusion you can draw changes dramatically depending on which group you mean.
Fix: Write a clear population statement before you start: “All Year 7 students at [school name] in [year].”
Copy Into Your Books

Collection Methods

  • Survey: ask people (questionnaire)
  • Observation: watch and record
  • Experiment: manipulate & measure

Population vs Sample

  • Population = everyone of interest
  • Sample = representative subset
  • Census = data from all of population

Avoiding Bias

  • Neutral question wording
  • No leading questions
  • Random sampling, not convenience

Random Sampling Methods

  • Lottery (names in a hat)
  • Random number generator
  • Systematic (every nth person)

How are you completing this lesson?

D
Brain Trainer · Collecting Data
4 problems

Four drill problems. Think carefully before revealing each answer.

  1. 1 Which method is best for finding out how far students travel to school?

    Survey — ask students to self-report their distance or travel time. Observation would require following every student home, which is impractical.Survey (questionnaire)
  2. 2 Spot the bias: “Most students think homework is too much, don’t they?”

    Leading question bias. “Most students think” and “don’t they?” both push respondents to agree. Rewrite: “How much homework do you receive per night? Too much / About right / Too little”Leading question bias
  3. 3 A school has 400 students. A researcher wants a 5% sample. How many students should be selected?

    5% of 400 = 0.05 × 400 = 20 students.20 students
  4. 4 Why would a phone survey of Year 7 students be problematic?

    Many Year 7 students may not have their own phone, or may not answer unknown numbers. This creates sampling bias — only students with phones who answer calls would be included, under-representing students without phones.Sampling bias (excludes students without phones or who don’t answer)
Complete in your workbook.
1
A scientist tests whether a new fertiliser makes plants grow taller. What collection method is this?
+10 XP
2
A researcher surveys 30 students from a Year 7 cohort of 200. What is the population?
+10 XP
3
Which question contains bias?
+10 XP
4
Why is a random sample better than a convenience sample?
+10 XP
5
A fruit company wants to check the quality of 50,000 mangoes. Why must they use a sample, not a census?
+10 XP
Show Your Working
9 marks total
Apply Medium 3 MARKS

Q6. A student wants to investigate: “Do Year 7 students at our school prefer playing sport indoors or outdoors?” (a) What collection method? (b) Describe one random sampling method for 180 Year 7 students. (c) Write one well-designed closed question.

Answer in your workbook.
Understand Easy 2 MARKS

Q7. The question “Don’t you love the new school canteen?” is poorly designed. (a) Identify the problem. (b) Rewrite it as a fair question with three response options.

Answer in your workbook.
Reason Hard 4 MARKS

Q8. Two researchers investigate students’ screen time. Researcher A surveys 20 friends at lunch. Researcher B uses a random number generator to select 50 from 500 students. (a) Which is more reliable? (b) Give two reasons why. (c) What type of bias does Researcher A have?

Answer in your workbook.
Comprehensive Answers

Quick Check

1. C — Experiment. A variable (fertiliser) is manipulated and growth is measured.

2. B — All 200 Year 7 students. The population is everyone the researcher wants to draw conclusions about.

3. A — “Do you agree the government is wasting money?” — leading (loaded language).

4. D — Every person has an equal chance, which eliminates selection bias.

5. B — Quality testing is destructive; testing every mango leaves none to sell.

Show Your Working Model Answers

Q6 (3 marks): (a) Survey — opinions exist in students’ minds [1]. (b) Number all 180 students 1–180, use a random number generator to select 18 students (10%) [1]. (c) “When you play sport, do you prefer: Indoors / Outdoors / No preference?” [1].

Q7 (2 marks): (a) Leading question — “Don’t you love” implies the canteen is good and pushes agreement [1]. (b) “How would you rate the new school canteen? Excellent / Satisfactory / Needs improvement” [1].

Q8 (4 marks): (a) Researcher B [1]. (b) Larger sample (50 vs 20) gives a better estimate [1]; random selection means every student had an equal chance, reducing bias [1]. (c) Convenience sampling bias — friends tend to have similar habits, so results don’t represent all 500 students [1].

Stretch Challenge · +25 XP, +10 coins

Design Your Own Survey

You want to investigate: “Do students at your school get enough sleep?” Design a complete data collection plan: (1) your exact population, (2) collection method and why, (3) sample size and how you will select it randomly, (4) three well-designed questions (at least one closed, one open), and (5) one potential source of bias and how you will minimise it.

Reveal model answer

Population: All students currently enrolled at [school name]. Method: Survey — sleep is self-reported, so you must ask. Sample: ~10% = 50 students if 500 enrolled. Selection: Number all students on the roll and use =RANDBETWEEN() in a spreadsheet. Q1 (closed): “On a typical school night, how many hours of sleep do you get? Less than 7 / 7–8 / More than 8.” Q2 (closed): “Do you feel tired during school lessons? Always / Sometimes / Rarely / Never.” Q3 (open): “What is the main reason you don’t get as much sleep as you’d like?” Bias: Self-reporting bias (students may exaggerate). Minimise by: assure anonymity and explain the importance of accurate data.

R
Quick Review

Survey

Ask people — for opinions, preferences, self-reported data

Observation

Watch and record — for counting events without interference

Experiment

Manipulate a variable — to test cause and effect

Population vs Sample

Define population first, then select a representative sample

Random sampling

Equal chance for all — eliminates selection bias

Bias

Leading questions and convenience sampling both cause biased data

Interactive: Survey Question Builder

Build a survey question and see how small changes in wording shift the level of bias. Observe how neutral wording changes simulated response distributions.

Your Badges

0 of 6
First Steps
3-Day Streak
3 in a Row
Lesson Ace
Stretch Seeker
Daily Warrior

Mark lesson as complete

Tick when you’ve finished Learn, Practice and the Stretch. Earns +85 XP and +25 coins.