Mathematics • Year 9 • Unit 4 • Lesson 19
Data and Probability in the Real World
Use measures of centre, probability rules and bias-spotting in real Y9 contexts: salaries, sport, school surveys, traffic and pizza orders. Then explain in your own words why online polls don't represent public opinion.
1. Word problems
Each scenario uses measures of centre, a probability rule, or a bias judgement. Show working.
1.1 — Salaries at a small company. Yearly salaries (in $000) for the 7 staff at a startup are: 55, 58, 60, 62, 65, 70, 280.
(a) Find the mean and the median salary.
(b) Which figure does the company use in its job ad to make the role look most attractive? Which would be more honest?
(c) Why is the median more representative here? Reference the high salary specifically. 4 marks
1.2 — Pizza topping order. A pizza shop runs a deal: a customer picks one of 4 bases (thin, normal, thick, wholemeal) and one of 6 toppings (margherita, pepperoni, hawaiian, veggie, BBQ chicken, supreme).
(a) How many different pizza combinations are possible (use the multiplication principle)?
(b) A customer chooses at random. Find P(thin base AND pepperoni topping).
(c) Find P(NOT a wholemeal base). 4 marks
1.3 — School survey on canteen food. A school is asked to survey which canteen item is most popular. The Year 12 prefects survey themselves at lunch (12 students).
(a) Identify two distinct sources of bias in this survey (be specific — e.g., who's being asked, when, by whom).
(b) Suggest one specific change to the sampling method that would reduce bias (refer to the lesson's data-collection checklist).
(c) The principal claims "12 students is plenty — that's a fair sample of the school." Explain in one sentence why this is wrong. 4 marks
1.4 — Traffic light timing. A traffic light cycles: 30 seconds green, 5 seconds amber, 25 seconds red.
(a) Find the experimental P(arriving on a red) for a random car arrival.
(b) A driver passes this light 20 times during the week. How many "reds" do they expect? (Round to the nearest whole.)
(c) The driver claims they got 16 reds in 20 trips. Comment on whether this seems consistent with the expected value. 4 marks
1.5 — Maths test scores. A Year 9 class of 24 sits a Maths test. The teacher reports: mean = 68, median = 73, IQR = 14, range = 60.
(a) Mean < median tells you what about the skew?
(b) Sketch (or describe) the box plot if Q1 = 65 and Q3 = 79 — give all five summary numbers if max = 95 and min = 35.
(c) A new student scores 25. Will this raise or lower the mean? Will it move the median? Explain. 4 marks
2. Explain your thinking
Use full sentences. 4 marks
2.1 A news website runs an online poll: "Should homework be banned? Click here to vote." After a week, 86% of voters say YES. The website reports "the public overwhelmingly supports banning homework." Explain in your own words:
(i) What kind of sample this is (use the lesson term).
(ii) Which source of bias is strongest, and why.
(iii) Why a "large" sample size doesn't fix the problem here (reference the lesson misconception).
(iv) One change that would make the poll trustworthy.
How did this worksheet feel?
What I'll revisit before next class:
1.1 — Startup salaries
(a) Sum = 55+58+60+62+65+70+280 = 650. Mean = 650 ÷ 7 ≈ $92.9k. Sorted: 55, 58, 60, 62, 65, 70, 280; middle (4th) value = $62k median.
(b) The company would advertise the $92.9k mean (sounds higher); the median ($62k) is more honest because it represents what most staff actually earn.
(c) The 280k founder's salary is an outlier that pulls the mean far above what any "ordinary" employee earns. The median ignores that single high value and gives the salary in the middle of the seven, which is closer to most people's actual pay.
1.2 — Pizza combos
(a) 4 × 6 = 24 combinations.
(b) P(thin AND pepperoni) = (1/4) × (1/6) = 1/24 ≈ 0.042.
(c) P(NOT wholemeal) = 1 − 1/4 = 3/4 = 0.75.
1.3 — School canteen survey
(a) Two biases: (i) only Year 12 prefects asked (selection bias — Year 7-11 not represented and prefects may have different tastes); (ii) only surveyed during lunch by school leaders (response/observer bias — students may give socially-acceptable answers).
(b) Stratified random sample — pick proportional numbers of students randomly from each year group (e.g., if Year 7-12 each have 100 students, pick 5 random students from each year).
(c) 12 of perhaps 600 students is far too small — even before considering bias, this isn't enough students to represent the whole school.
1.4 — Traffic light
(a) Cycle length = 30 + 5 + 25 = 60 s. P(red) = 25/60 = 5/12 ≈ 0.417.
(b) Expected reds in 20 = 20 × 5/12 ≈ 8.3 ≈ 8 reds.
(c) 16 reds in 20 is roughly double the expected 8. With only 20 trips, random variation is wide, but doubling the expected is unusual — possibly the driver only counts memorably bad trips, or hits the light at a fixed time of day when red is more likely. Not strong evidence of a faulty light, but worth flagging.
1.5 — Maths test
(a) Mean (68) < median (73) → left skew (tail to the low end — a few low scorers pull the mean down).
(b) Five-number summary: min = 35, Q1 = 65, median = 73, Q3 = 79, max = 95.
(c) A score of 25 is below the current min, so it would lower the mean (and become a low outlier strengthening the left skew). The median, with 25 students, would barely move — it would still be the 13th score, which is still around the high 60s/low 70s.
2.1 — Explain your thinking (sample response)
(i) This is a convenience / self-selection sample (in the lesson's terms): people only vote if they happen to visit that site AND care enough to click. (ii) The strongest bias is self-selection bias — people who really want homework banned are far more motivated to vote than people who are happy with the current system, so the sample over-represents one opinion. (iii) Even a million votes can't fix self-selection — as the lesson says, "a large biased sample is worse than a small representative one." More voters from the same biased pool just gives a more confident but still wrong answer. (iv) Run a proper stratified random sample of, say, 1 000 randomly-chosen Australians of various ages, and ask the question with neutral wording. Then the result actually represents the public.
Marking: 1 mark per part (i)–(iv).