Mathematics • Year 10 • Unit 4 • Lesson 12
Scatter Plots and Correlation — Mixed Challenge
Pull every Lesson 12 idea together: bivariate data, scatter plots, direction + strength + shape, correlation coefficients, and the lesson's two biggest traps — "correlation = causation" and "r close to ±1 means linear".
1. Mixed problems — choose the right tool
Each question uses a different idea from Lesson 12. Decide whether the question is about plotting, describing, ordering by strength, or a misconception before you start writing. 3 marks each
1.1 Plot these five points and describe the correlation (direction + strength + shape):
(2, 9), (4, 7), (6, 5), (8, 3), (10, 1).
|—————————————————————————————————|
0 2 4 6 8 10
1.2 Order the following correlation coefficients from strongest to weakest LINEAR relationship: r = +0.4, r = −0.92, r = 0, r = +0.78, r = −0.3.
1.3 A scatter plot of (hours of TV per day) vs (test score %) shows points scattered all over with no clear trend. What correlation would you describe, and what does this tell you about predicting test score from TV hours?
1.4 Bivariate data on (age of car in years) vs (resale price in $1000s) gives r = −0.85. (a) Describe the correlation in words. (b) Does an older car CAUSE a lower price, or is the relationship more nuanced? Explain.
1.5 Sketch (or describe in words) what each of these scatter plots looks like: (a) strong positive linear, (b) weak negative linear, (c) perfect quadratic (U-shape) — and state the approximate r for each.
1.6 For each correlated pair, decide if causation is plausible, and if so, in which direction (X → Y, Y → X, or neither): (a) hours of revision and test score, (b) shoe size and height, (c) global average temperature and atmospheric CO₂ levels.
2. Find the mistake
Another Year 10 student has tried to interpret a research result. Their reasoning is shown below. Exactly one line contains a Lesson 12 misconception. Spot it, explain why it's wrong, then write the correct interpretation. 3 marks
Student's reasoning — "Vitamin C and the common cold":
Line 1: A study finds r = −0.6 between vitamin C intake (mg/day) and number of cold days/year.
Line 2: r = −0.6 indicates a moderate negative linear relationship.
Line 3: Conclusion: "Therefore taking vitamin C causes fewer colds."
(a) Which line contains the misconception?
(b) Explain in one or two sentences why that claim is wrong, quoting the lesson's correlation/causation warning.
(c) Re-state Line 3 correctly, naming one possible confounding variable.
Stuck? People who take vitamin C may also eat more fruit/veg, wash their hands more, or have better general health.3. Open-ended challenge — design a bivariate study
This question has many valid answers. Be creative but follow every rule. 4 marks
3.1 Design a small bivariate study about ANY real-world topic. Your write-up must include:
- a research question in the form "Is there a relationship between A and B?",
- at least seven data points (made-up but realistic),
- a quick sketch of the scatter plot,
- a one-sentence description of the correlation (direction + strength + shape),
- one paragraph (3–4 sentences) discussing whether the correlation might imply causation — and if not, naming at least one plausible confounding variable.
How did this worksheet feel?
What I'll revisit before next class:
1.1 — Plot and describe
Points sit exactly on the line y = 11 − x; perfect straight-line downward trend. Description: "Strong (in fact, perfect) negative linear correlation." (r = −1)
1.2 — Ordering by strength
Strength = |r|. Order: |−0.92| = 0.92, |+0.78| = 0.78, |+0.4| = 0.4, |−0.3| = 0.3, |0| = 0.
Strongest → weakest: r = −0.92, r = +0.78, r = +0.4, r = −0.3, r = 0.
1.3 — No correlation
Description: "No (or very weak) correlation". This means knowing one variable does NOT help predict the other — the scatter plot has no usable trend, so any "line of best fit" through random points would be misleading.
1.4 — Age of car vs price
(a) Strong negative linear correlation: older cars tend to sell for less.
(b) Age is a CAUSAL factor (cars depreciate as they age and wear out) but other variables (model, condition, mileage) also matter. So age → lower price is plausible but not the whole story.
1.5 — Three sketches with r
(a) Strong positive linear: points climb from bottom-left to top-right, close to a line. r ≈ +0.9.
(b) Weak negative linear: points trend downward but are scattered. r ≈ −0.3.
(c) Perfect U-shape (e.g. y = x²): points form a parabola. r ≈ 0 (no linear relationship), even though the pattern is perfect — this is the Lesson 12 misconception trap.
1.6 — Causation direction
(a) Yes: revision → score (plausible mechanism: practice improves performance).
(b) Not really. Shoe size doesn't CAUSE height — both grow together in childhood. The link is via a third variable (age/growth).
(c) Plausible BOTH ways via feedback: more CO₂ → warming (greenhouse effect), and warming can release more CO₂ (e.g. permafrost) — this is a feedback loop, not one-way causation.
2 — Find the mistake
(a) The mistake is on Line 3.
(b) Lesson 12 explicitly warns "correlation ≠ causation". An r of −0.6 only shows the two variables move in opposite directions; it does not prove one causes the other.
(c) Corrected: "Higher vitamin C intake is associated with fewer cold days. However, this does not prove vitamin C causes fewer colds — people who take supplements may also wash hands more, sleep better or eat healthier diets (any of these is a confounding variable)."
3 — Open-ended challenge (sample solution)
Research question: "Is there a relationship between hours of phone use per day and average sleep per night?"
Data (phone hr, sleep hr) for 8 Year 10 students: (1, 9), (2, 8.5), (3, 8), (4, 7.5), (5, 7), (6, 6.5), (7, 6), (8, 5.5).
Sketch: a straight downward trend from (1, 9) to (8, 5.5).
Description: "Strong negative linear correlation."
Causation discussion: Phone use at night likely DOES contribute to less sleep (blue light + late-night use), so there is a plausible mechanism. However, the correlation alone does not prove this — a confounding variable could be evening anxiety or homework load (students who study/worry late at night might also use their phone more). To establish causation, a controlled study would be needed.
Marking: 1 mark for a clear research question, 1 for a realistic data set, 1 for a correct correlation description, 1 for a thoughtful causation discussion naming a confounding variable.