Comparing Distributions
Statistics is rarely about a single data set in isolation. The real power comes from comparison — does this year's class outperform last year's? Is the new teaching method producing more consistent results? This lesson teaches the systematic approach: examine centre, spread, shape and outliers side by side. Master this framework and you can draw meaningful conclusions from any pair of data sets.
Practise this lesson
Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.
Class A: most scores between 70–80, a few at 90. Class B: scores spread evenly from 50–100. Both have mean = 75. Which class would you rather be in, and why?
Before reading on — write your gut feeling. We will revisit this at the end of the lesson.
Compare centre: Use medians or means — whichever group has a higher median/mean typically scores higher.
Compare spread: Use IQR or standard deviation — smaller spread means more consistent results.
Also compare shape (symmetric vs skewed, unimodal vs bimodal) and outliers (which group has more extreme values).
Key facts
- The four dimensions: centre, spread, shape, outliers
- Tools for each dimension
- Why consistent measures matter
Concepts
- Why context matters in comparison
- How different spreads affect interpretation
- When a lower mean might still be preferable
Skills
- Compare two distributions using summary statistics
- Read and compare parallel box plots
- Write a full comparison paragraph in context
The first step in comparing distributions is to compare the centre — the typical or middle value.
- Use medians when the data may be skewed or has outliers (median is resistant to extreme values).
- Use means when the data is roughly symmetric and you need to consider all values.
Example:
School A: median = 78. School B: median = 72.
"School A typically achieves higher results than School B, with a median mark 6 points higher."
What to write in your book
- Compare medians for skewed data; compare means for symmetric data.
- The group with the higher median/mean typically has higher values.
- Always state the direction and magnitude of the difference in context.
Quick check: Group A has median = 82 and Group B has median = 75. Which statement correctly compares centre?
After comparing centre, compare the spread to assess consistency.
- Smaller IQR or SD = more consistent (values cluster near the centre).
- Larger IQR or SD = more variable (values are spread across a wider range).
Example:
Machine A: IQR = 2 mm. Machine B: IQR = 5 mm.
"Machine A produces more consistent parts than Machine B. Machine B's larger IQR indicates greater variability in screw length."
What to write in your book
- Smaller IQR or SD = more consistent; larger = more variable.
- Use IQR for robustness against outliers; use SD when data is symmetric.
- Interpret spread in context — what does consistency mean for this situation?
True or false: Two distributions with identical means must also have identical spreads.
Shape:
- Is one distribution symmetric while the other is skewed?
- Does one have a single peak (unimodal) while the other is bimodal?
- Right (positive) skew: tail extends to the right; median < mean.
- Left (negative) skew: tail extends to the left; median > mean.
Outliers:
- Does one group have more extreme values?
- Are the outliers high or low? What effect do they have on the mean?
Full comparison framework:
"Group A has a higher median than Group B, indicating better typical performance. However, Group A also has a larger IQR, suggesting less consistency. Group B is more symmetric, while Group A is slightly right-skewed with one high outlier that inflates its mean."
What to write in your book
- Compare all four: centre, spread, shape, outliers.
- Right skew: long right tail, mean pulled above median.
- Left skew: long left tail, mean pulled below median.
- Always interpret each comparison in the context of the question.
Fill the gap: A box plot has Q1 = 55 and Q3 = 75. The interquartile range is .
Worked examples · reveal each step
Team A: mean = 80, SD = 5. Team B: mean = 75, SD = 12. Compare the two teams fully.
School X: min=45, Q1=60, median=72, Q3=80, max=90, two outliers at 95. School Y: min=50, Q1=65, median=68, Q3=78, max=95, no outliers. Compare the schools.
Quick-fire practice · 2 activities
Group 1: mean = 65, SD = 8. Group 2: mean = 70, SD = 15. (a) Which group has higher typical scores? (b) Which group is more consistent? (c) If you wanted predictable results, which group is preferable?
Write a comparison paragraph for two sporting teams. Team Lions: median score = 88, IQR = 6, roughly symmetric. Team Tigers: median score = 82, IQR = 18, right-skewed with two outlier wins at 110.
Match each measure to what it tells you about a distribution:
Top 3 list: Name THREE real-world situations where comparing spread (consistency) is more important than comparing the average. Explain briefly for each.
Class A is more predictable — most students score 70–80, so if you are an average student you know what to expect. Class B has wide variation: some score 100 while others score 50. For a risk-averse student, Class A is preferable because of its small spread. For a high achiever who believes they can outperform the group, Class B offers the possibility of a higher mark. The key insight is that identical means can hide very different classroom experiences — this is why comparing spread is just as important as comparing centre.
Pick your answer, then rate your confidence — that tells the system what to drill next.
Q1. Class A has median = 75 and IQR = 8. Class B has median = 75 and IQR = 20. Which statement is correct?
Q2. When comparing two distributions with outliers, which measure of centre is most appropriate?
Q3. Factory A: mean = 50 mm, SD = 1 mm. Factory B: mean = 50.5 mm, SD = 3 mm. For precision engineering, which factory is preferable and why?
Q4. A distribution is right-skewed. Which relationship between mean and median is expected?
Q5. Two data sets have the same mean but different standard deviations. What does this tell you?
SA 1. Factory A produces screws: mean = 50 mm, SD = 1 mm. Factory B: mean = 50.5 mm, SD = 3 mm. (a) Compare the centres of the two distributions. (b) Compare the spreads. (c) Which factory would you choose for precision engineering? Justify your answer using both measures. (2 marks)
SA 2. Two schools' HSC results: School X — median = 82, IQR = 8, two outliers at 95. School Y — median = 80, IQR = 15, no outliers. (a) Compare typical performance. (b) Compare consistency. (c) Which school has better top-end performance? (d) Write a balanced comparison paragraph. (2 marks)
SA 3. A hospital compares two treatments for a condition. Treatment A: mean recovery = 10 days, SD = 2 days. Treatment B: mean recovery = 8 days, SD = 4 days. (a) A patient argues Treatment B is better because it has a shorter average recovery time. Evaluate this claim statistically. (b) A doctor prefers Treatment A. Explain why, using spread as the key reason. (c) Design a decision rule that helps patients choose between treatments based on their personal risk tolerance. (3 marks)
Comprehensive answers (click to reveal)
MC 1 — B: Same median means same typical score; smaller IQR means Class A is more consistent.
MC 2 — C: Median is resistant to outliers; mean is pulled toward extreme values.
MC 3 — A: In precision engineering, consistency (small SD) matters more than the 0.5 mm difference in mean.
MC 4 — D: Right skew means a long tail to the right, pulling the mean above the median.
MC 5 — B: Same mean = same average level; different SD = different consistency.
SA 1 (2 marks): (a) Factory B mean 50.5 mm is slightly higher than Factory A mean 50 mm [0.5]. (b) Factory A SD = 1 mm is far smaller than Factory B SD = 3 mm, so Factory A is much more consistent [0.5]. (c) Factory A — in precision engineering, consistency (small SD) is far more important than the 0.5 mm difference in mean [1].
SA 2 (2 marks): (a) School X has higher typical performance, median 82 vs 80 [0.5]. (b) School X is more consistent, IQR 8 vs 15 [0.5]. (c) School X has outliers at 95, indicating exceptional high achievers; School Y has no outliers but a wider spread [0.5]. (d) School X generally achieves stronger and more consistent results with some exceptional high achievers. School Y has slightly lower typical performance but a wider range of outcomes [0.5].
SA 3 (3 marks): (a) Treatment B does have a shorter mean (8 vs 10 days), but its SD of 4 means some patients may take up to 16 days — the claim ignores spread [1]. (b) Treatment A is more predictable — doctors can plan confidently around a 8–12 day recovery window; Treatment B's high variability makes planning difficult [1]. (c) Risk-averse patients or those requiring predictable recovery (elderly, planned surgery follow-up): choose Treatment A. Risk-tolerant or young healthy patients willing to accept the chance of longer recovery for the possibility of a shorter one: may prefer Treatment B [1].
Five timed questions on comparing distributions using centre, spread, shape and outliers. Beat the boss to bank a tier — gold (90% + speed), silver (75%), or bronze (50%). Replays welcome.
⚔ Enter the arenaClimb platforms comparing distributions. Pool: lesson 8.
Mark lesson as complete
Tick when you have finished the practice and review.