Mathematics • Year 8 • Unit 4 • Lesson 7
Histograms in the Real World
Apply histograms, modal class, estimated mean, and shape description to real situations: athletics times, exam marks, household sizes, and reaction times.
1. Word problems
Each problem uses ideas from Lesson 7. Show your working — a single answer with no working only earns half marks.
1.1 — Year 8 exam marks. A teacher records exam marks for 40 students in a grouped table: 0–<20 (f=2), 20–<40 (f=6), 40–<60 (f=14), 60–<80 (f=12), 80–100 (f=6).
(a) State the modal class.
(b) Add a cumulative frequency column. How many students scored less than 60?
(c) Estimate the mean mark using class midpoints. Show working. 4 marks
1.2 — Reaction times. 25 students measure their reaction time (ms): 150–<200 (f=3), 200–<250 (f=8), 250–<300 (f=9), 300–<350 (f=4), 350–<400 (f=1).
(a) State the modal class.
(b) Describe the shape of the distribution.
(c) Estimate the mean reaction time using midpoints. 3 marks
1.3 — Household sizes. A census records household size in 50 households: 1 (f=8), 2 (f=14), 3 (f=12), 4 (f=10), 5 (f=4), 6 (f=2).
(a) Should this be drawn as a histogram or a bar chart? Justify.
(b) What is the mode (most common household size)?
(c) Calculate the exact mean (not estimated — household size is discrete). 3 marks
1.4 — Comparing two classes. Class A's exam histogram is bell-shaped, centred on 70%. Class B's histogram is right-skewed — most students below 50%, with a few above 85%.
(a) Describe each class's typical performance in one sentence.
(b) Which class likely needs more support? Justify.
(c) For Class B, would the mean overstate or understate the "typical" mark? Why? 3 marks
1.5 — Athletics times. 100 m sprint times (seconds) for 30 athletes: 11.0–<11.5 (f=6), 11.5–<12.0 (f=12), 12.0–<12.5 (f=8), 12.5–<13.0 (f=4).
(a) State the modal class.
(b) Find midpoints and estimate the mean time.
(c) Describe the shape. 3 marks
2. Explain your thinking
This question is about communication, not just answers. Use full sentences. 4 marks
2.1 A classmate calculates a mean from a grouped frequency table by averaging the class boundaries (e.g. for 30–<40 they use 30) instead of the midpoint. In your own words, explain (i) why this gives the wrong estimate, (ii) what the correct value to use is, and (iii) what assumption we make about where data values lie within each class. Use the term midpoint in your answer.
How did this worksheet feel?
What I'll revisit before next class:
1.1 — Year 8 exam marks
(a) Modal class = 40–<60 (f = 14).
(b) Cumfreq: 2, 8, 22, 34, 40. Less than 60 = 22 students.
(c) Midpoints 10, 30, 50, 70, 90. Σ(f×mid) = 2×10 + 6×30 + 14×50 + 12×70 + 6×90 = 20 + 180 + 700 + 840 + 540 = 2280. Mean ≈ 2280 ÷ 40 = 57.
1.2 — Reaction times
(a) Modal class = 250–<300 ms (f = 9).
(b) Slightly skewed right — long tail toward higher times.
(c) Midpoints 175, 225, 275, 325, 375. Σ(f×mid) = 3×175 + 8×225 + 9×275 + 4×325 + 1×375 = 525 + 1800 + 2475 + 1300 + 375 = 6475. Mean ≈ 6475 ÷ 25 = 259 ms.
1.3 — Household sizes
(a) Bar chart — household size is discrete (whole numbers only); use a bar chart with gaps, not a histogram.
(b) Mode = 2 people (highest f = 14).
(c) Mean = Σ(f×value) ÷ Σf = (1×8 + 2×14 + 3×12 + 4×10 + 5×4 + 6×2) ÷ 50 = (8+28+36+40+20+12) ÷ 50 = 144 ÷ 50 = 2.88 people.
1.4 — Comparing two classes
(a) Class A: most students score around 70% with the spread balanced on both sides — consistent performance. Class B: most students score below 50% with a few high outliers — uneven performance.
(b) Class B — the majority are below expectations and the skewed shape shows the class is split.
(c) The mean would overstate the typical mark, because the few high-scoring students pull the mean upward above the median, masking how many students are actually struggling.
1.5 — Athletics times
(a) Modal class = 11.5–<12.0 s (f = 12).
(b) Midpoints 11.25, 11.75, 12.25, 12.75. Σ(f×mid) = 6×11.25 + 12×11.75 + 8×12.25 + 4×12.75 = 67.5 + 141 + 98 + 51 = 357.5. Mean ≈ 357.5 ÷ 30 ≈ 11.92 s.
(c) Slightly skewed right — most athletes between 11.0 and 12.0 s, with a longer tail toward slower (higher) times.
2.1 — Explain your thinking (sample response)
Using the lower boundary (30) instead of the midpoint systematically underestimates the mean — it assumes every value sits at the bottom of its class, which is unrealistic. The correct value to use is the midpoint, calculated as (lower + upper) ÷ 2, so for 30–<40 we use 35. The assumption is that values are spread evenly across each class, so the midpoint is the best single representative of "where the typical value in this class sits". This makes the estimated mean unbiased on average rather than tilted too low.
Marking: 1 mark for noting the boundary underestimates; 1 mark for naming the midpoint as correct; 1 mark for the (lower + upper) ÷ 2 formula; 1 mark for stating the "even spread within class" assumption.