Think First

Five friends compare their pocket money: $5, $8, $8, $10, $24. What is the "average" amount? Is it a fair representation of what most friends receive? Why or why not?

Mean: The Balancing Point

The mean is the value that would balance the data if each value were a weight on a number line. It spreads the total evenly across all values — but a single very large or small value can drag it far from the centre of the group.

$5, $8, $8, $10, $24 — Mean = $11 $0 $5 $8 $10 $11 $24 outlier pulls mean right balance point (mean)

What You'll Master

  • Calculate the mean from a list of values using $\bar{x} = \dfrac{\sum x}{n}$
  • Explain how outliers affect the mean
  • Calculate the mean from a frequency table using $\bar{x} = \dfrac{\sum f \cdot x}{\sum f}$
  • Estimate the mean from grouped data using class midpoints
  • Identify when the mean is and is not representative

Words You Need

mean ($\bar{x}$)The arithmetic average: total sum divided by count. The balancing point of a data set.
$\sum$ (sigma)The sum of. $\sum x$ means "add up all the x values".
$n$The number of values in the data set.
outlierA value much larger or smaller than the rest. Outliers pull the mean towards them.
frequency table meanWhen data is in a frequency table: multiply each value by its frequency, sum those products, then divide by the total frequency.
estimated meanMean calculated from grouped data using class midpoints — it is an approximation, not an exact value.

⚠ Spot the Trap

The mean can be misleading. If one friend earns $5 and another earns $500, the mean of $252.50 represents neither person well. Always check whether the mean is a fair summary — if the data has outliers or is very spread out, the median may be a better measure of the "typical" value.

Divide by n, not by the range. A common error: students add the values then divide by the largest value or the range instead of the count $n$.

The Mean Formula

$$\bar{x} = \frac{\sum x}{n} = \frac{\text{sum of all values}}{\text{number of values}}$$

Worked example: Pocket money values: $5, $8, $8, $10, $24.

Step 1 — Add: $5 + 8 + 8 + 10 + 24 = 55$

Step 2 — Count: $n = 5$

Step 3 — Divide: $\bar{x} = \dfrac{55}{5} = \mathbf{\$11}$

Notice that $\$11$ is higher than 4 of the 5 values. The outlier $\$24$ has pulled the mean upward.

Effect of Outliers on the Mean

An outlier is a data value that is much larger or smaller than the others. Because the mean uses every value in the calculation, a single extreme outlier can significantly distort it.

Example: Five test scores: 72, 75, 78, 80, 20.

With all 5: $\bar{x} = \dfrac{72+75+78+80+20}{5} = \dfrac{325}{5} = \mathbf{65}$

Without the outlier (20): $\bar{x} = \dfrac{72+75+78+80}{4} = \dfrac{305}{4} = \mathbf{76.25}$

The outlier drags the mean from 76 down to 65 — a difference of over 10 marks. In this case, the mean of 65 under-represents most students' performance.

When to be careful: House prices, incomes, and any data with a few very large values — the mean alone can give a misleading picture.

Mean from a Frequency Table

When data is given as a frequency table, use: $$\bar{x} = \frac{\sum f \cdot x}{\sum f}$$

Add an extra column for $f \times x$ (frequency × value), then divide the total by the total frequency.

Value (x)Frequency (f)f × x
122
248
3618
4520
5315
Total2063

$$\bar{x} = \frac{63}{20} = \mathbf{3.15}$$

Estimated Mean from Grouped Data

When data is grouped into class intervals, we cannot know exact values. Instead, we assume every value in a class sits at its midpoint.

Midpoint of an interval = $\dfrac{\text{lower bound} + \text{upper bound}}{2}$

Example: interval 20–<30 → midpoint = $\dfrac{20+30}{2} = 25$

Then use: $\bar{x} = \dfrac{\sum f \times \text{midpoint}}{\sum f}$ — this gives an estimate of the mean.

Important: always state this is an estimated mean (or "approximate mean") when working with grouped data.

Common Pitfalls

  • Dividing by the number of different values instead of the total number of data points. In a frequency table, divide by $\sum f$, not by the number of rows.
  • Forgetting to multiply frequency by value before summing ($f \times x$, not just $f$ or just $x$).
  • Using class boundaries instead of midpoints for grouped data estimates.
  • Not stating "estimated mean" when the result comes from grouped data.
  • Claiming the mean is always a good summary — check for outliers first.

Copy This Into Your Book

$$\bar{x} = \frac{\sum x}{n} \quad \text{(raw data)}$$

$$\bar{x} = \frac{\sum f \cdot x}{\sum f} \quad \text{(frequency table)}$$

$$\bar{x} \approx \frac{\sum f \times \text{midpoint}}{\sum f} \quad \text{(grouped data — estimate only)}$$

Outliers strongly affect the mean. If outliers are present, consider whether the mean is representative.

Calculate the mean of: 4, 7, 9, 6, 14, 8.

A frequency table shows: value 1 with frequency 3, value 2 with frequency 5, value 3 with frequency 4. What is the mean?

A data set is: 2, 5, 6, 8, 9. The mean is 6. Which value, when removed, increases the mean?

Grouped data: 10–<20: 4 students, 20–<30: 9 students, 30–<40: 7 students. What is the estimated mean?

A data set has a mean of 10. A new data value of 100 is added. What happens to the mean?

Q6. Eight test scores: 62, 71, 68, 74, 70, 69, 72, 20. (a) Calculate the mean. (b) Calculate the mean without the outlier. (c) Is the mean with or without the outlier more representative of most students? Explain.

Q7. Calculate the mean from this frequency table. Show the full f × x column and working.
Values and frequencies: (3, 4), (5, 6), (7, 8), (9, 5), (11, 2). (Format: value, frequency)

Q8. Estimate the mean from this grouped frequency table (total n = 40). Show all working including midpoints and f × midpoint column.
0–<10: 6, 10–<20: 12, 20–<30: 14, 30–<40: 8.

Show Answers

Q6

(a) Sum = 62+71+68+74+70+69+72+20 = 506. Mean = 506÷8 = 63.25.
(b) Without 20: sum = 506−20 = 486. n = 7. Mean = 486÷7 ≈ 69.4.
(c) The mean without the outlier (≈69) is more representative — it is close to all 7 remaining scores. The mean of 63.25 is pulled down by the single low score of 20, making it seem like most students performed poorly when they actually scored 62–74.

Q7

f × x: 3×4=12, 5×6=30, 7×8=56, 9×5=45, 11×2=22. Sum(fx) = 165. Sum(f) = 25.
$\bar{x} = \dfrac{165}{25} = \mathbf{6.6}$

Q8

Midpoints: 5, 15, 25, 35. f × midpoint: 6×5=30, 12×15=180, 14×25=350, 8×35=280. Sum = 840. n = 40.
Estimated mean = $\dfrac{840}{40} = \mathbf{21}$ (estimate — grouped data).

Stretch Challenge

A class of 25 students has a mean test score of 68. Another class of 15 students has a mean test score of 72.

(a) What is the total number of marks scored by each class?
(b) What is the combined mean test score for all 40 students?
(c) Why can't you simply average 68 and 72 to get the combined mean? Explain the difference.

Mean = sum ÷ count
Outliers pull the mean toward them
Frequency table: use f × x column
Grouped data: use midpoints (estimate)
Always divide by total n, not row count
State "estimated mean" for grouped data

Badges This Lesson

Mean Machine
Average Ace
Sum Slayer
Frequency Formula
Outlier Observer
Statistics Star
← Previous Lesson 8 of 20 Next →