Think First

You score 40, 45, 48, 50, 52, 55, 96 on 7 tests. Which number best describes your "typical" performance — the mean (about 55) or the middle value (50)? Why might the middle value be more honest?

Median and Mode: Better Measures for Messy Data

When data has outliers or is skewed, the median (middle value) and mode (most common value) often describe the data more fairly than the mean.

7 Test Scores — Finding the Median 40 45 48 52 55 96 50 MEDIAN 3 values below 3 values above outlier 1st 2nd 3rd 4th 5th 6th 7th

What You'll Master

  • Find the median of odd-count and even-count data sets
  • Explain why the median is resistant to outliers
  • Find the mode, including bimodal data sets
  • Choose the most appropriate measure of centre for a given context
  • Use mean, median, and mode together to describe data

Words You Need

medianThe middle value when data is arranged in order. Exactly half the values are below it and half above.
modeThe value that appears most often. A data set can have no mode, one mode, or two modes (bimodal).
bimodalA data set with two values that both appear the most frequently.
ordered dataData arranged from smallest to largest. You must order data before finding the median.
measure of centreA single value that summarises the "middle" of a data set. Mean, median, and mode are all measures of centre.
resistantNot strongly affected by outliers. The median is resistant; the mean is not.

⚠ Spot the Trap

Must order first: The median only works on data that has been sorted from smallest to largest. Finding the "middle" of unordered data gives the wrong answer.

Even n — average the two middle values: For 8 values, the median is the average of the 4th and 5th values, not just the 4th. Many students forget to average.

Mode is not always meaningful: If every value appears exactly once, there is no mode. Don't invent one.

Finding the Median

Step 1: Order the data from smallest to largest.

Step 2: Find the middle position.

  • Odd n: position = $\dfrac{n+1}{2}$. The value at this position is the median.
  • Even n: positions $\dfrac{n}{2}$ and $\dfrac{n}{2}+1$. Median = average of those two values.

Odd example (n = 5): Data: 3, 7, 9, 12, 15. Position = (5+1)/2 = 3rd. Median = 9.

Even example (n = 6): Data: 3, 5, 7, 9, 11, 14. Positions 3 and 4. Median = (7+9)/2 = 8.

The median does not have to be a value in the data set (as seen above — 8 is not in the list).

Why Median Handles Outliers Better

Data: 40, 45, 48, 50, 52, 55, 96 (n = 7, ordered).

Mean: Sum = 386. $\bar{x} = 386 \div 7 \approx 55.1$

Median: Position = (7+1)/2 = 4th value = 50.

The outlier (96) increases the mean from about 50 to 55 — a shift of 5 marks. But the median stays exactly at 50 regardless of whether the last value is 96 or 9600. The median is resistant to outliers.

For these 7 test scores, 50 (median) is the more honest description of "typical" performance — 4 of the 7 scores are between 45 and 55.

Finding the Mode

The mode is the value that appears most often. Count the frequency of each value — the one with the highest count is the mode.

One mode: 2, 3, 3, 4, 5, 5, 5, 7 → mode = 5 (appears 3 times).

Bimodal: 2, 2, 3, 3, 4 → mode = 2 and 3 (each appears twice — bimodal).

No mode: 1, 2, 3, 4, 5 → each appears once, so there is no mode.

The mode is the only measure of centre that works for categorical data: the modal colour of cars in a car park could be "white" — you can't find the mean or median of colours.

In a frequency table: the mode is the value with the highest frequency.

Choosing the Right Measure of Centre

MeasureBest used when…Weakness
MeanData is symmetric, no outliersPulled strongly by outliers
MedianData is skewed or has outliers (house prices, incomes)Ignores the size of extreme values
ModeCategorical data, or "most popular" value neededMay not exist; ignores other values

Real-world example: Why do you think newspaper articles about house prices usually report the median sale price rather than the mean? (A few luxury mansions would inflate the mean, making typical houses seem more expensive than they are.)

Common Pitfalls

  • Forgetting to sort data before finding the median.
  • For even n, taking the lower of the two middle values instead of averaging both.
  • Reporting a mode when every value appears only once — say "no mode" instead.
  • Using mode for numerical data when it gives an unrepresentative answer.
  • Assuming the median is always a value in the original data set — it may be the average of two values and therefore not in the list.

Copy This Into Your Book

Median: Order data. Odd n → position $\frac{n+1}{2}$. Even n → average of positions $\frac{n}{2}$ and $\frac{n}{2}+1$.

Mode: Most frequent value. Can be none, one, or two (bimodal).

Resistant: Median barely moves when outliers are added. Mean is strongly affected.

Choose: mean for symmetric data; median for skewed/outlier data; mode for categorical or "most popular".

Find the median of: 11, 5, 9, 3, 15, 7, 13.

Find the median of: 14, 4, 10, 6, 12, 8.

Find the mode of: 3, 7, 5, 7, 2, 9, 7, 4, 5.

A real estate report says the average house price in a suburb is $1.4 million, but most houses sell for around $900 000. Which measure of centre is most likely being reported, and what should be used instead?

Which of the following data sets is bimodal? A. 1, 2, 3, 4, 5   B. 2, 2, 3, 4, 4, 5   C. 1, 1, 1, 2, 3   D. 5, 6, 7, 8, 9

Q6. Find the median and mode of: 14, 9, 7, 14, 11, 8, 6, 14, 10, 9. Show all working including ordering the data.

Q7. A data set of 5 values has a mean of 6.2. Four of the values are: 4, 7, 8, 5. Find the missing fifth value. Show your working using the mean formula.

Q8. A data set of house sale prices (in $000s): 420, 450, 460, 470, 480, 490, 820. Calculate both the mean and the median. Which measure better represents the typical house price? Explain why, referring to the shape of the data.

Show Answers

Q6

Ordered: 6, 7, 8, 9, 9, 10, 11, 14, 14, 14.
n = 10 (even). Median = average of 5th and 6th values = (9 + 10) ÷ 2 = 9.5.
Mode = 14 (appears 3 times — most frequent).

Q7

Mean = 6.2, n = 5. Total sum = 6.2 × 5 = 31.
Known values: 4 + 7 + 8 + 5 = 24.
Missing value = 31 − 24 = 7.

Q8

Sum = 420+450+460+470+480+490+820 = 3590. n = 7.
Mean = 3590 ÷ 7 ≈ $512 900.
Ordered (already ordered). Median = 4th value = $470 000.
The median ($470 000) is more representative. The data is skewed right — six of the seven prices are between $420k and $490k, but the $820k outlier pulls the mean up to $513k. The median is not affected by this extreme value and better represents what most buyers actually paid.

Stretch Challenge

A data set has 9 values. When a 10th value of 100 is added, the median does not change but the mean increases by 5.

(a) What was the original mean? (Hint: if adding 100 increases the mean by 5, you can set up an equation.)
(b) Where does the value 100 sit when the 10 values are ordered? Does this make sense given that the median didn't change?
(c) If the original median of the 9 values is the 5th value = 40, and the new median of 10 values is the average of the 5th and 6th values = 40, what can you say about the 6th value in the ordered set of 10?

Always sort before finding median
Odd n: median = middle value
Even n: median = average of two middle values
Mode = most frequent value (can be bimodal)
Median is resistant to outliers; mean is not
Use median for skewed/outlier data

Badges This Lesson

Median Master
Middle Finder
Mode Maestro
Bimodal Buster
Centre Selector
Stats Superstar
← Previous Lesson 9 of 20 Next →