Worksheets

Practise this lesson

Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.

Build Foundations & guided practice Apply Application practice Master Mastery challenge Build custom Build your own from any module question

Normal Distribution Applications & Module Review

Apply everything — normal distribution, empirical rule, and z-scores — to real HSC-style problems.

MS-S5 Lesson 12 ~40 min

Think First

Without looking at your notes: list the 5 key ideas from MS-S4 (bivariate) and the 3 key ideas from MS-S5 (normal distribution). Which do you feel most confident about? Which needs more review?

See key ideas

MS-S4: Scatterplots, describing correlation (direction/strength), r value, regression line y = a + bx (interpret a and b), predictions (interpolation/extrapolation) and causation vs correlation.

MS-S5: Normal distribution features (bell curve, mean=median=mode), empirical rule (68–95–99.7), z-scores ($z = (x−μ)÷σ$) and comparison across datasets.

Learning Intentions

Apply normal distribution concepts to real-world and HSC-style contexts
Combine the empirical rule and z-scores in multi-step problems
Identify which tool (empirical rule vs z-score) is needed for a given question
Consolidate all Module 5 skills for exam readiness

Key Terms

Quality control

Using normal distribution and z-scores to determine whether manufactured items fall within acceptable bounds.

Percentile

The percentage of the population below a given value. The mean of a normal distribution is the 50th percentile.

Standardised score

Another name for a z-score — a value expressed in standard deviation units.

Module 5 framework

MS-S4: bivariate analysis (scatterplot → r → regression → causation). MS-S5: normal distribution (bell curve → empirical rule → z-scores).

Tool Selection: Empirical Rule vs z-score

The biggest challenge in Module 5 is knowing which tool to use. Here is the decision guide:

Situation	Use
The value is exactly 1, 2, or 3 standard deviations from the mean	Empirical rule (68/95/99.7)
The value is a non-integer number of SDs from the mean	z-score formula
Comparing results from two different datasets	z-scores
Finding percentage of data in a symmetric interval about μ	Empirical rule
Determining whether a value is unusual	Either (\|z\| > 2 rule)

Book Notes

Copy the decision table. Highlight: "comparing across datasets = always z-score".

Quick check: You need to compare a student's Biology result (μ=65, σ=9) with their Chemistry result (μ=71, σ=7). Which tool should you use?

Application: Quality Control

Normal distribution is widely used in manufacturing to set quality thresholds.

Scenario: A machine fills bottles with a target volume of 750 mL. Volumes are normally distributed with μ = 750 mL, σ = 6 mL. Bottles outside the range 738–762 mL are rejected.

738 = 750 − 12 = μ − 2σ and 762 = 750 + 12 = μ + 2σ.
By the empirical rule, 95% of bottles fall within this range → 5% are rejected.
A bottle containing 765 mL: z = (765 − 750) ÷ 6 = 2.5. Since |z| = 2.5 > 2, this bottle is unusual and would be rejected.

HSC tip: Quality control questions almost always test whether a value falls within 2σ and whether it should be accepted or rejected. Always show the z-score calculation.

Book Notes

Note: "quality control = check if |z| ≤ 2 (accept) or |z| > 2 (unusual/reject)". Write a 3-line worked template.

Quick check: In a quality control scenario (μ = 100 g, σ = 4 g), a product weighs 108 g. Should it be rejected as unusual?

Application: Using z-scores to Find Counts

Combining percentages from the empirical rule or z-scores with the total population size gives you the expected count.

Worked example: A school of 800 students sits a maths test. Results are normally distributed with μ = 62, σ = 10. How many students scored above 82?

82 = 62 + 20 = μ + 2σ.
By empirical rule, 5% of data is outside 2σ, so 2.5% is above μ + 2σ.
Expected count = 2.5% × 800 = 0.025 × 800 = 20 students.

Worked example 2 (z-score approach): How many scored between 55 and 62?

z = (55 − 62) ÷ 10 = −0.7 and z = 0 (at mean).
This is a non-integer z-score — for MS-S5 purposes, this requires a z-table (beyond scope) unless the boundaries are whole multiples of σ.

Scope note: Maths Standard only requires the empirical rule and z-scores for whole-number multiples of σ, or comparing relative performance. The full z-table is not required.

Book Notes

Write: Count = (percentage from empirical rule) × N. Example: 2.5% of 1200 = 30.

Quick check: In a group of 400 people, heights are N(170, 8²). How many people would you expect to be taller than 186 cm?

Module 5 Summary: MS-S4 Bivariate Data

Quick-reference checklist for exam preparation:

Scatterplot: plot (x, y) pairs; identify direction and form visually.
Correlation description: state direction (positive/negative/none), strength (strong/moderate/weak), and form (linear).
Pearson's r: range −1 to +1; sign = direction; magnitude = strength.
Regression line: y = a + bx; interpret a (y-intercept in context) and b (gradient in context).
Prediction: substitute x into equation; state interpolation/extrapolation; comment on reliability.
Causation: correlation ≠ causation; state this explicitly whenever r is strong.

Book Notes

Copy this 6-point checklist. Add a tick beside any you feel confident about and a circle beside any you need to review.

Quick check: In the regression line y = 8 + 1.5x (x = study hours, y = marks), what does the value 8 represent?

Module 5 Summary: MS-S5 Normal Distribution

Quick-reference checklist:

Normal distribution features: symmetric, bell-shaped; mean = median = mode; total area = 1; asymptotic tails.
Effect of μ and σ: μ shifts the curve; σ controls its spread.
Empirical rule: 68% within 1σ; 95% within 2σ; 99.7% within 3σ.
One-sided percentages: 16% below μ − σ; 2.5% below μ − 2σ; 0.15% below μ − 3σ (and symmetrically above).
z-score: $z = (x - \mu) \div \sigma$ and $x = \mu + z\sigma$.
Unusual values: |z| > 2.
Comparison: compare z-scores, not raw scores, across different distributions.

Book Notes

Write the MS-S5 checklist from memory. Check against this card. Circle anything you missed.

Quick check: Approximately what percentage of normally distributed data lies below μ − 2σ?

Activities

Activity 1 — Integrated Problem

A running club has 500 members. Their weekly training distances (km) are approximately normally distributed with μ = 35 km and σ = 6 km. Answer all parts.

What percentage of members train between 23 km and 47 km per week?
How many members train more than 41 km per week?
A member trains 26 km. Calculate their z-score and state whether this is unusual.
Another member has z = 1.8. What is their training distance?

See answers

23 = 35 − 12 = μ − 2σ and 47 = 35 + 12 = μ + 2σ. Within 2σ → 95%.
41 = 35 + 6 = μ + σ. Percentage above μ + σ = 32% ÷ 2 = 16%. Count = 0.16 × 500 = 80 members.
z = (26 − 35) ÷ 6 = −9 ÷ 6 = −1.5. |z| = 1.5 < 2, so this is not unusual.
x = 35 + 1.8 × 6 = 35 + 10.8 = 45.8 km.

Activity 2 — Module 5 Mixed Review

Answer these module-wide questions covering both MS-S4 and MS-S5.

A scatterplot shows a strong negative linear correlation. The regression line is y = 90 − 3.2x (x = hours of screen time, y = hours of sleep). Interpret the gradient and y-intercept, and predict sleep time for x = 5.
A dataset is approximately normally distributed. What would the histogram look like?
In a distribution with μ = 120 and σ = 15, what percentage of values lie between 105 and 135?

See answers

Gradient −3.2: for each additional hour of screen time, sleep time is predicted to decrease by 3.2 hours. y-intercept 90: with zero screen time, predicted sleep time is 90 hours (contextually meaningless — indicates this line is only valid near the data range). Prediction for x = 5: y = 90 − 3.2(5) = 90 − 16 = 74 hours (check reasonableness; if data range includes x = 5, this is interpolation).
The histogram would be approximately bell-shaped (symmetric), with most bars near the centre and smaller bars tapering off symmetrically on both sides.
105 = 120 − 15 = μ − σ and 135 = 120 + 15 = μ + σ. Within 1σ → 68%.

Multiple Choice

1. A factory produces components with length N(50, 4²). The acceptable range is 42–58 mm. Approximately what percentage of components will be rejected?

32%
5%
0.3%
2.5%

Answer

B. 42 = 50 − 8 = μ − 2σ and 58 = 50 + 8 = μ + 2σ. 95% are accepted, so 5% are rejected.

2. A class of 200 students sits a test where marks are N(68, 10²). Approximately how many students scored above 78?

Answer

A. 78 = 68 + 10 = μ + σ. Percentage above μ + σ = 16%. Count = 0.16 × 200 = 32 students.

3. The regression line is y = 20 + 3x. Which interpretation of the gradient is correct (x = hours exercise, y = calories burned)?

20 calories are burned when no exercise occurs
For each calorie burned, exercise increases by 3 hours
For each additional hour of exercise, 3 more calories are predicted to be burned
The correlation coefficient is 3

Answer

C. The gradient b = 3 means for each extra hour of exercise, calories burned increases by 3. (Note: 20 is the y-intercept.)

4. Anya scores z = 1.6 in French and z = 1.9 in History. Which statement is correct?

Anya performed better in French because languages are harder
Anya performed better in History — she was further above average
Anya performed equally well in both subjects
Cannot compare without knowing the raw scores

Answer

B. A higher z-score means a better relative performance. z = 1.9 in History > z = 1.6 in French.

5. Which of the following statements about the normal distribution is FALSE?

The total area under the curve is 1
The distribution is symmetric about the mean
The mean and median are equal
The curve is highest at the standard deviation, not the mean

Answer

D. The curve is highest at the mean (μ), not at the standard deviation. D is false.

Short Answer

SAQ 1. A large study finds that systolic blood pressure in healthy adults is approximately normally distributed with μ = 120 mmHg and σ = 12 mmHg. (a) Between what values does the middle 95% of blood pressures lie? (b) What is the z-score for a blood pressure of 150 mmHg? Is this unusual? (c) A doctor says that blood pressures above 144 mmHg are "high". What percentage of healthy adults would exceed this threshold?

See answer

(a) μ − 2σ = 120 − 24 = 96 mmHg and μ + 2σ = 120 + 24 = 144 mmHg. Middle 95%: 96 to 144 mmHg.

(b) z = (150 − 120) ÷ 12 = 30 ÷ 12 = 2.5. Since |z| = 2.5 > 2, this is unusual.

SAQ 2. The table shows data on daily temperature (°C) and ice cream sales ($). The regression line is y = −500 + 80x, and r = 0.91. (a) Describe the correlation. (b) Interpret the gradient and y-intercept in context. (c) Predict sales on a 30°C day and comment on reliability. (d) A journalist writes: "Hot weather causes ice cream sales to soar." Comment on this claim using statistical terminology.

See answer

(a) Strong positive linear correlation (r = 0.91).

(b) Gradient 80: for each additional degree Celsius, daily ice cream sales are predicted to increase by $80. y-intercept −500: when temperature is 0°C, predicted sales are −$500, which is not meaningful in context.

(c) y = −500 + 80(30) = −500 + 2400 = $1900. If 30°C is within the data range, this is interpolation and is likely to be a reliable prediction. If outside the data range, it is extrapolation and may be unreliable.

(d) While there is a strong positive correlation (r = 0.91) between temperature and ice cream sales, correlation does not prove causation. A high r value only shows association. Other factors (e.g., school holidays, outdoor events) may be confounding variables.

Full Answers

MC 1: B | MC 2: A | MC 3: C | MC 4: B | MC 5: D

SAQ 1: (a) 96–144 mmHg; (b) z = 2.5, unusual; (c) 2.5%.

SAQ 2: (a) strong positive linear; (b) gradient $80/°C, intercept not meaningful; (c) $1900 — reliability depends on data range; (d) correlation ≠ causation, confounding variables possible.

Revisit

You have completed Module 5 Statistical Analysis. Can you write a full bivariate analysis AND solve a normal distribution problem involving the empirical rule and z-scores — from memory, under exam conditions?

I have completed Module 5 Statistical Analysis — MS-S4 and MS-S5.

Practise this lesson

Normal Distribution Applications & Module Review

Think First

Learning Intentions

Key Terms

Tool Selection: Empirical Rule vs z-score

Book Notes

Application: Quality Control

Book Notes

Application: Using z-scores to Find Counts

Book Notes

Module 5 Summary: MS-S4 Bivariate Data

Book Notes

Module 5 Summary: MS-S5 Normal Distribution

Book Notes

Activities

Activity 1 — Integrated Problem

Activity 2 — Module 5 Mixed Review

Multiple Choice

Short Answer

Revisit

Practice — Normal Distribution Applications

Review — Module 5 Complete