Lesson 20 ~30 min Unit 4 · Data & Probability +90 XP

Data and Chance Synthesis

Connect the data cycle with probability. Use relative frequency as experimental probability and expected frequency to make real predictions.

Today's hook: You've learned data collection, graphs, statistics, AND probability. Now it's time to see how they ALL connect. Data reveals patterns → probability predicts future outcomes → statistics measures certainty. This is what data scientists do every day.

0/5QUESTS

Printable Worksheets

Print or save as PDF, or build a custom worksheet from any module's questions.

Build Apply Master Build custom

Think First

warm-up

A school surveys 200 students. Exactly 80 say their favourite subject is Maths. Based on this data alone, if you picked a random student from the school, what do you think the probability that they prefer Maths would be? Write your estimate and explain your thinking.

Record your answer in your workbook.

The Big Idea

+5 XP

Statistics (data) and probability (chance) are two sides of the same coin. The relative frequency you calculate from data is the experimental probability. Together, data and probability power medicine, sport, finance, and science.

When you collect data, you are observing what happened. When you calculate a relative frequency, you are measuring how often something occurred. That proportion is exactly the experimental probability, a number between 0 and 1 that tells you how likely that outcome is based on evidence. As you collect more data, your experimental probability gets closer to the true theoretical probability.

Relative frequency = Experimental probability

Data gives evidence

Relative frequency from data estimates the probability of future events.

More trials = better estimate

The law of large numbers guarantees this over time.

Both disciplines connect

Data scientists use both every day to make real decisions.

What You'll Master

objectives

Know

The five stages of the PPDAC data cycle
That relative frequency from data equals experimental probability
The expected frequency formula: $E = P(event) \times n$

Understand

Why data and probability are connected, not separate topics
How increasing sample size improves probability estimates
The difference between analysis (what the data shows) and conclusion (what it means)

Can Do

Plan a statistical investigation using the PPDAC cycle
Calculate experimental probability from survey or frequency data
Use $E = P \times n$ to predict expected frequencies in future trials

Words You Need

vocabulary

PPDAC cycleThe five-stage data investigation process: Problem, Plan, Data, Analysis, Conclusion.

Statistical inferenceUsing sample data to draw conclusions about a larger population.

Relative frequencyThe proportion of times an outcome occurs; frequency divided by total trials.

Probability modelA mathematical description of a chance process using probabilities for each outcome.

Expected frequencyThe number of times an event is predicted to occur: $E = P(event) \times n$.

VariationNatural differences in outcomes that occur even when the true probability is fixed.

Spot the Trap

heads-up

Wrong: "Data and probability are totally different topics, you study them separately and never mix them." Many students treat these as unrelated chapters.

Right: Relative frequency from real data is experimental probability. Every time you write a fraction of outcomes from a table, you are calculating a probability.

Wrong: "If P(win) = 0.4 and I play 10 games, I will definitely win exactly 4." Expected frequency is a prediction, not a guarantee.

Right: $E = P \times n$ gives the expected average over many repetitions. In any single set of 10 games, variation means you might win 2, 3, 5, or 6.

The Data Cycle, PPDAC

+5 XP

Every statistical investigation follows the PPDAC cycle: five ordered stages that take you from a question to a conclusion. Skipping stages leads to poor or misleading results.

Problem: Identify the question you want to answer (e.g. "Do Year 7 students get enough sleep?"). Plan: Decide what data to collect, how, and from whom. Data: Collect and record your data. Analysis: Calculate statistics, draw graphs, find patterns. Conclusion: Answer the original question with evidence, be honest about limitations.

Problem → Plan → Data → Analysis → Conclusion

Start with the question

A sharp problem statement guides every decision that follows.

Analysis ≠ Conclusion

Analysis describes what the data shows. Conclusion answers your question.

Acknowledge limitations

A good conclusion states what you cannot claim from the data.

Relative Frequency as Probability

+5 XP

When you calculate a relative frequency from survey or experiment data, you are directly measuring experimental probability. The formula is identical: divide the count of a specific outcome by the total number of observations.

Suppose you survey 120 students and 36 prefer basketball. Relative frequency = 36 ÷ 120 = 0.3. This means the experimental probability that a randomly chosen student prefers basketball is 0.3 (or 30%). If we had perfect information about every student, this estimate would approach the true probability, but we rarely have that luxury, so relative frequency from a good sample is our best tool.

P(event) = frequency ÷ total

Check totals add to 1

All relative frequencies in a table should sum to exactly 1 (or 100%).

Sample size matters

A larger survey gives a more reliable probability estimate.

Express as fraction or decimal

Either form is correct, choose whichever the question asks for.

Using Probability to Predict, Expected Frequency

+5 XP

Once you know the probability of an event, you can predict how many times it will occur in a given number of trials. This is called the expected frequency.

The formula is simple: Expected frequency = P(event) × n, where n is the number of trials. If P(head) = 0.5 and you flip a coin 80 times, you expect $0.5 \times 80 = 40$ heads. This is not a guarantee, due to natural variation you might get 37 or 43, but 40 is your best prediction.

$E = P(event) \times n$

E is a prediction

Expected frequency will often not be a whole number, that's fine.

Larger n, less relative variation

In 10 flips you might get 8 heads; in 1000, you'll be very close to 500.

Works both ways

Use theoretical or experimental probability in the formula, both give valid predictions.

Watch Me Solve It · 3 examples

Watch Me Solve It · Plan with PPDAC

+15 XP per step

PROBLEM

A student wants to investigate whether Year 7 students spend more than 3 hours per day on screens. Plan this investigation using the PPDAC cycle.

1

Problem: State the question clearly

Do Year 7 students at our school spend more than 3 hours per day on screens?

A clear question focuses your whole investigation. This one is specific (Year 7), measurable (hours per day), and has a comparison point (3 hours).
2

Plan: Decide how to collect data

Survey 30 Year 7 students at random. Ask: "How many hours did you spend on screens yesterday?" Record to the nearest half-hour.

Random selection avoids bias. Larger samples are better, but 30 is reasonable for a class project. Recording to 0.5 h keeps data manageable and accurate.
3

Data, Analysis, Conclusion (outline)

Data: Record hours in a table. Analysis: Calculate mean and find proportion above 3 h. Conclusion: If mean > 3 or more than 50% exceed 3 h, answer "yes" with evidence.

The conclusion must directly answer the problem question. Acknowledge that "yesterday" may not represent typical screen use.

AnswerP: Do Y7 students screen-time > 3 h/day? P: Survey 30 random Y7s. D: Record hours. A: Find mean & proportion > 3 h. C: Answer question with evidence; acknowledge limitations.

Watch Me Solve It · Relative Frequency as Probability

+15 XP per step

PROBLEM

A survey of 150 people asked which type of movie they prefer. Results: Action 45, Comedy 60, Drama 30, Horror 15. Find P(Comedy) and P(not Comedy) from this data.

1

Confirm the total

45 + 60 + 30 + 15 = 150

Always verify the total before calculating. It should match the stated sample size of 150.
2

Calculate P(Comedy)

P(Comedy) = 60 ÷ 150 = 0.4 = 2/5

This is the relative frequency of Comedy. It equals the experimental probability that a randomly chosen person from this group prefers Comedy.
3

Calculate P(not Comedy) using the complement

P(not Comedy) = 1 − P(Comedy) = 1 − 0.4 = 0.6

Or directly: (45 + 30 + 15) ÷ 150 = 90 ÷ 150 = 0.6. Both methods agree, which confirms our calculation.

AnswerP(Comedy) = 0.4; P(not Comedy) = 0.6

Watch Me Solve It · Predicting with Expected Frequency

+15 XP per step

PROBLEM

A spinner has 8 equal sections: 3 are red and 5 are blue. If you spin it 240 times, how many times would you expect to get red?

1

Find the theoretical probability

P(red) = 3/8 = 0.375

There are 3 red sections out of 8 equal sections, so P(red) = 3/8. This is our theoretical probability based on the spinner's design.
2

Apply the expected frequency formula

$E = P(red) \times n = \frac{3}{8} \times 240$
3

Calculate and interpret

$E = 3 \times 30 = 90$

You would expect to get red approximately 90 times. Due to variation, the actual count might differ, but 90 is the best prediction. Note that 240 ÷ 8 = 30, so each section is expected 30 times.

AnswerExpected frequency of red = 90 times

Common Pitfalls

heads-up

Confusing analysis with conclusion

Analysis describes what the data shows numerically (e.g. "The mean screen time was 4.2 hours"). A conclusion answers the original question: "This suggests Year 7 students do exceed 3 hours on average." Many students skip the conclusion or just restate the analysis.

Fix: After your analysis, always go back and directly answer the problem question using your results as evidence.

Treating expected frequency as a guarantee

$E = P \times n$ gives an average prediction, not a certain outcome. If P(six) = 1/6 and you roll a die 60 times, you expect 10 sixes, but you might roll 7 or 13 due to natural variation.

Fix: Always write "we would expect approximately..." rather than "we will get exactly..."

Drawing conclusions from too little data

Flipping a coin 4 times and getting 3 heads does not mean P(head) = 3/4 for that coin. With such a small sample, variation swamps the signal and your relative frequency is an unreliable probability estimate.

Fix: Always comment on sample size in your conclusion. Large samples give more reliable estimates of probability.

Copy Into Your Books

PPDAC Cycle

P roblem, What do you want to know?
P lan, How will you collect data?
D ata, Collect and record
A nalysis, Statistics, graphs, patterns
C onclusion, Answer the question

Relative Frequency = Exp. Probability

P(event) = frequency ÷ total
Tells you probability from real data
All relative frequencies sum to 1
Larger samples → better estimates

Expected Frequency

$E = P(event) \times n$
n = number of trials
E is a prediction, not a guarantee
Use "approximately" or "expect"

Key Connections

Data reveals patterns (past)
Probability predicts outcomes (future)
Statistics measures certainty
More data → closer to true probability

How are you completing this lesson?

Brain Trainer · 4 problems

Brain Trainer · Data and Chance

4 problems

Four drill problems connecting data and probability. Work each, then reveal the answer.

1 P(win) = 0.4. A team plays 50 games. How many wins are expected?

Use E = P(win) × n = 0.4 × 50.E = 20 wins
2 In a survey of 200 students, 80 prefer Maths. What is the experimental probability that a random student prefers Maths?

P(Maths) = 80 ÷ 200 = 0.4. This is the relative frequency = experimental probability.P = 0.4 = 2/5
3 List the five stages of the PPDAC cycle in order.

The five stages are:Problem → Plan → Data → Analysis → Conclusion
4 A fair die is rolled 300 times. Theoretically, how many times would you expect to roll a 6? Why might the actual result differ?

P(6) = 1/6. E = 1/6 × 300 = 50. Actual results differ due to natural variation, probability predicts averages over many trials, not exact outcomes in any single run.E = 50; actual may differ due to variation

Complete in your workbook.

Quick Check · 5 questions

80 of 200 students prefer Maths. What is P(Maths)?

+10 XP

P(win) = 1/4. In 60 games, what is the expected number of wins?

+10 XP

In PPDAC, which stage directly answers the original question using evidence?

+10 XP

45 out of 150 surveyed people prefer Action movies. What is P(Action)?

+10 XP

If E = 30, which statement is most accurate?

+10 XP

Show Your Working · 4 questions

Show Your Working

12 marks total

Apply Medium 3 MARKS

Q6. A class surveys 40 students about their favourite fruit. Results: Apple 12, Banana 8, Mango 14, Other 6. Calculate P(Mango) and P(not Mango). Show all working.

Answer in your workbook.

Apply Medium 3 MARKS

Q7. A weather app says P(rain) = 0.35 on any given day. How many rainy days would you expect in a 60-day period? Show your calculation and explain what "expected" means in this context.

Answer in your workbook.

Reason Hard 3 MARKS

Q8. A student claims: "I flipped a coin 10 times and got 7 heads, so the probability of heads for this coin must be 0.7." Evaluate this claim. Is the student correct? What would make the estimate more reliable?

Answer in your workbook.

Reason Hard 3 MARKS

Q9. A student wants to investigate: "Do students in my year level prefer social media or gaming in their free time?" (a) In your own words, explain what the student needs to do during the Plan stage of the PPDAC cycle for this investigation. (b) In your own words, explain what could go wrong with the whole investigation if the student skips the Plan stage and jumps straight to collecting data.

Answer in your workbook.

Comprehensive Answers

Quick Check

1. B0.4. P(Maths) = 80 ÷ 200 = 0.4.

2. C15. E = 1/4 × 60 = 15.

3. A Conclusion. This stage answers the original question with evidence from the analysis.

4. D0.3. P(Action) = 45 ÷ 150 = 0.3.

5. B E is a prediction, not a guarantee. Actual results vary due to chance.

Show Your Working Model Answers

Q6 (3 marks): Total = 12 + 8 + 14 + 6 = 40 ✓ [1]. P(Mango) = 14 ÷ 40 = 7/20 = 0.35 [1]. P(not Mango) = 1 − 0.35 = 0.65, or (12 + 8 + 6) ÷ 40 = 26/40 = 0.65 [1].

Q7 (3 marks): E = P(rain) × n = 0.35 × 60 = 21 [1]. This means over 60 days, we would predict approximately 21 rainy days on average [1]. "Expected" does not mean exactly 21 will occur, actual rainy days may vary due to natural variation in weather [1].

Q8 (3 marks): The student has correctly calculated the experimental probability: 7 ÷ 10 = 0.7 [1]. However, 10 flips is a very small sample, so this estimate is unreliable, it could easily be 7/10 just by chance variation even for a fair coin [1]. To make the estimate more reliable, the student should flip the coin many more times (e.g. 100, 200, or 1000 times); the relative frequency will then converge toward the true probability [1].

Q9 (3 marks): (a) During the Plan stage, the student needs to decide exactly how they will collect the data, for example, writing a clear survey question, choosing who to ask (a fair sample of the year level, not just their friendship group), and deciding how many students to survey and how responses will be recorded [2]. (b) If the Plan stage is skipped, the student risks collecting data that cannot actually answer the Problem, for example an unclear question or a biased sample, wasting the Data and Analysis stages and leading to a misleading Conclusion [1].

Stretch Challenge · +25 XP, +10 coins

The Rainy Day Investigation

A meteorologist records rain data for 180 days and finds it rained on 54 of them. (a) Calculate the experimental probability of rain. (b) Using this probability, predict how many rainy days to expect in the next 90 days. (c) Design the key parts of a PPDAC investigation to determine whether your town gets more rain in summer or winter. Include your problem statement, plan, and what a good conclusion would need to include.

Reveal solution

(a) P(rain) = 54 ÷ 180 = 0.3. (b) E = 0.3 × 90 = 27 rainy days. (c) Problem: "Does our town receive more rain in summer than winter?" Plan: Record daily rainfall for 3 months each season, use same rain gauge, 90-day period each. Conclusion must include: comparison of seasonal totals, statement about whether difference is meaningful or within variation, acknowledgement of whether one year is representative.

Quick Review

PPDAC

Problem → Plan → Data → Analysis → Conclusion

Rel. Freq. = Exp. Prob.

P(event) = frequency ÷ total observations

Expected frequency

$E = P(event) \times n$

E is a prediction

Not a guarantee, variation always occurs

More data = better

Larger samples give more reliable probability estimates

Analysis ≠ Conclusion

Analysis: what data shows. Conclusion: answers the question.

Interactive: Probability Simulator

Explore how experimental probability approaches theoretical probability as sample size grows. Run the simulation many times and watch the relative frequency stabilise.

Your Badges

0 of 6

First Steps

3-Day Streak

3 in a Row

Lesson Ace

Stretch Seeker

Daily Warrior

DAILY CHALLENGE · TODAY

The Prediction Sprint

A six-sided die is rolled 120 times. How many times do you expect each number? Calculate all 6 expected frequencies in under 90 seconds for double coins!

NEXT UP · 0% UNLOCKED

Checkpoint 3, Unit 4 Review

0/5

Finish this lesson's Practice MCs to unlock!

Mark lesson as complete

Tick when you've finished Learn, Practice and the Stretch. Earns +90 XP and +25 coins.

Unit overview · Maths subject page · Unit 4 Checkpoint