Skip to content
mathlab
0
0
0 XP
Lvl 1
KJ
Lesson 20 ~30 min Unit 4 · Data & Probability +90 XP

Data and Chance Synthesis

Connect the data cycle with probability. Use relative frequency as experimental probability and expected frequency to make real predictions.

Today's hook: You've learned data collection, graphs, statistics, AND probability. Now it's time to see how they ALL connect. Data reveals patterns → probability predicts future outcomes → statistics measures certainty. This is what data scientists do every day.
0/5QUESTS
Think First
warm-up

A school surveys 200 students. Exactly 80 say their favourite subject is Maths. Based on this data alone, if you picked a random student from the school, what do you think the probability that they prefer Maths would be? Write your estimate and explain your thinking.

Record your answer in your workbook.
1
The Big Idea
+5 XP

Statistics (data) and probability (chance) are two sides of the same coin. The relative frequency you calculate from data is the experimental probability. Together, data and probability power medicine, sport, finance, and science.

When you collect data, you are observing what happened. When you calculate a relative frequency, you are measuring how often something occurred. That proportion is exactly the experimental probability — a number between 0 and 1 that tells you how likely that outcome is based on evidence. As you collect more data, your experimental probability gets closer to the true theoretical probability.

DATA RELATIVE FREQUENCY EXP. PROB. P(event) = freq ÷ total More data → closer to true prob. Law of large numbers Statistics ↔ Probability
Relative frequency = Experimental probability
Data gives evidence
Relative frequency from data estimates the probability of future events.
More trials = better estimate
The law of large numbers guarantees this over time.
Both disciplines connect
Data scientists use both every day to make real decisions.
2
What You'll Master
objectives

Know

  • The five stages of the PPDAC data cycle
  • That relative frequency from data equals experimental probability
  • The expected frequency formula: $E = P(event) \times n$

Understand

  • Why data and probability are connected, not separate topics
  • How increasing sample size improves probability estimates
  • The difference between analysis (what the data shows) and conclusion (what it means)

Can Do

  • Plan a statistical investigation using the PPDAC cycle
  • Calculate experimental probability from survey or frequency data
  • Use $E = P \times n$ to predict expected frequencies in future trials
3
Words You Need
vocabulary
PPDAC cycleThe five-stage data investigation process: Problem, Plan, Data, Analysis, Conclusion.
Statistical inferenceUsing sample data to draw conclusions about a larger population.
Relative frequencyThe proportion of times an outcome occurs; frequency divided by total trials.
Probability modelA mathematical description of a chance process using probabilities for each outcome.
Expected frequencyThe number of times an event is predicted to occur: $E = P(event) \times n$.
VariationNatural differences in outcomes that occur even when the true probability is fixed.
4
Spot the Trap
heads-up

Wrong: "Data and probability are totally different topics — you study them separately and never mix them." Many students treat these as unrelated chapters.

Right: Relative frequency from real data is experimental probability. Every time you write a fraction of outcomes from a table, you are calculating a probability.

Wrong: "If P(win) = 0.4 and I play 10 games, I will definitely win exactly 4." Expected frequency is a prediction, not a guarantee.

Right: $E = P \times n$ gives the expected average over many repetitions. In any single set of 10 games, variation means you might win 2, 3, 5, or 6.

5
The Data Cycle — PPDAC
+5 XP

Every statistical investigation follows the PPDAC cycle: five ordered stages that take you from a question to a conclusion. Skipping stages leads to poor or misleading results.

Problem: Identify the question you want to answer (e.g. "Do Year 7 students get enough sleep?"). Plan: Decide what data to collect, how, and from whom. Data: Collect and record your data. Analysis: Calculate statistics, draw graphs, find patterns. Conclusion: Answer the original question with evidence — be honest about limitations.

PROBLEM PLAN DATA ANALYSIS CONCLUSION PPDAC cycle
Problem → Plan → Data → Analysis → Conclusion
Start with the question
A sharp problem statement guides every decision that follows.
Analysis ≠ Conclusion
Analysis describes what the data shows. Conclusion answers your question.
Acknowledge limitations
A good conclusion states what you cannot claim from the data.
6
Relative Frequency as Probability
+5 XP

When you calculate a relative frequency from survey or experiment data, you are directly measuring experimental probability. The formula is identical: divide the count of a specific outcome by the total number of observations.

Suppose you survey 120 students and 36 prefer basketball. Relative frequency = 36 ÷ 120 = 0.3. This means the experimental probability that a randomly chosen student prefers basketball is 0.3 (or 30%). If we had perfect information about every student, this estimate would approach the true probability — but we rarely have that luxury, so relative frequency from a good sample is our best tool.

Sport Count Rel. Freq. Basketball 36 0.30 Soccer 48 0.40 Other 36 0.30 TOTAL 120 1.00 P(basketball) = 36/120 = 0.3 Relative frequency = Exp. probability
P(event) = frequency ÷ total
Check totals add to 1
All relative frequencies in a table should sum to exactly 1 (or 100%).
Sample size matters
A larger survey gives a more reliable probability estimate.
Express as fraction or decimal
Either form is correct — choose whichever the question asks for.
7
Using Probability to Predict — Expected Frequency
+5 XP

Once you know the probability of an event, you can predict how many times it will occur in a given number of trials. This is called the expected frequency.

The formula is simple: Expected frequency = P(event) × n, where n is the number of trials. If P(head) = 0.5 and you flip a coin 80 times, you expect $0.5 \times 80 = 40$ heads. This is not a guarantee — due to natural variation you might get 37 or 43 — but 40 is your best prediction.

Expected frequency formula E = P(event) × n Example: P(head) = 0.5, n = 80 flips 0.5 × 80 = 40 Actual results may vary (e.g. 37–43) E=40 37 43 Expected Low var High var
$E = P(event) \times n$
E is a prediction
Expected frequency will often not be a whole number — that's fine.
Larger n, less relative variation
In 10 flips you might get 8 heads; in 1000, you'll be very close to 500.
Works both ways
Use theoretical or experimental probability in the formula — both give valid predictions.
Watch Me Solve It · Plan with PPDAC
+15 XP per step
Q1
PROBLEM
A student wants to investigate whether Year 7 students spend more than 3 hours per day on screens. Plan this investigation using the PPDAC cycle.
  1. 1
    Problem: State the question clearly
    Do Year 7 students at our school spend more than 3 hours per day on screens?
    A clear question focuses your whole investigation. This one is specific (Year 7), measurable (hours per day), and has a comparison point (3 hours).
  2. 2
    Plan: Decide how to collect data
    Survey 30 Year 7 students at random. Ask: "How many hours did you spend on screens yesterday?" Record to the nearest half-hour.
    Random selection avoids bias. Larger samples are better, but 30 is reasonable for a class project. Recording to 0.5 h keeps data manageable and accurate.
  3. 3
    Data, Analysis, Conclusion (outline)
    Data: Record hours in a table. Analysis: Calculate mean and find proportion above 3 h. Conclusion: If mean > 3 or more than 50% exceed 3 h, answer "yes" with evidence.
    The conclusion must directly answer the problem question. Acknowledge that "yesterday" may not represent typical screen use.
AnswerP: Do Y7 students screen-time > 3 h/day? P: Survey 30 random Y7s. D: Record hours. A: Find mean & proportion > 3 h. C: Answer question with evidence; acknowledge limitations.
Watch Me Solve It · Relative Frequency as Probability
+15 XP per step
Q2
PROBLEM
A survey of 150 people asked which type of movie they prefer. Results: Action 45, Comedy 60, Drama 30, Horror 15. Find P(Comedy) and P(not Comedy) from this data.
  1. 1
    Confirm the total
    45 + 60 + 30 + 15 = 150
    Always verify the total before calculating. It should match the stated sample size of 150.
  2. 2
    Calculate P(Comedy)
    P(Comedy) = 60 ÷ 150 = 0.4 = 2/5
    This is the relative frequency of Comedy. It equals the experimental probability that a randomly chosen person from this group prefers Comedy.
  3. 3
    Calculate P(not Comedy) using the complement
    P(not Comedy) = 1 − P(Comedy) = 1 − 0.4 = 0.6
    Or directly: (45 + 30 + 15) ÷ 150 = 90 ÷ 150 = 0.6. Both methods agree, which confirms our calculation.
AnswerP(Comedy) = 0.4; P(not Comedy) = 0.6
Watch Me Solve It · Predicting with Expected Frequency
+15 XP per step
Q3
PROBLEM
A spinner has 8 equal sections: 3 are red and 5 are blue. If you spin it 240 times, how many times would you expect to get red?
  1. 1
    Find the theoretical probability
    P(red) = 3/8 = 0.375
    There are 3 red sections out of 8 equal sections, so P(red) = 3/8. This is our theoretical probability based on the spinner's design.
  2. 2
    Apply the expected frequency formula
    $E = P(red) \times n = \frac{3}{8} \times 240$
  3. 3
    Calculate and interpret
    $E = 3 \times 30 = 90$
    You would expect to get red approximately 90 times. Due to variation, the actual count might differ — but 90 is the best prediction. Note that 240 ÷ 8 = 30, so each section is expected 30 times.
AnswerExpected frequency of red = 90 times
9
Common Pitfalls
heads-up
Confusing analysis with conclusion
Analysis describes what the data shows numerically (e.g. "The mean screen time was 4.2 hours"). A conclusion answers the original question: "This suggests Year 7 students do exceed 3 hours on average." Many students skip the conclusion or just restate the analysis.
Fix: After your analysis, always go back and directly answer the problem question using your results as evidence.
Treating expected frequency as a guarantee
$E = P \times n$ gives an average prediction, not a certain outcome. If P(six) = 1/6 and you roll a die 60 times, you expect 10 sixes — but you might roll 7 or 13 due to natural variation.
Fix: Always write "we would expect approximately..." rather than "we will get exactly..."
Drawing conclusions from too little data
Flipping a coin 4 times and getting 3 heads does not mean P(head) = 3/4 for that coin. With such a small sample, variation swamps the signal and your relative frequency is an unreliable probability estimate.
Fix: Always comment on sample size in your conclusion. Large samples give more reliable estimates of probability.
Copy Into Your Books

PPDAC Cycle

  • Problem — What do you want to know?
  • Plan — How will you collect data?
  • Data — Collect and record
  • Analysis — Statistics, graphs, patterns
  • Conclusion — Answer the question

Relative Frequency = Exp. Probability

  • P(event) = frequency ÷ total
  • Tells you probability from real data
  • All relative frequencies sum to 1
  • Larger samples → better estimates

Expected Frequency

  • $E = P(event) \times n$
  • n = number of trials
  • E is a prediction, not a guarantee
  • Use "approximately" or "expect"

Key Connections

  • Data reveals patterns (past)
  • Probability predicts outcomes (future)
  • Statistics measures certainty
  • More data → closer to true probability

How are you completing this lesson?

D
Brain Trainer · Data and Chance
4 problems

Four drill problems connecting data and probability. Work each, then reveal the answer.

  1. 1 P(win) = 0.4. A team plays 50 games. How many wins are expected?

    Use E = P(win) × n = 0.4 × 50.E = 20 wins
  2. 2 In a survey of 200 students, 80 prefer Maths. What is the experimental probability that a random student prefers Maths?

    P(Maths) = 80 ÷ 200 = 0.4. This is the relative frequency = experimental probability.P = 0.4 = 2/5
  3. 3 List the five stages of the PPDAC cycle in order.

    The five stages are:Problem → Plan → Data → Analysis → Conclusion
  4. 4 A fair die is rolled 300 times. Theoretically, how many times would you expect to roll a 6? Why might the actual result differ?

    P(6) = 1/6. E = 1/6 × 300 = 50. Actual results differ due to natural variation — probability predicts averages over many trials, not exact outcomes in any single run.E = 50; actual may differ due to variation
Complete in your workbook.
1
80 of 200 students prefer Maths. What is P(Maths)?
+10 XP
2
P(win) = 1/4. In 60 games, what is the expected number of wins?
+10 XP
3
In PPDAC, which stage directly answers the original question using evidence?
+10 XP
4
45 out of 150 surveyed people prefer Action movies. What is P(Action)?
+10 XP
5
If E = 30, which statement is most accurate?
+10 XP
Show Your Working
9 marks total
Apply Medium 3 MARKS

Q6. A class surveys 40 students about their favourite fruit. Results: Apple 12, Banana 8, Mango 14, Other 6. Calculate P(Mango) and P(not Mango). Show all working.

Answer in your workbook.
Apply Medium 3 MARKS

Q7. A weather app says P(rain) = 0.35 on any given day. How many rainy days would you expect in a 60-day period? Show your calculation and explain what "expected" means in this context.

Answer in your workbook.
Reason Hard 3 MARKS

Q8. A student claims: "I flipped a coin 10 times and got 7 heads, so the probability of heads for this coin must be 0.7." Evaluate this claim. Is the student correct? What would make the estimate more reliable?

Answer in your workbook.
Comprehensive Answers

Quick Check

1. B — 0.4. P(Maths) = 80 ÷ 200 = 0.4.

2. C — 15. E = 1/4 × 60 = 15.

3. A — Conclusion. This stage answers the original question with evidence from the analysis.

4. D — 0.3. P(Action) = 45 ÷ 150 = 0.3.

5. B — E is a prediction, not a guarantee. Actual results vary due to chance.

Show Your Working Model Answers

Q6 (3 marks): Total = 12 + 8 + 14 + 6 = 40 ✓ [1]. P(Mango) = 14 ÷ 40 = 7/20 = 0.35 [1]. P(not Mango) = 1 − 0.35 = 0.65, or (12 + 8 + 6) ÷ 40 = 26/40 = 0.65 [1].

Q7 (3 marks): E = P(rain) × n = 0.35 × 60 = 21 [1]. This means over 60 days, we would predict approximately 21 rainy days on average [1]. "Expected" does not mean exactly 21 will occur — actual rainy days may vary due to natural variation in weather [1].

Q8 (3 marks): The student has correctly calculated the experimental probability: 7 ÷ 10 = 0.7 [1]. However, 10 flips is a very small sample, so this estimate is unreliable — it could easily be 7/10 just by chance variation even for a fair coin [1]. To make the estimate more reliable, the student should flip the coin many more times (e.g. 100, 200, or 1000 times); the relative frequency will then converge toward the true probability [1].

Stretch Challenge · +25 XP, +10 coins

The Rainy Day Investigation

A meteorologist records rain data for 180 days and finds it rained on 54 of them. (a) Calculate the experimental probability of rain. (b) Using this probability, predict how many rainy days to expect in the next 90 days. (c) Design the key parts of a PPDAC investigation to determine whether your town gets more rain in summer or winter. Include your problem statement, plan, and what a good conclusion would need to include.

Reveal solution

(a) P(rain) = 54 ÷ 180 = 0.3. (b) E = 0.3 × 90 = 27 rainy days. (c) Problem: "Does our town receive more rain in summer than winter?" Plan: Record daily rainfall for 3 months each season, use same rain gauge, 90-day period each. Conclusion must include: comparison of seasonal totals, statement about whether difference is meaningful or within variation, acknowledgement of whether one year is representative.

R
Quick Review

PPDAC

Problem → Plan → Data → Analysis → Conclusion

Rel. Freq. = Exp. Prob.

P(event) = frequency ÷ total observations

Expected frequency

$E = P(event) \times n$

E is a prediction

Not a guarantee — variation always occurs

More data = better

Larger samples give more reliable probability estimates

Analysis ≠ Conclusion

Analysis: what data shows. Conclusion: answers the question.

Interactive: Probability Simulator

Explore how experimental probability approaches theoretical probability as sample size grows. Run the simulation many times and watch the relative frequency stabilise.

Your Badges

0 of 6
First Steps
3-Day Streak
3 in a Row
Lesson Ace
Stretch Seeker
Daily Warrior

Mark lesson as complete

Tick when you've finished Learn, Practice and the Stretch. Earns +90 XP and +25 coins.