Mathematics Advanced • Year 12 • Module 5 • Lesson 3

Conditional Probability

Past-paper style: conditional probability, tree diagrams and Bayes-style reasoning, including an extended response on a false-positive scenario.

Master · Past-Paper Style

1. Short-answer questions

1.1 Two events satisfy P(A) = 0.4, P(B) = 0.5 and P(A ∪ B) = 0.7. Find P(A ∩ B) and hence P(A | B). 2 marks Band 3

1.2 In a school, 40% of students play a musical instrument. Of those who play, 25% sing in the choir. Of those who do not play, 10% sing.
(a) Draw a tree diagram.
(b) Find P(sings in choir).
(c) Given a student sings in the choir, find P(plays an instrument). 3 marks Band 3-4

1.3 A bag contains 3 red and 5 blue counters. Two counters are drawn without replacement.
(a) Find P(both red).
(b) Given that the first counter drawn is blue, find P(second is red). 3 marks Band 4

Stuck on 1.3(b)? After removing one blue, 3 red and 4 blue remain; conditioning changes the denominator from 8 to 7.

2. Extended response

2.1 A rare disease has prevalence P(D) = 0.005 in a screening population. A new diagnostic test has:

Sensitivity: P(+ | D) = 0.98
Specificity: P(− | D′) = 0.97 (so the false-positive rate is 0.03)

(a) Draw a fully labelled tree diagram for the joint events (D / D′) and (+ / −).
(b) Find P(+), the unconditional probability of a positive test.
(c) Find P(D | +) and P(D | −) to 4 decimal places.
(d) A health policy advisor argues "the test is 97-98% accurate, so a positive result almost certainly means disease". Critique this claim using your value of P(D | +), then suggest two ways the screening programme can be improved (one statistical, one practical). 8 marks Band 5-6

Explicit marking criteria

Part (a) — 1 mark — fully labelled tree: branches D (0.005), D′ (0.995); each followed by + and − with correct conditionals.

Part (b) — 2 marks

• 1 mark — computes both joint probabilities: P(D ∩ +) = 0.005 × 0.98 = 0.0049; P(D′ ∩ +) = 0.995 × 0.03 = 0.02985.

• 1 mark — sums to give P(+) = 0.03475.

Part (c) — 2 marks

• 1 mark — P(D | +) = 0.0049 / 0.03475 ≈ 0.1410.

• 1 mark — P(D | −) = P(D ∩ −) / P(−) = (0.005 × 0.02) / (1 − 0.03475) = 0.0001 / 0.96525 ≈ 0.0001 (essentially 0). A negative result is very reassuring.

Part (d) — 3 marks

• 1 mark — identifies the base-rate fallacy: 97% specificity sounds high, but with prevalence only 0.5%, false positives outnumber true positives ~6:1.

• 1 mark — statistical improvement: e.g. "screen a higher-prevalence subgroup so the posterior rises; or follow every positive with a more specific confirmatory test (PCR / biopsy)."

• 1 mark — practical improvement: e.g. "communicate posterior probability clearly to patients/clinicians, not raw sensitivity/specificity, to reduce panic from false positives."

Your response:

Stuck on (d)? The key insight: at low prevalence, even a tiny false-positive rate dominates the positive-test pool.

How did this worksheet feel?

Got it Partly Lost

What I'll revisit before next class:

Answers — sample responses + marking notes

1.1 — Find P(A ∩ B) then P(A | B) (2 marks)

Sample response. From P(A ∪ B) = P(A) + P(B) − P(A ∩ B), 0.7 = 0.4 + 0.5 − P(A ∩ B), so P(A ∩ B) = 0.2. Then P(A | B) = P(A ∩ B)/P(B) = 0.2/0.5 = 0.4.

Marking notes. 1 mark — uses addition rule to find P(A ∩ B). 1 mark — applies conditional formula correctly. Note: P(A | B) = P(A) = 0.4 here, so A and B are independent.

1.2 — Music and choir (3 marks)

Sample response.

(a) Tree: instrument (0.4) → choir (0.25) or no choir (0.75); no instrument (0.6) → choir (0.10) or no choir (0.90).

(b) P(choir) = 0.4 × 0.25 + 0.6 × 0.10 = 0.10 + 0.06 = 0.16.

Marking notes. 1 mark — correct tree with all conditional probabilities labelled. 1 mark — P(choir) = 0.16 with total-probability working. 1 mark — Bayes/conditional gives 5/8. Common error: stating P(instrument | choir) = 0.25 (which is the wrong-direction conditional P(choir | instrument)).

1.3 — Two counters without replacement (3 marks)

Sample response.

(a) P(both red) = (3/8) × (2/7) = 6/56 = 3/28.

(b) After one blue is removed, 3 red and 4 blue remain (7 counters). So P(second red | first blue) = 3/7.

Marking notes. (a) 1 mark — uses 3/8 then 2/7, recognising without-replacement reduces both numerator and denominator. (b) 1 mark — correct conditional denominator 7; 1 mark — correct numerator 3 (reds unchanged because the removed counter was blue).

2.1 — Rare-disease screening (8 marks): sample Band-6 response with annotations

Sample Band-6 response.

(a) Tree diagram. First-level branches: D with P = 0.005 and D′ with P = 0.995. From D: + with P = 0.98, − with P = 0.02. From D′: + with P = 0.03, − with P = 0.97. [1 mark — fully labelled tree.]

(b) P(+).

P(D ∩ +) = 0.005 × 0.98 = 0.0049 P(D′ ∩ +) = 0.995 × 0.03 = 0.02985 [1 mark]

P(+) = 0.0049 + 0.02985 = 0.03475 (≈ 3.5% of all tests are positive) [1 mark]

(c) P(D | +) and P(D | −).

P(D | +) = 0.0049 / 0.03475 ≈ 0.1410 (only 14.1% of positives are true positives) [1 mark]

P(D ∩ −) = 0.005 × 0.02 = 0.0001 P(−) = 1 − 0.03475 = 0.96525

P(D | −) = 0.0001 / 0.96525 ≈ 0.0001 (a negative test is highly reassuring) [1 mark]

(d) Critique and improvements.

The advisor commits the base-rate fallacy: 97% specificity sounds reliable, but at prevalence 0.5%, the 3% false-positive rate generates ~30 false positives per 1 000 tests against only ~5 true positives — so only 14% of positives are real. A positive does not "almost certainly mean disease"; it means disease is now 28 times more likely than the prior (0.141 vs 0.005), but most positives are still false alarms. [1 mark]

Statistical improvement: screen a higher-prevalence subpopulation (e.g. symptomatic patients or known high-risk groups) so the prior P(D) rises, lifting P(D | +). Or run a more specific confirmatory test on every positive — combining two tests with independent error modes can push P(D | + on both) above 95%. [1 mark]

Practical improvement: communicate the posterior P(D | +) ≈ 14% directly to patients, rather than "the test is 98% accurate", so that anxiety and clinical follow-up are calibrated to the true post-test probability. [1 mark] ▮

Total: 8/8.

Band descriptors for marker.

Band 3: Tree drawn but mislabels conditionals; finds P(+) but not P(D | +). ≈ 3-4 marks.

Band 4: Tree, P(+) and P(D | +) all correct; no critique of the advisor or only one improvement. ≈ 5-6 marks.

Band 5: All numerical parts; names the base-rate fallacy but improvements are generic. ≈ 6-7 marks.

Band 6: Numerical work complete to required precision; critique explicitly invokes the base rate; both statistical and practical improvements distinct and well-justified. 8/8.