Variance and Standard Deviation
A factory tests 100 batteries from a production line where each has a 5% chance of being defective. The mean number of defective batteries is easy: $np = 5$. But how much will that count vary from one batch to the next? Variance and standard deviation answer that question — they measure spread. For the binomial, both have famously clean formulas: $\text{Var}(X) = npq$ and $\text{SD}(X) = \sqrt{npq}$. This lesson trains the calculation and the interpretation.
You already know that for $X \sim B(n, p)$ the mean is $E(X) = np$. Without using any formula — what do you think happens to the spread of $X$ as $p$ gets close to $0$ or close to $1$? Is the distribution more or less spread out compared to when $p = 0.5$? Write your reasoning below.
Every variance/SD calculation rewards two habits: identify $n$, $p$, and $q = 1-p$ from the wording, then substitute into $npq$ before doing arithmetic. Skipping the $q$ step or confusing $p$ with $q$ is the single biggest cause of wrong answers.
The identify-then-substitute strategy: (1) extract $n$ (number of trials) and $p$ (success probability) from the question, (2) compute $q = 1 - p$, (3) substitute into $\text{Var}(X) = npq$ and $\text{SD}(X) = \sqrt{npq}$ before simplifying.
Mean: $\mu = np$ · Variance: $\sigma^2 = npq$ · SD: $\sigma = \sqrt{npq}$
Key facts
- Variance: $\text{Var}(X) = npq$ where $q = 1-p$
- Standard deviation: $\text{SD}(X) = \sqrt{npq}$
- SD has the same units as $X$; variance has units$^2$
Concepts
- Why spread depends on both $n$ and the product $pq$
- Why $pq$ is largest at $p = 0.5$ and shrinks at the extremes
- How SD describes the typical deviation from the mean
Skills
- Calculate $\text{Var}(X)$ and $\text{SD}(X)$ given $n$ and $p$
- Interpret SD as the typical spread around the mean $np$
- Compare spreads of binomial distributions with different parameters
If $X \sim B(n, p)$ counts successes in $n$ independent trials, then
- Mean: $\mu = E(X) = np$ — already familiar from earlier lessons.
- Variance: $\sigma^2 = \text{Var}(X) = npq$ where $q = 1 - p$.
- Standard deviation: $\sigma = \text{SD}(X) = \sqrt{npq}$.
- Interpretation: roughly speaking, most outcomes lie within $1$–$2$ standard deviations of the mean.
Worked through the hook: For $X \sim B(100, 0.05)$:
- $n = 100$, $p = 0.05$, so $q = 1 - 0.05 = 0.95$.
- Mean: $\mu = np = 100 \times 0.05 = 5$ defectives.
- Variance: $\sigma^2 = npq = 100 \times 0.05 \times 0.95 = 4.75$.
- Standard deviation: $\sigma = \sqrt{4.75} \approx 2.179$.
- So the typical count of defectives is about $5 \pm 2$ — closer to "give or take 2", not 5 or 10.
Binomial formulas: $E(X)=np$; $\text{Var}(X)=npq$; $\text{SD}(X)=\sqrt{npq}$ where $q=1-p$.
Pause — copy the formulas $\text{Var}(X)=npq$ and $\text{SD}(X)=\sqrt{npq}$ with a worked example for $B(100,0.05)$ into your book.
Quick check: If $X \sim B(80, 0.25)$, what is $\text{Var}(X)$?
We just saw that for $X\sim B(n,p)$: $\text{Var}(X)=npq$ and $\text{SD}(X)=\sqrt{npq}$ where $q=1-p$. That raises a question: what does $\sigma=\sqrt{npq}$ actually tell you about the spread of outcomes, and how do you use it to judge whether an observed frequency is typical or unusual? This card answers it → most outcomes fall within $[\mu-\sigma,\mu+\sigma]$; a gap of more than $2\sigma$ from $\mu$ is noteworthy.
SD tells you the typical deviation from the mean. If $X \sim B(n, p)$ has mean $\mu = np$ and standard deviation $\sigma = \sqrt{npq}$, then most observed values of $X$ fall in the interval $[\mu - \sigma, \mu + \sigma]$ — and almost all fall in $[\mu - 2\sigma, \mu + 2\sigma]$.
Compare two scenarios:
- $X_1 \sim B(100, 0.5)$: $\mu = 50$, $\sigma = \sqrt{100 \times 0.5 \times 0.5} = \sqrt{25} = 5$. Typical range $\approx [45, 55]$.
- $X_2 \sim B(100, 0.05)$: $\mu = 5$, $\sigma = \sqrt{100 \times 0.05 \times 0.95} = \sqrt{4.75} \approx 2.18$. Typical range $\approx [2.8, 7.2]$.
Even though $X_1$ has a much bigger SD ($5$ vs $2.18$), $X_2$ has a larger SD relative to its mean ($2.18/5 \approx 44\%$ versus $5/50 = 10\%$). For comparing relative variability across different distributions, scale matters.
SD tells you the typical deviation from the mean. If $X \sim B(n, p)$ has mean $\mu = np$ and standard deviation $\sigma = \sqrt{npq}$, then most observed values of $X$ fall in the interval $[\mu - \sigma, \mu +...
Pause — copy the interpretation rule: typical observed values fall within $\pm 2\sigma$ of the mean $\mu=np$; gaps larger than this warrant comment into your book.
Did you get this? True or false: for $X \sim B(n, p)$ with $n$ fixed, the variance $npq$ is maximised when $p = 0.5$.
Worked examples · 3 in a row, reveal as you go
A fair coin is tossed $60$ times. Let $X$ be the number of heads. Find (a) the variance of $X$ and (b) the standard deviation of $X$.
A machine produces light bulbs, $4\%$ of which are defective. A quality inspector tests a random sample of $200$ bulbs. Let $X$ be the number of defective bulbs found. Find $\text{Var}(X)$ and $\text{SD}(X)$, and interpret the SD.
Two binomial distributions are $X \sim B(50, 0.2)$ and $Y \sim B(50, 0.8)$. Compare their variances and standard deviations. Comment on what you notice.
Fill the gap: If $X \sim B(64, 0.25)$, then $\text{Var}(X) = 64 \times 0.25 \times 0.75 = $, so $\text{SD}(X) = \sqrt{12} \approx 3.46$.
Misconceptions to fix · the 3 traps that cost marks
Did you get this? True or false: for $X \sim B(100, 0.3)$, the standard deviation is $\sqrt{30}$.
Activities · practice with the ideas
$X \sim B(50, 0.4)$. Find the variance and standard deviation.
A die is rolled $36$ times. Let $X$ be the number of sixes. Find $\text{Var}(X)$ and $\text{SD}(X)$.
$15\%$ of customers at a café order a flat white. In a sample of $80$ customers, find the standard deviation of the number who order a flat white.
For what value of $p$ is the variance of $B(40, p)$ greatest? What is that maximum variance?
$X \sim B(n, 0.25)$ has variance $\text{Var}(X) = 30$. Find $n$.
Odd one out: Three of these statements about $X \sim B(100, 0.5)$ are correct. Which one is NOT?
Earlier you predicted whether $X \sim B(100, 0.05)$ would have a typical spread of about $2$, $5$, or $10$.
The answer: variance $= npq = 100 \times 0.05 \times 0.95 = 4.75$, so SD $= \sqrt{4.75} \approx 2.18$. The typical spread is about $2$. Because $p$ is so close to $0$, the distribution is tight near its mean of $5$ — most batches will contain $3$–$7$ defectives. SD shrinks at the extremes of $p$ and is largest at $p = 0.5$.
Pick your answer, then rate your confidence — that tells the system what to drill next. Each retry pulls a fresh mix from the bank.
Q1. $X \sim B(120, 0.25)$. Find the variance and standard deviation of $X$. (2 marks)
Q2. In a factory, $6\%$ of phones produced are defective. A sample of $250$ phones is tested. Find the standard deviation of the number of defective phones, and interpret what it means. (3 marks)
Q3. A binomial random variable $X \sim B(n, p)$ has mean $24$ and variance $19.2$. Find $n$ and $p$. (3 marks)
Comprehensive answers (click to reveal)
Activity answers:
1. $q = 0.6$. $\text{Var}(X) = 50 \times 0.4 \times 0.6 = 12$. $\text{SD}(X) = \sqrt{12} \approx 3.464$.
2. $p = 1/6$, $q = 5/6$. $\text{Var}(X) = 36 \times \frac{1}{6} \times \frac{5}{6} = 5$. $\text{SD}(X) = \sqrt{5} \approx 2.236$.
3. $\text{Var}(X) = 80 \times 0.15 \times 0.85 = 10.2$. $\text{SD}(X) = \sqrt{10.2} \approx 3.194$.
4. $pq$ maximised at $p = 0.5$, giving max variance $= 40 \times 0.5 \times 0.5 = 10$.
5. $n \times 0.25 \times 0.75 = 30 \Rightarrow 0.1875 n = 30 \Rightarrow n = 160$.
Q1 (2 marks): $q = 0.75$; $\text{Var}(X) = 120 \times 0.25 \times 0.75 = 22.5$ [1]. $\text{SD}(X) = \sqrt{22.5} \approx 4.743$ [1].
Q2 (3 marks): $X \sim B(250, 0.06)$, $q = 0.94$ [1]. $\text{Var}(X) = 250 \times 0.06 \times 0.94 = 14.1$; $\text{SD}(X) = \sqrt{14.1} \approx 3.755$ phones [1]. Interpretation: mean defectives $= 15$, so most samples contain roughly $15 \pm 3.76$ defectives, i.e., between $11$ and $19$ [1].
Q3 (3 marks): $np = 24$ and $npq = 19.2$; dividing: $q = 19.2/24 = 0.8$, so $p = 0.2$ [1]. Then $n \times 0.2 = 24 \Rightarrow n = 120$ [1]. Verify: $\text{Var}(X) = 120 \times 0.2 \times 0.8 = 19.2$ ✓ [1].
Five timed questions on variance and standard deviation of binomial distributions. Beat the boss to bank a tier — gold (90% + speed), silver (75%), or bronze (50%). Replays welcome.
⚔ Enter the arenaClimb platforms by answering variance and SD questions. Lighter alternative to the boss.
Mark lesson as complete
Tick when you've finished the practice and review.