Comprehensive assessment covering all 15 lessons: probability foundations, probability rules, conditional probability, independence and mutual exclusivity, discrete probability distributions, measures of centre and spread, representing data, comparing data sets, bivariate data analysis, regression analysis, random variables, the normal distribution, normal applications, the binomial distribution, and module synthesis.
Assessment
Select the best answer for each question. 1 mark each.
If $P(A) = 0.6$ and $P(B) = 0.5$, what is the maximum possible value of $P(A \cup B)$?
Events $A$ and $B$ are independent with $P(A) = 0.4$ and $P(B) = 0.5$. What is $P(A \cap B)$?
A data set has mean 20 and standard deviation 5. If every value is multiplied by 2, what is the new standard deviation?
Pearson's $r = 0.6$ between study hours and test scores. What percentage of variation in test scores is explained by study hours?
For a continuous random variable, $P(X = 5)$ equals:
IQ scores are $N(100, 15^2)$. Approximately what percentage of the population has an IQ above 130?
A fair coin is flipped 5 times. What is $P(X = 2)$ where $X$ = number of heads?
The regression line $\hat{y} = 10 + 3x$ predicts $y$ when $x = 8$. A student who studied 8 hours scored 40. What is the residual?
Which of the following correlations is strongest?
For $X \sim B(100, 0.45)$, is normal approximation appropriate?
Short Answer
A bag contains 4 red and 6 blue marbles. Two marbles are drawn without replacement.
(a) Find $P(\text{red then blue})$. (b) Find $P(\text{at least one red})$. (c) Are the events "first is red" and "second is red" independent? Justify your answer. 3 MARKS
For a data set: 12, 15, 18, 20, 22, 25, 28, 30, 35, 95.
(a) Calculate the mean and median. (b) Identify any outliers using the 1.5 ร IQR rule. (c) Explain which measure of centre best represents this data set and why. 3 MARKS
A machine fills bottles with $\mu = 500$ mL and $\sigma = 8$ mL. The volumes are normally distributed.
(a) What percentage of bottles contain less than 484 mL? (b) The company wants to set a minimum fill volume such that only 2.5% of bottles are below it. Find this minimum volume. (c) A quality inspector tests 50 bottles and finds 3 underfilled. Explain why this does not necessarily mean the machine is malfunctioning. 3 MARKS
A study collects data on advertising spend ($x$, in $000s) and sales ($y$, in $000s) for 20 stores: $\bar{x} = 10$, $s_x = 3$, $\bar{y} = 80$, $s_y = 15$, $r = 0.80$.
(a) Find the equation of the least-squares regression line. (b) Predict sales when advertising spend is $15,000. Is this interpolation or extrapolation? (c) The residual for a store that spent $15,000 is $-5$. What was their actual sales? 3 MARKS
Q1: B โ $P(A \cup B) = P(A) + P(B) - P(A \cap B)$. Maximum occurs when $P(A \cap B)$ is minimised. Minimum is $0.1$ (since $0.6 + 0.5 = 1.1 > 1$), giving $P(A \cup B) = 1.0$.
Q2: B โ For independent events, $P(A \cap B) = P(A) \times P(B) = 0.4 \times 0.5 = 0.2$.
Q3: C โ Multiplying by a constant scales the SD by that constant: $5 \times 2 = 10$.
Q4: B โ Coefficient of determination $r^2 = 0.6^2 = 0.36 = 36\%$.
Q5: B โ For continuous variables, probability at any single point is zero.
Q6: A โ $130 = 100 + 2(15) = \mu + 2\sigma$. Approximately $2.5\%$ lie above $\mu + 2\sigma$.
Q7: B โ $P(X = 2) = \binom{5}{2}(0.5)^5 = 10/32 = 5/16$.
Q8: C โ $\hat{y}(8) = 10 + 3(8) = 34$. Residual = $40 - 34 = 6$.
Q9: B โ Strength is measured by $|r|$. Values: 0.5, 0.8, 0.7, 0.3. So $r = -0.8$ is strongest.
Q10: B โ Both $np = 45 \geq 5$ and $n(1-p) = 55 \geq 5$ are satisfied.
Q11 (3 marks): (a) $P(\text{red}) = \frac{4}{10}$, $P(\text{blue} \mid \text{red}) = \frac{6}{9} = \frac{2}{3}$. So $P(\text{red then blue}) = \frac{4}{10} \times \frac{2}{3} = \frac{8}{30} = \frac{4}{15}$ [1]. (b) $P(\text{both blue}) = \frac{6}{10} \times \frac{5}{9} = \frac{30}{90} = \frac{1}{3}$. So $P(\text{at least one red}) = 1 - \frac{1}{3} = \frac{2}{3}$ [1]. (c) Not independent. $P(\text{second red} \mid \text{first red}) = \frac{3}{9} = \frac{1}{3}$, but $P(\text{second red}) = P(\text{red then red}) + P(\text{blue then red}) = \frac{4}{10} \times \frac{3}{9} + \frac{6}{10} \times \frac{4}{9} = \frac{12 + 24}{90} = \frac{36}{90} = 0.4$. Since $\frac{1}{3} \neq 0.4$, the events are dependent [1].
Q12 (3 marks): (a) Mean = $\frac{300}{10} = 30$ [0.5]. Median = $\frac{22 + 25}{2} = 23.5$ [0.5]. (b) $Q_1 = 16.5$, $Q_3 = 29$, IQR = 12.5 [0.5]. Lower fence = $16.5 - 18.75 = -2.25$; Upper fence = $29 + 18.75 = 47.75$. Outlier: 95 [0.5]. (c) The median (23.5) is better because the mean (30) is inflated by the outlier (95). The median is robust to extreme values [1].
Q13 (3 marks): (a) $484 = 500 - 16 = 500 - 2(8) = \mu - 2\sigma$. By the empirical rule, approximately $2.5\%$ of bottles contain less than 484 mL [1]. (b) Minimum volume = $\mu - 2\sigma = 500 - 16 = 484$ mL [1]. (c) With 2.5% expected underfilled, in 50 bottles we expect $50 \times 0.025 = 1.25$ underfilled on average. Finding 3 underfilled is somewhat above expectation but not extremely unusual โ it could be natural sampling variation. A larger sample or formal hypothesis test would be needed to conclude malfunction [1].
Q14 (3 marks): (a) $b = 0.80 \times \frac{15}{3} = 4$ [0.5]. $a = 80 - 4(10) = 40$ [0.5]. $\hat{y} = 40 + 4x$ [0.5]. (b) $\hat{y}(15) = 40 + 4(15) = 100$. Predicted sales = $\$100,000$ [0.5]. This is extrapolation if $x = 15$ is outside the original data range (since $\bar{x} = 10$ and $s_x = 3$, the typical range is roughly 4โ16, so $x = 15$ is at the boundary โ arguably interpolation) [0.5]. (c) Actual = predicted + residual = $100 - 5 = 95$. Actual sales = $\$95,000$ [0.5].