The Least Squares Regression Line
The AFL uses regression lines fitted to player performance data to build salary models — the least-squares method ensures no single outlier skews the whole formula. In this lesson you will interpret the equation $y = a + bx$ and understand what the gradient $b$ and y-intercept $a$ mean in real-world contexts.
Practise this lesson
Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.
If you could minimise the total vertical distance between all data points and a line, what would that give you? How is this mathematically better than drawing a line of best fit by eye?
The least squares regression line is the unique line that minimises the sum of the squared vertical distances from each data point to the line. It is given as $y = a + bx$.
$b$ is the gradient: for each one-unit increase in x, the predicted y changes by $b$ units. It is the rate of change.
$a$ is the y-intercept: the predicted value of y when x = 0. It may or may not have a meaningful real-world interpretation.
$a$ = y-intercept (base value), $b$ = gradient (rate)
Key facts
- The HSC equation form: $y = a + bx$
- $b$ = gradient (rate of change); $a$ = y-intercept
- How to substitute to find predicted values
Concepts
- What $b$ means in the context of the two variables
- When $a$ has a meaningful real-world interpretation
- Why the regression line is "best" mathematically
Skills
- Interpret $a$ and $b$ in context for a given equation
- Substitute to predict y for a given x
- Identify when the y-intercept is or is not meaningful
The regression equation $y = a + bx$ has two components that are always interpreted in the context of the variables being studied:
The y-intercept $a$: the value of y when $x = 0$. This is the "starting value" or "base value." For example, if $y = 42 + 3.5x$ (where x = hours studied, y = exam score), then $a = 42$ means "a student who studied 0 hours is predicted to score 42%." This may represent prior knowledge.
The gradient $b$: the rate of change. For each 1-unit increase in x, y changes by $b$ units. For $y = 42 + 3.5x$, $b = 3.5$ means "for each additional hour of study, the predicted score increases by 3.5 percentage points."
What to write in your book
- $y = a + bx$: $a$ = predicted y when x = 0 (y-intercept). $b$ = change in y per unit increase in x (gradient).
- Interpret $b$ as: "For each additional [unit of x], y [increases/decreases] by [b]."
- Interpret $a$ as: "When x = 0, the predicted y is [a]." Check if this is meaningful in context.
Quick check: For the equation $y = 15 + 4x$ (where x = advertising spend in $000s, y = monthly sales in $000s), what does the value 4 represent?
The gradient $b$ tells you the rate of change — how much y changes for every one-unit increase in x.
Positive $b$: As x increases, y increases. Example: $b = 3.5$ with x = hours of study, y = score → "for each additional hour of study, the predicted score increases by 3.5 points."
Negative $b$: As x increases, y decreases. Example: $b = -2.4$ with x = temperature, y = hot coffee sales → "for each 1°C increase in temperature, predicted coffee sales decrease by 2.4 units."
Units: The gradient has units of $\frac{\text{units of y}}{\text{units of x}}$. For score (%) per hour (h), the gradient is in %/h.
What to write in your book
- $b > 0$: as x increases, y increases (positive relationship).
- $b < 0$: as x increases, y decreases (negative relationship).
- Template: "For each additional [unit of x], the predicted [y variable] [increases/decreases] by [|b|] [units of y]."
Which does NOT belong? Correct interpretations of $b = -3$ in the equation $y = 50 - 3x$ (x = absences, y = grade %):
The y-intercept $a$ is the predicted value of y when $x = 0$. Whether this is meaningful depends on the context:
Meaningful: If x = 0 is a realistic or relevant value.
- $y = 42 + 3.5x$ (study hours vs score): $a = 42$ means a student who studies 0 hours is predicted to score 42%. This could represent prior knowledge — reasonable.
- $y = 1800 + 250x$ (years of experience vs salary): $a = 1800$ is the predicted starting salary with 0 years of experience. Realistic.
Not meaningful: If x = 0 is unrealistic or outside the data range.
- $y = -15 + 2x$ (height in cm vs weight in kg for children): $a = -15$ means a child of height 0 cm weighs −15 kg. Nonsensical — x = 0 is far outside the data range.
What to write in your book
- $a$ = predicted y when x = 0. Ask: "Is x = 0 realistic in this context?"
- If x = 0 is within or near the data range → $a$ is meaningful.
- If x = 0 is outside the data range or nonsensical → $a$ is not meaningfully interpreted.
Complete: In the equation $y = a + bx$, the value $b$ represents the and tells us the change in $y$ for each in $x$.
Worked examples · 3 in a row, reveal as you go
The regression equation for hours studied (x) vs exam score % (y) is $y = 42 + 3.5x$. Interpret $a$ and $b$ in context.
Using $y = 42 + 3.5x$, predict the score for a student who studies 8 hours.
For a regression of years of experience (x) vs annual salary $k (y), the equation is $y = 38 + 2.8x$. Explain what $b = 2.8$ means in context.
What to write in your book
- Template for b: "For each additional [unit of x], the predicted [y variable] [increases/decreases] by [b] [unit of y]."
- Template for a: "When x = 0, the predicted [y variable] is [a] [unit of y]."
- Always state units in your interpretation.
The regression equation for daily temperature °C (x) vs electricity demand in MWh (y) for a city is $y = 420 - 8.5x$.
- Interpret $a = 420$ in context.
- Interpret $b = -8.5$ in context.
- Predict electricity demand on a 25°C day.
- Does $b < 0$ make sense in this context? Explain.
The least squares regression line minimises the sum of the squared vertical distances from the data points to the line. Squaring penalises large errors more than small ones, so the line is pulled towards outliers less than you might expect. This mathematical optimisation ensures the line is the best possible linear prediction — better than any line drawn by eye.
Pick your answer, then rate your confidence. Each retry pulls a fresh mix from the bank.
Q1. The regression equation for weekly rainfall (mm) (x) vs monthly crop yield (tonnes) (y) is $y = 12 + 0.8x$. (a) Interpret $a = 12$ in context. (b) Interpret $b = 0.8$ in context. (c) Predict the crop yield in a week where 30 mm of rain falls. (3 marks)
Q2. For a regression of weight (kg) (x) vs height (cm) (y) for babies aged 0–12 months, the equation is $y = -12 + 8.5x$. Explain whether the y-intercept $a = -12$ has a meaningful interpretation. (2 marks)
Answers (click to reveal)
Activity: (1) When temperature = 0°C, predicted electricity demand = 420 MWh (base demand in cold weather with no cooling needed). (2) For each 1°C increase in temperature, electricity demand decreases by 8.5 MWh (warmer weather reduces heating demand). (3) $y = 420 - 8.5(25) = 420 - 212.5 = 207.5$ MWh. (4) Yes, $b < 0$ makes sense — warmer temperatures reduce heating demand, so electricity usage decreases.
Q1 (3 marks): (a) $a = 12$ means: when rainfall = 0 mm, predicted crop yield is 12 tonnes (base yield from irrigation/soil moisture) [1]. (b) $b = 0.8$ means: for each additional mm of rainfall, predicted crop yield increases by 0.8 tonnes [1]. (c) $y = 12 + 0.8(30) = 12 + 24 = 36$ tonnes [1].
Q2 (2 marks): $a = -12$ does not have a meaningful interpretation [1]. A baby with weight = 0 kg cannot have a height of −12 cm. Since x = 0 is outside the realistic data range for babies, the y-intercept is a mathematical artefact of the equation, not a meaningful real-world value [1].
Interpret a and b, make predictions, and identify meaningful intercepts. Beat the boss to bank a tier. Replays welcome.
⚔ Enter the arenaClimb platforms answering regression equation questions. Pool: lesson 06.
Mark lesson as complete
Tick when you've finished the practice and review.