Pearson's Correlation Coefficient
CSIRO researchers use Pearson's r to confirm whether rainfall patterns and crop yields in regional NSW are related enough to build prediction models. A single number between −1 and +1 captures both direction and strength of a linear relationship.
Practise this lesson
Three printable worksheets that build from foundations to mastery — or build your own from any module’s questions.
What would a single number that measures the strength and direction of correlation look like? What range of values would it need to cover all possible situations — from perfect positive to perfect negative? Write your thoughts before reading on.
Pearson's correlation coefficient $r$ is a number that measures the strength and direction of a linear relationship between two variables.
The range: $r$ is always between $-1$ and $+1$ inclusive. $r = +1$ is perfect positive; $r = -1$ is perfect negative; $r = 0$ means no linear correlation.
Sign gives direction; magnitude gives strength. The further $r$ is from 0 (closer to ±1), the stronger the correlation.
Key facts
- $r$ ranges from $-1$ to $+1$
- $r = +1$, $r = -1$, and $r = 0$ each have specific meanings
- Guidelines for classifying strength from $r$
Concepts
- How the sign of $r$ indicates direction
- How the magnitude of $r$ indicates strength
- Why $r$ only measures linear (not curved) relationships
Skills
- Interpret a given $r$ value in context
- Choose which $r$ value matches a described scatterplot
- State the limitations of $r$
Pearson's $r$ packages both direction and strength into one number:
- Sign (+/−): Positive $r$ → points slope upward. Negative $r$ → points slope downward.
- Magnitude (distance from zero): The closer $|r|$ is to 1, the stronger (tighter) the relationship.
Strength guidelines:
| Range of $|r|$ | Strength |
|---|---|
| $0.9$ to $1.0$ | Strong |
| $0.6$ to $0.9$ | Moderate |
| $0.3$ to $0.6$ | Weak |
| $0$ to $0.3$ | Very weak / no correlation |
What to write in your book
- $r$ sign: positive → positive correlation; negative → negative correlation.
- $|r|$ near 1 = strong; near 0.6–0.9 = moderate; near 0.3–0.6 = weak; near 0 = very weak/none.
- $r = 0$ does NOT mean no relationship — only no linear relationship.
Quick check: Which $r$ value indicates the strongest correlation?
When interpreting $r$, always state: (1) the direction, (2) the strength, and (3) what it means in context of the two variables.
Examples:
- $r = 0.87$: strong positive linear correlation — as [x variable] increases, [y variable] tends to increase strongly.
- $r = -0.92$: strong negative linear correlation — as [x variable] increases, [y variable] tends to decrease strongly.
- $r = 0.41$: weak positive linear correlation — as [x variable] increases, [y variable] shows a slight tendency to increase, but the relationship is not consistent.
- $r = -0.05$: essentially no linear correlation — knowing [x variable] tells us almost nothing about [y variable].
Real example: For a study of age (x) and resting heart rate (y), $r = -0.68$ means "there is a moderate negative linear correlation between age and resting heart rate — as age increases, resting heart rate tends to decrease moderately."
What to write in your book
- Template: "There is a [strength] [direction] linear correlation between [x] and [y] ($r = $ [value])."
- Mention both the number and what it means in context of the two variables.
Which does NOT belong? Things you can tell from Pearson's $r$ alone:
Pearson's $r$ has important limitations that examiners test:
- $r$ only measures linear relationships. Two variables can have a perfect curved (non-linear) relationship with $r \approx 0$. Low $r$ does not mean no relationship — just no linear one.
- $r$ does not imply causation. A high $r$ tells you the variables are strongly associated, but it does not prove that one causes the other. (We will explore this in Lesson 4.)
- Outliers can distort $r$. A single outlier can pull $r$ toward 0 or toward ±1, making the relationship look weaker or stronger than it is for the main cluster of data.
What to write in your book
- $r \approx 0$: no linear relationship — there could still be a curved one.
- $r$ does not prove causation — even $r = \pm 1$ does not mean one variable causes the other.
- Outliers can distort $r$. Check the scatterplot, not just the $r$ value.
Complete: A value of $r = -0.92$ indicates a linear correlation.
Worked examples · 3 in a row, reveal as you go
For a dataset of weekly exercise hours (x) and body mass index (y), $r = 0.72$. Interpret this value in context.
For daily screen time (x) and hours of sleep (y), $r = -0.95$. Interpret this value in context.
Three scatterplots are described: (A) tightly grouped upward, (B) widely scattered downward, (C) random scatter. Match each to the most likely r value: $r = 0.95$, $r = -0.45$, $r = 0.02$.
What to write in your book
- Steps: (1) Note sign (direction), (2) Note magnitude (strength), (3) Write full interpretation with variable names.
- $r = 0.95$ → strong positive. $r = -0.45$ → weak negative. $r = 0.02$ → no linear correlation.
For each $r$ value below, state the direction, classify the strength, and write a sentence of interpretation. Assume x = advertising spend ($000s) and y = monthly sales ($000s).
- $r = 0.88$
- $r = -0.31$
- $r = 0.05$
- $r = -0.97$
At the start you thought about what range a single correlation number would need. The answer is $-1 \le r \le +1$: negative values capture negative correlation, positive values capture positive correlation, and the size (magnitude) captures the strength. $r = 0$ sits in the middle, meaning no linear relationship. This elegant range makes $r$ easy to interpret consistently.
Pick your answer, then rate your confidence. Each retry pulls a fresh mix from the bank.
Q1. For a study of daily exercise (minutes) and resting heart rate (bpm), $r = -0.84$. (a) What is the direction of this correlation? (b) What is the strength? (c) Write a full interpretation in context. (3 marks)
Q2. A researcher finds $r = 0.03$ for the relationship between a person's favourite colour and their reaction time. A student concludes "there is no relationship between these variables." Is the student correct? Explain. (2 marks)
Answers (click to reveal)
Activity: (1) $r=0.88$: strong positive — as advertising increases, sales tend to increase strongly. (2) $r=-0.31$: weak negative — slight tendency for higher advertising to associate with lower sales (unusual, suggests confounding). (3) $r=0.05$: no linear correlation — knowing advertising spend tells us almost nothing about sales. (4) $r=-0.97$: strong negative linear correlation.
Q1 (3 marks): (a) Negative — as exercise increases, heart rate decreases [1]. (b) $|r|=0.84$, falls in 0.6–0.9 → moderate-strong [1]. (c) "There is a moderate to strong negative linear correlation between daily exercise and resting heart rate ($r=-0.84$). As exercise time increases, resting heart rate tends to decrease." [1]
Q2 (2 marks): The student is not fully correct. $r = 0.03$ indicates no linear relationship [1], but there could still be a non-linear (curved) relationship between the variables that $r$ cannot detect [1].
Interpret $r$ values, classify strength and direction, and identify limitations. Beat the boss to bank a tier. Replays welcome.
⚔ Enter the arenaClimb platforms answering Pearson's r questions. Pool: lesson 03.
Mark lesson as complete
Tick when you've finished the practice and review.