Skip to content
HSCScience Biology · Y12 · M8
0 XP
🪙0
🔥0
Lv 1
Year 12 Biology Module 8 · IQ3 ⏱ ~45 min Practice bank · 3 Short Answer Lesson 12 of 21

Epidemiology — Incidence, Prevalence, Mortality and Study Design

Every claim about disease — "smoking causes lung cancer," "obesity increases heart disease risk," "this vaccine reduces infection by 95%" — comes from epidemiology. This lesson builds the tools to understand how those claims are generated, what makes them reliable, and how to critically evaluate them.

Today's hook: More people have diabetes today than ever before — but is that because we're getting sicker, or because we're living longer and diagnosing better? How do epidemiologists separate real trends from statistical illusions?
0/5TASKS
Worksheets

Practise this lesson

Four printable worksheets that build from the foundations up to exam-style questions — start at whatever level suits you.

Cancer treatment — epidemiological evaluation context

Epidemiology underpins IQ3 — how we measure whether treatments and prevention strategies work

THINK FIRST · DATA LITERACY
Is "More People Have Diabetes" the Same as "Diabetes Is Getting Worse"?

Between 2000 and 2022, the total number of Australians diagnosed with Type 2 diabetes more than doubled. Headlines reported this as evidence that Australia's diabetes epidemic was worsening catastrophically.

But over that same period, Australia's population also grew substantially — and aged significantly. More people were also being screened and diagnosed than ever before. A researcher argues that when you adjust for population size and age structure, the age-standardised incidence rate of Type 2 diabetes has actually been relatively stable or even declining in some age groups.

Before reading on:

Q1: What is the difference between the total number of cases of a disease and the rate of disease in a population? Why does this distinction matter for public health decisions?

Q2: Why might improved screening and diagnosis make a disease appear to be increasing even if the underlying rate is unchanged?

Scan these before reading
vocab
EpidemiologyThe study of the distribution and determinants of disease in populations; generates data used to identify risk factors and guide public health policy.
IncidenceThe number of new cases of a disease occurring in a population within a defined time period.
PrevalenceThe total proportion of a population with a disease at a given point in time; includes both new and existing cases.
Confounding variableA variable associated with both the exposure and outcome that can create a false or distorted apparent association.
Age-standardised rateA rate adjusted for differences in age distribution between populations, allowing valid comparisons across groups with different age profiles.
Correlation vs causationA statistical association between two variables does not prove that one causes the other; additional criteria (e.g. Bradford Hill) are needed.
Learning Intentions
goals

Know

  • The definitions and formulas for incidence, prevalence, and mortality rate
  • The key features of cohort, case-control, cross-sectional, and randomised controlled trial (RCT) study designs
  • What a confounding variable is and why it matters
  • The difference between correlation and causation in epidemiological data

Understand

  • Why incidence and prevalence give different pictures of disease burden
  • Why age-standardised rates are needed for valid population comparisons
  • Why observational studies can establish association but not proof of causation alone
  • Why RCTs are considered the gold standard and when they cannot be used

Can Do

  • Calculate or interpret incidence, prevalence, and mortality rates from data
  • Identify the most appropriate study design for a given research question
  • Identify confounding variables in epidemiological scenarios
  • Evaluate whether epidemiological data supports a causal or merely associative conclusion
Key Point
Epidemiology is the foundation of IQ3 — you cannot evaluate whether a treatment or prevention strategy works without measuring disease rates, identifying risk factors, designing studies, and critically evaluating evidence.
1
Measuring Disease — Incidence, Prevalence and Mortality
+5 XP

Three different measurements of disease burden, each answering a different question about population health

Before any analysis of disease patterns can be done, the disease must be measured consistently. Epidemiologists use three core measures — incidence, prevalence, and mortality — each capturing a different aspect of disease burden. Confusing these three measures is one of the most common errors in interpreting health data.

Incidence — New Cases Over Time

What it measures: The rate at which new cases of a disease arise in a population over a defined time period. Answers the question: "How fast is this disease spreading or developing?"

Incidence rate = (Number of NEW cases in time period ÷ Population at risk) × 100,000

Example: If 1,800 people are newly diagnosed with melanoma in a population of 10 million in one year, the incidence rate is 18 per 100,000 per year.

Best used for: Measuring the risk of developing a disease; assessing whether a disease is becoming more or less common; evaluating the impact of prevention programs.

Interpretation note: Rising incidence can reflect genuinely increased disease burden, OR improved screening/diagnosis detecting cases that previously went undetected.

Types of bias in epidemiological studies

Types of bias in epidemiological studies

Prevalence — All Existing Cases at a Point in Time

What it measures: The total proportion of a population that has a condition at a specific time point (point prevalence) or during a specified period (period prevalence). Answers: "How much of this disease exists in the community right now?"

Prevalence = (Number of EXISTING cases ÷ Total population) × 100

Example: If 1.3 million of Australia's 26 million people have Type 2 diabetes at a given time, prevalence is 5%.

Best used for: Healthcare planning (how many people need treatment?); allocating health resources; understanding disease burden on the healthcare system.

Key relationship: Prevalence = Incidence × Average duration of disease. A disease with low incidence but long duration (e.g. Type 2 diabetes — chronic, lifelong) has high prevalence. A disease with high incidence but short duration (e.g. influenza — resolves or kills quickly) has lower prevalence relative to incidence.

Mortality Rate — Deaths From Disease

What it measures: The number of deaths attributable to a specific disease per unit of population per unit of time. Distinct from case fatality rate (proportion of cases that die).

Mortality rate = (Number of deaths from disease ÷ Population) × 100,000 per year

Example: If 1,800 people die of coronary heart disease per year in a population of 10 million, mortality rate is 18 per 100,000 per year.

Best used for: Measuring the severity of a disease; assessing the impact of treatment advances; comparing the lethality of different diseases.

Important distinction: A disease can have high incidence but low mortality (e.g. most skin cancers — common but rarely fatal if caught early) or low incidence but high mortality (e.g. pancreatic cancer — rare but ~90% mortality within 5 years).

Age-standardisation — making fair comparisons

Raw rates cannot always be fairly compared between populations with different age structures. Older populations will always have higher crude rates of age-related diseases (cancer, cardiovascular disease, dementia) simply because they have more older people — not necessarily because those diseases are more prevalent for any given age. Age-standardisation applies a standard age distribution to both populations, allowing the underlying disease rates to be compared on a level playing field.

This is why Australia's age-standardised cancer mortality has been falling for decades even as the total number of cancer deaths has risen — improved treatment has reduced the death rate per case, but the population is larger and older, producing more total deaths despite the improved rate.

Common Error
Students confuse incidence and prevalence. Key distinction: incidence = NEW cases in a time period (a rate of new events); prevalence = ALL existing cases at a point in time (a snapshot). A disease with effective treatment that extends life will see rising prevalence even if incidence is stable or falling — because people live longer with the disease. HIV is the classic example: effective antiretroviral therapy means fewer people die, so prevalence rises even as incidence (new infections) falls in high-income countries.
What to write in your book
  • Incidence = NEW cases ÷ population at risk × 100,000/yr (risk of developing disease).
  • Prevalence = ALL existing cases ÷ total population × 100 (% snapshot); Prevalence ≈ Incidence × duration.
  • Mortality rate = deaths ÷ population × 100,000/yr.
  • Age-standardised rates adjust for age structure → fair comparison between populations.

A disease with effective treatment that extends life shows rising prevalence even though incidence is falling. Why?

Interactive · Epidemiology Calculator
2
Epidemiological Study Designs — From Observation to Experiment
+5 XP

Different questions require different study designs — each with characteristic strengths, limitations, and appropriate uses

Epidemiologists cannot randomly assign people to smoke cigarettes or eat unhealthy diets for decades to study the effect on health — most important questions about disease and exposure must be studied observationally. The choice of study design determines what questions can be answered and what conclusions can be drawn.

Cohort Study (Prospective)

Design: A group of disease-free people is followed over time. Exposed and unexposed subgroups are compared for disease development.

Strength: Establishes temporal sequence (exposure before disease); good for common outcomes; can study multiple outcomes from one exposure.

Limitation: Slow and expensive; loss to follow-up; not practical for rare diseases.

Example: British Doctors Study (Doll and Hill) — followed 40,000 doctors from 1951, comparing smoking status to lung cancer rates over decades.

Case-Control Study (Retrospective)

Design: People with a disease (cases) are compared with disease-free people (controls). Past exposures are compared between groups.

Strength: Efficient for rare diseases; quick and inexpensive; can study multiple exposures simultaneously.

Limitation: Relies on recalled exposure (recall bias); cannot establish temporal sequence as clearly; cannot directly calculate incidence.

Example: Comparing asbestos exposure history in mesothelioma patients vs controls without mesothelioma.

Cross-Sectional Study

Design: Measures both exposure and disease at the same time point — a population snapshot.

Strength: Quick and cheap; good for measuring prevalence; generates hypotheses for further study.

Limitation: Cannot establish which came first (exposure or disease); susceptible to prevalence bias; cannot calculate incidence.

Example: National Health Survey measuring smoking status and cardiovascular disease in a sample of Australians at one time point.

Randomised Controlled Trial (RCT)

Design: Participants randomly allocated to intervention (treatment/exposure) or control (placebo/no treatment) groups. Outcomes compared after defined follow-up period.

Strength: Randomisation controls for confounders — the gold standard for establishing causation. Double-blinding reduces bias.

Limitation: Cannot be used for harmful exposures (unethical); expensive; may lack real-world generalisability.

Example: HPV vaccine trials — participants randomly assigned to vaccine or placebo, HPV infection and precancerous lesion rates compared.

HSC Tip
When asked to choose or evaluate a study design in an exam, always address: (1) whether it can establish temporal sequence (exposure before disease); (2) whether it controls for confounders; (3) whether it is ethical and practical. RCTs are gold standard but often impossible for disease risk questions. Cohort studies are the next best for establishing causation; case-control for rare diseases; cross-sectional for prevalence.
What to write in your book
  • Cohort (prospective): follow exposed/unexposed forward → establishes temporal sequence.
  • Case-control (retrospective): cases vs controls, look back at exposure → efficient for rare diseases (recall bias).
  • Cross-sectional: snapshot of exposure + disease at one time → measures prevalence, can't show which came first.
  • RCT: randomised, gold standard for causation; can't use for harmful exposures (unethical).

Why can't a randomised controlled trial be used to study whether smoking causes lung cancer?

Interactive · Study Design Matcher
3
Confounding Variables, Bias and the Limits of Epidemiological Evidence
+5 XP

Association ≠ causation — understanding what can go wrong in epidemiological studies

Epidemiology measures associations between exposures and diseases in real populations — which means it must contend with all the complexity of real life. Confounding variables, biases, and chance findings can all produce apparent associations that are not genuinely causal. Critical evaluation of epidemiological evidence requires recognising these limitations.

Confounding variables

A confounding variable is one that is associated with both the exposure being studied and the disease outcome, and whose presence can create a spurious or distorted apparent relationship. Classic example: a study finds that coffee drinking is associated with lung cancer. Apparent conclusion: coffee causes lung cancer. But coffee drinkers in the 1950s–1980s were also far more likely to smoke. Smoking is the confounder — it is associated with both coffee drinking (same social context) and lung cancer (causally). When you control for smoking status, the coffee-cancer association largely disappears.

Confounders can be controlled by: matching cases and controls on confounding variables; statistical adjustment; stratified analysis; or — best of all — randomisation (which distributes confounders equally between groups by chance).

Types of bias

  • Selection bias: The sample does not represent the target population. Healthy worker effect (workers are healthier than the general population, so occupational studies underestimate disease rates in the general population).
  • Recall bias: Cases (who have a disease) may remember past exposures differently from controls (who do not). People who have developed cancer may more carefully recall exposure to potential carcinogens than healthy controls.
  • Information bias: Systematic errors in measuring exposure or outcome. Misclassification of disease status or exposure level.
  • Reporting bias: Certain outcomes are more likely to be published (publication bias — positive results are more publishable than null findings).

Correlation vs causation

Two variables can be correlated (statistically associated) without one causing the other. The classic examples: ice cream sales correlate with drowning rates (both rise in summer — confounded by hot weather). Countries with higher chocolate consumption have more Nobel Prize winners per capita (confounded by wealth and education). In epidemiology, establishing causation requires more than statistical association — it requires the Bradford Hill criteria (from L08): strength, consistency, specificity, temporality, dose-response, biological plausibility, coherence, experiment, and analogy.

Bradford Hill criterion — meaning

  • Strength: large relative risk
  • Consistency: association replicated in multiple studies/populations
  • Temporality: exposure precedes disease
  • Dose-response: more exposure = more disease
  • Biological plausibility: known mechanism
  • Specificity: exposure linked to specific disease(s)

Example — tobacco & lung cancer

  • Smokers have 15–25× higher lung cancer risk than non-smokers
  • Found in studies across dozens of countries and populations
  • Smoking precedes cancer by 20–40 years
  • More pack-years = higher risk; quitting reduces risk
  • PAHs form DNA adducts → G→T mutations in TP53 (L08)
  • Tobacco specifically causes lung and other cancers, not all diseases equally
Epidemiological measures showing incidence, prevalence and mortality rate definitions

The three core epidemiological measures and why age-standardised rates are essential for valid comparisons between populations.

IQ3 Framing
The IQ3 inquiry question asks you to "investigate the treatment of non-infectious diseases." Epidemiology is the foundation of that investigation — you cannot evaluate whether a treatment or prevention strategy works without measuring disease rates, identifying risk factors, designing studies to test interventions, and critically evaluating the evidence. The skills in this lesson apply to everything in IQ3 and IQ4.
What to write in your book
  • Confounder: a variable associated with BOTH exposure and outcome (e.g. smoking confounds coffee–lung cancer).
  • Control confounders: matching, statistical adjustment, stratification, or randomisation.
  • Biases: selection, recall, information, reporting/publication.
  • Correlation ≠ causation → need Bradford Hill criteria (strength, consistency, temporality, dose-response, plausibility…).

A variable associated with both the exposure and the disease outcome, which can create a false apparent association, is called a _____ variable.

4
Reading and Interpreting Epidemiological Data
+5 XP

The practical skills needed to interpret tables, graphs, and data from studies — tested directly in HSC exams

HSC Biology exams regularly include tables or graphs of epidemiological data and ask students to interpret, analyse, and evaluate them. These questions test whether you can read what the data shows (describe), identify patterns and relationships (analyse), and assess whether the data supports a conclusion (evaluate).

Worked example — interpreting a data table

The following table shows hypothetical data on Type 2 diabetes in Australia:

YearTotal diagnosed casesPopulation (millions)Crude prevalence (%)Age-standardised prevalence (%)
2000640,00019.23.3%4.1%
2010970,00022.34.4%4.3%
20221,300,00025.95.0%4.2%

What you should notice and state:

  • Total cases increased by ~100% from 2000 to 2022 — but this partly reflects population growth.
  • Crude prevalence increased from 3.3% to 5.0% — but this partly reflects the ageing of the population (older people have higher T2D rates).
  • Age-standardised prevalence changed much less (4.1% → 4.2%) — suggesting the underlying disease rate in comparable age groups has been relatively stable, not dramatically increasing. Much of the apparent increase reflects demographic change rather than worsening epidemic.
  • This illustrates why age-standardised rates are essential for valid comparisons over time and between populations.
Exam Technique
When asked to "analyse" epidemiological data in HSC exams: (1) Describe the overall trend; (2) Quote specific data values from the table/graph to support your description; (3) Identify any patterns, anomalies, or differences between groups; (4) If asked to evaluate, state what conclusions can and cannot be drawn — always note limitations (confounders, correlation vs causation, age-standardisation).
What to write in your book
  • Rising total cases ≠ rising rate (check population size).
  • Crude rate vs age-standardised rate (check age structure).
  • Rising prevalence ≠ rising incidence (check treatment/survival).
  • Always quote specific data values in "analyse" answers.

If the total number of diagnosed cases of a disease rises, the underlying disease rate must also be rising.

Incidence measures the number of new cases of a disease in a population over a specific time period.

Prevalence and incidence are the same measure and can be used interchangeably in epidemiological studies.

Activity 1
ApplyBand 4

Calculating and Comparing Disease Measures

Use the data provided to calculate and interpret epidemiological measures. Show your working.

1. In a population of 5 million people, 450 new cases of bowel cancer are diagnosed in one year. Of those already living with bowel cancer (total existing cases = 8,500), 90 die from the disease during that year. Calculate: (a) the incidence rate per 100,000 per year; (b) the prevalence (%); (c) the case fatality rate (% of existing cases who die).

2. The table below shows cardiovascular disease data for two countries in the same year. Interpret the data and explain what can and cannot be concluded from these crude vs age-standardised rates.

CountryCVD deathsPopulationCrude mortality (per 100k)Age-standardised mortality (per 100k)
Country A48,00024 million200145
Country B18,00012 million150190
Activity 2
AnalyseBand 5

Choosing and Evaluating Study Designs

For each research question, identify the most appropriate study design, justify your choice, and identify one major limitation or potential confounding variable.

  1. Researchers want to test whether a new drug reduces Type 2 diabetes progression in patients with insulin resistance. The drug has been safety-tested in Phase 1 and 2 trials and is believed to be beneficial.
  2. Researchers want to investigate whether childhood sun exposure (before age 10) increases adult melanoma risk. Participants are adults aged 40–60 who either have or do not have melanoma.
  3. A study finds that people who drink red wine have lower rates of cardiovascular disease than non-drinkers. A journalist reports: "Red wine prevents heart disease." Identify at least two confounding variables that could explain this association, and explain why the study design cannot establish causation.
How a Cohort Study Changed Medicine and Public Policy

In 1951, Richard Doll and Austin Bradford Hill sent questionnaires to every doctor on the British Medical Register asking about their smoking habits. They then followed these ~40,000 doctors for decades, recording causes of death. This was one of the first large prospective cohort studies — and it produced the most compelling epidemiological evidence for the smoking-lung cancer causal link.

Within 4 years, the data were clear enough that Doll himself — a smoker — quit. After 50 years of follow-up, the study had quantified that smoking reduced life expectancy by approximately 10 years, established the dose-response relationship between pack-years and lung cancer mortality, and documented the survival benefit of quitting at different ages. Doctors who quit before age 35 had near-normal life expectancy; those who quit at 65 had reduced but still significant benefit.

The study design was crucial: by following people forward in time (prospective cohort), it established that smoking preceded lung cancer — ruling out reverse causation. By following a large, well-defined professional cohort with reliable death certification, it minimised selection bias and information bias. The results were consistent across subgroups, showed a clear dose-response, and had an identified biological mechanism (carcinogens in smoke). This is exactly how Bradford Hill's criteria for causation are applied in practice.

PRIORITY MISCONCEPTIONS — EPIDEMIOLOGY
Priority Misconceptions — Epidemiology
✗ "Prevalence and incidence mean the same thing."
✓ Incidence is the rate of NEW cases arising per unit time. Prevalence is the total existing cases at a point in time. A disease with effective treatment that extends life (e.g. HIV in high-income countries) will have rising prevalence even if incidence is falling — because people live longer with the disease. Always specify which measure you are using.
✗ "Correlation means causation."
✓ Statistical association between an exposure and disease does not prove causation. A third variable (confounder) may explain the association. Causation requires temporal sequence, dose-response, biological plausibility, consistency, and ideally experimental confirmation — the Bradford Hill criteria.
✗ "RCTs can always be used to test hypotheses about disease causes."
✓ RCTs cannot ethically be used to study harmful exposures. You cannot randomly assign people to smoke for 20 years to study lung cancer. For questions about harmful exposures, observational studies (cohort, case-control) are the only ethical approach. RCTs are used for testing treatments and preventive interventions, not for studying harmful exposures.
✗ "A study with more participants is always better."
✓ Sample size matters, but study design matters more. A very large cross-sectional study cannot establish temporal sequence — it cannot determine whether the exposure preceded the disease. A large observational study with uncontrolled confounders will produce a large, precisely wrong answer. Design quality, control of bias, and appropriate methods are more important than size alone.
✗ "If a disease rate is rising, the disease is becoming more common."
✓ Rising rates can reflect: genuinely increasing disease burden; population growth (more absolute cases from the same rate); population ageing (more age-susceptible people); improved screening and diagnosis; or changes in diagnostic criteria. Always ask whether rates are crude (unadjusted) or age-standardised before interpreting a trend.

Three Disease Measures

  • Incidence = NEW cases ÷ population at risk × 100,000 per year
  • Prevalence = ALL existing cases ÷ total population × 100 (%)
  • Mortality rate = deaths ÷ population × 100,000 per year
  • Age-standardised: adjusts for age structure to allow fair comparison

Study Designs

  • Cohort (prospective): follows exposed/unexposed forward in time
  • Case-control (retrospective): cases vs controls, looks back at exposure
  • Cross-sectional: snapshot of exposure + disease at one time
  • RCT: randomised, gold standard for causation, can't use for harmful exposures

Confounding + Bias

  • Confounding variable: associated with both exposure AND outcome
  • Selection bias: sample not representative
  • Recall bias: cases remember exposure differently to controls
  • Correlation ≠ causation — need Bradford Hill criteria

Data Interpretation

  • Rising total cases ≠ rising rate (check population size)
  • Crude rate vs age-standardised rate (check age structure)
  • Rising prevalence ≠ rising incidence (check treatment/survival)
  • Always quote data values in exam answers
Interactive Tool — Non-infectious Disease Open fullscreen ↗
The Non-Infectious Disease tool shows cardiovascular disease risk factors. Which is a MODIFIABLE risk factor?
01
Multiple Choice
+5 XP

A fresh set drawn from this lesson's question bank — feedback shown immediately. +5 XP per correct · +25 XP all correct

Pick your answer, then rate your confidence — that tells the system what to drill next.

02
Short Answer — 14 marks
+5 XP

ApplyBand 4(4 marks) 1. Distinguish between incidence and prevalence, and explain why effective treatment for a disease can cause its prevalence to rise even if its incidence is falling. Use a specific example in your answer.

AnalyseBand 4–5(5 marks) 2. A researcher is investigating whether regular physical activity reduces the risk of Type 2 diabetes. Describe how you would design a cohort study to investigate this question. Identify the cohort, the exposure and outcome variables, how data would be collected, and what would constitute evidence of an association. Identify one confounding variable and explain how it would be controlled.

EvaluateBand 5–6(5 marks) 3. Evaluate the following claim using your knowledge of epidemiological evidence and study design: "Because an RCT is the gold standard for medical evidence, we should require RCT evidence before accepting any claim that an environmental exposure causes disease."

Show all answers

Multiple choice

MC answers and full explanations are shown inline as you complete each question. Use the retry button to attempt a fresh set from the lesson bank.

Activity 1 — Calculations and Interpretation

1. Bowel cancer calculations. (a) Incidence rate = 450 ÷ 5,000,000 × 100,000 = 9 per 100,000 per year. (b) Prevalence = 8,500 ÷ 5,000,000 × 100 = 0.17%. (c) Case fatality rate = 90 ÷ 8,500 × 100 = 1.06% per year — about 1 in 100 existing patients dies of the disease each year, reflecting that many are diagnosed early and survive for years while a smaller advanced-disease group contributes most deaths.

2. CVD country comparison. From crude rates, Country A appears to have higher CVD mortality (200 vs 150 per 100,000). But age-standardised rates reverse this — Country B has the higher rate (190 vs 145). This reversal indicates Country A has an older population: the large elderly proportion inflates Country A's crude rate even though its underlying risk at each age is lower. Age-standardised rates are more valid for comparing underlying disease burden because they remove the confounding effect of different age structures. Country B has the greater underlying CVD risk despite fewer total deaths per 100,000 in the raw data.

Activity 2 — Study Design Evaluation

1. New drug for T2D. Best design: Randomised Controlled Trial. The drug is safety-tested and believed beneficial, so it is ethical to assign participants to drug vs placebo. Randomisation eliminates confounding (groups have similar baseline characteristics by chance), so any difference in progression is attributable to the drug; double-blinding eliminates assessment bias. Limitation: the trial population may not represent all T2D patients (often excludes very old, pregnant, or multi-morbid patients), limiting generalisability; trial duration may be too short for long-term effects.

2. Childhood sun exposure and melanoma. Best design: Case-control. We cannot follow children prospectively for 30–40 years, so we recruit adults who already have melanoma (cases) and adults without (controls) and compare recalled childhood sun exposure (retrospective). Limitation: recall bias — melanoma patients may recall childhood sun exposure more carefully than controls, artificially inflating the association. Partially controlled by objective measures (e.g. geographical sun-exposure records) rather than self-report.

3. Red wine and CVD. Confounder 1: socioeconomic status — moderate red-wine drinkers tend to have higher SES, which is independently associated with lower CVD risk (healthcare, diet, activity). Confounder 2: diet quality — wine drinkers often follow Mediterranean-style diets that independently reduce CVD risk. Why causation can't be established: the study is observational — it shows wine drinkers have lower CVD rates but cannot determine whether wine is causal or whether confounders explain the association. Without controlling these (statistical adjustment, matching, or an RCT), the headline claim is unjustified; existing RCT/polyphenol evidence does not support a strong protective effect.

Short Answer Model Answers

SA1 (4 marks): Incidence is the rate of new cases arising in a defined population over a specified time, (new cases ÷ population at risk) × 100,000 — it measures how fast disease develops. Prevalence is the total proportion with the disease at a given time, (existing cases ÷ total population) × 100 — it measures how much disease exists [2]. Why effective treatment raises prevalence despite falling incidence: prevalence ≈ incidence × duration. Effective treatment extends survival, so patients remain in the existing-cases pool for longer; even if incidence falls, the pool grows [1]. Example: HIV in high-income countries — antiretroviral therapy extended life, so prevalence rose through the 2000s while incidence (new infections) fell. The same pattern occurs for Type 2 diabetes (better treatment → longer survival → rising prevalence despite stable incidence) [1].

SA2 (5 marks): Cohort: recruit a large sample (50,000+) of adults aged 35–65 without T2D, willing to be followed 15–20 years [1]. Exposure: measure physical activity at baseline and every ~2 years (questionnaires or accelerometers) — type, duration, intensity, frequency; classify into activity categories [1]. Outcome: development of T2D (fasting glucose ≥7.0 mmol/L, HbA1c ≥48 mmol/mol, or diagnosis), measured at each follow-up [1]. Evidence of association: compare annual T2D incidence in high- vs low-activity groups; calculate relative risk (<1.0 supports protection); test dose-response [1]. Confounding variable: diet (healthier eaters exercise more AND have lower T2D risk). Control: collect dietary data and statistically adjust, or restrict analysis to similar dietary patterns [1].

SA3 (5 marks): RCTs are the gold standard because randomisation distributes known and unknown confounders equally by chance, and blinding prevents bias — establishing causation [1]. But RCTs cannot ethically be used for harmful exposures: you cannot assign people to smoke, inhale asbestos, or receive high UV exposure for decades; an ethics board would never approve it. Requiring RCT evidence would mean we could never establish causation for environmental carcinogens experimentally [2]. Observational evidence can establish causation via the Bradford Hill criteria — strength, consistency, temporality, dose-response, biological plausibility, specificity. The smoking–lung cancer link was established entirely through observational cohort studies (Doll and Hill) plus mechanistic evidence, with no RCT [1]. Conclusion: the claim is partly valid (RCTs are ideal when ethical — drugs, vaccines, interventions) but inappropriate as a universal standard for harmful exposures; the appropriate standard is convergent evidence from multiple study types satisfying the Bradford Hill criteria [1].

Test yourself against the clock
boss

Five timed questions on incidence, prevalence, mortality and study design. Beat the boss to bank a tier — gold (perfect + fast), silver (80%+), or bronze (cleared).

⚔ Enter the arena
Race Through Epidemiology!

Sprint through questions on incidence, prevalence, mortality and study design. Pool: lessons 1–12.

How did your thinking change?

Return to your Think First responses at the start of the lesson.

  • Q1 — total cases vs rate: Total case count is influenced by population size. Rate (cases per 100,000) controls for this — allowing valid comparison between different-sized populations and over time. Rate = cases ÷ population size.
  • Q2 — improved screening making disease appear to increase: Screening detects cases that previously existed but were undiagnosed. When screening uptake increases, the diagnosed (recorded) prevalence rises even if true prevalence is stable — this is ascertainment bias.
  • Write the formulas for incidence rate and prevalence from memory, and state in one sentence why age-standardised rates are more useful than crude rates for comparing populations.