Biology · Year 12 · Module 5 · Lesson 18
HSC Exam Practice
Large-Scale Population Genetics Data — Disease, Conservation, Human Evolution
Short answer
1.Short answer
Define large-scale collaborative population-genetics project and identify one named example.
Distinguish between a population bottleneck and a founder effect.
Outline two reasons why measuring genetic diversity matters for a threatened animal population.
Explain why a large-scale data set can identify population trends in inherited disorders but still cannot guarantee the outcome for any one individual.
Outline how shared and divergent genetic-marker patterns across human populations can be used to infer evolutionary relationships.
Data response
2.Data response — pairwise FST and the Out-of-Africa model
The figure below summarises pairwise genetic differentiation (FST) between the Yoruba population (Nigeria) and five other human populations, plotted against geographic distance from Addis Ababa. Each point is a population mean from genome-wide microsatellite or SNP data.
(a) Describe the trend shown by the data and quote one supporting value.
(b) Account for the trend using the Out-of-Africa model and the concept of a serial founder effect.
(c) Identify one limit of inference that still applies even to a data set this size.
Extended response
3.Extended response
Evaluate the claim that large-scale collaborative population-genetics data sets have transformed our understanding of conservation, disease inheritance and human evolution. In your response, refer to at least one named example from each of the three application contexts and to the limits of inference that still apply.
Biology · Year 12 · Module 5 · Lesson 18
Answer Key & Marking Guidelines
Section 1 · Short answer · 2 marks · Band 3
Sample response. A large-scale collaborative population-genetics project pools genetic data from many laboratories, sites and populations so that allele-frequency patterns and rare variants can be detected with statistical power not available to a single small study. Named example: the 1000 Genomes Project, gnomAD, the UK Biobank, the Save the Tasmanian Devil Program, or the National Centre for Indigenous Genomics (any one acceptable).
Marking notes. 1 mark for the definition (pooled data, many sources, larger statistical power). 1 mark for one specific named project.
Section 1 · Short answer · 3 marks · Band 3
Sample response. A bottleneck is a sharp reduction in an existing population (e.g. by disease, hunting, habitat collapse) that removes most individuals and so most of the population's allelic variation. A founder effect occurs when a small subset of individuals colonises a new area and starts a new population; the new population carries only the alleles present in that founding subset. Both reduce genetic diversity, but a bottleneck shrinks an existing population while a founder effect creates a new population from few founders.
Marking notes. 1 mark for bottleneck definition (sharp reduction in existing population, allele loss). 1 mark for founder-effect definition (new population from a small founding subset). 1 mark for the distinction (same outcome — reduced diversity — but different demographic mechanism).
Section 1 · Short answer · 2 marks · Band 3
Sample response. First, higher genetic diversity provides more raw material for natural selection, so the population has a better chance of including individuals with alleles suited to environmental change or new disease. Second, low diversity is correlated with reduced fitness through inbreeding depression and homozygosity for deleterious alleles, which lowers fertility and survival.
Marking notes. 1 mark for linking diversity to adaptive capacity / response to change. 1 mark for linking low diversity to reduced fitness / inbreeding depression.
Section 1 · Short answer · 3 marks · Band 4
Sample response. Large data sets sample many individuals, so allele-frequency estimates are precise and rare disease-linked variants become detectable, allowing comparison of carrier frequencies and risk patterns across groups. However, an individual's phenotype is influenced by other genes, environmental factors and chance, so a population trend cannot translate into certainty for one person. In addition, the sample itself carries assumptions — ancestry composition, ascertainment bias — that affect how confidently a trend applies to any single patient.
Marking notes. 1 mark for why size strengthens trend identification (precision / rare-variant detection). 1 mark for why phenotype is not determined by genotype alone (other genes, environment, chance). 1 mark for acknowledging sampling assumptions or method limits that prevent individual certainty.
Section 1 · Short answer · 2 marks · Band 3
Sample response. Populations that share more genetic markers are likely to share more recent common ancestry, because the markers were inherited together from an ancestral population before divergence. Populations sharing fewer markers have diverged for longer, so allele-frequency differences have accumulated through drift, selection and possibly admixture.
Marking notes. 1 mark for "more shared markers → more recent shared ancestry". 1 mark for "fewer shared markers → more divergence / accumulated allele-frequency differences".
Section 2 · Data response · 8 marks · Band 4–5
Sample response (a). FST with the Yoruba population increases approximately linearly with geographic distance from Addis Ababa. The Bedouin (~5,000 km) sit near FST ≈ 0.04, while the Karitiana of Amazonia (~24,000 km) reach FST ≈ 0.19 — about a five-fold increase across the geographic range. (Accept any correctly-quoted value.)
Sample response (b). Under the Out-of-Africa model, modern humans expanded from East Africa beginning approximately 60–70 thousand years ago. Each migration wave founded a new population from a small subset of the previous one, losing rare alleles and shifting allele frequencies through drift — a serial founder effect. The further a population is from the East-African origin, the more founder steps separate it from the source, so allele-frequency differentiation (FST) accumulates with geographic distance, producing the linear pattern shown.
Sample response (c). FST patterns are also influenced by local selection, post-migration admixture and uneven population sampling, so geographic distance is a proxy rather than the only driver. The inference is strong but remains evidence-based — additional data such as ancient DNA, haplotype-length distributions or archaeological dates can revise it.
Marking notes. Part (a): 1 mark for stating the positive, near-linear trend; 1 mark for quoting at least one specific FST value from the figure. Part (b): 1 mark for invoking the Out-of-Africa expansion from East Africa; 1 mark for explaining the serial founder effect as repeated bottlenecks; 1 mark for explicitly linking accumulated founder steps to increasing FST with distance. Part (c): 1 mark for identifying a real confound (selection, admixture, sampling, methodological assumption); 1 mark for framing the result as an inference open to revision.
Section 3 · Extended response · 7 marks · Band 5–6
Sample response. Large-scale collaborative population-genetics data sets have transformed all three application contexts identified by the lesson, but they have not removed the underlying uncertainty of biological inference. In conservation, the Save the Tasmanian Devil Program has used genome-wide SNP genotyping of thousands of Sarcophilus harrisii to demonstrate that observed heterozygosity has fallen ~30% in north-eastern populations affected by Devil Facial Tumour Disease since 1996 (Hohenlohe et al. 2019), and to identify candidate resistance loci showing rapid allele-frequency change in survivors (Epstein et al. 2016). This data directly informs the design of the insurance population and supports inference that the species is recoverable but vulnerable. In disease inheritance, the gnomAD consortium has aggregated >125,000 exomes (Karczewski et al. 2020), showing that pathogenic CFTR allele frequencies vary by an order of magnitude across ancestry groups and supporting population-specific carrier-screening policy. In human evolution, Ramachandran et al. (2005) demonstrated that pairwise FST rises approximately linearly with geographic distance from East Africa, consistent with a serial founder effect during the Out-of-Africa expansion, and Malaspinas et al. (2016) used whole-genome data from Aboriginal Australian groups to support continuous occupation of Sahul for >50,000 years. However, three classes of limit remain. First, sampling bias — gnomAD is dominated by European-ancestry samples (Sirugo et al. 2019), so variant interpretation is uneven across populations. Second, methodological assumptions about linkage, neutrality and population structure can shift conclusions when revised. Third, even perfectly clean data is a population-level statistic — it cannot predict whether any one carrier couple will have an affected child, whether any individual devil will survive DFTD, or how human migration narratives will read after the next wave of ancient-DNA discoveries. The claim is therefore largely defensible but requires the qualifier the lesson emphasises: large-scale collaborative data strengthens inference across conservation, disease inheritance and human evolution, but the conclusions remain inferences — strong, evidence-based, but open to revision — not certainties.
Marking notes. 1 mark — defines or implicitly applies large-scale collaborative data (pooled, many sources, increased statistical power). 1 mark — conservation example with mechanism (e.g. Tasmanian devil bottleneck / resistance loci). 1 mark — disease-inheritance example with mechanism (e.g. gnomAD CFTR ancestry-specific carrier frequencies). 1 mark — human-evolution example with mechanism (e.g. FST–distance relationship, Aboriginal Australian deep ancestry). 1 mark — identifies sampling-bias / under-representation as a real limit. 1 mark — identifies methodological assumptions or individual-level uncertainty as a real limit. 1 mark — reaches an explicit evaluative judgement framing the conclusions as inferences rather than certainties and integrating all three contexts.