Biology • Year 12 • Module 5 • Lesson 16
Frequency Data and SNP Analysis
Lock in the vocabulary of frequency data, the arithmetic of simple trait/allele frequencies, and what a SNP actually is — before you start interpreting larger data sets.
1. Term–definition match
The ten definitions below are shuffled. In the right-hand column write the matching term from this list: frequency data, trend, sample size, bias, SNP, marker, allele frequency, genotype frequency, population, representative sample. 10 marks
| # | Definition (shuffled) | Matching term |
|---|---|---|
| 1.1 | Data showing how common a characteristic or allele is within a sample or a wider group. | |
| 1.2 | The number of individuals measured in a study. | |
| 1.3 | A single-base difference at a specific position in DNA used as a comparison marker. | |
| 1.4 | The proportion of one specific allele out of all alleles at a locus in a population. | |
| 1.5 | The proportion of individuals in a population that carry a particular combination of alleles (e.g. AA, Aa, aa). | |
| 1.6 | A general pattern visible in data — not a claim about every individual. | |
| 1.7 | A systematic problem in data collection that makes the sample unrepresentative. | |
| 1.8 | A DNA feature used to compare individuals, populations or species. | |
| 1.9 | A group of organisms of the same species that can interbreed in a defined area. | |
| 1.10 | A sample whose composition reflects the wider population it was drawn from. |
2. Cloze — frequency data and SNPs
Fill the blanks using terms from the word bank. Each term is used once. 8 marks
Word bank: nucleotide · marker · sample · trend · bias · representative · allele · genome
A single _____________ (1) polymorphism (SNP) is a one-base difference at a specific DNA position. Because the same position can be checked across many individuals, a SNP acts as a useful _____________ (2) for comparing genetic similarity. Frequency data describes how common a trait or _____________ (3) is in a sample, and patterns observed are best described as a _____________ (4) rather than a fixed rule. Conclusions are only as good as the _____________ (5) the data came from — a small or biased one will mislead. A _____________ (6) sample reflects the wider population reasonably well, while sampling only one location may introduce _____________ (7). Stronger conclusions about relatedness require comparison across many SNPs across the _____________ (8), not a single position.
3. True or false — with correction
For each statement, circle T or F. If the statement is false, write the corrected version. 8 marks (1 for T/F, 1 for the correction where needed)
3.1 A SNP is a single-base difference at the same DNA position in comparable sequences. T / F
3.2 If one SNP differs between two populations, this proves the populations are completely unrelated species. T / F
3.3 A larger sample size always removes all bias automatically. T / F
3.4 A frequency of 60% in a sample of 100 individuals means every population will show that same 60% value. T / F
4. Calculate the frequency
Use the simple rule frequency = count ÷ total. Express each as a decimal (to 2 d.p.) and a percentage. Show your working in the table. 8 marks (1 per cell)
| # | Scenario | Working | Decimal | Percentage |
|---|---|---|---|---|
| 4.1 | 20 individuals show trait X in a sample of 80. | |||
| 4.2 | In a sample of 200 alleles, 130 are the dominant allele A. | |||
| 4.3 | Out of 250 sampled people, 175 have free earlobes. | |||
| 4.4 | At one SNP locus, 36 of 120 sampled chromosomes carry the G allele (the rest carry A). |
5. Identify the SNP in each sequence pair
For each aligned DNA pair, circle the SNP position and write the position number (1 = first base) and the two alleles separated by a slash (e.g. position 4 — A/G). Assume only one SNP per pair. 5 marks
| # | Sequence 1 | Sequence 2 | Position & alleles |
|---|---|---|---|
| 5.1 | A T C G A T C C G A | A T C G G T C C G A | |
| 5.2 | C C A T G C T A C G | C C A T G C T A T G | |
| 5.3 | T A G C T A C C G T | T A G C C A C C G T | |
| 5.4 | G T C A A G T C T A | G T C A A G T T T A | |
| 5.5 | A A C G T T G A C C | A A C G T T G A C T |
6. Function recall
Answer each in 1–2 sentences using precise lesson terms. 8 marks (2 each)
6.1 What is the function of using frequency data rather than reporting individual cases?
6.2 What is the function of reporting a sample size alongside a frequency?
6.3 What is the function of a SNP as a genetic marker in comparison studies?
6.4 What is the function of comparing many SNPs rather than relying on one?
7. Build a concept map
Draw labelled arrows between the five terms below to show how they connect. Each arrow must carry a linking phrase (e.g. "measures", "supports", "limits"). Aim for at least 5 labelled arrows. 5 marks
Supplied terms: sample · frequency data · trend · SNP marker · conclusion.
Q1 — Term–definition matches
1.1 frequency data • 1.2 sample size • 1.3 SNP • 1.4 allele frequency • 1.5 genotype frequency • 1.6 trend • 1.7 bias • 1.8 marker • 1.9 population • 1.10 representative sample.
Q2 — Cloze paragraph
(1) nucleotide • (2) marker • (3) allele • (4) trend • (5) sample • (6) representative • (7) bias • (8) genome.
Q3 — True / false with correction
3.1 True.
3.2 False. Correction: one SNP is only one position. A difference at a single SNP does not prove the populations are unrelated species — many populations within the same species differ at individual SNPs. Stronger conclusions need many markers.
3.3 False. Correction: larger samples reduce the influence of random error but do not automatically remove bias. A large sample collected only from one biased location is still biased.
3.4 False. Correction: 60% is the observed frequency in that sample. Other populations may have different values, and the next individual sampled does not have to show the trait.
Q4 — Frequency calculations
4.1 20 ÷ 80 = 0.25 = 25%.
4.2 130 ÷ 200 = 0.65 = 65% (frequency of A).
4.3 175 ÷ 250 = 0.70 = 70%.
4.4 36 ÷ 120 = 0.30 = 30% (frequency of G; A allele therefore has frequency 0.70).
Q5 — Identify the SNP
Read position-by-position (1 = first base):
- 5.1 — position 5 — A/G.
- 5.2 — position 9 — C/T.
- 5.3 — position 5 — T/C.
- 5.4 — position 8 — C/T.
- 5.5 — position 10 — C/T.
Q6.1 — Function of frequency data
Frequency data shifts the question from single individuals to patterns across groups. It lets us compare how common a trait or allele is between populations and identify trends rather than relying on anecdotes from a few individuals.
Q6.2 — Function of reporting sample size
Reporting sample size lets the reader judge how much confidence to put in the frequency. A frequency of 70% from 200 individuals carries more weight than the same frequency from 10 individuals because small samples may not be representative.
Q6.3 — Function of a SNP as a marker
A SNP is a fixed, comparable position in the genome. Because the same position can be checked across many individuals, it provides a consistent reference point for comparing similarity and difference within and between populations or species.
Q6.4 — Function of comparing many SNPs
One SNP samples one tiny part of the genome and can differ by chance. Comparing many SNPs averages out single-locus chance variation and gives a much better estimate of overall genomic similarity or difference between groups.
Q7 — Sample concept map
A correct map should include arrows such as:
- sample — produces → frequency data
- frequency data — reveals → trend
- trend — supports → conclusion
- SNP marker — contributes data to → frequency data
- sample size / bias — limits the strength of → conclusion
- Optional: single SNP marker — cannot alone justify → conclusion (a relatedness claim).
Any biologically valid linking phrases are accepted. Award full marks for at least 5 correctly labelled arrows that respect causal direction.