Biology • Year 12 • Module 5 • Lesson 17

DNA Sequencing and DNA Profiling

Apply sequencing-versus-profiling reasoning to real data: an allele-frequency table, a comparison framework, a gel-electrophoresis profile of a family, and an Australian forensic case study.

Apply · Data & Reasoning

1. Interpret an allele-frequency table from two populations

Researchers used DNA sequencing to determine the frequency of two single-nucleotide variants — a disease-linked variant (D+) and a neutral marker variant (M+) — at the same loci in two human populations sampled in 2024. 7 marks

VariantPopulation P (n = 480)Population Q (n = 510)Notes
D+ (disease-linked SNP)0.180.04Known to raise disease risk
D+ heterozygotes29%7%One copy of D+
D+ homozygotes3.2%0.2%Two copies of D+
M+ (neutral SNP)0.410.39Selectively neutral marker

Frequencies are allele frequencies (0–1) except where shown as % of individuals.

1.1 Identify which technology — sequencing or profiling — was needed to generate the data above, and justify your answer in one sentence. 2 marks

1.2 Compare the two populations on the D+ variant: which population is at higher inherited disease risk, and by approximately what factor in allele frequency? 2 marks

1.3 The M+ allele frequencies in P and Q are almost identical. Explain why this does not mean the two populations have the same overall genetic structure. 3 marks

Stuck? Card 1 (what sequencing can identify directly) and Card 4 (population inheritance patterns).

2. Compare and contrast — sequencing vs profiling

Fill in the table below in your own words, using lesson terminology. 8 marks (1 per cell)

FeatureDNA sequencingDNA profiling
2.1 What it determines / compares
2.2 Output (what you see at the end)
2.3 Can it directly identify a specific inherited mutation?
2.4 Most useful Module 5 application
Stuck? Cards 1–3 give the contrast directly. Card 4 lists population-level uses.

3. Read a DNA profile — gel-electrophoresis bands

The figure below shows DNA profile band patterns produced by gel electrophoresis at a single highly variable marker locus, for a mother (M), a father (F), and three offspring (C1, C2, C3). Bands closer to the bottom travelled further and represent shorter DNA fragments. 8 marks

Stylised gel showing band positions only. In a real profile, multiple loci are run side-by-side to build statistical strength.

3.1 Each child inherits one allele (one band) from each parent. List the parental origin of each band for C1, C2 and C3. 3 marks

3.2 One of the three children's bands is not consistent with being the biological child of M and F. Identify which child and explain how you can tell. 3 marks

3.3 A real DNA profile compares many marker loci, not just one. Briefly explain why one locus alone is rarely enough to conclude relatedness. 2 marks

Stuck? Card 2 (profiling compares patterns at selected regions) and Card 4 (relatedness inference is statistical, not absolute).

4. Read a sequencing output — compare two sequences

The two short sequences below are aligned outputs from DNA sequencing of the same 24-base region of a gene in two individuals (X and Y). The reference (healthy) base order is also given. The disease-linked variant at this locus is a single base change A → G at position 11. 6 marks

Position:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Reference:  T G A C C T A G T  C  A   T  G  C  A  T  C  G  A  T  T  A  C  G
Sample X:   T G A C C T A G T  C  A   T  G  C  A  T  C  G  A  T  T  A  C  G
Sample Y:   T G A C C T A G T  C  G   T  G  C  A  T  C  G  A  T  T  A  C  G

4.1 Identify whether Sample X, Sample Y, both, or neither carries the disease-linked variant, and justify using the data. 2 marks

4.2 Explain why this conclusion could be drawn from sequencing data but could not generally be drawn from a profile of band patterns alone. 2 marks

4.3 If 1,000 individuals from a population were sequenced and 40 carried the A → G variant, calculate the variant's allele frequency (each individual contributes two alleles). 2 marks

Stuck? Card 1 (sequencing identifies specific variants directly) and the worked reading in Card 5.

5. Case study — DNA evidence in an Australian investigation

Read the short factual paragraph below and then answer the question. 6 marks

Stimulus. In July 2001 British tourist Peter Falconio disappeared on the Stuart Highway in the Northern Territory. Investigators recovered a small blood sample from the victim's vehicle and from cable ties used to restrain his partner. NT Police compared a DNA profile generated from these samples with a DNA profile taken from a suspect, Bradley John Murdoch, by examining patterns at multiple short tandem repeat (STR) marker regions. The two profiles matched at every tested marker. The court used this profile match — combined with non-DNA evidence — to convict Murdoch in 2005. The case is widely cited as an example of DNA profiling rather than whole-genome sequencing.

5.1 Explain why profiling, rather than full sequencing, was the appropriate technology for this investigation. Refer to what each technology produces and to what the investigators needed to show. 3 marks

5.2 A media report claimed "the DNA evidence proved Murdoch's full genetic code matched the crime-scene sample". Identify the scientific flaw in that claim, and rewrite it accurately. 3 marks

Stuck? Card 2 (profiling = patterns at selected regions, not full base order) and the lesson's "Trap" callout.
Answers — Do not peek before attempting

Q1.1 — Technology used (2 marks)

Sequencing [1]. Identifying a specific single-nucleotide variant (D+ or M+) requires the exact base order to be read; profiling alone compares pattern similarity at selected markers and would not give the actual base change [1].

Q1.2 — Risk comparison (2 marks)

Population P is at higher inherited disease risk because D+ allele frequency is 0.18 in P versus 0.04 in Q [1]. P's D+ frequency is approximately 4.5× higher than Q's [1].

Q1.3 — Why similar M+ does not equal same structure (3 marks)

M+ is a single neutral marker, and one marker captures only a tiny slice of total genetic variation [1]. P and Q differ substantially at D+ even though they look similar at M+, which shows that two populations can share frequencies at one locus while diverging at others [1]. Population structure depends on the distribution of many alleles across many loci, not on agreement at one marker [1].

Q2 — Compare-and-contrast table (8 marks)

2.1 Sequencing — determines the exact order of nucleotide bases in a region; Profiling — compares patterns at selected DNA marker regions between samples.
2.2 Sequencing — an ordered string of bases (A, T, C, G) for the region; Profiling — a pattern of bands or peaks at chosen marker loci.
2.3 Sequencing — yes, because it reveals the actual base change; Profiling — generally no, it does not directly read the base change, only the marker pattern.
2.4 Sequencing — identifying inherited disease variants / SNPs in a population; Profiling — comparing samples for relatedness, lineage or inheritance link in a population.

Marking: 1 mark per cell (8 cells), accept synonyms.

Q3.1 — Parental origin of each child's bands (3 marks)

C1 — band at position 1 from Father; band at position 4 from Mother [1].
C2 — band at position 2 from Mother; band at position 3 from Father [1].
C3 — band at position 1 from Father; band at position 2 from Mother [1].

Q3.2 — Inconsistent child (3 marks)

All three children's profiles are consistent with M and F at this single locus — each carries one allele also found in M and one allele also found in F. Hence no child can be excluded. [1] Award marks if students explicitly state that no exclusion is possible from this locus alone and explain why: every band a child has appears in at least one parent [1]; therefore the data here neither confirms nor excludes any child from being the biological offspring of M and F [1].

Marking note: the question is intentionally a trap — students who pattern-match a name without checking both parents lose marks. Accept full marks for any answer that correctly identifies that no child can be excluded from this single locus, with reasoning. Penalise blanket "C3 is excluded" answers.

Q3.3 — Why one locus is rarely enough (2 marks)

At a single locus, many unrelated people in a population can share the same allele combinations purely by chance, so a single-locus match has low statistical power [1]. Comparing many loci multiplies the probabilities together, raising confidence that a match (or exclusion) is real rather than coincidental [1].

Q4.1 — Who carries the disease variant (2 marks)

Only Sample Y carries the disease-linked variant [1]. At position 11, the reference and Sample X read A, while Sample Y reads G — matching the disease-linked A → G change [1].

Q4.2 — Why sequencing was needed (2 marks)

Identifying the specific A → G base change requires knowing the actual nucleotide at position 11, which is what sequencing produces directly [1]. A band-pattern profile shows fragment-size similarity at chosen markers and does not, on its own, reveal the actual base at any position — so the disease variant call could not be made from profiling alone [1].

Q4.3 — Allele frequency calculation (2 marks)

Each of the 1,000 individuals contributes 2 alleles = 2,000 alleles total [1]. If 40 individuals each carry one copy of the variant (assuming heterozygotes), the variant appears 40 times, giving an allele frequency of 40 / 2,000 = 0.02 (2%) [1]. Accept also the assumption that those 40 are heterozygotes; if students assume homozygotes (80/2000 = 0.04) award full marks provided the assumption is stated.

Q5.1 — Why profiling was appropriate (3 marks)

The investigators needed to test whether the suspect's DNA matched the crime-scene sample — a comparison question, not a question about the suspect's exact base order [1]. DNA profiling compares patterns at selected marker regions (STRs) between samples and is the standard technology for this kind of identity comparison [1]. Full sequencing of the whole genome would have been slower, more expensive, and unnecessary for the question the court had to answer [1].

Q5.2 — Media-claim correction (3 marks)

The flaw: profiling does not read the full genetic code — it compares patterns at a small set of marker regions [1]. The media statement confuses profiling with whole-genome sequencing [1]. An accurate rewrite: "The DNA profile generated from Murdoch's sample matched the crime-scene sample at every tested STR marker; this is consistent with — but not the same as — comparing the entire genome." [1]