Hardy Weinberg Calculations for Multiple Alleles at a Single Locus.
Newer technologies allow population geneticists to skip obvious phenotypes and go directly to genotype with DNA sequencing techniques. This has allowed far more sensitive detection of multiple (molecular) alleles at a particular gene locus.
Although these genetic differences may not be reflected in an organism's phenotype, they are important for the population geneticist following the microevolutionary events that may eventually lead to speciation.
Example of calculation for a multiple allele system
Let's say we have a gene locus with three distinct molecular alleles, "X", "@" and "&". The same shorthand can be used, although there is no distinct dominance/recessiveness implied.
- The frequency of X = p (which also can be written as "f(X)")
- The frequency of @ = q (which also can be written as "f(@)")
- The frequency of & = r (which also can be written as "f(&)")
To calculate your expected genotype frequencies (if the population is in
Hardy Weinberg equilibrium), use the trinomial equation (since you have
three alleles):
(p + q + r)2
which expands to...
p2 + 2pq + q2 + 2pr + 2qr + r2 = 1.0
In this population, the following genotypes can exist with respect to this
locus:
- XX (p2)
- X@ (2pq)
- @@ (q2)
- X& (2pr)
- @& (2qr)
- && (r2)
To calculate the expected genotypes if the population is in HW equilibrium, two of the alleles in homozygous condition (in this case, we'll choose @@ and &&), and determine their frequency in the population with your DNA sequencing and assay. (Off to the lab!).
You found that of 1000 individuals sequenced, 200 were @@ (0.2, or 20%) and 50 were && (0.05, or 5% of the population).
Take the square root of each value to determine q and r:
square root of q2 = 0.44
square root of r2 = 0.22
And solve for p:
If p + q + r = 1.0, then 1.0 - 0.44 - 0.22 = p, or 0.34
q = 0.45
r = 0.22
p = 0.33
Plug into the trinomial to calculate the expected relative genotype frequencies, based on homozygous allele frequencies:
p2 + 2pq + q2 + 2pr + 2qr + r2 = 1.0
0.332 + 2(0.33)(0.45) + 0.452 + 2(0.33)(0.22) + 2(0.45)(0.22) + 0.222
XX (p2) = 0.11
X@ (2pq) = 0.30
@@ (q2) = 0.20
X& (2pr) = 0.14
@& (2qr) = 0.20
&& (r2) = 0.05
Which means that of your 1000 individuals (allowing for slight rounding errors):
110 should be XX
300 should be X@
200 should be @@
140 should be X&
200 should be @&
50 should be &&
...if the popuation is in HW equilibrium with respect to this locus.
Does your population differ significantly from the predicted values?
Note that your population of 1000 is essentially a sample size of ONE, and
it could be that your observed genotypic ratios are due to
sampling error.
But if your ratios are significantly different from those predicted, the next question is to determine why.