Let's recall our gerbils and their fur color genes...
And now let's add two more genes that also control aspects of coat color:
And so...
(Note that these are "hybrid" crosses because the parents are hybrids for the traits in question.)
Predicting the expected ratios of offspring phenotypes becomes more complicated as the number of different traits is increased.
Let's do a dihybrid cross and consider both hair color and color pattern expected in such a cross (BbPp x BbPp)
In this typical dihybrid cross, you expect to obtain a ratio of
9:3:3:1 in the offspring cohort. 9 agouti: 3 black: 3 agouti piebald: 1 black piebald.
You can also figure out the expected phenotypic ratios for a trihybrid cross (BbPpMm x BbPpMm).
What are the possible gamete types either trihybrid parent can produce?
What are the expected phenotypic ratios? (You fill in the square!)
Once you start considering more than two or three traits at a time, it becomes too complicated to keep track of what you're doing in longhand.
Fortunately, a few simple formulas solves the problem, if n equals the number of traits/genes in question...
Number of F1 gamete types |
2^{n} |
Proportion of F2 homozygous recessives |
1/(2^{n})^{2} |
Number of different F2 phenotypes (complete dominance) |
2^{n} |
Number of different F2 genotypes (or phenotypes, if no dominance) |
3^{n} |
Using Statistics in Genetics Many testable questions can be posed with respect to inheritance of mono- ahd dihybrid crosses. For example:
To answer this type of question (and others), geneticists use the scientific method. And major tools used in the scientific method are the common laws of Probability and Statistics.
Data The appropriate sample distribution to which you compare your own experimental data depends on the type of data you collect:
(e.g., # of beetles/m^{2} in a forest habitat; # of smokers in a population of college students)
(e.g. - brain size; snout-vent length; stature; blood volume etc.)
The range (difference between highest and lowest values), standard deviation (measurement of how tightly data points are clustered around the mean) and variance (measure of the spread of a set of sample data) describe dispersion of data points around those middle values.
Probablity in Genetics
Probability Calculations allow us to define the range of possible
results, often in the form of a bell-shaped curve representing the likelihood of
each result occuring, for a particular phenomenon. In other words, these calculations allow us to
determine expected results.
Examples: Sum Rule, Product Rule, Binomial Theorem
Statistical Tests define confidence limits, which tell us
whether or not an observed result is significantly different from what is
expected, as determined by probability calculations.
Examples: Chi Square, t-test, ANOVA, etc.
To be able to apply a statistical test with confidence, the investigator must...
1. Design a good experiment with
2. Understand general statistical values such as mean, standard deviation, variance.
3. Understand what is meant by probability
The probability (P) of an event occuring can be expressed as:
This simplest type of probability calculation is based on past experience. For example...
If an airport kept records of the number of bird/airplane collisions and had recorded 10 collisions in the past three years (approximately 26,280 hours), then the probability of a bird hitting an airplane (or vice versa) would be calculated as
This means that there is a 0.038% chance that there will be a bird/airplane collision in any given hour of the airport's operation.
In the genetics of monohybrid and dihybrid crosses, the Punnett square (or the equations allowing calculations of expected numbers of genotypes and phenotypes, as shown in the previous lecture) can be used to establish probability. For example...
In a monohybrid cross Bb x Bb, the expected genotypic ratios of offspring is 25% BB, 50% Bb, and 35% bb.
Combined Probabilities
When dealing with more than one possible "event" in a triel, the investigator must use special
care to determine exactly how the two events are related before choosing
the appropriate calculation to determine their relative probabilities.
For example, if...
event #2: has d possible outcomes
Then the number of different possible combinations of the two events occuring together is equal to c x d (cd).
Let's say...
event #1: "agouti or black fur"
event #2: "solid or piebald pattern fur"
Each event has two possible outcomes, so the total number of possible outcomes = 2 x 2, or 4.
A Punnett Square will give the same result: There are four possible gerbil color/pattern combinations that could result from these combined events:
This allows us to understand three important rules in statistical testing in Genetics...
The Sum Rule This is used to calculate the probability of two events occurring if event #1 precludes event #2. It's an "either/or" situation, such as the roll of a die. Once one face comes up on a roll, no other face can come up on the same roll.
(Or, to use a genetic example, a child must be either male or female--not both.)
Question: What is the probability, upon rolling a die, that the roll will
yield either a 1 or a 6?
Either event has a probability of 1/6. Summing these probabilities gives...
P = (1/6)_{1} + (1/6)_{6 }= 2/6, or 1/3.
In prose: One in three rolls will yield either a "1" or a "6".
You can do the same thing for any genetic events
that preclude (i.e., prevent from happening) each other.
The Product Rule This is used to calculate the probability of two events occurring if event #1 and event #2 are independent. It's an "and" situation, such as the rolling a die twice, back to back.
Each roll's result is independent of previous rolls and subsequent rolls. (Or, to use a genetic example, two siblings can be male/male, male/female or female/female)
Question: What is the probability that upon rolling a die twice, that one roll will yield a 1 and another roll will yield a 6?
In our example, either event has a probability of 1/6. Multiplying their probabilities yields...
P = (1/6)_{1} x (1/6)_{6 }= 1/36.
In prose: You'll have to roll the die 36 times to get a two roll sequence in which the first roll is a "1" and the second roll is a "6".
You can do the same thing for any genetic events
that are independent.
The Binomial Theorem In this case, two alternate events have independent probabilities.
Let's say we have two alternate events for a particular trial, X or Y.
The probability of X is p.
The probability of Y is q.
n = the number of trials in which either X or Y can occur.
s = the number of times event X occurs in your trials
t = the number of times event Y occurs in your trials
(Note that, by definition, p + q = 1.0 and s + t
= 1.0)
The probability that X will occur
s times and Y will occur t times in n trials can be calculated by using an
expansion of the binomial equation...
(recall: ! is the symbol for "factorial": 10! = 1x2x3x4x5x6x7x8x9x10)
For example...
Gerbils have two alleles at the fur color locus, B (agouti) and b (black).
What is the probability is that a monohybrid cross yielding a litter of four pups will produce three agouti and one black pup?
P =
n = # trials (births) (4)
s = agouti (p = 3/4 = .75)
t = black (q = 1/4 = .25)
Therefore,
P = [4!/3!1!](.75)^{3}(.25)^{1 }= 0.42
This means that 0.42 litters of four pups produced in such a monohybrid cross should consist of three agouti and one black pup. (Any other combinations have their own probability, which also can be calculated with the binomial theorem. Pick one and try it for practice.)
If you were to breed monohybrid gerbils and obtain a sufficient sample size of four-pup litters, you could compare your actual number of 3 agouti/1 black litters to the predicted 42%.
But how do we know if the deviation from the expected is actually significant, or simply due to random sampling error? This is where statistical tests come into play.
Statistical Tests In some cases, results of monohybrid, dihybrid or other crosses will be close to the expected, but not exactly the same. How far can such results deviate from the expected before one suspects that the reason is not simply random sampling error (chance)? Subject the data to a statistical test.
The appropriate statistical test to use for any given data set depends on the type of data.
A parametric test is used to test the significance of continuous numerical
data relative to the expected.
(Examples: t-test, ANOVA, etc.)
A non-parametric test is used to test qualitative, discrete (attribute)
data relative to the expected.
(Example: Chi Square)
The Chi Square Test In genetics, the Chi Square (Χ^{2}) test can be used to test whether the distribution of data from experimental crosses is consistent with the expected result.
For example, if one were to perform a monohybrid cross of two agouti gerbils (Bb x Bb), the expected phenotypic result, as per Mendel's Law of Segregation, would be 75% agouti and 25% black gerbils.
If two gerbils of known genotype with respect to the B locus were crossed often enough to produce 100 pups, and of these pups 60 were agouti and 40 were black, is this deviation from the expected significant, or simply due to chance? This is a job for Chi Square.
In which...
E = expected number in a phenotypic category
O = observed number in a phenotypic category
...summed for all possible categories.
For our example,
Χ^{2} = (60 - 75)^{2}/75 + (40 - 25)^{2}/25
(Use numbers, not percentages, as the Chi Square test's validity depends heavily upon actual sample size.)
Doing the math, we see that the Χ^{2} statistic is equal to 12.
The next step is to determine the degrees of freedom--a measure of the number of independent categories--in the system. For the Chi square test, df = n-1, in which n is the number of independent categories. (Because agouti and black are not independent (a pup must be one or the other), df is not equal to n.)
In our example, df = 2 - 1 = 1
Armed with a Chi Square statistic of 12 at one degree of freedom, we now turn to the well-established Table of Critical Values for the Chi Square.
As you may recall, any proportion of agouti and black offspring is possible, no matter how unlikely. The P value is simply a measure of how probable it is that the deviation from the expected in our case is due only to chance.
By convention, a null hypothesis is rejected if the probability value associated with the statistic is less than 0.05. In prose, this means that there is less than a 5% chance that the deviation from the expected result is due to random sampling error.
Our particular example suggests that the 60:40 ratio of agouti:black is not very likely to be due to random chance (less than 0.005%!), and that some other factor is responsible for the deviation.
It is now up to the clever scientist to try and determine what that other factor is, so it's back to the hypothesis factory.
In extremely rigorous experiments, a significance level of 0.01 is sometimes used.
In clinical trials (e.g. development of drugs, in the early stages) a more lax siginficance level of 0.1 is acceptable at the beginning of testing (you don't want to throw out a promising new drug, after so much expense has gone into developing it!).
Error in Statistical Tests A statistical test tells only the likelihood that a particular result is due to chance. Even though our example seemed very unlikely to be due to chance, it's still not impossible that this was the case.
If we have rejected or failed to reject a hypothesis in error, there are two ways it can happen:
Inheritance Beyond Mendel Mendel dealt with traits controlled by a single locus. It was only by concentrating on these simpler traits that he was able to divine the nature of inheritance. But the vast majority of the phenotype of complex eukaryotes is made up of traits affected by many genes.
Mendel also did not know about organelle genomes, nor how DNA itself functions. The Father of Modern Genetics paved the way for more complex understanding of inheritance.
Polygenic Inheritance To this point, we've considered discontinuous, qualitative character states. But many, if not most phenotypic traits in natural populations are expressed on a continuum of states. Such quantitative characters are often controlled by more than one gene interacting.
In many cases, continuous variation of a trait is the result of environmental influence affecting the degree of gene expression. (e.g., genetically similar plants showing different phenotypes depending on environmental factors such as amount of water, fertilizer, sunlight, etc.). But perhaps as many cases of continuous variation can be genetically explained.
One interesting example is human skin color, which is controlled by at least eight different loci on different chromosomes...
The more genes involved in a particular trait's expression, the smoother and more continuous the distribution of the character traits...
A suite of genes controlling the heredity of a particular trait showing continuous variation are collectively called polygenes or quantitative trait loci (QTLs).
(Recall that the locus of a gene is essentially it's "physical address" on a DNA molecule. This term can be used synonymously with gene.)
Polygenes contributing to the level of pigmentation in a tissue can each be considered to be responsible for a certain "dose" of that pigmentation. The more "high pigment" genes an individual inherits, the more pigment that tissue will express.
Other polygenes work similarly, each contributing a "dose" of whatever form of expression that trait takes. It's just a bit easier to visualize with the pigment example.
Variation and assortment of polygenes in a population contribute to continous variation of polygenic traits in that population.
Mitochondria and Chloroplasts: Organelle Inheritance
The History of Organelle DNA's discovery
In 1909, Carl Correns first reported unusual inheritance patterns in variegated leaf colors in Japanese Four o' Clocks (Mirabilis jalapa)
Mirabilis exhibited maternal inheritance patterns.
The hypothetical process involved is known as CSAR: Cytoplasmic Segregation And Recombination. Or sometimes simply cytoplasmic segregation.
Organelles of similar genotype tend to segregate together during mitosis, until eventually the various areas of a mature organism may contain mitochondria (or chloroplasts) with genomes distinct from those of other areas of the body. In the variegated plants, the youngest leaves are green, and only as the plant grows older does it eventually produce variegated leaves after its cells have undergone such organelle segregation.
Further studies of cytoplasmic segregation done on a haploid organism, Neurospora crassa, were published in 1952 by Mary Mitchell. Her fungal colonies expressed two mutant traits:
2) extra cytochromes in the mitochondria
Both extranuclear mutant traits showed maternal inheritance patterns.
The "poky" mutation turned out to be a deficiency in the promoter region of the small ribosomal subunit's rRNA gene: faulty protein synthesis resulted in stunted growth of the colony.
(Yet another bit evidence supporting the Endosymbiont Model.)
In 1965, Edward Tatum, et al. transplanted mitochondria from mutant Neurospora into wild type recipient cells.
The team then plated them out for several generations and...voila. The mutant phenotype suddenly appeared. This showed the mitochondrial and/or chloroplast segregation first evidenced in the variegated Mirabilis leaves occurs in other species, and may be evolutionarily conserved.
The mitochondrial and c'plast genomes are
Most organelle-encoded polypeptides unite w/ nucleus-encoded peptides to produce active proteins that function in the organelle.
Reciprocal crosses involving the extranuclear genome result in unexpected phenotypic ratios typical of uniparental or maternal inheritance.
Maternally Inherited Mitochondrial Cytopathies
Plasmids in Eukaryotes: More Non-Mendelian Inheritance
No more laughing at Lamarck: Epigenetic Inheritance Recall the early contention of Jean Baptiste Lamarck, that organisms could pass on acquired traits to their offspring. While his examples were wrong, it turns out that some acquired genetic information can be passed on to offspring, usually in the form of "repackaged" DNA that behaves differently without having changed in nucleotide sequence.
Paramutation
Paramutation occurs when certain special--but seemingly normal--alleles, called paramutable alleles, are irreversibly changed after simply having been present in the same genome as another class of special alleles, called paramutagenic alleles. This phenomenon is known only in plants (so far), and has been studied most thoroughly in corn.
A gene known as B-I in corn encodes one of the enzymes in the production pathway of a blue-purple pigment, anthocyanin. Alternative, mutant alleles at this locus are recessive to B-I, and fail to allow pigment production.
A special paramutagenic allele, B', confers the ability to make a small amount of anthocyanin pigment. Corn that is homozygous for this allele is only weakly pigmented.
When true-breeding B-I corn plants are crossed with B' homozygotes, the resulting heterozygotes are weakly pigmented, and physically indistinguishable from the parental B' plants. This would suggest that B-I is recessive to B'. But it isn't that simple.
If B' were dominant to B-I, then the F2 generation should be 25% fully pigmented plants. Instead, all F2 offspring are weakly pigmented, as are all subsequent generations of the plants. The B-I allele is said to have paramutated, simply by having been exposed to the paramutagenic B' allele. The B' allele somehow permanently cripples the B-I allele. This result would seemingly suggest that B-I is recessive to B'. If this simple explanation were true, self-crosses of these heterozygous plants would generate homozygous B-I plants. However, instead, only B' alleles appear in the next (and subsequent) generations, indicating that the B-I allele has been paramutated. Somehow, by virtue of having been exposed to the paramutagenic B' allele by being in the same genotype for but a single generation, the B-I allele has been permanently crippled in its activity.
Parental Imprinting
Genes that exhibit parental imprinting are methylated, bound to histones and hence, "turned off" in one sex during gametogenesis. This means that functional alleles of these genes are inherited from only one parent or the other, and in a sex-specific manner.
Some parentally imprinted genes are turned off in males, and the functional version of the gene is inherited only from the mother. Other imprinted genes are turned off in females, and the functional version is inherited only from the father.
Parental imprinting does not involve any change in DNA sequence; it is merely a "repackaging" of a DNA sequence, rendering it untranscribable.
We will return to examples of epigenetic inheritance in more detail later, when we discuss inheritance at the molecular level.