Probability is defined with respect to the event and the sample space:
In the following, the event is "the number 4" or "numbers greater than 2". The sample space is all possible numbers on a single die. (Dice is plural; die is singular.)

The probability is the proportion of times the event would occur if we repeat a random trial over and over, essentially an infinite number of times. In the Venn diagram above, the probability of an event is a ratio of areas. So, the probability, p(4), of obtaining a "4" is 1/6. The probability of obtaining a number of dots greater than 2, p(number of dots > 2) is 4/6 = 2/3.
|
The probabilities of all possible (exhaustive sampling), mutually exclusive evens add to one:
For a single throw of a die, p(1)=1/6, p(2)=1/6, etc. and:
p(1)+p(2)+p(3)+p(4)+p(5)+p(6) = 1
or, with p(even) = p(2)+p(4)+p(6) = 3/6 = 1/2, and p(1)+p(3)+p(5)= 3/6 = 1/2
p(even)+p(odd) = 1
Finally, note we can use the fact that all probabilities must sum to 1 by computing
p(number of dots >2) = 1-p(1)-p(2) (This is a very useful idea)
Note that that the events must be mutually exclusive if probabilities are going to add up to 1. - The properties of evenness and number of dots overlap in the Venn diagram. A die can have an even number of dots and 2 dots at the same time.
|
"OR" means addition of probabilities - mutually exclusive events).
We can ADD the probabilities to obtain the probability that one event OR another event outcome occurs.
p(1 OR 2) = p(1)+p(2) = 1/6+1/6 = 1/3
(Note p(1 OR 2)+p(3 OR 4) + p(5 OR 6) = 1/3+1/3+1/3 = 1)
|
"AND" means multiplication of probabilities for independent events.
Suppose that we have two throws of a die or the simultaneous throw of two dice. If these throws are independent, then the probability of obtaining some outcome on the first throw AND some outcome on the second throw is the product of the two probabilities:
P(3 on first throw AND 3 on second throw) = (1/6)*(1/6) = 1/36
|
Mendel's peas:
Mendel inferred that if he crossed a test plant with itself and observed the pod color of 10 randomly chosen offspring, that the test plant was homozygous for pod color. Look at the Punnett Square - It's really a Venn diagram. The probability of a sperm or egg being G" or "g" is one half. The probability of a resulting embryo being a particular genotype:
p(homozygous dominant) = p(GG) = p(G)*p(G) = (1/2)*(1/2) = 1/4
p(heterozygous) = p(G sperm and g egg) + p(g sperm and G egg)
= (1/2)*(1/2)+(1/2)*(1/2) = 1/2
p(homozygous recessive) = p(gg) = p(g)*p(g) = (1/2)*(1/2) = 1/4
Note that the probability of a particular sperm (G or g) from meiosis is independent of the probability of a particular egg (G or g) from meiosis. "Selection" of a sperm is a random event independent of "selection" of an egg.

So, p(all ten offspring green) = p(first is green)*p(second is green)*...*p(tenth is green) =
= 0.056. So Mendel likely misidentified about 6% of heterozygous individuals. However, he correctly identified heterozygotes about 94% of the time.
|
Let's redraw our Punnett Square / Venn diagram to make it interactive.
For the mating of single individuals, then probabilities of obtaining one of the haploid products of meiosis or the other are equal. In a population of individuals with different genotypes, random mating produces a "pool" of sperm and eggs for which p(G) (and p(g)= 1-p(G)) are not necessarily 1/2.
Out[2]=
|
Another example of obtaining the probability of two independent events occuring:
The following example, for which the probability of being a smoker is independent of having high blood pressure, shows how we can calculate the joint probability of an individual being BOTH a smoker and having high blood pressure: (Notice that the four numbers in black add up to one, as two the two red and two blue numbers.)
Out[4]=
|
Conditional probability:
If events are not independent, we must use conditional probabilities - p(A | B) is read as "the probability that event A has occurred, GIVEN that B has occurred.
For two INDEPENDENT events:![]()
We saw that the probability of high blood pressure AND smoking was p(high bp)*p(smoking)
For two NON-INDEPENDENT events:
![]()
We obtain Bayes' theorem:
|
The "triple test" screens for Down's Syndrome without the risks of amniocentesis:
Down's Syndrome occurs in one in 1000 pregnancies, so
p(DS) = 0.001
Probability that a fetus with DS will be correctly scored as having DS is:
p(+test | DS) = 0.6
Probability that a test would incorrectly say that a normal fetus had DS is
p(+test | no DS) = 0.05
(Note that if the test results were independent of the presence of Down's Syndrome, these last two numbers would be the same!)
The three numbers above give the values marked in blue and red on the next graph.
The four numbers in black are calculated from:
|
There is only a one percent chance that a positive test indicates Down's Syndrome!!!
The law of total probability says that the probability of a positive test, (the sum of the two purple areas)
p(+test) = p(+test | DS)*p(DS) + p(+test | no DS)*p(no DS) =
0.6*0.001 + 0.05*0.999 = 0.05055
Bayes' Theorem can be used to calculate the probability of having DS, given that the test gives a positive result:
p(DS | +test) = p(+test | DS)*p(DS)/p(+test) = 0.6*0.001 / 0.05055 = 0.012
Conclusion: there is about a one percent chance that a positive test indicates Down's Syndrome!!!
Out[6]=
|