Genomes An organism's genome is its full complement of genetic information. Examples of genomes include...

Genomics is the subdiscipline of genetics/molecular biology concerned with the structure, function, evolution, and mapping of genomes.

The C-value Paradox One might expect more complex organisms to contain more DNA and more genes than simpler organisms, but this is not the case. In fact, the amount of DNA is not proportional to either (1) the number of genes nor (2) the apparent complexity of the organism.

An organism's C-value is the amount of DNA it contains in a haploid nucleus. C-values are species-specific, and vary widely across species.

A typical frog has about seven times as much DNA as a human, though it is (arguably) less complex. A common lily has about 100 times more DNA than a human. This was known as the C-value paradox, and it made little sense until the discovery of non-coding DNA.

The human genome is comprised of approximately 3.5 billion base pairs, an amount that should translate to about two million genes. The actual number is still not known (and there's a betting pool). But estimates are settling somewhere between 20,000 - 50,000 genes in the proteome.

If this is correct, then only about 3% of the human genome codes for polypeptides, with 97% being non-coding DNA. What is this non-coding DNA, and where does it come from?

Non-coding DNA Even non-coding DNA may have an important function. There are...

Studies in functional genomics and evolutionary/comparative genomics seek to determine how genomes are constructed, how they work, and how they have changed over evolutionary time.

Jumping Genes: Transposable Genetic Elements The term transposable genetic element is the most generic term used to describe a genetic element that can occasionally move (transpose) from one position on a chromosome to another position

Specific types of these elements have other names, including

Such elements often cause abnormalities in gene function at the loci where they insert--most often by disrupting normal expression of the gene.

Tranposable Genetic Elements were first discovered and described in corn by Barbara McClintock, who won the Nobel Prize in Physiology or Medicine in 1983 for her lifelong work.

Transposable elements were not isolated at the molecular level until they were studied in yeast and Drosophila.

  • Insertion sequences, transposons and phage µ are some of the transposable elements found in bacteria. These TGEs work at the level of the DNA molecule.

  • In eukaryotic cells, transposable elements have been found in corn (maize), yeast, Drosophila and some mammalian systems. In eukaryotes, transposable elements can be responsible for rearrangements of entire chromosomes by causing breakages. In some cases, and RNA intermediate is utilized during transposition.

    Discovery of Tranposable Elements in Corn (Zea mays) In 1938, Marcus Rhoades reported unexpected (non-Mendelian) ratios in certain corn crosses.

    This would mean that the presence of an Dt allele allowed spots of pigment to form in a corn kernel that was genetically supposed to be colorless. Not very parsimonious.

    A second hypothesis was proposed:

    A lucky break...
    Rhoades found a male corn plant in which the anthers exhibited the dotted pigment pattern. He used pollen from these to test cross with a1a1 females. Some of the progeny were completely pigmented. This suggested that something in the dotted individuals' genes could somehow "reawaken" the ability to produce pigment in the dotted individuals offspring--but not always. What was going on?

  • The a1 a1 allele is the first known example of an unstable mutant allele--one in which reverse mutations occur at a very high rate.

    The Ds element In the 1940s, Barbara McClintock noted in her cytological studies of corn chromosomes that in one strain of corn, chromosome 9 readly broke at a specific site. She hypothesized that the break was due to the presence of two genetic factors she named >Ds (for "Dissociation"--this one was located at the breakage site) and Ac (for "Activator"--because the Ds site would not break unless Ac was present).

    But when she tried to map them...they wouldn't hold still! From this, she predicted that the two elements were mobile, and could actually change places within the genome and

    She also found rare, unusual and unexpected corn kernel phenotypes in the offspring of her corn crosses:

    In this example, the presence of Ac causes Ds to break, and the acentric fragment is lost. The result is hemizygosity at all the loci carried away on the lost fragment, allowing recessive phenotypes to be expressed in the cells derived from the single cell in which this breakage occurred early in the corn kernel's development.

    In this example, Ds in inserted into C early in the kernel's development, suppressing pigment production. In a few progenitor cells, Ds later pops out. This allows the normal function of C to resume, and the areas where Ds has excised are now able to produce pigment. This is an example of how a transposable element (Ds) can produce an unstable phenotype: expression changes in different cell lines and at different times because you never know when Ds is going to pop out of the gene and allow it to resume function.

    Autonomous and Nonautonomous Elements
    In plants, there are two types of transposable elements:

    Insertion of either type of element into a gene causes that gene to be disrupted, producing a mutant phenotype.

    In Rhoades' early study, Dt was that separate element: it supplied the factors promoting the transposition of a gene segment, and insertion of that segment into the pigment gene (A) disrupted the wild type allele's (A1) function, causing the mutant, unpigmented a1 phenotype.

    Insertion of an autonomous element is unstable, because it can direct its own transposition over and over. The mutation can occur in each generation; the allele produced by the insertion is called a mutable allele because of its instability.

    Insertion of a nonautonomous element is stable, because it needs the products of the autonomous element in order to transpose and produce the mutant allele.

    Let's look:

  • Top row: Wild type pigmented kernel.
  • Second row: Ds is inserted into pigment gene (C) permanently, disabling it. By itself, it can't move. It's stuck. Ds is a non-autonomous element.
  • Third row: Ds and Ac both present, Ds can now excise from the C gene in some cells (i.e., it can transpose) during development, creating developmental fields that can produce pigment. This is because Ac has provided the elements needed for Ds to transpose.
  • Fourth row: Ac is inserted into pigment gene, but not permanently, as it can provide the elements that allow its removal from the gene. Ac is an autonomous element.

    And the kicker: Rarely, an Ac type was sometimes found to transform into the Ds type, apparently because the Ac element spontaneously turned into a Ds element. (This could mean that Ds is simply a mutant version of Ac that has lost the ability to encode the elements that allow it to jump around.)

    When McClintock first reported her findings in the 1960s, most people believed that this was something unique to corn. But later, as transposable elements were discovered in E. coli, yeast, and higher organisms, it became apparent that she had been the first to describe a phenomenon that was far more universal, suggesting that genomes were far more dynamic than first supposed. In 1983, she was awared the Nobel Prize in Physiology or Medicine for her early work on corn transposons.

    TGEs were first accepted to exist only in corn. But over the years, they have been discovered in many other organisms, including both prokaryotes and eukaryotes.

    Transposable Genetic Elements in Bacteria Two basic types of TGEs are known in bacteria

    IS elements Insertion sequences (IS) were first discovered in the gal operon of E. coli, and were physically located because viruses carrying the bacterial gene in both mutated and wild type forms could be separated in a centrifuge: the mutants had an extra piece of DNA inserted, making them denser.

    When an IS appears in any of the three genes of the gal operon (E for epimerase, T for transferase and K for kinase), the normal transcription of the gene is disrupted.

    Insertion of an IS affects only the transcription of the genes downstream from the insertion. For example, if the IS occurs late in the E gene, the T and K genes might be disrupted, but the E might not be, and epimerase is still manufactured.

    This phenomenon is known as a polar mutation, since there is directionality to the transcriptional effects.

    IS elements are short pieces of DNA that move about in the genome. Where they insert, they can disrupt the function of a gene, including any operon genes downstream from the insertion point because of the change of reading frame induced by the IS (i.e., polar mutation).

    IS elements may differ in exact sequence (several distinct IS elements have been identified), but they all encode an enzyme known as transposase, which facilitates movement of the IS element.

    All IS elements begin and end with inverted repeat sequences that facilitate their removal and insertion.

    Transposons Recall bacterial R factors, plasmids that carry genes encoding factors that make bacteria resistant to particular antibiotics. Like F-factors, they are rapidly replicated and shared among bacteria.

    In the 1950's a strain of Shigella bacteria appeared in Japanese hospitals. The normal strains of this bacterium are sensitive to a wide spectrum of antibiotics. But a Shigella strain isolated from patients with a severe dysentery, was discovered to be resistant to most antibiotics.

    The multiple-drug resistance phenotype was apparently inherited as a single package--and not only by other Shigella. Other bacterial species could also obtain this resistance.

    The problem was a self-replicating episome, a bacterial genetic element capable of

    This episome was called an R factor (for "Resistance").

    The R factor is transferred rapidly between bacteria upon conjugation. In the cytoplasm, it exists as a plasmid.

    As you may recall, plasmids in bacteria often carry genes that confer resistance to antibiotics

    If one denatures the DNA of these R Factors and allows them to slowly renature, portions of the plasmid form a stem loop.

    The genes conferring drug resistance are usually located on the LOOP of the stem loop. This is located between two inverted repeat (IR) sequences, which create the stem loop.

    The resistance genes in the loop, along with their flanking IR sequences are known as a transposon. The regions between the IR sections are known as the resistance transfer region (RTR), since that's what carries the antibiotic resistance genes.

  • Composite transposons consist of a protein-coding region flanked by two Inverted Repeats (the IS elements).

  • Simple transposons consist of The above illustrates replicative transposition, in which a new copy of the transposon is made.
    Transposition may also occur in a conservative fashion, with the transposon simply moving without being copied. Transposons can jump from one plasmid to another, or directly into a bacterial chromosome.

    Both mechanisms generate a repeated sequence of the target DNA (i.e., the DNA in which the transposon is inserted). Several models were proposed for the mechanism of transposon insertion, and the one described by J. Shapiro (1978) is the currently accepted one.

    This explains the presence of direct and/or inverted repeats where transposons insert.

    Precise vs. Imprecise Excision Although transposons can excise without affecting surrounding DNA, they often generate a high incidence of deletions in their vicinity. These can consist of part of the element and part of the adjacent DNA.

  • precise excision occurs when the transposon is excised and any deleted portions of the adjacent DNA are restored by repair enzymes.

  • imprecise excision occurs when varying lengths of DNA adjacent to the transposon are excised along with it. (This is far more common than precise excision.)

    Phage µ

    This temperate virus (a bacteriophage) inserts into the genome of E. coli.

    If more than one µ is present, they can cause deletions, insertions and translocations of the host's chromosome if both excise at once.

    µ replicates with the host c'some, and generally does not form a plasmid.

    Transposable Genetic Elements in Eukaryotes Tranposable elements were first discovered in corn (maize), but are now known in numerous other eukaryotes, from yeast to mammals. These can be classified as either:

    Retrotransposons These were first discovered in a strain of mutant yeast (HIS4 mutants, which had a faulty enzyme in the histidine pathway, and were unable to make histidine) that was able to revert to wild type at a much faster rate (1000x!) than would be predicted by simple random mutation. This turned out to be similar to a previously discovered segment of DNA common in yeast, called a Ty element.

    Retrotransposons consist of

    Characterization of the yeast elements yielded a surprise: Their characteristics were reminiscent of those of retroviruses.

    1. The inserted elements DNA is transcribed (by the host cell's enzymes) into mRNA.
    2. The mRNA is then reverse-transcribed by reverse transcriptase encoded by the element (via its pol gene).
    3. Reverse transcriptase reverse-transcribes the element's mRNA into DNA.
    4. Presto! New element is born.

    The main thing lacking in the Ty element is the env gene, which encodes the retroviral protein capsid. Without viral packaging. Just a naked bit of DNA the generally stays inside the same host cell, inserting and re-inserting itself.

    Other elements since discovered in other species (e.g., the copia-like elements found in Drosophila also show these retroviral characteristics. Clues to their origin?

    Why the High Frequency of Reversions in HIS4 Mutant Yeast? Note that these LTR retrotransposons do not excise when they create a new copy of themselves. They stay put.
    So why did the HIS4 mutants have such a high rate of reversion?
    The answer lies in the nature and location of the insertion:

    The solo LTR, which isn't big enough to interfere with transcription, is the product of crossing over between the identical LTRs: only one is left behind.

    Solo LTRs are very common in almost all eukaryotes, and this could explain how they got there.

    DNA transposons Class II transposable elements move about in a way similar to that seen in bacteria: either the element itself or a copy (i.e., DNA) is inserted into a gene, disrupting its function. The maize transposons were the first discovered (McClintock). But the first to be understood on a molecular basis were the P-elements found in Drosophila.

    The P Elements These are DNA transposons that:

  • are variable in length; anywhere from 0.5 - 2.9kb long (variable length may be due to deletions of the P element, causing them to be defective)

    P elementshave been useful in allowing geneticists to formulate models for the mechanisms of transposition.

    Hybrid Dysgenesis and P elements P Elements were first discovered due to a phenomenon--observed in controlled laboratory matings--known as hybrid dysgenesis in offspring produced in a cross of M (maternal) cytotype females (known in the lab only) and P (paternal/wild type) cytotype males.

    M female x P male --> dysgenesis

    M male x P female --> normal offspring

    This germline dysgenesis included

    HOWEVER... Hypothesis: the mutations are being caused by the insertion of foreign DNA--which could later excise, reversing the mutations.

    Experimental Results:

    In normal P-cytotype flies, the P-elements are suppressed, and cannot transpose.

  • Old Hypothesis: A cytoplasmic repressor protein is present in P-cytotype, but not M-cytotype.
  • New Hypothesis: Almost all lab Drosophila are descended from those used by Morgan, et al. Why do they have no P elements? No one knows for sure. But if it's the latter, it can show just how fast a new transposable element can be introduced into a genome and summarily silenced, likely by natural selection for the silencing mechanisms.

    P elements are useful in the lab Like many accidents of nature, P-elements have become a useful tool for studying genetics. They can be used to tag genes, insert transgenes, and who knows what else in the future?

    A World of Eukaryotic TGEs As investigators search across species, it is becoming apparent that large genomes have tremendous numbers of transposable elements, and may even be composed mostly of transposable elements.

    This may help explain the C-value paradox: There appears to be little correlation between the size of an organism's genome and its biological complexity.

    Possibly more than half of the human genome appears to consist of transposable elements, mostly long interspersed elements (LINEs) and short interspersed elements (SINEs). Most of these can no longer move about, but retain the vestiges of former mobility (e.g., inverted repeats). A vast number also are included only in introns, and are excised and never transcribed. They are evolutionary relics rendered harmless by the points of their insertion and by the host's regulatory mechanisms.

    The most abundant SINE in the human genome contains a target site for the Alu restriction enzyme (isolated from Arthrobacter luteus), and so is named for that restriction enzyme: The Alu element.
    These make up nearly 10% of our genome, with nearly a million copies of various fragments of the element.
    The elements resemble the universal signal recognition particle, and may be the product of their (accidental?) reverse transcription.

    A few elements are still able to move around, and some are known to be responsible for causing human disorders by inserting into specific locations:

  • hemophila A (three different insertions of a LINE into the clotting factor VIII gene)
  • hemophila B (insertion into the clotting factor IX gene)
  • neurofibromatosis (Alu insertion into the NF1 gene
  • one type of breast cancer (Alu insertion into the BRCA2 gene)

    More will be found, now that we know how to look.

    Large Genomes, Space for LTRs In grasses used by humans for grain production, differences in genome size can largely be attributed to different quantities of inserted LTR transposons. Except for the transposon regions, the different grasses show a great deal of synteny in their genomes.

    How is such a massive load of transposons tolerated?
    Like any good parasite, a smart transposon doesn't harm its host. The ones that persist are those that have landed in genetic safe havens: areas of the genome where there are few functional genes. The transposons just hang out and are replicated--the ultimate freeloading passengers.

    Transposons that happened to land in a sensitive spot were subject to negative selection, and weeded out of the genome.

  • some elements can be used as biotechnology tools for cloning and gene manipulation, facilitating insertion of genes into germ lines of recipient cells.

  • The properties of TGEs may allow their use in gene therapy: insertion of functional genes in individuals lacking a normal, functioning copy of an essential gene (e.g., those with "bubble boy syndrome", who lack the precursor cells necessary to manufacture important cells of the immune system).

    So maybe in the long run, we'll be glad of our little passengers, and they'll eventually be paying their way by means we can't yet foresee.

    Repressing Transposable Elements: Epigenetic Regulation How do host cells manage to silence transposition of TGEs that contain transposase genes?

    Model organism: C. elegans lab strain with a DNA transposon, intentionally Tc1, inserted into the unc-22 gene ("uncoordinated" worms move with a herky-jerky motion, unlike smoothly-gliding wild type).

    Ronald Plasterk, et al. subjected this known mutant strain to mutagens in an attempt to disrupt the function of genes that repress transposition. Success would mean finding rare mutant worms that could move normally.

    They generated new strains in which the worms could excise the Tc1 element in the germline, and give rise to normally-locomoting offspring. Over 25 genes, when mutated, restored normal movement.

    Years of research revealed that this eukaryotic host employed RNAi to repress the expression of active TGEs in their genomes.

    A single TGE that inserts near a gene is transcribed to produce dsRNA that can be processed by DICER and RISC, and then employed in the silencing of all copies of the element in the genome.

    This silencing mechanism can be compared to an early warning "radar" that detects incoming enemies (via new insertions), and respond by awakening silencing mechanisms to inactivate any subsequent insertions by preventing copies of the TGEs from being made.

    Some transposons have evolved mechanisms that allow them to escape being chopped up, though. Natural selection works its Cold War wonders in both directions.