EvoWiki is now a project of the RationalMedia Foundation.
We are moving all content to RationalWiki.
See the EvoWiki project page for details!

Junk DNA

From EvoWiki

Jump to: navigation, search

Goals: Help develop this as a talkorigins.org FAQ on Junk DNA!



The genome of almost all eukaryotic organisms contains a large proportion of DNA which does not code for proteins. Some of this DNA is concerned with regulation of genes, but much of it has no discernable purpose and is therefore popularly known as "junk DNA". Creationists feel uneasy about having a genome full of DNA that doesn't do anything - why would God be so wasteful? - and claim that junk DNA isn't junk DNA. Creationists claim that scientists are being too quick to call junk DNA purposeless, have no evidence for the claim and that it is an example of "dogmatic darwinism". This is viewed as a bit rich since creationists have no evidence for their own claim.

What is Junk DNA?

There are a number of different types of DNA classified as junk DNA. Contrary to the creationists' claims we know quite a lot about the different kinds, and there are articles here on EvoWiki about them:

The scientific definition of "junk DNA" and that used by creationists are somewhat at variance. "Junk DNA", as used in the scientific literature, is restricted to sequences for which there is no known adaptive purpose. Creationists often cite non-coding sequences with known functions, such as introns and regulators, as examples of junk DNA.

Reasons to think that much non-coding DNA is "junk"

The major fact that led scientists to propose the junk DNA hypothesis was not one of the above reasons, but the simple observation that amount of DNA per cell in an organism can vary widely between closely-related organisms. An excellent example is provided in a post cited below, where one species of deer has ~20% more DNA than another very similar species of deer in the same genus. Virtually all of this variation is in noncoding sequences, such as the noncoding repetitive elements that make up a large percentage of mammalian genomes. Either one species lost a lot of DNA with only trivial resulting change, or the other species gained a lot of DNA with only trivial resulting change. Either way this implies that this DNA is not very important. Even if the scattered scientific reports of functions found for bits and pieces of noncoding DNA were magnified a hundredfold, the direct observation of widely varying amounts of DNA between closely related organisms would still imply that most noncoding DNA is not very important.

As the 2002 textbook Molecular Cell Biology puts it,

The DNA content per cell also varies considerably among closely related species. All insects or all amphibians would appear to be similarly complex, but the amount of haploid DNA in species within each of these phyla varies by a factor of 100. The same variation in DNA content per cell is common within groups of plants that have similar structures and life cycles. For example, the broad bean contains about three to four times as much DNA per cell as the kidney bean.

These facts further suggest that much of the DNA in certain organisms is "extra" or expendable — that is, it does not encode RNA or have any regulatory or structural function. The total amount of DNA per haploid cell in an organism is referred to as the C value; the failure of C values to correspond to phylogenetic complexity is called the C-value paradox. This perplexing variation in genome size occurs mainly because eukaryotic chromosomes contain variable amounts of DNA with no demonstrable function, both between genes and within genes in introns. As discussed later, much of this apparently nonfunctional DNA is composed of repetitious DNA sequences, some of which are never transcribed and most all of which are likely dispensable. The different classes of eukaryotic DNA sequences discussed in the following sections are summarized in Table 9-1.

For some reason, no creationist or Intelligent Design advocate ever seems to even acknowledge the existence of this observation, let alone discuss the implications. It's as if they are claiming to play baseball without knowing about first base. Unless an antievolutionist discusses the basic observation at the root of the "junk DNA" hypothesis, they are not even in the right ballpark.

Here is a figure showing the variability in genome size among various organisms:

None of the above is meant to imply that the junk DNA hypothesis is the only possibility. Cavalier-Smith, in particular, strongly advocates the "skeletal DNA" hypothesis. This hypothesis is based on the observation that the C-value (the technical term for the amount of DNA per cell) in a eukaryote species correlates strongly with cell volume:


According to the skeletal DNA hypothesis, then, more noncoding DNA is necessary for larger cells -- perhaps by more widely spacing the protein and RNA-coding genes, transcription can occur more rapidly, allowing the production of more proteins necessary for the larger cell. This is a perfectly reasonable hypothesis giving the "junk DNA" a function, but it is a far cry from allowing the antievolutionists to claim any kind of "design" or "language" for the noncoding DNA. The function under this hypothesis is just spacing, which would be almost completely insensitive to sequence.

It is worth pointing out that the junk DNA hypothesis can accomodate the C-value-to-cell volume correlation as well, if cells with higher volume are more tolerant of junk DNA. The energetic cost of replicating a cell's genome relative to the energetic cost of other cell processes decreases dramatically as cell volume increases, so this is also a reasonable hypothesis. The hypotheses are testable (see articles and a book by Cavalier-Smith), so time will tell the relative worth of the junk DNA and skeletal DNA hypotheses. But it seems very unlikely that the antievolutionist dream of finding a "highly specified" function for most noncoding DNA will ever come to pass. Certainly, avoiding the major observations that lead scientists towards these hypotheses will not garner antievolutionsts any support from the biologists who actually know something about the issue.

For further discussion see this IIEC thread.

Elimination of Conserved Junk: In 2004, researchers reported the creation of mice lacking approximately 3% of their genome.<ref>Marcelo A. Nóbrega, Yiwen Zhu, Ingrid Plajzer-Frick, Veena Afzal and Edward M. Rubin. 2004. Megabase deletions of gene deserts result in viable mice Nature 431:988-993, 21 October 2004.  [Full text] [PubMed]</ref> The deleted sequences were composed entirely of non-coding DNA, but included over 1000 regions that were highly conserved with equivalent sequences in humans for stretches of over 100 bases. The resulting mice were healthy in every way assayed by the researchers, including longevity and reproduction. Thus, a substantial amount of junk DNA, including sequences that have been retained during mammalian evolution, are either dispensable or redundant with other regions of junk.

Some functions of non-coding DNA

There are some known functions, and some hypothesised functions, for bits and pieces of non-coding DNA.

Non-coding gene sections

Stretches of DNA do not have to directly represent an amino-acid to be part of a gene. There are a number of different classes of non-coding DNA which forms part of genes, notably regulatory elements and introns. Because of their well understood functions, regulators and introns are generally not considered "junk" by scientists, but creationists often cite them as an example of junk DNA.

Regulatory elements are stretches of DNA associated with genes that assist the molecular machinery of gene expression in determining which genes, and which exons within those genes, should be expressed. Regulatory elements are essential, as without them the enzymes of gene expression would either be unable to bind to the DNA molecule to initiate transcription or would mis-transcribe the gene.

Introns are sections of non-coding DNA located within the coding sections of a gene. Introns may be transcribed into mRNA, but before translation intro proteins they are spliced out. Many genes, particularly in eukaryotes, are made up of subunits which are coded for by the different exons (coding sections), separated by the introns. The final protein molecule can be compared to a sentance, with the exons representing the words of that sentence. Often a gene may contain alternative versions of one or more of the exons, giving the option of creating two or more similar but not-identical proteins from the same gene. Introns assist the molecular machinery (spliceosome) in the splicing of exons prior to translation into a protein.

However, introns do not take up much more of the genome than coding DNA does, and so do not provide a general function for "junk DNA" or the C-value paradox.

Non-essential DNA

In section three it was argued that DNA whose sequence is not conserved, or which can be removed altogether without causing fatality suggests lack of function. However, the genome is highly complex with interactions between many genes and many regulatory elements, while molecular biology and biochemistry are even more complex, with interactions between many enzymes and their products. Many items in these long and complex relationships may be non-essential, but may still have an adaptationary function, e.g. speeding up a reaction or producing a non-essential but nevertheless useful molecule. The removal of such "non-essential" sequences would not, in the short-term, appear to have an effect, but over evolutionary time, with the aid of recombination, selection will maintain their presence in the genome.

There is much redundancy in both the genetic code and many gene products, e.g. enzymes. Only a small section of the enzyme, the 'active site', takes part in the catalyzation of a reaction. The rest of the protein structure of the enzyme may only be important in making sure the active site is the correct shape, and many enzymes may have stretches where the amino acid (and this gene) sequence does not matter.

Many known regulatory elements have a short section of specific nucleotides, around 10-20, which must be conserved, but also include sections where the number of nucleotides matter, but the specific sequence is irrelevant, and therefore can be subject to mutation with no effect. This may be the case for other non-coding DNA.

DNA stability and size

There is evidence that larger DNA molecules are more stable than shorter ones. Heating DNA causes the double helix to separate, and when cooling they reassociate, during which time the DNA could become damaged. Studies on the chromosomes of Mammals (>1000 Mb) and E. coli (~4 Mb) show that the complementary DNA strands of the human chromosome came together and realigned much quicker than the E. coli chromosome. Junk DNA may therefore have a role in reducing the rate of mutation.

Additionally, repetitive DNA may be important in giving the DNA molecule stability in this situation, E. coli having very little repetitive DNA.


Pseudogenes are defined as sequences showing similarity to genes, but containing mutations precluding the production of functional products. Pseudogenes are therefore, by definition, non-functional junk. However, some sequences initially believed to be pseudogenes have been found to be transcribed, and to have potentially functional RNA products. A nitric oxide synthase (NOS) homologue has been found in snails which was initially thought to be non-functional, but now appears to be involved in antisense mediated RNA interferance, i.e. it regulates the production of NOS by hybridising with the messenger RNA from the functional NOS gene. However, only 5-20% of "pseudogenes" are transcribed at all, so the majority of these sequences remain apparently functionless.<ref>Deyou Zheng, and Mark B. Gerstein. 2007. The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they? Trends in Genetics 23:219-224, May 2007.  [Full text] [PubMed]</ref>

Antievolutionist claims


External links

  1. PZ's blog on pseudogenes
  2. "Junk" DNA, Towards a FAQ on "Junk" DNA
  3. An excellent post by Charlie D at ARN: http://www.arn.org/boards/ubb-get_topic-f-14-t-000365.html
  4. t.o. discussion
  5. >%20Gene C-value paradox -- Lecture notes at UC Davis.
  6. Junk DNA articles on the Panda's Thumb: [1] [2] [3].
  7. Junk DNA Portal Collection of Junk DNA webpages.
  8. "Junk" DNA not junk? October Scientific American at IIEC
  9. The DNA content of a cell at IIEC
  10. The Panda's Thumb on Pseudogenes
  11. "Function, non-function, some function: a brief history of junk DNA", at Genomicron.

Further reading

  1. Abdel-Halim Salem, D, et al. 2003. Alu elements and hominid phylogenetics Proc Nat Acad Sci 100(22):12787–12791, 28 October 2003.  [Full text] [Full text (PDF)]
  2. Knight, Johnathan. 2002. Evolutionary genetics: All genomes great and small Nature 417(6887):374-376, 23 May 2002.  [Full text] [PubMed]
  3. Zuckerkandl E, and Cavalli G. 2006. Combinatorial epigenetics, 'junk DNA', and the evolution of complex organisms Gene 390(1-2):232-42, 1 April 2007.  [Full text] [PubMed]
  4. Zuckerkandl, Emile. 2002. Why so many noncoding nucleotides? The eukaryote genome as an epigenetic machine Genetica 115(1):105-129, May, 2002.  [Full text] [PDF] [PubMed]




This page is part of the EvoWiki encyclopedia of genetics and molecular biology.

Topics: Genetics - Transmission genetics - Molecular genetics - Population genetics - Quantitative genetics - Molecular biology - Genomics
Browse: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Personal tools