Junk DNA

But not all of the DNA sequence in our genome is used to make protein (perhaps less than 10%). There is a lot of DNA that is never used to make protein: we know what some of this DNA does, but not all. The bits of DNA we don't understand are often called 'junk DNA'. Much of this DNA is repeated sequences - as if a printer had made a mistake and scattered lots of copies of one page of a book throughout the story…Quite simply, we don't know what this DNA does. It may be important as a 'spacer' in the genome to make sure the active parts work properly. It may be important in evolution of complex animals such as humans.’ Your Genome, 2003

‘Repeats are often described as 'junk' and dismissed as uninteresting. However, they actually represent an extraordinary trove of information about biological processes. The repeats constitute a rich palaeontological record, holding crucial clues about evolutionary events and forces. As passive markers, they provide assays for studying processes of mutation and selection. It is possible to recognize cohorts of repeats 'born' at the same time and to follow their fates in different regions of the genome or in different species. As active agents, repeats have reshaped the genome by causing ectopic rearrangements, creating entirely new genes, modifying and reshuffling existing genes, and modulating overall GC content. They also shed light on chromosome structure and dynamics, and provide tools for medical genetic and population genetic studies. The human is the first repeat-rich genome to be sequenced, and so we investigated what information could be gleaned from this majority component of the human genome. Although some of the general observations about repeats were suggested by previous studies, the draft genome sequence provides the first comprehensive view, allowing some questions to be resolved and new mysteries to emerge….The age distribution of the repeats in the human genome provides a rich 'fossil record' stretching over several hundred million years. The ancestry and approximate age of each fossil can be inferred by exploiting the fact that each copy is derived from, and therefore initially carried the sequence of, a then-active transposon and, being generally under no functional constraint, has accumulated mutations randomly and independently of other copies. We can infer the sequence of the ancestral active elements by clustering the modern derivatives into phylogenetic trees and building a consensus based on the multiple sequence alignment of a cluster of copies.’ "International Human Genome Sequencing Consortium" International Human Genome Sequencing Consortium, Nature, 2001

‘There are, on average, around 12 genes per million bases of human DNA, compared with 117 in fruit flies, 197 in roundworms and 221 in Arabidopsis. Finding genuine genes amid the morass of meaningless  DNA has proven a sore trial to current computer software. Another reason human genes are hard to detect is that, with other creatures' genes, they are highly fragmented.’ Henry Gee, Nature, 2001

‘But why not cut out the middle man? A viral genome could drop most of the virus’s genes and keep just the reverse transcriptase gene. Then this streamlined parasite could give up the laborious business of trying to jump from person to person in spit or during sex, and instead just hitchhike down the generations within its victims’ genomes. A true genetic parasite. Such ‘retrotransposons’ are far commoner even than retroviruses. The most common of all is a sequence of ‘letters’ known as LINE-1. This is a ‘paragraph’ of DNA between a thousand and six thousand ‘letters’ long, that includes a complete recipe for reverse transcriptase near the middle. LINE-1s are not only very common – there may be 100,000 copies of them in each copy of your genome – but are also very gregarious, so that the paragraph may be repeated several times in succession on the chromosome. They account for a staggering 14.6% of the entire genome…the implications of this are terrifying…Even commoner than LINE-1s are shorter paragraphs called Alus. Each Alu contains between 180 and 280 ‘letters’, and seems to be especially good at just using other people’s reverse transcriptase to get itself duplicated. The Alu text may be repeated a million times in the human genome – amounting to perhaps 10% of the whole book…. the typical Alu sequence bears a close resemblance to a real gene… This gene, unusually, has what is called an internal promoter, meaning that the message ‘READ ME’ is written in a sequence in the middle of the gene. It is thus an ideal candidate for proliferation…each Alu gene is probably a ‘pseudogene’…rusting wrecks of genes that have been holed below the waterline by a serious mutation and sunk. They now lie on the bottom of the genomic ocean, gradually growing rustier, (that is, accumulating more mutations) until they no longer even ressemble the genes they once were.’ Matt Ridley, Genome: The Autobiography of a Species in 23 Chapters, Fourth Estate, 2000

‘Actual exons take up as little as 1 per cent of the genome, while introns account for 24 per cent. More than half of the euchromatic genome consists of repeat  sequences, with the vast majority (45 per cent) accounted for by repeats derived from 'parasitic DNA', called transposable elements or transposons. These elements propagate by replicating and then inserting a new copy of themselves into another site in the genome. The sheer numbers of repeated elements is unprecedented in any other sequenced genome: repeats account for just 1.5 per cent of a typical bacterial genome and 3 per cent of fly euchromatin.’ Richard Gallagher and Carina Dennis, A Repititious Genome, Wellcome Trust               


97% of the three billion letters of the Human Genome is described as ‘junk DNA’; only 3% of our DNA appears to code for proteins

No wonder we talk rubbish, oxymoronic bollocks, gobbledygook -

splurge out non sequiturs, DVD/ IKEA instructions, a Jeffrey Archer book;

we’re mostly made of nonsense –

unsolvable even by hundreds of boffins in conference.

Bad enough to be the same as a jellyfish essentially -

slug, amoeba, virus - President George W Bush, a garden pea,

but stuffed with genetic parasites, meaningless repeats –

‘retrotransposon LINE-1’, thousands of letters long, logic defeats;

a hundred thousand copies doing nothing for us at all,

just multiplying like mad, having a ball -

another ten per cent, a million ‘Alus’ -

probable psuedogene that pays no dues;

a useless ‘READ ME’ written in code, just for the thrill,

nonsense gene like something dreamt up by Lewis Carroll…

We’re a bucket of junk, imperfect deletions -

irrelevant text, bungles, mistakes, inaccurate completions;

35% DNA classed as ‘selfish’ -

and worse, quite at home in a lab, in a petri dish -

ten per cent of our genes derived from bacteria! -

enough to cause the hygiene-conscious a bout of hysteria;

not to mention the remains of several thousand viruses -

came to live as part of us, instead of being our nemesis.

But, hmmmm, come to think of those that did me wrong -

for no other reason than being female, talented, not strong…       


yeah, when I think of those rotten mothers,

the bacteria thing is easier to understand in some cases, than others.


A typo that mutated

into a pleasing idea.

‘A collection of mystery DNA segments, which seem to be critical for the survival of many animals, are causing great interest among scientists. Researchers inspecting the genetic code of rats, mice and humans were surprised to find they shared many identical chunks of apparently "junk" DNA. This implies the code is so vital that even 75 million years of evolution in these mammals could not tinker with it. But what the DNA does, and how, is a puzzle, the journal Science reports. Before scientists began laboriously mapping several animal life-codes, they had a rather narrow opinion about which parts of the genome were important.’ BBC, 2004

Such a human response - naming something misunderstood

in Nature as ‘junk’, like the whiskery miracle of brother rat,

dismissed as ‘vermin’; those dazzling savages, urban seagulls,

as ‘flying rats’, just for cunningly adapting to our inland filth -

likewise the fox gorgeously hunting evening bins, rubbish tips -

sudden russet flash of him electrifying urban streets; his beauty

called ‘nuisance’, ‘parasite’. And despite stunning mechanisms,

their sampling, fingery feelers subtle connoisseurs of air, space,

environment, ant and woodlouse, spider, called ‘creepy-crawlies’,

‘bugs’ - stood on – a million years of Evolution crushed in each

crunched exoskeleton; or silly screamers at the tiny flickering black

brilliance of starry bats, singing invisibly higher than piccolo stars -

pouring poison on the weed that is the Mother of Flowers;

humble in the yard, yet still exhibiting her learning of light.

‘According to the traditional viewpoint, the really crucial things were genes, which code for proteins - the "building blocks of life". A few other sections that regulate gene function were also considered useful. The rest was thought to be excess baggage - or "junk" DNA. But the new findings suggest this interpretation was somewhat wanting. David Haussler of the University of California, Santa Cruz, US, and his team compared the genome sequences of man, mouse and rat. They found - to their astonishment - that several great stretches of DNA were identical across the three species. To guard against this happening by coincidence, they looked for sequences that were at least 200 base-pairs (the molecules that make up DNA) in length. Statistically, a sequence of this length would almost never appear in all three by chance. Not only did one sequence of this length appear in all three - 480 did. The regions largely matched up with chicken, dog and fish sequences, too; but are absent from sea squirt and fruit flies… He thinks the most likely scenario is that they control the activity of indispensable genes and embryo development. Nearly a quarter of the sequences overlap with genes and may help slice RNA - the chemical cousin of DNA involved in protein production - into different forms, Professor Haussler believes. The conserved elements that do not actually overlap with genes tend to cluster next to genes that play a role in embryonic development… The next step is to pin down a conclusive function for these chunks of genetic material. One method could be to produce genetically engineered mice that have bits of the sequences "knocked out". By comparing their development with that of normal mice, scientists might be able to work out the DNA's purpose. Despite all the questions that this research has raised, one thing is clear: scientists need to review their ideas about junk DNA.’ BBC, 2004

It absolutely knocked me off my chair. It's extraordinarily exciting to think that there are these ultra-conserved elements that weren't noticed by the scientific community before. The really interesting thing is that many of these ‘ultra-conserved’ regions do not appear to code for protein. If it was not for the fact that they popped up in so many different species, they might have been dismissed as useless padding. But whatever their function is, it is clearly of great importance. We know this because ever since rodents, humans, chickens and fish shared an ancestor - about 400 million years ago - these sequences have resisted change. This strongly suggests that any alteration would have damaged the animals' ability to survive. These initial findings tell us quite a lot of the genome was doing something important other than coding for proteins. The fact that the conserved elements are hanging around the most important development genes, suggests they have some role in regulating the process of development and differentiation.” Professor David Haussler, University of California, Santa Cruz, US

"Amazingly, there were calls from some sections to only map the bits of genome that coded for protein - mapping the rest was thought to be a waste of time. It is very lucky that entire genomes were mapped, as this work is showing – I think other bits of 'junk' DNA will turn out not to be junk. I think this is the tip of the iceberg, and that there will be many more similar findings." Professor Chris Ponting, Functional Genetics Unit, Medical Research Council, UK

Revealing the acute artistry of Nature;

even she works blank canvas, frame -

shadows - surrounding air and light,

darkness; the molecular components

of pigment are significant - affecting

the sublime whole; evolving, ongoing

masterpiece hung in the world home,

where artistic thought, first principle -

exercising of the spirit of creativity,

will muster tools, expression, spells,

whose nature belongs to alchemy

of leaves, manufacture of the eye

from dust and light - the shining there -

who might yet understand such process.

‘HUMAN CHROMOSOME 18 COMPLETED - There are not many genes on human chromosome 18, especially in comparison to its 23 siblings. Yet chromosome 18 is far from a genomic backwater. Indeed, as described by a research team led by Broad scientists…evolution has worked hard to conserve a surprisingly large number of non-gene regions of the chromosome. This discovery offers tantalizing evidence that 'extra' sequence plays an important role in genome function. It was known that chromosome 18 has a low density of protein-coding genes, which may explain why fetuses carrying three copies of chromosome 18 can survive to birth (although with devastating consequences). Most protein-coding gene sequences are highly conserved among mammals, but since they represent only about 1.5 per cent of the genome, they are not sufficient for understanding how the genome works. The non-genic sequences are undoubtedly also important, but these are poorly understood. It is known that about 3 per cent of the human genome that is not in genes is highly conserved among mammals, but to what end is still mysterious.’ Wellcome Trust, 2005

Deeper yet we must go, into the unknown -

wild recesses, jungle, rainforest, mistaken

for wasteground; stillness of battlegrounds.

What strange blooms there, partial remains

still emanating something profound Nature

sees fit to retain, re-use in incalculable ways -

genomic earth, amalgam, compound, compost;

contributing elements, help, molecular fertiliser.

‘The purpose of the 97 per cent of 'junk' DNA is being discovered. We have got stronger hints than before that the repeat family called Alu may play some important function. We have always suspected that we couldn't simply divide the genome into 3 per cent of good stuff (genes) and 97 per cent of junk. Here we are beginning to see some of the functions of the 'junk'. Exactly as one would expect, the junk has a function – rather more diffuse than the hard information carried by the genes, but nevertheless functional in some way. It may help to move genes around.’ Wellcome Trust

Note from the author
exploring the project

    The Human Genome Project (1)
    The Word
    Genetic Transcription
    & Translation
    Nature of the Genome
        Junk DNA
    All Life is One

Leave a comment
About the author
Make a contribution
Legal note on copyrightHome.htmlNote_from_the_author.htmlExploring_the_project.htmlQuotes.htmlIntroduction.htmlContents.htmlSEQUENCE_ONE.htmlPrimer.htmlIntroductory_Poems.htmlChemistry.htmlThe_Double_Helix.htmlRevelation.htmlThe_Human_Genome_Project.htmlThe_Word.htmlGenetic_Transcription_%26_Translation.htmlGenetic_Transcription_%26_Translation.htmlNature_of_the_Genome.htmlBacteria.htmlAll_life_is_one.htmlSEQUENCE_TWO.htmlSEQUENCE_THREE.htmlSEQUENCE_FOUR.htmlComment.htmlAbout.htmlContribute.htmlCopyright.htmlshapeimage_2_link_0shapeimage_2_link_1shapeimage_2_link_2shapeimage_2_link_3shapeimage_2_link_4shapeimage_2_link_5shapeimage_2_link_6shapeimage_2_link_7shapeimage_2_link_8shapeimage_2_link_9shapeimage_2_link_10shapeimage_2_link_11shapeimage_2_link_12shapeimage_2_link_13shapeimage_2_link_14shapeimage_2_link_15shapeimage_2_link_16shapeimage_2_link_17shapeimage_2_link_18shapeimage_2_link_19shapeimage_2_link_20shapeimage_2_link_21shapeimage_2_link_22shapeimage_2_link_23shapeimage_2_link_24shapeimage_2_link_25shapeimage_2_link_26