The Lurker: How A Virus Hid In Our Genome For Six Million Years

In the mid-2000s, David Markovitz, a scientist at the University of Michigan, and his colleagues took a look at the blood of people infected with HIV. Human immunodeficiency viruses kill their hosts by exhausting the immune system, allowing all sorts of pathogens to sweep into their host’s body. So it wasn’t a huge surprise for Markovitz and his colleagues to find other viruses in the blood of the HIV patients. What was surprising was where those other viruses had come from: from within the patients’ own DNA.

HIV belongs to a class of viruses called retroviruses. They all share three genes in common. One, called gag, gives rise to the inner shell where the virus’s genes are stored. Another, called env, makes knobs on the outer surface of the virus, that allow it to latch onto cells and invade them. And a third, called pol, makes an enzyme that inserts the virus’s genes into its host cell’s DNA.

It turns out that the human genome contains segments of DNA that match pol, env, and gag. Lots of them. Scientists have identified 100,000 pieces of retrovirus DNA in our genes, making up eight percent of the human genome. That’s a huge portion of our DNA when you consider that protein coding genes make up just over one percent of the genome.

Scientists have studied these so-called endogenous retroviruses both in humans and in other species, and the evidence all points to the same scenario for how they genetically merged with us. Our ancestors were infected with retroviruses on a regular basis. On rare occasion, a virus infected a sperm or egg and managed to end up in an embryo. Every new cell in the embryo inherited the retrovirus DNA implanted in its genome. And then the embryo grew up into an adult, which then had offspring of its own, and passed the virus DNA on as well.

At first, the virus still retained some of its old powers. Its DNA could sometimes still give rise to new viruses. Mutations arose in the viral genes, and they might prevent it from making shells. Yet the dying virus could still make a new copy of its genes and insert them back into its host genome. That would explain why it’s possible to classify our many endogenous retroviruses into different families. The families are made up of new copies of an ancestral virus.

Eventually, however, the endogenous retroviruses got so hobbled by mutations that they became nothing more than baggage. (In some cases, we’ve domesticated their genes, co-opting them for our own functions, such as building a placenta.) Given that many matching endogenous retroviruses can be found in other primates, this process has been going on for millions of years–even tens of millions.

The world of our inner viruses is still a murky, mysterious one that scientists are still surveying. And Markovitz’s discovery enabled him to add considerably to our understanding of these shadowy creatures. He discovered new members of a particularly interesting class of endogenous retroviruses–ones that, even today, can still have life breathed into them.

Markovitz and his colleagues analyzed the sequence of the virus genes they found in the patients with HIV. The genes belonged to a family of endogenous retroviruses called HERV-K, but they were not quite like any known HERV-K virus previously found.

The Michigan scientists wondered if this new HERV-K virus was hidden in the human genome. They checked the most complete draft of the human genome and couldn’t find a match. They knew that the human genome sequence was only about 95% finished, so they turned instead to the chimpanzee genome, on the off chance that the virus had infected the common ancestor of humans and chimpanzees over six million years ago. Bingo: a single copy of the virus turned up in the chimp genome. They dubbed it K111.

Having found this match, the scientists decided to return to the human genome and search for K111. They isolated DNA from their HIV patients, as well as from healthy people. They then split apart the two strands of the DNA and added a short piece of DNA that would bind to K111, should it be lurking there. In all 189 of their subjects, the scientists found the virus’s DNA.

Remarkably, though, the scientists didn’t find just one copy of K111 in each of their subject’s genomes, as is the case in chimps. The more the scientists looked, the more variants they found. Some K111 viruses were fairly intact, while others were vestiges. The scientists found over 100 copies of the virus in the human genome, scattered across fifteen chromosomes.

To figure out the origin of K111, the scientists looked back at other primates. They couldn’t find a version of K111 in any species other than chimpanzees. They concluded that the virus infected our ancestors not long before the split between humans and chimpanzees roughly six million years ago.

To find out what happened next, Markovitz and his colleagues turned to the genomes of extinct humans. Svante Paabo of the Max Planck Institute and his colleagues have sequenced the Neanderthal genome, as well as the genome of a lineage of mysterious cousins of Neanderthals, known as Denisovans. Our own ancestors diverged from those of Neanderthals and Denisovans about 800,000 years ago. Markovitz and his colleagues looked for K111 in their genomes, and there it was. The scientists found seven copies of K11 in Neanderthal DNA and four in the Denisovan genome.

This finding suggests that between 6 million and 800,000 years ago, K111 was duplicated a few times at a fairly slow pace. It’s possible that Markowitz and his colleagues missed some other copies because the reconstruction of those ancient genomes wasn’t quite accurate enough for their search. But even if we generously assumed that Neanderthals and Denisovans had twenty K111 viruses apiece, that’s still a small fraction of the 100 or more copies of K111 the scientists found in the human genome. It was only later, in the past 800,000 years, that K111 started proliferating at a faster pace.

One reason that K111 has gone overlooked till now is that it found a good place to hide–the center of chromosomes. This region, called the centromere, is a genomic Bermuda Triangle. It’s loaded with lots of short, repetitive stretches of DNA. When scientists reconstruct the sequence of a genome, they break DNA down into many overlapping segments, which they then try to rebuild based on overlapping similarities. Centromere DNA is so similar to itself that it’s easy to line up fragments in many different arrangements. As a result, centromeres make up much of the last 5% of the human genome that has yet to be mapped.

Another reason K111 has been able to hide for so long is that it’s fairly feeble. It lacks genes to make shells, so it can’t escape from its host cells any more. In fact, it was our own centromeres that appear to have made all the extra copies of K111. The repeating DNA in centromeres is not just tricky for human gene sequencers. It’s also tricky for the enzymes in a cell that make new copies of our DNA. They can slip up and accidentally swap segments from two chromosomes. K111 was thus able to spread from the centromere of one chromosome to another. Our cells also stutter sometimes when they try to copy centromere DNA, making extra copies of segments there. Markovitz and his colleagues argue that this is how new copies of K111 proliferated within each centromere.

Ironically, it was the HIV in the patients Markovitz and his colleagues studied which brought K111 back to light. When people get infected with HIV, the virus makes a protein called Tat which uncoils tightly wound stretches of human DNA, which allows its host cell to make more HIV at a faster rate.

Markovitz and his colleagues wondered if the Tat in their HIV-infected patients was spurring cells to also make copies of K111. To find out, they injected Tat proteins into human cells that were free of HIV. As they predicted, out came new genes for K111.

It’s conceivable that K111 interacts with HIV to contribute to AIDS, but Markovitz and his colleagues found no evidence of that. It’s certainly worth investigating further. But there’s another reason to keep learning about K111.  Now that scientists have discovered K111, they can look for more copies of it in centromeres. Markovits suggests that their distinctive genes might serve as a kind of genetic barcode that could help genome mappers orient themselves in the hall of mirrors that is centromere DNA. Perhaps the human genome sequence will finally be completely mapped thanks to a virus that has been hiding in it for six million years.

(For more information, see my book, A Planet of Viruses.)

28 thoughts on “The Lurker: How A Virus Hid In Our Genome For Six Million Years

  1. the significance for people like me who don’t believe in evolution, is that this offers a mechanism, in initial understanding, of why random mutation, with its mathematical impossibilities, does not work, and how evolution works via adaptive mutation, just a start, but a definite start

  2. @jack: Way to miss the forest by looking at too many trees. K111 explains how micro changes to DNA causes point-in-time mutation but does not account for mass species extinction and environmental stressors that give rise to massive shifts in adaptation — i.e. the circumstances that give rise to the evolutionary process.

  3. Random mutations have no mathematical impossibility. It has been observed and well documented. Pseudoscience makes me sad.

  4. This is discovery of K111 is only the beginning of understanding the way the human genome and it’s centromere DNA adapts, mutates and may be responsible for the misconception of mathematical impossibilities. Excellent article, keep the research moving Markovits!

  5. Fascinating article. Thank you for sharing. No, it is not a mathematical impossibility, the correct term is statistical impossibility.

  6. This article begins a new era in the world of medical science. Thanks to the all researchers for this research. And we have to protect from this viruses. For this reason we should maintain our own religious activities.

  7. Excellent post. This is the best article written on the subject that I have seen anywhere, and I’ve been looking all around. When you combine the chronology of retrovirus duplication with the chronology of srGAP2 duplication, you end up with a timeline that matches human evolution almost perfectly. I’m convinced that viruses affecting srGAP2 are what made us the intelligent creatures we are today.

    It also makes me wonder who is really in charge of this body of mine — me or the viruses that inhabit it? Do I have free will or am I behaving the way my viruses want me to? It’s starting to feel like viruses are calling all the shots when you get down to the bottom of things. There’s a science fiction movie script here for someone.

  8. Judy, yes, that collection of organisms that is you is most likely guided by “cooperative viruses” rather than selfish genes free will – not a chance.

  9. Markovitz didn’t investigate further the K111 interaction with HIV and (?) contribution to AIDS, that’s the missing piece of work that would perhaps lead to groundbreaking discovery that viruses don’t invade HIV patient’s body from outside, but rather from patient’s own genome. Kaposi sarcoma, Herpes viruses, and other viruses may be the dormant mummies that are awakened by genetic pertubations likewise in AIDS patients.

  10. Hi, Jack. I’m a Christian and believe God made the world through evolution. I don’t understand the distinction you’re trying to make between small adaptive mutations in response to small forces and large-scale adaptations in response to major environmental or geological forces. Adaptation/mutation in either case is still evolution, and neither proves nor disproves God’s involvement in the unfolding story of Creation. I encourage you to consider other theological positions that embrace a God whose authority and power aren’t dependent on our ability to distinguish between ‘micro-evolution’ and ‘macro-evolution.’

  11. This is amazing! We are carrying all those viruses in our body so if we would be able to get rid of all of them including everything what we acquired from ancestors, I’m sure we could live in health over 2-300 years! With no involvement of viruses, bacteria etc our body could be so strong,no mutations no kids born with disabilities!!! Perfect world!

  12. @aggy
    while viruses are not needed without bacteria we can’t survive
    bacteria is the ones that keeps us strong they live in out stomach and help it digest and turn complex food into simple nutritiouns allowing our body to absorb even complex food in return our body gives bacteria the food we don’t want and the immune system marks them as friendly

  13. So we have been carrying around a virus for 6 million years that has been mutating in our dna like gangbusters since we separated from chimps? But that same virus has failed to mutate in chimps in the last 6 million years. That doesn’t give scientists pause to think that perhaps this dormant code was not inherited because why? Darwinism is never scrutinized by its advocates. The Bible is not a textbook bu at least most of its admirers accept it as an allegory for something they don’t understand. Perversely science which is supposed question everything fails to make even the slightest inquiry when the evidence may alter their world view. Galileo would be apappalled at this reversal.

  14. If k111 proliferation is characteristic to humans, maybe it helped make us what we are. It would be interesting to know what the eradication of k111 from our genome would do to us.

  15. The excitement I received from this article is immeasurable. The fact that we have discovered a dormant accomplish to HIV’s terror within our own genome makes me wary of the other terrors contained within myself. Also Judy, concerning your thoughts on viral dna and our behavior. Perhaps the lack of cooperation, compatibility, and interaction between our dormant viral dna is what gives us free will. Imagine if viruses really DID direct us. I imagine humans being emotionless and empty, ore mechanical than worker drones in a bee hive. We would constantly be in close contact in order to spread and obtain new viruses, and would domesticate life that we could infect, and capture life we couldn’t. Being run by mindlessly driven and straightforward viruses, without the notion of culture, emotion, or even technology, we would be akin to the Borg of Star Trek, but without technology or cybernetic modifications, and genetic ones instead. Still retaining the cold, logical nature, but with no technology to supplement it. Of course, perhaps the minds inhibiting these vassals THEMSELVES are viruses whose various deviations in genetics, sequence, and number have resulted in the large cosmetic and mental diversity of the world today. Perhaps we were instinctively driven to commit racial acts of terror such as the crusades and wars of the past, sensing radically different viral populations within humans of other lands, and thus feeling the need to eradicate or mingle with them, justifying and hiding these decisions through politics, religions, prejudices, and other social constructs. Of course, I believe the possibility of viruses infecting and puppeting us to such a large extent to be highly unlikely, but perhaps viruses do play a part in our social interactions, at least on an instinctive level.

  16. Did artistic intuition preced this by a few years ?

    “Doktor Kurt Unruh von Steinplatz has put forth an interesting theory as to the origins and history of this word virus. He postulates that the word was a virus of what he calls BIOLOGIC MUTATION effecting the biologic change in its host which was then genetically conveyed. One reason that apes cant talk is because the structure of their inner throats is simply not designed to formulate words.

    He postulates that alteration in inner throat structure were occasioned by virus illness … And not occasion … This illness may well have had a high rate of mortality but some female apes must have survived to give birth to the wunder kindern. The illness perhaps assumed a more malignant form in the male because of his more developed and rigid muscular structure causing death through strangulation and vertebral fracture. Since the virus in both male and female precipitates sexual frenzy through irritation of sex centers in the brain the males impregnated the females in their death spasms and the altered throat structured was genetically conveyed. Having effected alterations in the host’s structure that resulted in a new species specially designed to accomodate the virus the virus can now replicate without disturbing the metabolism and without being recognized a virus. A symbiotic relationship has now been established and the virus is now built into the host which sees the virus as a useful part of itself.”

  17. I have two questions:
    1. Will HIV eventually become as benign as K111?
    2. Do we know where in the Human Genome HIV inserts it’s genetic code is it in the same “Centromere” area?

  18. My main interest in this discovery is wondering what effect the K111 virus has in infected hosts. Does it produce symptoms? How might it have spread from one host to another?

    Along more speculative thoughts, I wonder how viruses influence human behavior. Could viruses actually make it possible or impossible for some people to reason in certain ways? Could viruses make some people prone to voting Democratic?

    @ Chris Steele 1. Perhaps. As I recall, chimps show evidence that they were massively infected with the simian version, that led to the near extinction of the species. However, a few chimps developed an immunity to the virus, and now do not show symptoms when infected.

  19. Just echoing Andrei – what would we get from removing all of K111 from our genome? Someone is going to have to test this and find out, somewhere, sometime.

    1. Interesting, BUT
      try to look at it the other way.

      The human genome assimilated the virus and not the other way around.

      This could be a way to recognise the invaders and develop an immunity to this virus. (Kind of know your enemies process)

      HIV may be messing whit this process and let the dormant virus reactivated itself.

  20. I think it all started with one virus. More virus joined the party over time. And we are just means of virus replication. Most viruses don’t need to make shell as we are the shell.

    I don’t know I am feeling lil depressed 🙂 where’s my Book.

  21. Read your book ” A planet of viruses”. Its great article. i am looking forward to bring out a seminar at university level on this particular topic.

    thank you

  22. Over the next 20 or so years, humans will advance greatly in their understanding of every part of the human body right down to the genome, to know about how and why genes do what they do and how to successfully edit them. About half will of the Information will be very use full, but as for other half that will just scare the Crap out of most of Us, even the ones that under stand this emerging science. I like this very informative and intriguing article, but I would rather be ignorant of this worrisome knowledge because ignorance is truly bliss. Not a thought in your head or a care in the world is preferred because to much knowledge is some times a Very Bad thing.

