A Blog by Carl Zimmer

Redrawing the Tree of Life

In 1837, Charles Darwin scribbled a simple tree in a notebook and scrawled above it, “I think.”

That little doodle represented a big idea: that species were descended from common ancestors. They looked different from each other today thanks to the differences that evolved after their lineages split.

It wasn’t until 1859 that Darwin presented this idea–buttressed by hundreds of pages of argument and evidence–to the public, in his book On the Origin of Species. He included a tree-like diagram in the book to illustrate his concept of how species evolved over time.

In neither of these two pictures did Darwin actually use the names of real species. But as biologist Theodore Pietsch explains in his wonderful new book, Trees of Life: A Visual History of Evolution, Darwin did try to map the kinship of some real species. In 1868, for example, he sketched a tree with humans on one branch and other primates on the others.

Generations of evolutionary biologists have continued to draw more extensive ones–trees that encompass not just primates, not just mammals, not just animals, but all living things.

A tree of life is a visual hypothesis. It’s a statement about how a scientist thinks species are related to one another, an arrangement that best explains the data the scientist can analyze. Those data grow over the years, as scientists find new species, as they find new methods for comparing more species at once, and as they find new things in those species to compare. And along the way, new hypotheses replace old ones.

Darwin could only compare humans to other primates based on their anatomy. In the mid-1900s, scientists began opening up a new lode of information to mine: DNA. There’s not much anatomy you can use to compare E. coli to a mountain lion, but both species share a number of genes, each with its own modified version.

In the 1970s, Carl Woese of the University of Illinois and his colleagues started comparing bits of genetic material across a vast span of species, and drew an entire tree of life. It displayed a stunning large-scale structure. All life, they found, belonged to three great branches. And the next couple decades only strengthened the hypothesis that life belonged to three great domains. This tree, from 1997, is the work of Norman Pace of the University of Colorado.

One branch was our own, the eukaryotes. This branch includes animals, plants, fungi, and protozoans. Eukaryotes share a lot in common. For example, our DNA is packed in a nucleus, where it’s coiled up into packages called chromatin.

Another branch were the bacteria. This familiar lineage includes E. coli and lots of microbes you haven’t heard of. They have no nucleus, and they copy their genes with enzymes not found in eukaryotes.

And the third branch came out of the blue: the archaea. The species that Woese put on this branch were thought to be no different from bacteria, except that they were odd methane-producing microbes that turned up in swamp bottoms and other in nasty places. When Woese and others compared these species, they discovered some unique traits, such as certain types of molecules in their membranes. (In later years, scientists have found archaea in lots of environments, from the open oceans to our bodies, so they don’t deserve their initial reputation for extreme living.)

The three-domain tree continued to gain support as the years passed, and as more species came to light. But things, as they often do, got complicated.

For one thing, it became clear that some genes were not staying on their branches. DNA from one species can sometimes be delivered to the genome of another. That’s how antibiotic resistance spreads from microbe to microbe in your gut, for example. Huge portions of the genomes of species like E. coli were inherited from other lineages.

Some scientists have argued that this transfer of genes obliterates the tree-like structure of evolution. But a lot of the scientists I’ve talked to don’t go so far. They think of those shuttling genes as cars taking side roads to move from one superhighway to another. The flow of traffic still runs recognizably down the interstates.

Meanwhile, a second complication emerged. Maybe there were not three big branches, but just two. James Lake of UCLA first proposed this idea in 1984. He examined the protein-making factories of cells, called ribosomes. The ribosomes of eukaryotes were more similar to some kinds of archaea than others. That suggested to Lake a close kinship. In other words, we are just another branch of the archaea.

The latest test of this idea appears in the December issue of the Proceedings of the Royal Society (open access). It was carried out by Martin Embley of the University of Newcastle and his colleagues. They included some newly discovered archaea that are quite different from previously known species. They compared 41 proteins sequences from all the species, as well as 64 genes among the archaaea and the eukaryotes. Instead of just building a single tree, they constructed many trees from the genes and proteins and then compared them to each other to find the best fit to the data. They consistently found that eukaryotes fit best within the archaea, not on a separate branch.

Tom Williams, the lead author on the new paper, kindly made this figure to show the two alternatives they tested. We are on the green branch, the eukaryotes. The blue rectangle indicates the major branches of the archaea. “TACK” refers to four of those lineages. In the three-domain hypothesis, on the left, our ancestors split off from the ancestors of archaea. Williams, Embley, and their colleagues rejected that hypothesis and put eukaryotes within the Archaea, most closely related to the TACK microbes. It’s not clear which of them is our closest relatives, but the scientists are emphatic about one thing: there are not three domains of life.

Why does this matter? For many reasons. Our ancestry runs through the ancestry of archaea. And by looking at archaea, we may be able to discern some key steps on the path to our eukaryote cells. Our cells have skeletons, for example: molecular scaffolding that give them structure and which they can reassemble to travel from one place to another. Recently, scientists have discovered archaea with two components of our skeleton: actin and tubulin. And in the new journal eLife, Corey Nislow and his colleagues report the discovery of the same coiled-up chromatin in archaea.

In other words, a lot of what we think of as the eukaryote cell may have already evolved over 2 billion years ago in our archaea ancestors–features that stil survive in some archaea today. This proto-eukaryote cell may have been like a chassis, onto which genes from bacteria were plugged in. And at some point, an entire bacterium snuck into our ancestors. Today, we use it to generate energy with oxygen.


Postscript: In working on this piece, I contacted some experts on the tree of life. Most favored the two-domain tree Embley argues for. “I’m pretty sure that the three domain hypothesis has been falsified,” James McInerney of the National University of Ireland told me.

I also contacted Norman Pace, who published the three-domain tree I reprinted above, to get his thoughts. But he didn’t get back in touch until after I had published this post.

Pace is not convinced by Embley’s new work. Every scientist who uses molecular data to build a tree of life has to rely on a model for how DNA evolves–whether some sites are more prone to mutation than others, for example. Pace suspects that the models of scientists like Embley and Lake are using are joining together branches in misleading ways. He argues that the molecules found in the membranes of all archaea are so different from those found in all eukaryotes that they must be separate domains.

“In short, my faith in the three domain tree is not fazed,” says Pace.

Image: A detail of a tree of birds by Max Furbringer, 1888. Universitätsbibliothek Heidelberg, via Creative Commons

[Update 12/20 3 pm: Fixed date of Darwin’s sketch]

10 thoughts on “Redrawing the Tree of Life

  1. I love those trees – they make it abundantly clear just how arbitrary and silly this argument is. The topology of both is identical, and yet the one on the left is supposedly “three domains” and the one on the right is “two domains”. The only reason we see them as fundamentally different is because they differ on where they happen to place ourselves.

    Life has two domains, and three, and four… or as many as you like, depending which level of the tree you want to draw your arbitrary line and say “Branches before this point separate domains, branches after it do not”. Scientifically that’s a meaningless choice though.

  2. Hi Peter,

    actually, the tree topologies are different — in the three domains tree, the Archaea form a clan (potentially a monophyletic group, depending on the position of the root), but in the eocyte tree they don’t — not without including the eukaryotes. So, however we choose to define “domains”, which I agree is not clear-cut, the two trees have different topologies and different biological implications. That might be clearer if you check out figure 1 of our paper (it’s open access) at Proc R Soc B (http://rspb.royalsocietypublishing.org/content/early/2012/10/18/rspb.2012.1795.full)

  3. Nice piece. Interesting findings. I hadn’t seen Darwin’s primate tree before.
    Lots more on old trees:

    This seemed misleading to me:
    “Instead of just building a single tree, they constructed many trees from the genes and proteins and then compared them to each other to find the best fit to the data. ”

    To be fair, this is of course standard procedure and has been for a long time. (Another source of debate is the best way to choose among alternative trees.)

  4. Thanks Carl, I start teaching Cell Structure and Function in a month and now I will be able to give my kids the most currently accepted view or at least let them know different ideas floating among the experts.

  5. Carl,

    Thanks for this posting on Tree of Life, and best wishes here at National Geographic.

    Saying that you are interested in viruses would be an understatement. As you well know, viruses are the most abundant life forms on Earth, and it is estimated that the repertoire of viral genes is greater than that of cellular genes. And, yet, viruses are not part of mainstream evolutionary paradigms, nor are they included in the Tree of Life.

    As I recently wrote in a paper entitled “The Origin and Evolution of Viruses as Molecular Organisms” (http://precedings.nature.com/documents/3886/version/1), this might be one of the major paradoxes in modern biology, and it would be great if you will elaborate on it. Thanks.

  6. I THINK that I shall never see.
    A blog post lovely as a tree.
    However this one closely be.
    — With apologies to the spirit of Joyce Kilmer.

    As a computer scientist, I prefer all trees of information, knowledge and its evolution to be binary. The simplest hypothesis is one evolutionary step at a time, from the least fit for changed environment (at the time). So I’m satisfied (for the moment) with the bifurcating explanation.

  7. I agree with ChasCPeterson (see above) that the current approach of constructing phylogenetic trees is misleading, although, technically, there might not be better way for doing it at the moment. However, it is critical that the scientists working in this field realize the problems with the current approach of building trees based on nucleotide or amino-acid sequences, but unfortunately they don’t; instead, many of them even question the validity of Tree of Life (TOL).

    In the paper I mentioned above on the origin and evolution of viral lineages (http://precedings.nature.com/documents/3886/version/1), I also discuss the problems with the current approach and thinking on the TOL:

    “The intent of the TOL, however, is to establish the line of descent among groups of organisms or species, not necessarily the evolutionary relationships among their genes. Certainly, each of the millions of cellular and viral genes has an evolutionary history that can be revealed by a sequence-based phylogenic tree, but many of these gene-based trees do not represent a TOL that reflects the line of descent among the species.”

    I think there is deep conceptual flaw in the current thinking, and I hope those working in this field, including Tom Williams and Martin Embley, as well as interested science writers such as out host Carl, will address it; of course, if they believe that this is not the case, then it would make sense to say it, in writing.

  8. @Peter: Don’t forget there is an implicit time dimension, which means that the trees are NOT just topology, but make statements about which groups split off earlier or later.

  9. Don’t forget there is an implicit time dimension, which means that the trees are NOT just topology, but make statements about which groups split off earlier or later.

    The time is relative, not absolute or at all to scale, and it just goes away from the root. However, even without this, and even without the root, the two trees aren’t topologically identical. You can’t distort one into the other; you have to pluck at least one branch off and reinsert it elsewhere.

    So I’m satisfied (for the moment) with the bifurcating explanation.

    Both trees are strictly bifurcating near their bases: they agree that eukaryotes and archaeans are more closely related to each other than to Bacteria. What they don’t agree on is whether there’s a monophyletic Archaea that includes Euryarchaeota but excludes Eukary(ot)a: the left one says yes, the right one says no.

Leave a Reply

Your email address will not be published. Required fields are marked *