Tree of Life, c. 2006


Scientists are probably centuries away from drawing the full tree of life. For one thing, they have only discovered a small fraction of the species on Earth–perhaps only ten percent. They are also grappling with the relationships between the species they have discovered. Systematists (scientists who study the tree of life) rely mainly on DNA these days to figure out how species are related to one another. They compare the similarities and differences in a given gene in several different species to figure out which ones share the closest kinship. But they have actually sequenced DNA from relatively few species. And in many cases, that DNA may come from a single gene.

Systematists have made good use of this scanty data. They’ve been able to sort out relationships of many big groups of species, from mammals to plants–groups that sparked debate among systematists for decades. But these are just small tufts on the complete tree of life. The big picture has proven harder to pull into sharp focus.

When systematists try to make sense of billions of years of evolution, they must struggle with many foes. The DNA they study, for example, may send them a misleading signal. Some of the most common mutations are known as point mutations, because they change DNA at one point–changing a single base of DNA to another. Since there are are only four letters in the alphabet of DNA, it’s not surprising that over billions of years two lineages may acquire the same letter at the same position in the same gene. These two independent mutations may well give the illusion that two lineages share a close common ancestry.

Genes create another challenge when they jump from one species to another. For decades scientists have known that microbes can swap genes, primarily with the help of viruses that sometimes move between species, carrying some host DNA with their own. At first this hopping seemed like rare flukes. Then some systematists argued that gene swapping was so common that life’s history might be better represented by a web than a tree. Most experts disagree: they argue that this gene-swapping does not destroy the quest for the tree of life. It creates vines draped between the branches of the tree of life, but the branches of the tree are still visible.

It’s been hard to resolve this debate because until now most scientists have analyzed the tree of life by looking at just one gene in a number of species, or, in rare cases, a few genes. Fortunately, scientists now have entire genomes of a couple hundred species to analyze. In the new issue of Science, biologists at the European Molecular Biology Laboratory published the latest, most thorough glimpse at the tree of life.

It’s quite something to behold. I’ve posted a reduced version of the tree on this page, and you can get a closer look here, at a site dedicated to the project. To orient yourself, our species is at about two o’clock, next to the chimp, rat, and mouse.

The scientists took advantage of the fact that so many genomes have been sequenced over the past decade, and that it’s now possible to compare the DNA in different genomes relatively quickly (if you have a supercomputer, of course). Their strategy was to search for all the genes that could offer the clearest clues to the tree of life–genes that had not been swapped too much between species, for example. They searched the genomes of 191 species of animals, plants, fungi, protozoans, bacteria, and archaea (microbes that look superficially a lot like bacteria). They selected 36 universal genes, but then tossed out five of them because they appear to have been swapped.

This tree emerged from their analysis of the remaining 31 genes. The scientists kicked the tires, as it were, by running the tree through a series of statistical tests. Did the same pattern of branches emerge if they left out some species? What happened if they left out one gene or another from the analysis? Two-thirds of the branches turn out to have 100% support from these tests, and many of the others, while not so perfect, are still statistically robust. So this study suggests that gene-swapping does not end the quest for the true tree of life. (Other scientists came to a similar conclusion last year, which I wrote about here.)

Here’s a quick tour of the tree. Start at middle of the circle. The central point represents the last common ancestor of all living things on Earth. The tree sprouts three deep branches, which between them contain all the species the scientists studied. These deep branches first came to light in the 1970s, and are known as domains. We belong to the red domain of Eukaryota, along with plants, fungi, and protozoans. Bacteria (blue) and Archaea (green) make up the other two domains.

These lineages probably split very early in the history of life. Fossils of bacteria that look much like living bacteria turn up at least 3.4 billion years ago. Just a few lineages became multicellular much later, with some algae getting macroscopic about two billion years ago.

The length of the branches on this tree represent so-called genetic distance. The longer the branch, the more substitutions have accumulated in its genes. Since these genomes all come from living species, the branches all span the same period of time. The fact that some branches are long and some are short means that some lineages have evolved more than others. Many forces can stretch out genetic distance. A species may reproduce fast, or it may have a life that makes it prone to acquiring more mutations. The slash in the Bacteria branch represents a segment that the scientists left out to make the full tree easier to see.

The long length of the Bacteria branch underscores one of the big messages of this tree: the diversity we can see with the naked eye reflect a pretty paltry snippet of life’s genetic diversity. Humans and mushrooms are tucked into a small part of the tree. Meanwhile, bacteria such as ones that cause strep throat (Streptococcous) and the ones that cause food poisoning (Salmonella) are divided by vast evolutionary gulfs. The diversity of microbes did not stop evolving billions of years ago. Escherichia coli, for example, emerged relatively recently, specializing on the warm guts of mammals and birds.

As the scientists point out, this tree challenges the traditional way that biologists classify living things into species, genus, family, class, phylum, and kingdom. Scientists named many of these groups in the eighteenth and nineteenth centuries, when they could only sort species by what they could see with the naked eye or a crude microscope. But there’s a vast amount of hidden biochemical diversity in living things, and that diversity is reflected in this new tree. The scientists compared the genetic distance among different groups. Animals in different phyla are separated by much less genetic distance than bacteria that are in the same phylum. If scientists were classifying life from scratch today based on genetic distance, they’d probably downgrade animal phyla to classes.

This discovery does not sit well with claims from creationists that evolution cannot account for the emergence of animal phyla. It is certainly true that the earliest fossils of several animal phyla emerge over a span of perhaps thirty million years around the beginning of the Cambrian period, 540 million years ago. But animal phyla are, in a sense, overrated. This new tree of life supports a growing consensus that relatively small genetic changes in animal evolution led to big changes in their bodies. (If you want to read a whole book on this, check out Endless Forms Most Beautiful.)

This tree supports some findings from other recent studies. Mushrooms are more closely related to us than they are to plants, for example. But it will also make some scientists unhappy in other ways. There’s a big debate going on these days about the animal kingdom. Some researhcers think that arthropods and nematodes belong to a “moulting” group. But this tree suggests that arthropods are more closely related to us vertebrates. The authors of the new study acknowledge that their tree may be unreliable in this respect. That’s because the animal genomes that have been sequenced may not belong to the best species to include in this sort of study. The fruit fly Drosophilia melanogaster or the mouse Mus musculus did not get their genomes sequenced so they could be put in the tree of life. They were chosen because scientists had studied their genes and physiology for decades. They would be able to make good use of the genomes of these animals for their research. It would help enormously if scientists could get the genomes of other animals that belong to different branches of the animal kingdom, such as ragworms and other obscure critters.

Fortunately, scientiss should be able to add these species to this tree very quickly. In the past, scientists have had to do a lot of their tree-building by hand, lining up genes, idenfitying cases of gene-swapping, and so on. But as the European scientists built this new tree, they were able to set up an automated pipeline. As new genomes are published, it will be possible to let a computer automatically compare them to older sequences and generate a new tree that does a better job of explaining all the evidence. None of us may live to see the full tree of life emerge, but at least we may be able to savor a better sneak preview.

Update, 3/5 9:45 am: Rhasgobel has put together a useful list of translations of the Latin names on the tree.

0 thoughts on “Tree of Life, c. 2006

  1. Wow, that’s beautiful. I had heard that the diversity we can see doesn’t correlate well with biochemical diversity… but this is very vivid. It has changed my view of the world.

    By the way, not that you would, but please don’t ever be misled by the comment counts. I suspect the typical reader, like me, comes to sites like Pharyngula or The Loom for science we’d have trouble understanding in journals or finding in the popular press. Controversial topics are the side dishes. But in the comments, it’s easy to throw in your two cents’ worth on a report about outrageous stupidity; on a post like this, it’s hard to say much but “wow, that’s beautiful”.

  2. Genetic distance is certainly a better measure than the old taxonomy growing out of gross morphology. I suspect that this tree itself will change a lot as we continue to move from “beanbag genetics” toward understanding gene regulation and interaction. It will give more weight to a base-pair substitution in a high-level regulatory gene, which could alter a whole developmental cascade, than to one in Yet Another intron.

    (Of course, that view in itself betrays metazoan self-regard. Those who don’t have developmental cascades don’t think they’re such a big deal…)

  3. This gives one a buzz even better than the light coming thru a stained-glass window! And, based in reality – fantastic, thanks for the reporting.

  4. First (because the next bit is rather verbose): how long was the piece of line that the authors cut out to make the tree more readable?

    Now a comment: Gene Phylogeny can be substantially different from Organismal Phylogeny. (Read “The Ancestor’s Tale” by Dawkins)

    On one hand, species can diverge phenotypically while their genomes remain substantially the same — take humans and chimps, for example. Substantial chunks of genome (haplotypes) typically remain unmodified well beyond speciation events.

    On the other hand, given that evolution works on genetic variation, at least some of this
    variation must PRECEDE any speciation event.

    Since higher-level cladogenesis begins with speciation, the same can be said for the origin of higher clades.

    Further, different genes/haplotypes change at different rates over evolutionary time (and among lineages), so analysis can (will?) give different phylogenetic trees for different genes/haplotypes.

    The conclusion is that there is no single “genetic evolutionary tree”, but a cluster (more or less) of similar ones, one for each gene/haplotype).

    When dealing with divergence in this way, it becomes clear that the branching “points” are actually 2-dimensional: “smeared out” along the time axis, FOR EACH gene/haploytpe, with the timing and duration of the smears likely being different for each. (This revelation shook my foundations!) There’s a clear analogy with the “Point estimate” and “Confidence interval” in statistical analysis. The representation of phylogeny as a tree with distinct (and unique) branch points is somewhat misleading.

  5. Monte Davis writes in anticipation of there being more weight given to changes in regulatory genes versus beanbag genetics. This touches on something that has concerned us systematists for many many years. If a weight, what weight? An integer weight? A weight that applies equally across all lineages or can vary from branch to branch? Weight all changes the same in the hundreds-bases-long regulatory gene, or just those for which there is evidence of positive selection? My point is not to disparage, but rather to note that systematists are, in fact, quite wary of such issues. The intractability, and tendency to subjectivity in many respects, is something that causes many of us to, in fact, avoid use of such genes and, rather, employ those that appear to be “neutral” so as to not bias the result or force us to make arbitrary decisions about what change is or is not more important than another.

  6. “This touches on something that has concerned us systematists for many many years. If a weight, what weight?”

    I realize with all due trepidation that weighting opens the door to subjectivity. The EMBL version is indispensable as a raw look at the domain: it is both true and important that, as Carl notes, strep and salmonella are more different from each other and from us than we are from fungi.

    But we routinely use many axes of “difference.” I anticipate that if we’re judicious and explicit about weighting, we’ll develop multiple views of a many-dimensional tree, each a different slice through the data. And some of those slices will reflect that eukaryotic genetic systems — and metazoan developmental systems — are intensely hierarchical and context-dependent in time and space, so that some substitutions have more far-reaching consequences than others.

  7. Wow, thats awesome. I almost don´t know what to say, but its very cool. Its also neat that as new species have there DNA genomes sequenced they can be added to the tree, so that the tree itself can “evolve” as it were.

    I can´t tell you how much I like The Loom… I´m studying in Spain and have a bit of a hard time finding good, informative science news (its partially the language thing, but also just cause there aren´t as many good sources, or at least, I haven´t found them yet…). As a result, I pop on here everyday to see if theres something new… its just amazing!!

  8. WOW!

    Please come to the Evangelical Outpost sometime and help fellow evolutionists argue our case against the smart IDer’s and the dumb-ass fundamentalists.

    If one were looking for God, this tree might be a good place to start.

  9. A tree of life …

    Carl Zimmer has linked to a beautiful phylogeny displaying the diversity of life … For those who are curious, here are the common names (and descriptions, where appropriate) of the eukaryotes included:

  10. I have a question about this tree. I notice that the chimpanzee branch is longer than the human branch. Does this mean that the chimp genome has changed more than the human genome since the split?

  11. Nice period for inspiring pictures, this… Actually, a rather nice T-shirt would have the two last images (this one and the Earth’s clock) on the two sides. wit two arrows pointing at how small we humans are…

    I guess I’m going to do it, now… 🙂

  12. and by the way, I wonder how ‘skewed’ a view of life this tree gives, considering that most animals whose genome has been sequenced have received this treatment because, for one reason or another, are involved with humans/cattle/crop health. Does this give the wrong picture of unicellulars being like mostly pathogens/parassites ‘itching’ a ride on our multi-cellular backs?

  13. Rafael: I think you’re misreading something. There aren’t any reptiles in the set of animals analyzed (presumably because no one has published a complete genome for a reptile yet), so the diagram doesn’t actually say anything about the relatedness of reptiles to birds, mammals, or fish.

  14. This to me is as powerful a piece of evidence for evolution as the geological column. If life had all been created at once then why the beautifully nested series that not only follows most of the predictions made on the basis of the fossil record and morphology but also clarifies and elucidates in places where the fossil record is unhelpful or morphology is ambiguous?

    The only regret I have is that the metazoa section seems so poorly resolved. I suppose that’s due to the fact that metazoan genomes are longer but I would love to see even a few lophotrocozoa plugged into this, or a chaetognaph or two, or gnathostomulid, or echinoderm, or priapulid. Without this data it’s hard to tell if the position of vertebrates in relation to insects is just an artifact and will change when better data is plugged in, or if it reflects something more profound (such as a return to a coelomate clade and maybe a return to the view that vertebrates are “worms turned upside down” or visa versa, as chaetognaths and protoconodonts possibly suggest)

Leave a Reply

Your email address will not be published. Required fields are marked *