A Blog by

Contaminomics: Why Some Microbiome Studies May Be Wrong

You’ve got a group of people with a mysterious disease, and you suspect that some microbe might be responsible. You collect blood and tissue samples, you extract the DNA from them using a commonly used kit of chemicals, and you sequence the lot. Eureka! You find that every patient has the same microbe—let’s say Bradyrhizobium, or Brady for short. Congratulations, you have discovered the cause of Disease X.

Don’t celebrate yet.

You run the exact same procedure on nothing more than a tube of sterile water and… you find Brady. The microbe wasn’t in your patients. It was in the chemical reagents you used in your experiments. It’s not the cause of Disease X; it’s a contaminant.

Versions of this story could be playing out in dozens of labs around the world. A team of scientists led by Susannah Salter and Alan Walker at the Wellcome Trust Sanger Institute has shown that DNA extraction kits, and other lab reagents commonly used in microbe studies, are almost always contaminated by low levels of microbial DNA.

Bradyrhizobium is a common culprit, but the team have identified a list of around 100 microbes whose DNA regularly turn up when sequencing supposedly “blank” tubes of water. Most of them live in soil and water. Some come from human skin. This cabal of contaminants, which I’m going to call “the Brady Bunch”, poses a problem for studies of microbe communities, or microbiomes. It raises the haunting possibility that many published results in the field are just wrong.

Conta-mi-nate, good times, come on. (It's a contamination. It's a contaminati-o-on.)
Conta-mi-nate, good times, come on. (It’s a contamination. It’s a contaminati-o-on.)

Salter and Walker first noticed the problem in one of their own studies. Their colleagues had been studying a group of 20 infants, and had swabbed the backs of their noses every month for two years. When Salter’s group sequenced these samples, they found the microbes from the infants’ first months of life were distinct from those that came later. But they soon learned that they had used different DNA extraction kits for the early samples and the later ones, so that both sets were contaminated by different Brady Bunch microbes. When the team excluded these contaminants from their results, the pattern they found also disappeared.

By coincidence, Nick Loman from the University of Birmingham and Michael Cox from Imperial College London had encountered similar troubles. The three groups compared notes and decided to work out the extent of the problem.

Working independently, they sequenced a pure culture containing a single species: Salmonella bongori. Sure enough, that was the only bacterium whose DNA they detected. But when they started diluting the culture, so that there were fewer and fewer microbes in each sample, the results changed dramatically. After five dilutions, S.bongori accounted for just 5 to 30 percent of the results. The rest were contaminants.

As samples become more dilute, the proportion of the target bacterium (black bars) fall, and contaminants dominate the results (all other colours). The three graphs represent results from three independent labs analysing the same samples. Credit: Salter et al, 2014.
As samples become more dilute, the proportion of the target bacterium (black bars) fall, and contaminants dominate the results (all other colours). The three graphs represent results from three independent labs analysing the same samples. Credit: Salter et al, 2014.

These rogue microbes are inescapable. They vary between different reagents and different labs, but they’re always there to some degree. Loman says that whenever they used DNA extraction kits on tubes of pure water, they almost always get some microbial sequences.

No one knows how many existing studies have been affected by this problem, but the team have highlighted 20 papers that warrant a second look. All of these reported low levels of Brady Bunch microbes. Some were simply cataloguing the microbiomes of habitats like human eyes, earthworm kidneys, and mosquito guts. Others found suspicious species in very unexpected places, including extreme habitats like the upper atmosphere or lakes beneath Antarctica, or places that are thought to be sterile, like the brain or the surfaces of spacecraft.

And some of the highlighted papers found associations between unusual microbes and human diseases. One group reported that the soil bacterium Methylobacterium was more common in breast cancer tissues than in healthy samples, while Sphingomonas was less common. (Delphine Lee, who led the study, says her team used the same DNA kits to compare cancerous and healthy tissues from the same patients, so it’s unlikely that contaminants would be more common in one sample than the other.)

Another team found a new species of Brady in people with an enigmatic diarrhoeal disease called cord colitis, but not in healthy controls or people with other illnesses. That was certainly unexpected. Cord colitis affects people with blood disorders, who are treated with transplanted stem cells taken from umbilical cords. And Brady is a plant microbe—it colonises roots and provides plants with fertilising nitrogen. Could it really be responsible for a human disease?

“I’m alarmed by that paper,” says Loman. “If you screen a bunch of human sequences, you’ll find Bradyrhizobium popping up more often than not. If you find Brady in all of these cases, and no one has ever seen it before, and you haven’t cultured it, then it seems unlikely that you’ve found a novel pathogen.”

But Michael Meyerson, who led the study, counters that his team paid careful consideration to the risk of contamination, and couldn’t find Bradyrhizobium DNA in any of their reagents or control samples, using a variety of methods. “We cannot yet formally exclude the possibility that the finding of Bradyrhizobium in cord colitis specimens is due to contamination, but we believe that the preponderance of the evidence supports the presence of this bacterium,” he says.

These issues are unlikely to matter for, say, surveys of gut microbes, where hordes of bacteria in the actual samples will drown out any rogue contaminants. But researchers should be especially careful when hunting for a rare disease-causing microbe among a throng of other species, or when analysing samples with very few microbes in them at all. For example, scientists who study ancient DNA from fossils often work with minute amounts of DNA. Contaminants are such a huge issue that two researchers recently wrote a strident letter to Science exhorting the field to “do it right or not at all”.

Contamination can also waste valuable time and money and, when it applies to medical research, create false hope for patients. The most egregious recent example involved a virus called XMRV, which was touted as a possible cause of chronic fatigue syndrome (CFS), after a 2009 paper identified it in samples from CFS patients. After a long saga involving much follow-up work, allegations of misconduct, the retraction of the original paper, and much angst for patients, it is now clear that XMRV was a contaminant.

This is not a new problem, but Loman says that he still gets shocked reactions when he presents his team’s results at conferences. “I think many people are still surprised that if you put nothing into your sequencing pipeline, you come out with something that looks like a well-ordered microbiome,” he says. “If you talk about the old guard, they’ll say that contamination has been a problem since day one. But the generation of scientists who are now furiously engaged in microbiome research needs to relearn the lessons of the past. You’ve got to assume that your results might be explained by a technical factor unless you’ve ruled it out.”

“Ruling it out” involves putting everything, even negative controls like tubes of water, through the same process, involving the same reagents. Scientists also need to compare the species they identify against the Brady Bunch list, and take extra precautions if there’s a strong overlap. And perhaps the best test of all would be to grow the microbes from the samples they are supposedly hiding in.

“The significance of this paper isn’t really about contamination, it’s about how we do science,” says Mick Watson from the University of Edinburgh. “Microbiome research is absolutely fascinating, everyone wants to do it, and [we have] the power to explore the microbiome in far more depth than we could have dreamt of before. But with great power comes great responsibility. You need to know the sources of bias and error in your experiment.  If you don’t understand that, you’re going to get it wrong. A significant amount of published microbiome research is bunk because people didn’t understand what they were doing.”

Journalists like me, who cover microbiome studies, should also take note. The 20 papers that the team singled out include several that I’ve recently read while researching my book, and at least one that I’ve reported on before. Are these results valid, or are we telling the world about false alarms?

Reference: Salter, Cox, Turek, Calus, Cookson, Moffatt, Turner, Parkhill, Loman & Walker. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biology. http://www.biomedcentral.com/1741-7007/12/87

Note: Apologies in advance to Jonathan Eisen for “contaminomics”. Don’t hit me.

17 thoughts on “Contaminomics: Why Some Microbiome Studies May Be Wrong

  1. “they had used different DNA extraction kits for the early samples and the later ones, so that both sets were contaminated by different Brady Bunch microbes.” – I’d say the experiment was a bit screwed up even if it hadn’t been contaminated. What if extraction Kit A is better at extracting DNA from one type of microbe, and Kit B for another type? You’d see a spurious difference in microbial community between the two groups. Also, surely contamination in the kits will get caught by anyone that does a decent negative control in studies like that – and it’s worrying that the article implies plenty of people skimp on the controls. Still, I often have to make the mistake before I catch it, and it’s great they wrote this paper to alert their microbial homies.

  2. Hello Luke,

    Just thought I’d clarify. We used the same type of DNA extraction kit for all samples in the study. The problem was that contamination differed between different batches of the same type of kit. Hope this clears this up.

  3. “Bradyrhizobium is a common culprit, but the team have identified a list of around 100 microbes whose DNA regularly turn up when sequencing supposedly “blank” tubes of water. Most of them live in soil and water.”
    Well I’m working both with soil and water related samples and was wondering when this article was published what to do next. I use negative controles during the PCR steps and during the sequencing, but if the bacteria that I found there are, for example, water related, than it’s a difficult decision for me to “subtract” them from my bacterial community found in a water environment. So this paper is really good, especially when you are a starter in the field like me, to get attention on contamination and the difficulties with the technique, but I think, and that’s my personal opinion, that to do really good science, the scientific community should think about some standards.

  4. I think this paper and the one by Lusk point to a very important issue not only in metagenomic studies but in molecular biology in general. So far people have got away without including negative controls in metagenomics because of the cost of library prep and sequencing. I think is very important to include negative controls, especially in low abundance/low diversity samples. The other big factor that is not discussed here is both recording things like what batch/aliquot of a certain kit you use (and making that information available) and accounting for batch effects statistically, for example here: http://www.nature.com/nrg/journal/v11/n10/abs/nrg2825.html and here: http://nar.oxfordjournals.org/content/early/2014/10/07/nar.gku864.short

  5. @Caroline
    There are standards, we are just finding they may need to be tighter or a new protocol adopted with how sensitive our testing methods have become. The point of publishing this type of information is to show how and were improvements may be needed.
    Also, if you see the sequences in your water negative you know you can’t trust the same sequence found in your positives or your samples. It’s a cue for you to investigate it. If you are using specific primers you likely wont have a problem unless they are binding non-specifically, but if you are doing high-throughput testing, shot gunning everything in a sample etc. you know you need to subtract the sequence or at least subtract the ‘amount’ of contaminating sequence and see if your samples have amounts above the threshold set by your water negatives.
    Another thing to be aware of that microbiome researchers have see (along with other areas) is that the way the programs are designed to handle the large amount of info thrown at them, certain sequences might be lost as background or may look similar to other sequences and be categorized wrong. Peer review, replications, and constant investigation is how we find these kinks and iron them out. Since you seem so concerned with technique maybe you should read more related papers and see if you can brainstorm on improvements where you work or will work. It would certainly help you understand the nuances and maybe as you learn more you will find this will give you a leg up in the lab.

  6. This story of test-tube contamination reminds me of a notable episode in Darwin’s day. Not long after Darwin published “The Origin of Species,” his ally Thomas Huxley (aka Darwin’s Bulldog) identified in deep-sea water a substance he called Bathybius — a “primordial ooze” of the sort that he and others had previously speculated might be a protoplasmic goo from which all life might have arisen. Its apparent existence seemed to bolster Darwin’s theory of evolution and its notion of constantly evolving life forms. Huxley trumpeted it loudly, as did others, in Nature and other venues.

    Seven years after Huxley found it, scientists on the oceanographic vessel the HMS Challenger found something else. They “poured a large quantity of alcohol into a bottle containing deep-sea ooze and the mixture almost instantly produced something remarkably like the mysterious Bathybius. [They] instantly realized that Huxley’s ancient slime was simply a new goo produced by the reaction between ordinary ooze made up of planktonic skeletons and alcohol. The stuff in Huxley’s tubes had apparently been formed more slowly by the traces of alcohol left after washing. [Challenger chief scientist] Thomson immediately wrote Huxley, breaking the news with remarkable tact, and Huxley promptly sent the letter to be published in Nature along with a graceful and funny letter confessing his error.”*

    Easy to see what we want to see. Good to catch ourselves doing it — and act constructively when we do.

    __

    *From my Reef Madness: Charles Darwin, Alexander Agassiz, and the Meaning of Coral, 2005, where I describe this and many other episodes of confirmation bias. http://www.amazon.com/Reef-Madness-Charles-Alexander-Agassiz/dp/0375421610

    A PDF of Huxley’s letter — a nice model of self-correction — is here: http://cl.ly/YV6r

  7. “Salter and Walker first noticed the problem in one of their own studies. They had been studying a group of 20 infants, and had swabbed the backs of their noses every month for two years.” – the PLOS One paper you link to has neither of them as a co-author: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0038271

    Paul Turner is the only author in common between that paper and this recent one. Sure, Salter et al. are using the swab archive generated from the Thailand study, but it’s misleading to attribute the original data-gathering and paper to Slater and Walker in the way that you do in the quote (even if it makes a cleaner story).

    [You’re totally right. I’ve made a couple of small tweaks to the text and moved the link, to better reflect who did what. Thanks for pointing that out. – Ed]

  8. “And perhaps the best test of all would be to grow the microbes from the samples they are supposedly hiding in.” Bingo! Do not underestimate the power of culturing. A to DNA mehods, we still do not have a way to establish whether one DNA sequence = one (living) microorganism.

  9. I’m curious if there were any contaminants in their sequencing reagents? It appears that they did not run a sequencing blank, unless I’m mistaken? Could this lead to the false assumption that the extraction kits, other reagents are the contaminants? With that being said, still a nice piece to highlight the importance of running negative controls in molecular biology.

    [Jake, as it says in the piece, the contaminants are indeed in the sequencing reagents. People usually run sequencing blanks, but if you run a blank through the same kits as the actual samples, they’ll produce sequences. – Ed]

  10. “You’ve got a group of people with a mysterious disease, and you suspect that some microbe might be responsible. You collect blood and tissue samples, you extract the DNA from them using a commonly used kit of chemicals, and you sequence the lot. Eureka! You find that every patient has the same microbe”

    Ed, surely they would be in the controls samples too, and would therefore not be flagged as disease-specific?

  11. This is common in Physics too, where, to discover something like saym the Higgs boson, we have to carefully subtract “contamination” from other processes that can mimic a genuine signal.
    On another note, I really hope “the Brady Bunch” becomes accepted technical terminology in the field.

  12. As usual, great coverage of an important new study Ed. I agree w/ Eduardo Castro’s comment that this #contaminomics issue applies yet is under-appreciated across many molecular biology fields. Hopefully the many new fields following in the wake of biomedical and microbial sequencing (including my own – macrobial eDNA) can get ahead of this issue rather than pretend it doesn’t apply.

    Solutions won’t be easy to come by though. Nhu Nguyen & colleagues from U. Minnesota and Stanford recently published an excellent and wonderfully transparent discussion of possible solutions to this same problem when studying fungi: http://onlinelibrary.wiley.com/doi/10.1111/nph.12923/full

    The last thing I’ll mention is the similarly underappreciated “carrier effect” that causes negative control samples to miss much of the contamination that remains detectable in normal samples: http://www.humpopgenfudan.cn/p/E/E6.pdf

    It appears that our standard negative controls (i.e., water instead of biota) underestimate contamination. So the problem described so clearly by Salter et al. is probably even worse.

  13. Many viruses and microbes are beneficial and fight pathogens that would harm their hosts.

    Just because they are there does not ipso facto prove they are a bad guy.

  14. As I was told again and again and AGAIN in chemistry, ‘garbage in, garbage out.’ Keep an eye on your reagents, clean your equipment and always, always run a negative control. Some of the stuff you run through may stick around for *weeks*.

Leave a Reply

Your email address will not be published. Required fields are marked *