A Blog by

Jumping DNA and the Evolution of Pregnancy

About a decade ago, Vincent Lynch emailed Frank Grutzner to ask for a tissue sample from a pregnant platypus. He got a polite brush-off instead.

Then, around eight years later, Grutzner got back in touch. His team had collected tissues from a platypus that had been killed by someone’s dog. They had some uterus. Did Lynch still want some?

“Hell yes!”

The platypus was the final critical part of a project that Lynch, now at the University of Chicago, had longed to do since he was a graduate student. He wanted to study the evolution of pregnancy in mammals, and specifically the genetic changes that transformed egg-laying creatures (like platypuses) into those that give birth to live young (like us).

The platypus enjoys a short pregnancy. Its embryo sits in the uterus for just 2-3 weeks, surrounded by a thin eggshell, and nourished by a primitive placenta. It then emerges as an egg. Marsupials, like kangaroos and koalas, also have short pregnancies. But mothers give birth to live young, which live in a pouch until they’re big enough. Other mammals—the placentals, or eutherians—keep their babies in the uterus for as long as possible, nourishing them through a complex placenta. Their pregnancies can be marathons—up to two years in an elephant.

The move from egg-laying to live-bearing was huge. Mammals had to go from holding a shell-covered embryo for weeks to nourishing one for months. To understand how they made the leap, Lynch compared 13 different animals, including egg-layers like the platypus, marsupials like the short-tailed opossum, and eutherians like the dog, cow, and armadillo. He catalogued all the genes that each species switches on in its uterus during pregnancy. He then compared these different sets to work out when mammals started (or stopped) using those genes during reproduction.

He found thousands of differences, many more than he anticipated. For example, hundreds of genes are involved in making eggshell minerals; they’re active in the uterus of a platypus but silent in those of other live-bearing mammals. Conversely, the marsupials and eutherians started activating hundreds of genes involved in suppressing the immune system, and in passing hormonal signals between the mother and foetus.

This all makes sense. A platypus embryo, during its brief stay in the uterus, is separated from its mother—and its mother’s immune system—by a shell. “It’s like the embryo has a cloak,” says Lynch. When mammals evolved live births, the cloak disappeared and a problem arose. Every foetus shares only half of its genes with its mother, so mum’s immune system should recognise this lump of growing tissue as a potential threat. To dispense with eggs, early marsupials and eutherians had to evolve ways of tamping down their immune responses, and only in the uterus. They also needed ways of exchanging signals with their embryos. “The foetus needs to say, Hey I’m here, and the mum needs to say, Oh, that’s okay,” says Lynch.

His study shows that they did so by repurposing a vast array of genes that already had roles in other organs, like the guts, brains, and bloodstream. But how? How does an animal deploy a gene—or thousands of genes—in a different organ?

The answer involves jumping DNA. Many bit of the genome can cut themselves away from the surrounding DNA and paste themselves in elsewhere. Others can copy themselves and insert the duplicates into new spots. These sequences are genomic parasites—they reproduce, often at the expense of their host. If they disrupt other genes when they land, they can cause cancer and other diseases. But sometimes, they settle somewhere useful.

Think of the jumping DNA as the infrared sensor in your television. The sensor recognises a stimulus—the signal from your remote control—and switches on the TV. Imagine that the sensor makes thousands of copies of itself, and somehow wires these into appliances all over your house. Now, when you press the remote control, your TV whirrs into life, but your lights also flicker on, your washing machine starts up, your computer boots, and your radio starts playing. By duplicating and spreading the sensor, you ensure that the same stimulus now turns on a multitude of things.

This is what happened during the evolution of pregnancy except there, the stimulus isn’t an infrared signal but a hormone called progesterone. In the ancestor of eutherian mammals, jumping DNA littered the genome with sequences that progesterone can recognise. They allowed this one hormone to switch on a vast array of new genes in the uterus. And they did so in a very short span of time by evolutionary standards—just a million years or so, by Lynch’s reckoning.

Craig Lowe from Stanford University says that, for decades, scientists have theorised that jumping DNA could do something like this, but Lynch has shown that they actually have. Lowe also suspects that other scientists will use the same methods to study the evolution of traits like pregnancy, which seem overwhelmingly complicated at first pass.

Indeed, that’s what motivated Lynch originally. He’s interested in how evolution produces radically new structures. “We don’t have a good understanding of how you get something entirely new,” he says. “It’s easy to imagine how you select upon an existing structure to get a slightly different one, like a hand into a flipper or a bat wing. But how do you get the limb to begin with?”

The answer almost certainly involved using existing genes in new and innovative ways. “Does that happen slowly and step-wise, or can you have broad, genome-wide changes that reorganise things in larger jumps? Our work suggests that the larger jumps are possible.”

Reference: Lynch, Nnamani, Kapusta, Brayer, Plaza, Mazur, Emera, Shehzad, Sheikh, Grutzner, Bauersachs, Graf, Young, Lieb, DeMayo, Feschotte & Wagner. 2015. Ancient Transposable Elements Transformed the Uterine Regulatory Landscape and Transcriptome during the Evolution of Mammalian Pregnancy. Cell Reports. http://dx.doi.org/10.1016/j.celrep.2014.12.052

A Blog by

Now This Is How You Find Disease Genes

When you read stories about scientists identifying a new link between Gene X and Disease Y, the underlying studies vary a lot in quality. At one extreme, you get papers which show that a variant of Gene X is common in a small group of people with Disease Y and not in healthy controls… and that’s it. You don’t really know if X is really responsible for Y, or even if the result is genuine and not a false alarm produced by small numbers.

At the other extreme, you have this—a study that used a smorgasbord of experiments to identify 18 new genes behind hereditary spastic paraplegias (HSPs). This diverse group of genetic disorders all involve damage to the long neurons running between the brain and spinal cord, leading to stiffness and involuntary contractions in leg muscles.

Scientists have already linked 22 genes to HSPs, but these only explain around 20 to 30 percent of cases. “Many of the children with these conditions can’t receive a proper diagnosis and there’s no treatment available,” says Joseph Gleeson at the University of California, San Diego. “We wanted to understand more about the causes, and hopefully see some new treatments come out of that.”

To find more HSP genes, Gleeson’s team forged contacts with scientists in countries where HSP is more common and where genetic studies are rare, including Egypt, Pakistan and Iran. They found 55 families with the disorders and sequenced every gene in 93 of their members. They identified several genes that seemed to cause HSPs in these people, and they bred mutant fish to check that getting rid of these genes actually does produce relevant symptoms. They created a network to show what these genes do, and how they interact with each other. And they used that network to find even more HSP genes.

The scope of the work, led by team members Gaia Novarino, Ali Fenstermaker and Maha Zaki, is incredible. “We’ve been working on it for close to 10 years,” says Gleeson. “We just didn’t feel comfortable publishing it until it was all done. Hopefully, people will look to our paper as a roadmap for studying genetically diverse conditions.”

Between them, the 18 new genes and the 22 old ones explain around 70 percent of the HSP cases among the team’s recruits. “That is of enormous value, not only for biological understanding, but also for providing a definite diagnosis in families and for accelerating research into possible treatment of these progressive disorders,” says Joris Veltman, a geneticist at Radboud University Nijmegen Medical Centre.

Just finding the families was hard enough. “It’s not easy for an American to get into Iran,” says Gleeson. But it was worth it because in these parts of the world, the practice of marrying relatives means that family members share an unusually high proportion of their DNA. This makes it easier to find recessive genes that only cause HSP when people inherit two copies.

The team sequenced every volunteer’s complete exome—the 1 percent of their genome that codes for proteins. By comparing the exomes of family members with or without HSPs, they showed that a third of the cases were due to genes that had already been implicated in the disorders. But another 40 percent were possibly caused by mutations in 15 new genes.

Next, they verified this list by engineering baby zebrafish that lacked each of these candidate genes. None of the mutants could swim properly. Some, for example, had tails that were permanently curved to the side, much like the stiff limbs of children with HSP. “We felt compelled to do that,” says Gleeson. “For a lot of the genes, we only had a single family with the mutation.” Without the fish experiments, he wouldn’t have felt comfortable claiming that these genes were really related to HSP.

Exome sequencing is quickly becoming the frontline technique for gene detectives, who no longer have to narrow down their search to specific parts of the genome. They can just sequence every gene and see what jumps out. “The current study clearly takes this approach to the next level by applying it to a very large cohort and performing systematic functional follow-up studies,” says Veltman.

Even that wasn’t enough. “In some diseases, one is left with a hodgepodge of genes and no clear path forward,” says Gleeson. “We tried to weave commonalities between our genes and understand what they were telling us.” They did that by mapping all the interactions between their HSP genes and the proteins they make, creating a tangled network that they call the “HSPome”.

The genes clustered in different groups based on what they did. “It was like lifting the veil,” says Gleeson. “We could see how all the factors that were identified fit together.” Some are involved in folding proteins correctly, others help to make building blocks of DNA, and yet others help neurons to grow and move to the right places. These clusters tell us about “points of molecular vulnerability” in the brain-to-spine neurons that are damaged in HSPs, says John Fink from the University of Michigan.

The team then extended the network to look at other genes that interacted with the ones they identified—the “friends of friends”. By scanning this extended list, they found more three more new HSP genes, which underlie the disorders in three more families. That brought the total up to 18.

The network also overlapped a lot with other sets of genes that have been implicated in Alzheimer’s disease, Parkinson’s disease and Lou Gehrig’s disease. This suggests that these disparate brain diseases may have some common ground, and that drugs which target these overlapping genes could help to treat several conditions.

“This is important, because drug development is very costly, and the larger the potential market, the more interested pharmaceutical companies will be to pursue these leads,” says Craig Blackstone from the National Institutes of Health. Indeed, by linking HSPs to better-studied (and better-funded) conditions, Gleeson hopes to spur interest in these often-overlooked conditions.

“These are exciting times for research not only into the causes and treatments for HSP but for other neurodegenerative disorders as well,” says Fink.

Reference: Novarino, Fenstermaker, Zaki  Hofree, Silhavy, Heiberg, Abdellateef, Rosti, Scott, Mansour, Masri, Kayserili, Al-Aama, Abdel-Salam, Karminejad, Kara, Kara, Bozorgmehri, Ben-Omran, Mojahedi, Gamal El Din Mahmoud, Bouslam, Bouhouche, Benomar, Hanein, Raymond, Forlani, Mascaro, Selim, Shehata, Al-Allawi, Bindu, Azam, Gunel, Caglayan, Bilguvar, Tolun, Issa, Schroth, Spencer, Rosti, Akizu, Vaux, Johansen, Koh, Megahed, Durr, Brice, Stevanin, Gabriel, Ideker, and Gleeson. 2013. Exome Sequencing Links Corticospinal Motor Neuron Disease to Common Neurodegenerative Disorders. Science http://dx.doi.org/10.1126/science.1247363

A Blog by

Humans Restrain Jumping DNA That Chimps Allow To Run Free

Our genome isn’t static; some of it can move about. We’re loaded with stretches of DNA that can copy themselves and paste their duplicates into new locations, increasing their numbers as they go. These sequences, known as retrotransposons, have become so abundant that they make up more than 40 percent of our genome. They’ve probably been a major force in our evolution. Depending on where they land, they could either disrupt genes in debilitating ways, or act as building material for new adaptations.

The majority of our retrotransposons can no longer jump. They’re genetic fossils, which have mutated so much that their days of wanderlust are behind them. But one group of sequences—the L1 or LINE-1 elements—includes a small number that are still on the move. They’re still copying and pasting themselves, still creating variation between people, still causing disease.

It seems that our genome takes a particularly conservative attitude to L1 elements. By comparing our cells to those of our closest relatives, chimpanzees and bonobos, Carol Marchetto and Inigo Narvaiza from the Salk Institute for Biological studies have shown that we keep these slippery bits of DNA under a particularly tight leash. By contrast, the other two apes allow L1s to move with much greater abandon—a trait that might help to explain why their genetic diversity is far greater than ours.

Marchetto and Narvaiza began by reprogramming skin cells from four humans, two chimps and two bonobos into a state where they’re almost like stem cells. Rather than being stuck down the skin path, these stemmy cells (or iPSCs) can produce all the various types of cell in their host bodies.

The iPSCs from all three species were very similar in the genes that they switched on, but the differences were revealing. When the team looked at genes that were more strongly activated in the human cells than the chimp or bonobo ones, they saw that two of the top 50 were involved in restraining L1 elements.

If L1s were allowed to run roughshod across the genome, they could destabilise important genes and lead to disease. So, cells have evolved ways of keeping them in line. Two of these guardians—A3B and PIWIL2—are especially vigilant in our genomes. A3B is 30 times more active in human cells than chimp or bonobo cells, and PIWIL2 is 15 times more active.

It’s no surprise, then, that chimp and bonobo L1s jump about 8 to 10 times more frequently than those in humans. And the team managed to tweak that difference by changing the levels of A3B in their reprogrammed cells. By making it less active, they sent the human L1s into a jumping flurry. By making it more active, they forced the chimp and bonobo L1s to stay put.

This might help to explain why humans have such low genetic diversity. We might think that people from different corners of the world look very different, but our genomes tell a story of unusual uniformity. You can find are more genetic differences between chimps living in the same troop, than among all living humans.

It’s clear that humans passed through one or more genetic bottlenecks at some point in our evolutionary past. Something happened to whittle our population down to a small group, from whom everyone alive today descends. Perhaps that bottleneck was some sort of climatic change. Maybe it was a viral pandemic.

Fred Gage, who led the new study, quite likes the virus idea. Both PIWIL2 and AB3 also help to suppress viral infections. He wonders if, in response to an ancient pandemic, these genes allows certain groups of early humans to survive, and only later took up the task of suppressing the L1 elements. By holding down these jumping sequences, the genes exacerbated the loss of variation in the human genome even further.

And here’s another idea that Gage is pondering. Keeping a tight grip on L1 elements makes for less varied genomes. Less varied genomes mean that people (and children or neighbours in particular) become more similar, in both their physical traits and behaviour. In a population like that, “a cultural innovation like art or language might be more likely to persist,” says Gage. “If you have a unique event, like say a Picasso invents cubism, and you introduce it into the pack, it has a greater chance of being assimilated into the culture. “

This is all speculation for now. “The work is interesting, but I’m unclear at this point whether it says that L1 retrotransposition is important in speciation or in genome adaptation,” says Haig Kazazian, who studies mobile DNA at Johns Hopkins University School of Medicine. “I think that much more needs to be done to show that, and I’m sure the authors would agree.”

They do, and in producing iPSCs from chimps and bonobos, Gage’s team now have the tools to start answering more complicated questions about our evolutionary history. Kathleen Burns, who also studies mobile DNA, says that we have learned a lot about these invasive sequences by comparing the genomes of different animals. But with the iPSCs, scientists can now do a broader range of experiments to understand how L1s and other mobile elements are controlled.  “Understanding this has implications not only for normal human physiology, but also a wide variety of pathologies where mobile DNAs are de-repressed,” says Burns.

Reference: Marchetto, Narvaiza, Denil, Benner, Lazzirini, Nathanson, Paquola, Desai, Herai, Weitzman, Yeo, Muotri & Gage. 2013. Differential L1 regulation in pluripotent stem cells of humans and apes. Nature http://dx.doi.org/10.1038/nature12686

A Blog by

The End of Family Secrets?

I’ve been tied to the genealogy community ever since I can remember. My dad was always into it — for decades, he collected old documents and photos, and went on fact-finding trips to libraries and cemeteries, all to fill in the holes of his ever-expanding family tree. As I’ve written about before, I’ve had trouble wrapping my head around his obsession. Why spend so much time digging up the past?

But I may be in the minority. Genealogy is a booming business, with an estimated 84 million people worldwide spending serious money on the hobby.

As it turns out, the industry owes a big part of its recent success to a technology that I’m quite invested in: genetic testing. Several dozen companies now sell DNA tests that allow customers to trace their ancestry. This technology can show you, for example, how closely you’re related to Neanderthals, or whether you’re part Native American or an Ashkenazi Jew. But the technology can just as easily unearth private information—infidelities, sperm donations, adoptions—of more recent generations, including previously unknown behaviors of your grandparents, parents, and even spouses. Family secrets have never been so vulnerable.

My latest story is about the rise of this so-called “genetic genealogy” and how it has forced some people to confront painful questions about privacy, identity, and family. The story is out today in MATTER, a new publication for long-form narratives about big ideas in science and technology.

The star of my story is Cheryl Whittle, a 61-year-old from eastern Virginia who graciously invited me into her home and into her extended family. Cheryl took her first DNA test in 2009, and what happened after that is a story with lots of twists and turns, joys and sorrows. You’ll have to go read the story to see what I mean — Here you can read a teaser, or buy the whole 10,000 words for just $.99.

One of the things I loved about reporting this story was seeing how genetic technology is being integrated into the lives of people who aren’t all that interested in science. Here’s a quick video of Cheryl, for example, explaining — in fluent genetic lingo — how to compare her 23 pairs of chromosomes to someone else’s using the online service of 23andMe, a popular genetic testing company:

Thanks to genealogy hobbyists like Cheryl, genetic databases are growing larger every day. And this raises some important issues regarding privacy and ethics. It’s plausible that in the not-­too-­distant future, we’ll all be identifiable in genetic databases, whether through our personal contribution or that of our relatives. Is that a good thing? A bad thing?

I’ve heard a wide range of answers to these questions. A couple of months ago I asked my father’s first cousin, John Twist, who has been an avid genealogist for decades, whether he had bought any DNA tests to further his research. Genealogists tend to have a sharing mentality, so his response surprised me. He wrote:

I have NOT sent in my DNA.  I would have, perhaps, earlier, but now with the revelations of Big Brother, I just don’t want to. I read that they caught the BTK killer in Kansas City (?) through his daughters pap smear.  AND, in an article I read yesterday from MIT (?) a fellow said he’d rec’d an anonymous DNA sample and was able to identify the person who’d given it through Ancestry – well, something like that.

My own views tend to fall on the other side of the spectrum. I’m keen on the potential benefits of direct-to-consumer genetic testing, whether it’s used for estimating your medical risks or unearthing family secrets. That said, the full range of its legal and ethical implications has not yet come to light.

Dov Fox, an assistant professor of law at the University of San Diego who specializes in genetic and bioethical issues, told me that it’s only a matter of time before genetic genealogy leads to lawsuits regarding fidelity, paternity, and inheritance. But it’s unclear, for now, how the law will handle those cases.

Here in the U.S., there aren’t any federal privacy statutes that would apply, Fox says. The U.S. Genetic Information Nondiscrimination Act (GINA), passed in 2008, says that health insurers and employers cannot use an individual’s genetic information to deny medical coverage or to make employment decisions. But genetic genealogy doesn’t have anything to do with medical risks. That means lawyers will have to get creative in how they present their cases.

“What happens often with advances in science and technology is that we try to shoehorn new advances into ill-fitting existing statutes,” Fox says. So genetic genealogy cases might hinge upon laws originally written for blackmail, libel, or even peeping Tom violations.

Maybe it’s not all that surprising that genetic genealogy, a new technology, hasn’t ironed out its privacy standards yet.

“When telephones were first becoming widely adopted, you couldn’t just dial someone directly. An operator would put your call through and often listen to the call,” says J. Bradley Jansen, Director of the Center for Financial Privacy and Human Rights in Washington, D.C., and the founder of the Genealogical Privacy blog. When a technology is new, its novelty trumps any privacy worries. That was true for Facebook, too: At the beginning, everyone shared everything with abandon. “But as technologies mature, privacy, which had been a luxury, becomes an essential commodity,” Jansen says.

I hope he’s right.

And I hope you like the story, now up at MATTER.

A Blog by

Tiger, Tiger, Burning Bright, Just One Gene To Make It White

White tigers were first recorded in India in the 1500s, but the last wild one was shot in 1958. Still, this spectral animal thrives in captivity. Its captivating white coat and blue eyes have made it a popular mainstay of zoos, and a small number of individuals have been repeatedly bred with each other to boost captive numbers. There were just a few dozen in the 1970s. Now, there are hundreds.

The white tiger isn’t a species in its own right, or even a subspecies. Instead, it’s a mutant version of the Bengal tiger, whose orange coat has whitened thanks to an extremely rare recessive gene. If a tiger inherits two copies of this recessive variant, one from each parent, it’s white. If it has even one normal copy, it’s orange.

Back in the 1970s, Roy Robinson suggested that the gene in question was tyrosinase (TYR). It’s involved in making melanin—a pigment responsible for black, brown, red and yellow colours. If individuals have faulty versions of TYR, they are born without melanin and have pale hair, skin and eyes—they’re called albinos.

The white tiger isn’t a true albino since it still has black pigment in its stripes and eyes. Instead, Robinson thought that it carries chinchilla—a version of the TYR gene that only removes the type of melanin behind yellow and red colours. Without this, the orange coat becomes white, but the black bits stay black. Mystery solved.

You’ll still find this explanation all over the internet, but Xiao Xu from Peking University showed that Robinson was wrong. White tigers have the same version of TYR as orange ones. They also carry identical variants of four other genes that affect the colour of mammal coats. These include MC1R, the gene responsible the white coats of “snow coyotes” and “spirit bears”.

To find the real culprit behind the white coats, Xu’s team compared the DNA of 7 white tigers and 9 orange ones, living in China’s Chimelong Safari Park. They’re all related, and you can see their family tree below. The team sequenced the entire genomes of the three parents, identified more than 170,000 places where their DNA varied between individuals, and sequenced these locations in the rest of the animals.

Family tree of tigers involved in this study. Credit: Xu et al, 2013. Current Biology, Cell Press
Family tree of tigers involved in this study. Credit: Xu et al, 2013. Current Biology, Cell Press

Gradually, they homed in on seven genes that consistently differed in the white and orange animals. And by looking at these genes in 130 more tigers, from several unrelated sources, the team narrowed their list down to just one.

It has the tremendously catchy name of SLC45A2. It’s also involved in making melanin, although no one is entirely sure how. Variations in the gene have been linked to lighter skin or hair in mice, horses, chickens, medaka fish and humans. It’s associated with light skin colour in modern Europeans, as well as one type of albinism.

The SLC45A2 gene makes a protein of the same name, which consists of 560 amino acids. A single mutation in the gene—a change in just one DNA letter—switches one of those 560 amino acids from an alanine to a valine. This distorts the protein’s shape, and potentially prevents it from taking part in the creation of red-yellow melanin. Every white tiger has two copies of this mutated gene, and can only make the distorted protein. That’s all it takes to change their coats from orange to white.

Greg Barsh from the HudsonAlpha Institute of Biotechnology thinks that Xu’s team have found the right gene, and their results might eventually help to explain exactly what SLC45A2 does. In other species, mutations in the gene usually interfere with both the red-yellow and brown-black types of melanin. But in the tigers, they just disrupt the red-yellow pigments. Mutations in the TYR gene can sometimes do the same—remember chinchilla?—so even though Robinson was wrong about the gene behind the white coats, it’s still possible that SLC45A2 somehow interacts with TYR.

Some people have suggested that the genes behind the white coat also cause other defects, which have become more prominent because the captive animals are so inbred. These include club feet, crossed eyes, cleft palates, and hip or spine problems.

But Xu’s team argue that the white coat is the result of a pigmentation problem, and nothing more. After all, white tigers did once exist in the wild, and those that were captured or shot were often mature adults. This suggests that they’re capable of surviving in the wild despite their mutation—possibly because they hunt colour-blind prey.

Barsh disagrees. “Many humans and other animals with SLC45A2 mutations have severe visual problems,” he says, and he notes that previous studies have found abnormal visual connections between the tigers’ eyes and brains. This might explain the crossed eyes of the captive animals, and probably means that the mutation did affect the white tigers’ survival in the wild.

All of this feeds into a longstanding debate about the role that these white beasts should play in tiger conservation. Writing in Slate, Jackson Landers argues that white tigers should play no role in breeding programmes or reintroduction efforts, and should be allowed to “disappear into memory”. Every zoo enclosure that houses one is an enclosure that isn’t preserving one of the genuinely endangered tiger subspecies, whose numbers and genetic diversity are dwindling.

Xu’s team argues, based on their results, that the white tiger “should be considered a part of the genetic diversity of tigers that is worth conserving”. They argue that both white and orange tigers should be used to boost the Bengal population, and that reintroductions are possible.

It’s hard to see how their results address that issue, though. Given the past existence of wild white tigers, it’s clear that the white mutation was indeed a naturally occurring one—we just know which gene it affects now. Identifying SLC45A2 doesn’t change the fact that white tigers do suffer from several abnormalities, thanks to generations of inbreeding.

And with fewer than 3,200 tigers left in the wild, it’s perhaps a distraction to worry about conserving this one mutant gene. As John Seidensticker form the Smithsonian National Zoological Park bluntly puts it, “We have much more pressing tiger conservation problems.”

Reference: Xu, Dong, Hu, Miao, Zhang, Zhang, Yang, Zhang, Zou, Zhang, Zhuang, Bhak, Cho, Dai, Jiang, Xie, Li & Luo. 2013. The Genetic Basis of White Tigers. Current Biology. http://dx.doi.org/10.1016/j.cub.2013.04.054

More on cat genes:

A Blog by

Flesh-Eating Plant Cleaned Junk From Its Minimalist Genome

How much of your DNA actually does something useful? Of the 3 billion letters that make up your genome, we know that only 1.5 percent consists of genes, which carry the instructions for making proteins. Of the remaining 98.5 percent, some sequences affect how, when and where our genes are used, but the vast majority have no obvious roles. They contain the corpses of dead genes, parasitic strings of selfish DNA that have run amok, and other bits that seem to do nothing. You might call them junk.

So, what would happen if you got rid of it? If you stripped all these “non-coding” sequences from the human genome, would you still get a normal, living person? This experiment will always be a fantasy for us, for reasons of impossibility and ethics, but it’s one that some living things have unwittingly carried out.

Take the floating bladderwort. This flesh-eating water plant is a genetic minimalist, adrift in a world of hoarders. The onion, for example, has around five times more DNA than you do, with a genome that’s around 15 billion DNA letters long. The wheat genome is slightly bigger still. And even these genetic titans look positively svelte next to the record-breaking genome of the Japanese “canopy plant”—a pretty, white flower whose 150-billion-letter genome is the largest of any plant.

The bladderwort, however, has a paltry 82 million letters in its genome—40 times fewer than you, and 2,000 times fewer than the canopy plant. For comparison, the thale cress, a plant that geneticists chose to focus on for its “small” genome, has almost twice as much DNA.

By sequencing the bladderwort genome, Enrique Ibarra-Laclette from the CINESTAV institute in Mexico City has shown that it’s largely junk-free. Its genes are squashed together and around three-quarters of its DNA codes for proteins. The extraneous sequences between them are few and far between. And yet, the plant survives. “Our results suggest that “junk” DNA is not necessary for the function of complex organisms,” says Luis Herrera-Estrella who led the study.

They say that you never know how important something is until it’s gone. The opposite is also true. Losing something can also make you realise just how dispensable it always was.

Credit: Bruce Salmon (Ecosphere publications, 2001)
Credit: Bruce Salmon (Ecosphere publications, 2001)

Getting bigger to get smaller

There are over 200 species of bladderwort that are all united by their twin loves of water and flesh. The floating bladderwort (Utricularia gibba) grows in ponds and lakes, and produces yellow, orchid-like flowers. Below the surface, it captures prey with pressurised bladders that can rapidly open to suck in passing animals, including insects, tadpoles and even small fish.

When Herrera-Estrella first sequenced the bladderwort’s genome, he expected it to represent the “minimal plant genome”. He thought that the bladderwort must have pared back its genes until only the most essential ones were left. Spare copies, or genes that perform overlapping jobs, would have been ruthlessly culled.

He was wrong. The bladderwort actually has around 28,500 genes—slightly more than plants with much bigger genomes, like the grape. Stranger still, this number hides a turbulent history of expansion and contraction. The team noticed that some genes in relatives like the tomato or grape are found eight times over in the bladderwort. This means that the bladderwort duplicated its entire genome at least three times in its evolutionary history, before promptly losing many of the doubled genes. Its genome went through three cycles of getting bigger before getting smaller.

“There are these natural currents that underlie genome dynamics,” explains Victor Albert from University at Buffalo, who was also involved in the study. Expansion versus contraction. Duplication versus deletion. “The key to the evolution of genome size may be the extent to which these opposing forces can be tolerated by natural selection.”

On the one hand, junk or duplicated DNA could provide fuel for evolution, by allowing natural selection to tinker with sequences that aren’t already used for important roles. On the other hand, having lots of DNA is expensive in energy terms – you need to keep a leash on it, and duplicate it all whenever cells divide. If the benefits outweigh the costs, you might get a bloated onion. If it’s the opposite, you get a lean bladderwort.

Certainly, the bladderwort seems to have jettisoned most of its junk. Repetitive stretches of DNA make up just 3 percent of its genome, compared to 10 and 60 percent in most other plants. Retrotransposons—a type of jumping gene that spreads through DNA by copying and pasting itself—dominate the genomes of most flowering plants, but have been relegated to just 2.5 percent of the bladderwort’s DNA. “The non-coding DNA must not have had a particular evolutionary benefit to the plant, or natural selection would have fought against the tendency to delete it,” says Albert.

The path to minimalism

Herrera-Estrella doesn’t know how the bladderwort ended up with its minimalist genome, but he certainly found no evidence that natural selection has specifically weeded out any non-essential DNA. Andrew Leitch, a plant geneticist at Queen Mary, University of London, finds that surprising. He notes that bladderworts live in environments that are poor in the essential element phosphorus. That’s why it eats meat—to harvest phosphorus from the bodies of animals. And since DNA’s backbone is loaded with phosphorus, it’s reasonable to think that the plant would have evolved to have less DNA, so it can make do with less of this element.

But not so. Instead, the team thinks that the answer probably involves recombination—a process where matching DNA strands swap genetic material with each other. Think of it like this: You’re filling out the missing pages of a book by using a pristine copy as a template. If those missing pages are all the same, you might accidentally skip a few while writing them out or start in the wrong place. Either way, you’d end up with a slightly shorter version of the same book. Do this enough times, and the flabby repetitive middle would dwindle away. Perhaps recombination is exceptionally sloppy in the bladderwort, leading to a natural tendency to delete its DNA.

The study puts an interesting spin on the results from the huge international ENCODE project. In a  controversial paper published last year, ENCODE claimed that around 80 percent of the human genome has some “functional activity”. It either controls the activity of genes, sticks to proteins, or gets transcribed into a related molecule called RNA. Rather than a desolate junkyard, ENCODE portrayed the genome as a thriving jungle full of hidden regulators and “things doing stuff”.

But the project’s many critics argued that it had redefined “function” to the point of meaninglessness. To put it simply: just because something happens at/to a stretch of DNA doesn’t mean that stretch is important. A completely random genome would probably show a lot of such “activity”.

The bladderwort teaches this lesson well. “At least for a complex flowering plant, having a bunch of potential, hidden biological regulators in the non-coding part of the genome simply isn’t necessary,” says Albert. “If a plant can get rid of junk DNA, it is possible that the role of this junk, if any, can be achieved by other means,” says Herrera-Estrella. “Our study also generates some doubts as to whether junk DNA is as important for humans, as stated by the ENCODE initiative.”

“The study further challenges simplistic accounts of genome biology that assume functions for most or all DNA sequences, without addressing the enormous variability in genome size among plants and animals,” says T. Ryan Gregory, who studies the evolution of genome sizes at the University of Guelph.

In 2007, Gregory coined the “Onion Test” to challenge anyone who thinks that non-coding DNA isn’t junk. If that DNA is important, why is it that the onion needs so much more of it than a human, or even other closely related plants? “The Onion Test could just as easily have been called the Bladderwort Test,” he says. “If non-coding DNA is vital for gene regulation or some similar function, then how can a plant such as the bladderwort get by with so little of it?”

Reference: Ibarra-Laclette et al. Architecture and evolution of aminute plant genome. 2013. Nature http://dx.doi.org/10.1038/nature12132

More on bladderworts meat-eating plants with ultrafast traps

More on genome size

A Blog by

“We Gained Hope.” The Story of Lilly Grossman’s Genome

One – The Twitch

It started with a slight twitch. Steve and Gay Grossman both noticed it in their daughter Lilly in 1998, when she was just one-and-a-half years old. By the time she was four, the twitches had grown into full-blown muscle tremors. They wracked her whole body at night and were painful enough to wake her up.

The family stopped sleeping properly. Lilly would wake up, shaking and crying, as often as 20 or 30 times a night. During the worst bouts, Steve and Gay took shifts to console her, one staying with her until two in the morning and the other taking over from there. “I can’t describe what it’s like to care for a baby, a young child, who’s crying and shaking all night,” says Steve.

The Grossmans have dealt with this for the last 13 years and, if anything, Lilly’s tremors became more frequent and more severe. They eventually started happening during the day. She developed muscle weakness, poor coordination and balance problems. She had to use a walker until middle-school and a motorised wheelchair thereafter. She was often very tired.

Then, in the summer of 2012, the tremors stopped. For 18 days, Lilly slept soundly through the night. So did Steve and Gay. “We had dreams again,” he says. “We had forgotten what that was like.”

This U-turn in Lilly’s fortunes was the result of a study called IDIOM, led by the father-and-daughter team of Eric and Sarah Topol at the Scripps Translational Science Institute in La Jolla, California. IDIOM stands for Idiopathic Diseases of Man—that is, “serious, rare and perplexing health conditions that defy a diagnosis or are unresponsive to standard treatments”. In other words, whatever Lilly had.

The Scripps team sequenced Lilly, Steve and Gay’s complete genomes. Amidst the morass of As, Gs, Cs and Ts, they identified the likely causes of Lilly’s mystery condition—three mutations in two different genes. One of these pointed the way to a potential treatment—a drug called Diamox that had helped another family with a fault in one of the same genes. When Lilly tried it, she gained a few weeks of sound tremor-free sleep.

“Whole-genome sequencing can change lives and maybe save some,” says Steve. “It changed ours.” It was no miracle—the tremors have returned to a lesser extent than before, and the team are pursuing new leads. But Steve and Gay never expected The Answer. They didn’t anticipate an easy cure. Genomics gave them something arguably more important—hope. It turned the nameless, unknowable ailment that had stolen years of sleep from their daughter into something tangible—a condition with a cause that can eventually be addressed. And it bought them time with Lilly.

Lilly Grossman, 3rd grade, courtesy of Steve & Gay Grossman.
Lilly Grossman, 3rd grade, courtesy of Steve & Gay Grossman.

Two – Not Knowing

Lilly’s life has been defined by both the condition that restricts her choices, and the smarts, tools and support that allow her to escape those restrictions. Gay recalls, “Ever since Lilly was really small, she’d be up most of the night and in the morning, I’d say, “Why don’t you stay with me and relax?” And she would just cry and cry to go to school. She always wanted to be doing what the other kids were doing.”

Schools can make many children feel isolated or different, but they have always been great equalisers for Lilly. Her weak muscles and sensitivity to warm temperatures meant that, at home, she missed out when other kids played outside. At school, everyone sat and so did she. She got to use a brain that, tiredness aside, has stayed untouched by her physical symptoms. “She’s a regular teenager—smart, sarcastic, funny—and she has a grade point average of 3.5,” says Steve.

Lilly became a technophile out of necessity. Since pens and books are painful to hold, she has used laptops since kindergarten. Her voice tires easily and she hates it when people talk to her like she’s deaf or infirm; when she got her first cellphone and started sending texts, her social life blossomed. She was always good at maths but since drawing figures was taxing, she gravitated towards English and reading-heavy subjects. She now fancies herself a writer, penning pieces for her school newspaper, posts on her blog, and an online book about disability called Through My Eyes.

Through all of this, Steve and Gay have worked tirelessly to support her. She designs a stationery line called Letters from Lilly and spends her day “arguing with insurance companies and school bureaucracies”. He works at a software company with links to aerospace and defence and sources all the technology that allows his daughter to live as independently as possible. When it came time to tell Lilly’s story, Gay wrote three pages of text. Steve prepared a PowerPoint presentation.

For the longest time, the duo were bedevilled by uncertainty about Lilly’s condition and the belief that her time was slowly running out. Doctors initially diagnosed Lilly with cerebral palsy, but that wasn’t it. Next came a diagnosis of glutaric aciduria, leading to a modified diet and a lot of support groups. That wasn’t it either. The next guess was the heartbreaker: some kind of mitochondrial disease. These disorders affect the tiny bean-shaped batteries that power our cells. They vary a lot, but given the harsh and relentless nature of Lilly’s symptoms, Steve and Gay were worried. “The life expectancy for a teenager isn’t so great,” he recalls.

Printed out, Lilly’s medical records take up two four-inch binders. She’s had MRI scans, blood draws, spinal taps, skin biopsies, nerve biopsies, and a muscle biopsy. The tests hinted at a few depleted nutrients that could be fixed by supplements, but for the most part, they said the same thing: Lilly seemed normal. “We’ve been from world-class people to alternative quacks,” says Steve. No one could offer them the surety of a true diagnosis, much less a suitable treatment.

“Every birthday was a hard one—missed milestones and another reminder that we still didn’t know what’s wrong with her,” says Gay. When would the sand eventually run out? This year? The next one? “When you don’t know what you’re dealing with, and you’re up all night with your kid crying and shaking like crazy, you think: Does anyone even remember this is going on? Nobody knew what to do with us.”

Lilly, on her first easter in La Jolla.
Lilly, on her first easter in La Jolla.

Three –The Study

In 2005, when Lilly was eight, the family moved from Cleveland, Ohio to the more stable climate of La Jolla, California. “If things weren’t going to get better, we thought: At least, let’s go somewhere nice, where we can be outside,” says Steve. San Diego was also a thriving hub of medical research, including many scientists who worked on mitochondrial disease. Steve and Gay dreamed of a lucky random encounter.

For years, they were stuck in a holding pattern. Then, on June 16, 2011, Gay saw the following headline while browsing through NPR: Genome Maps Solve Medical Mystery For Calif. Twins. The article told the story of Alexis and Noah Beery—two twins of Lilly’s age who also had a long history of motor problems. They were also misdiagnosed with cerebral palsy before someone correctly worked out that they had a genetic disorder called dopa-responsive dystonia (DRD). It’s caused by low levels of dopamine—a chemical that carries signals between nerve cells. The twins had been taking a drug that boosts dopamine levels, which initially seemed to control their symptoms.

As they got older, Noah started getting hand tremors and his attention suffered, while Alexis developed breathing problems so severe that she needed daily adrenaline shots to stop herself from suffocating. Their parents, Retta and Joe, turned to scientists at Baylor College of Medicine, who  sequenced Alexis and Noah’s full genomes. They identified a mutation in a gene called SPR, which depletes another brain chemical called serotonin. The twins started taking serotonin-boosting drugs too, and their health greatly improved.

It was the success story that the Grossmans had long anticipated. They knew that whole-genome sequencing was a possibility, and had been waiting for it to become readily available. When Gay read about the Beery twins, she thought, “Wow, this is really here.”

Good news: the Beerys lived in San Diego. Lilly met the twins, while Retta and Gay became good friends. Better news: the Beerys’ doctor was Jennifer Friedman from University of California, San Diego, who was also handling Lilly’s case. But Friedman has already talked to the Baylor researchers and found that they needed a sibling for their study. Alexis and Noah had each other. Lilly was an only child.

Then, Steve and Gay learned about the IDIOM study through a friend of Retta’s. It seemed perfect. Here was a world-class facility, practically on their doorstep, trying to solve medical mysteries of the kind that afflicted Lilly. And they wanted to sequence a trio: dad, mum and child.

Gay put together a bright pink binder, emblazoned with photos of Lilly, and full of her writing, test scores, and a DVD of all her medical records. She sent it to Sarah Topol. “I wanted to make sure that they would never forget her,” she says. She needn’t have worried; Lilly fit IDIOM’s criteria perfectly. “Her symptoms looked likely to have genetic underpinnings,” says Nicholas Schork from Scripps.

Lilly became the first child to be enrolled into IDIOM, but she kept measured expectations. On her blog, she wrote, “Scripps will keep my records for twenty years so that if they find out any new information, they will try again. And they will keep trying until they figure me out.”

The family has a mantra: It’s a marathon not a sprint. They were battle-hardened from a long road of possible fixes and disappointments. “We thought: This is great but it’s probably just going to be another data point that we add to the binder,” says Steve.“Lilly’s already had a lot of bad news in her life,” says Gay. “Her biggest fear was that we wouldn’t find anything. Not knowing would be the worst thing.”

Gay and Lilly Grossman volunteer with National Charity League, San Diego Chapter.
Gay and Lilly Grossman volunteer with National Charity League, San Diego Chapter.

Four – Knowing

The IDIOM team took blood from the three Grossmans and sequenced their complete genomes, as well as their exomes—just the bits of DNA that code for proteins. (Lilly explains the process very clearly on her blog.) By comparing Lilly’s sequences against those of her parents, and cross-referencing any differences against what was known about the associated genes, the team identified just two of interest. “The list of candidates was already quite short, and none of the others made sense,” says Ali Torkamani, who led the analysis.

The first gene—ADCY5—influences dopamine’s ability to pass signals between neurons. It’s particularly active in a brain region called the striatum, which helps to plan and coordinate movements. The second—DOCK3—influences the movement of molecules within the neurons that control our movements. Mice that lack this gene entirely have uncoordinated movements and weak muscles.

Lilly had inherited a mutation in DOCK3 from Gay, but she also had one unique ‘point mutation’ in both ADCY5 and DOCK3—a single altered DNA letter that was absent from either of her parents’ versions. The team had expected as much, since neither Steve nor Gay had any of Lilly’s symptoms. Her genetic quirks, whatever they were, were most likely unique to her, rather than family heirlooms.

The team suspects that ADCY5 accounts for Lilly’s shaking, while DOCK3 influences her balance and muscle weakness. It seems that she was born with extraordinary bad luck—a double-whammy of fresh mutations in two separate genes that conspired to produce her unique constellation of problems. “That doesn’t discount other genes having a role,” says Schork. “It’s just that these two seem to be the most logical candidates.”

They sent their results to Friedman.

By this time, a different team of geneticists at the University of Washington had identified another family with an ADCY5 mutation, where the affected members shared some of Lilly’s symptoms. They had tested a couple of different drugs and one—Diamox—had helped some of them but not others.

Diamox interferes with an enzyme called carbonic anhydrase, which helps to maintain the right pH balance in the blood. In cases where blood is too alkaline, the drug acidifies it. There’s no particular reason why it should help people with faults in ADCY5, but it has a history of being useful for movement disorders. That’s why the Seattle team tried it, and their success intrigued Friedman. She saw two options. The more “biologically pleasing” one would have been to design treatments that directly addressed the problems caused by Lilly’s defective genes. The other was to try what worked before, even without a clear rationale. Friedman considered both strategies and recommended the latter.

It was August 2012, and the Grossmans were about to go on holiday when Friedman emailed them about the results. (Steve thinks she called. Gay says she emailed. Steve taps out.) After years of nothing, they found themselves oddly unprepared for a test result that actually had a result. They went to see Friedman by themselves, and she spent two hours explaining everything. There were two genes. One suggested a possible treatment. She wanted to try it. They had options.

After the Grossmans left the meeting, they drove home in silence. Steve broke it.

“Did you hear her say ‘normal life expectancy’?”

Five – The Marathon

Lilly getting reading for homecoming dance. By Steve & Gay Grossman
Lilly getting ready for homecoming dance. By Steve & Gay Grossman

“I am so happy that my genome didn’t come back all normal and say nothing is wrong,” Lilly wrote. “These next few months will be very interesting.”

She started on Diamox two weeks later. The first night was horrific. She shook and cried more than ever, but after she acclimatised to the drug, she slept soundly for 18 consecutive nights. She cut down on other medications and became more alert at school. The whole family got a boost. “When you have a long stint of all-nighters, you drop into a haze,” says Steve. “I feel smarter again.”

But this was no miracle cure. The tremors eventually returned.  “The response waxes and wanes, and it’s not 100 percent clear whether the treatment is effective or not,” says Torkamani.  But the Grossmans are adamant that Lilly’s dramatic improvement was no placebo effect. She still shakes herself awake, but less frequently or severely than before. The all-nighters are a thing of the past. She feels better during the day, and her platter of anti-seizure medicines and sedatives has been replaced by a small and benign set of supplements.“At least we know we found something that works,” says Steve. “Now, we just need to know how to make it work all the time.”

The knowledge is what matters, especially the fact that Lilly does not have mitochondrial disease. “We just celebrated her 16th birthday. That was the first one where we’ve known that Lilly will be here on her next one,” says Gay. “That alone was worth the sequencing. It bought us time. We always thought there wasn’t much time.” Before, they paid lip-service to the future. Now, they’re looking at colleges, jobs, and organisations that can help people with physical disabilities to transition towards independent adult life.

The Scripps team is now trying to better understand the ADCY5 variant that Lilly has, and to see if they can identify a treatment that directly addresses the problems caused by this faulty gene. Meanwhile, Steve and Gay are talking to Friedman about tweaking Lilly’s medications. Just last Friday, they decided to try a stronger dose of Diamox; Lilly only got up three times on Saturday night.

Steve and Gay are collecting as much data as possible. They use an iPad app designed to measure contractions in pregnant women to record the length and strength of Lilly’s shaking bouts. In the morning, Gay emails the data to the IDIOM team. Meanwhile, Steve’s checking out different accelerometers that  Lilly could wear to collect the data automatically. “If we try a new drug and we get a, say, 2% drop in shaking one night, we can act on that,” he says.

Whole-genome sequencing doesn’t provide easy answers. For every prominent success story, like the Beery twins or Nicholas Volker, there will be tales where the path from genome to treatment meanders and backtracks. In another case, the IDIOM team identified a genetic variant in a different patient that suggested a potential treatment, but hit an impasse when the child’s physician disagreed.

Lilly had the benefit of well-educated and realistic parents, and a doctor who was savvy about genomics. Many scientists have debated how genetic results should be returned to patients but the Scripps team has a simple solution: They rely on “physician champions” like Friedman. It’s their call how to convey the results and factor them into any treatment plans. That takes time and genetics expertise, and many doctors are short on both.

“The public perception is that you send your genome for sequencing and you come back with an answer, like your cholesterol level,” says Friedman. “The reality is that there’s ambiguity in the results and their interpretation. It’s an iterative process that has to be re-examined year after year. It’s not a crystal ball; it’s a fuzzy vision of the future.”

Cinnamon Bloss, a clinical psychologist at Scripps who worked with the Grossmans, adds, “This story highlights how promising whole-genome sequencing is, but also the difficulties that have yet to be overcome. Sequencing is getting cheaper and more powerful, but the social support that it relies upon is not easily scalable. Many stars must align. The Grossmans understand that. “Treatments aren’t going to be instantaneous or 100 percent, but they’re hope,” says Steve. “We gained hope. And the more data we have, the better position we’ll be in to figure this out.”

It’s a marathon, not a sprint.


Six – Epilogue

One Friday, last September, Steve and Gay took Lilly to meet the Scripps team. She had met Bloss and Sarah Topol, but the rest were faceless names looking out for her from afar. She brought homemade, individually-wrapped chocolates and a thank-you note for every member of the team. Here is what it said:

Dear Scientists,

Thank you for what you do every day. Without you, I would still be shaking every night and be exhausted during the day. Now that I’m sleeping, I no longer have to wear socks when I sleep for fear of scratching my legs with my toenails when I shake.

I look forward to being able to sleepover at my friends’ houses, instead of always having to invite them over because of my shaking.

School will be so much easier now that I’m sleeping as well. My family and I can’t thank you enough!



Lilly's letter

A Blog by

You Have 46 Chromosomes. This Pond Creature Has 15,600

Remember when encyclopaedias were books, and not just websites? You’d have a shelf full of information, packaged into entries, and then into separate volumes. Your genome is organised in a similar way. Your DNA is packaged into large volumes called chromosomes. There are 23 pairs of them, each of which contains a long string of genes. And just as encyclopaedia books are bound in sturdy covers to prevent the pages within from fraying, so too are your chromosomes capped by protective structures called telomeres.

That’s basically how it works in any animal or plant or fungus. The number of chromosomes might vary a lot—fruit flies have 8 while dogs have 78—but the basic organisation is the same.

But there’s a pond-dwelling creature called Oxytricha trifallax whose DNA is organised in a very… different… way. A team of US scientists has sequenced its genome for the first time and discovered genetic chaos. It’s like someone has taken the encyclopaedias, ripped out all the individual pages, torn some of them, photocopied everything dozens of times, and stuffed the whole lot in a gigantic messy drawer.

Oxytricha trifallax is neither animal nor plant, but protist –part of the kingdoms of life that include amoebas and algae. Composed of just a single cell, it never gets bigger than a quarter of a millimetre in length. It swims around ponds and puddles in search of other microbes to consume, and moves by beating small hairs called cilia. These hairs give it and its relatives their group name—the ciliates.

Within its cell, Oxytricha contains two nuclei, which enclose its DNA. One of these—the micronucleus— contains the complete edition of Oxytricha’s genome, just like the single nucleus within our own cells. That’s the tidy encyclopaedia shelf. But while the material in our nucleus must be constantly decoded and transcribed so that we can live, Oxytricha’s micronucleus is largely inactive. The encyclopaedia’s are barely read.

Instead, it relies on a second structure called the macronucleus. That’s the messy drawer. All of the DNA in the micronucleus is copied thousands of times over, and shunted into the macronucleus. In the process, it is broken up at tens of thousands of places, rearranged, and pruned. What’s left is a collection of thousands of “nanochromosomes” that contain all the information Oxytricha needs to survive. This is the stuff that gets decoded and transcribed, used and reused while the originals gather dust.

Sequencing this almighty mess must have been a devilish task, but Etienne Swart from Princeton University rose to the challenge. Leading a team of US and Swiss scientists, he has sequenced Oxytricha’s complete macronuclear genome. Modern sequencing works by breaking genomes into small fragments, sequencing these, and assembling everything together. The DNA in Oxytricha’s macronucleus is already fragmented and extremely repetitive, make it hard to capture everything and assemble it into a coherent whole. Then again, almost three-quarters of the fragments were already complete chromosomes.

The team found around 15,600 of these nanochromosomes. On average, each is around 3,200 DNA ‘letters’ long, and around 80 percent of them contain just a single gene.

As if that wasn’t complicated enough, the genome is duplicated so extensively that there are around 2,000 copies of each nanochromosome. And around one in ten of them are broken up into even smaller fragments. So, different copies of the same nanochromosomes might just contain a small passage from the full page of information.

Our 46 chromosomes are capped by protective structures called telomeres that stop DNA from fraying, rather like the plastic tags on the end of shoelaces. All of Oxytricha’s nanochromosomes have their own telomeres, so each individual has tens of millions of these protective caps. It has, in Swart’s words, an “inordinate fondness for telomeres”. It’s like every page in its messy drawer is hard-bound.

As the contents of the micronucleus are copied into the macronucleus, anything that doesn’t contain instructions for making proteins—the so-called “non-coding DNA”—is ruthlessly pruned. Around 96 percent of the genome is jettisoned in this way. The remainder—the nanochromosomes—are a small fraction of the full genome, but they contain all the genes that Oxytricha needs for day-to-day existence. The only things missing are a smattering of genes that the creature needs to reproduce.

This isn’t just an academic exercise, targeted at an (admittedly cool) creature. Ciliates have a long history of teaching us about our own genomes. Another of them—Tetrahymena thermophila—taught us about the existence of telomeres in the first place, and these structures are now through to play critical roles in ageing, cancer and other aspects of our lives. Tetrahymena also helped to show that RNA—a genetic molecule that’s related to DNA—can act as an enzyme. That’s crucial to modern theories about the origin of life itself. (And its genome was fully sequenced back in 2006, by the inimitable Jon Eisen)

Meanwhile, Oxytricha, with its bonanza of telomeres, helped scientists to identify the proteins that stick to these caps and help to create, maintain and control them. Perhaps its bizarre genome will tell us even more about how DNA is rearranged and copied—something that happens in our genome to a less dramatic (but still important) extent.

Reference: Swart EC, Bracht JR, Magrini V, Minx P, Chen X, et al. (2013) The Oxytricha trifallax Macronuclear Genome: A Complex Eukaryotic Genome with 16,000 Tiny Chromosomes. PLoS Biol 11(1): e1001473. http://dx.doi.org/10.1371/journal.pbio.1001473

A Blog by

Will We Ever Fully Decipher Life’s Code?

Here’s the 12th piece from my BBC column

In 2001, the Human Genome Project gave us an almost complete draft of the 3 billion letters in our DNA. We joined an elite club of species with their genome sequences, one that is growing with every passing month.

These genomes contain the information necessary for building their respective owners, but it’s information that we still struggle to parse. To date, no one can take the code from an organism’s genes and predict all the details of its shape, behaviour, development, physiology—the collection of traits known as its phenotype. And yet, the basis of those details are there, all captured in stretches of As, Cs, Gs and Ts. “Cells know pretty reliably how to do this,” says Leonid Kruglyak from Princeton University. “Every time you start with a chicken genome, you get a chicken, and every time you start with an elephant genome, you get an elephant.”

As our technologies and understanding advance, will we eventually be able to look at a pile of raw DNA sequence and glean all the workings of the organism it belongs to? Just as physicists can use the laws of mechanics to predict the motion of an object, can biologists use fundamental ideas in genetics and molecular biology to predict the traits and flaws of a body based solely on its genes? Could we pop a genome into a black box, and print out the image of a human? Or a fly? Or a mouse?


A Blog by

ENCODE: the rough guide to the human genome

Back in 2001, the Human Genome Project gave us a nigh-complete readout of our DNA. Somehow, those As, Gs, Cs, and Ts contained the full instructions for making one of us, but they were hardly a simple blueprint or recipe book. The genome was there, but we had little idea about how it was used, controlled or organised, much less how it led to a living, breathing human.

That gap has just got a little smaller. A massive international project called ENCODE – the Encyclopedia Of DNA Elements – has moved us from “Here’s the genome” towards “Here’s what the genome does”. Over the last 10 years, an international team of 442 scientists have assailed 147 different types of cells with 24 types of experiments. Their goal: catalogue every letter (nucleotide) within the genome that does something. The results are published today in 30 papers across three different journals, and more.

For years, we’ve known that only 1.5 percent of the genome actually contains instructions for making proteins, the molecular workhorses of our cells. But ENCODE has shown that the rest of the genome – the non-coding majority – is still rife with “functional elements”. That is, it’s doing something.

It contains docking sites where proteins can stick and switch genes on or off. Or it is read and ‘transcribed’ into molecules of RNA. Or it controls whether nearby genes are transcribed (promoters; more than 70,000 of these). Or it influences the activity of other genes, sometimes across great distances (enhancers; more than 400,000 of these). Or it affects how DNA is folded and packaged. Something.

According to ENCODE’s analysis, 80 percent of the genome has a “biochemical function”. More on exactly what this means later, but the key point is: It’s not “junk”. Scientists have long recognised that some non-coding DNA has a function, and more and more solid examples have come to light [edited for clarity – Ed]. But, many maintained that much of these sequences were, indeed, junk. ENCODE says otherwise. “Almost every nucleotide is associated with a function of some sort or another, and we now know where they are, what binds to them, what their associations are, and more,” says Tom Gingeras, one of the study’s many senior scientists.

And what’s in the remaining 20 percent? Possibly not junk either, according to Ewan Birney, the project’s Lead Analysis Coordinator and self-described “cat-herder-in-chief”. He explains that ENCODE only (!) looked at 147 types of cells, and the human body has a few thousand. A given part of the genome might control a gene in one cell type, but not others. If every cell is included, functions may emerge for the phantom proportion. “It’s likely that 80 percent will go to 100 percent,” says Birney. “We don’t really have any large chunks of redundant DNA. This metaphor of junk isn’t that useful.”

That the genome is complex will come as no surprise to scientists, but ENCODE does two fresh things: it catalogues the DNA elements for scientists to pore over; and it reveals just how many there are. “The genome is no longer an empty vastness – it is densely packed with peaks and wiggles of biochemical activity,” says Shyam Prabhakar from the Genome Institute of Singapore. “There are nuggets for everyone here. No matter which piece of the genome we happen to be studying in any particular project, we will benefit from looking up the corresponding ENCODE tracks.”

There are many implications, from redefining what a “gene” is, to providing new clues about diseases, to piecing together how the genome works in three dimensions. “It has fundamentally changed my view of our genome. It’s like a jungle in there. It’s full of things doing stuff,” says Birney. “You look at it and go: “What is going on? Does one really need to make all these pieces of RNA? It feels verdant with activity but one struggles to find the logic for it.

Think of the human genome as a city. The basic layout, tallest buildings and most famous sights are visible from a distance. That’s where we got to in 2001. Now, we’ve zoomed in. We can see the players that make the city tick: the cleaners and security guards who maintain the buildings, the sewers and power lines connecting distant parts, the police and politicians who oversee the rest. That’s where we are now: a comprehensive 3-D portrait of a dynamic, changing entity, rather than a static, 2-D map.

And just as London is not New York, different types of cells rely on different DNA elements. For example, of the roughly 3 million locations where proteins stick to DNA, just 3,700 are commonly used in every cell examined. Liver cells, skin cells, neurons, embryonic stem cells… all of them use different suites of switches to control their lives. Again, we knew this would be so. Again, it’s the scale and the comprehensiveness that matter.

“This is an important milestone,” says George Church, a geneticist at the Harvard Medical School. His only gripe is that ENCODE’s cells lines came from different people, so it’s hard to say if differences between cells are consistent differences, or simply reflect the genetics of their owners. Birney explains that in other studies, the differences between cells were greater than the differences between people, but Church still wants to see ENCODE’s analyses repeated with several types of cell from a small group of people, healthy and diseased. That should be possible since “the cost of some of these [tests] has dropped a million-fold,” he says.

The next phase is to find out how these players interact with one another. What does the 80 percent do (if, genuinely, anything)? If it does something, does it do something important? Does it change something tangible, like a part of our body, or our risk of disease? If it changes, does evolution care?

[Update 07/09 23:00 Indeed, to many scientists, these are the questions that matter, and ones that ENCODE has dodged through a liberal definition of “functional”. That, say the critics, critically weakens its claims of having found a genome rife with activity. Most of the ENCODE’s “functional elements” are little more than sequences being transcribed to RNA, with little heed to their physiological or evolutionary importance. These include repetitive remains of genetic parasites that have copied themselves ad infinitum, the corpses of dead and once-useful genes, and more.

To include all such sequences within the bracket of “functional” sets a very low bar. Michael Eisen from the Howard Hughes Medical Institute said that ENCODE’s definition as a “meaningless measure of functional significance” and Leonid Kruglyak from Princeton University noted that it’s “barely more interesting” than saying that a sequence gets copied (which all of them are). To put it more simply: our genomic city’s got lots of new players in it, but they may largely be bums.

This debate is unlikely to quieten any time soon, although some of the heaviest critics of ENCODE’s “junk” DNA conclusions have still praised its nature as a genomic parts list. For example, T. Ryan Gregory from Guelph University contrasts their discussions on junk DNA to a classic paper from 1972, and concludes that they are “far less sophisticated than what was found in the literature decades ago.” But he also says that ENCODE provides “the most detailed overview of genome elements we’ve ever seen and will surely lead to a flood of interesting research for many years to come.” And Michael White from the Washington University in St. Louis said that the project had achieved “an impressive level of consistency and quality for such a large consortium.” He added, “Whatever else you might want to say about the idea of ENCODE, you cannot say that ENCODE was poorly executed.” ]

Where will it lead us? It’s easy to get carried away, and ENCODE’s scientists seem wary of the hype-and-backlash cycle that befell the Human Genome Project. Much was promised at its unveiling, by both the media and the scientists involved, including medical breakthroughs and a clearer understanding of our humanity. The ENCODE team is being more cautious. “This idea that it will lead to new treatments for cancer or provide answers that were previously unknown is at least partially true,” says Gingeras, “but the degree to which it will successfully address those issues is unknown.

“We are the most complex things we know about. It’s not surprising that the manual is huge,” says Birney. “I think it’s going to take this century to fill in all the details. That full reconciliation is going to be this century’s science.”

Find out more about ENCODE:

So… how much is “functional” again?

So, that 80 percent figure… Let’s build up to it.

We know that 1.5 percent of the genome codes for proteins. That much is clearly functional and we’ve known that for a while. ENCODE also looked for places in the genome where proteins stick to DNA – sites where, most likely, the proteins are switching a gene on or off. They found 4 million such switches, which together account for 8.5 percent of the genome.* (Birney: “You can’t move for switches.”) That’s already higher than anyone was expecting, and it sets a pretty conservative lower bound for the part of the genome that definitively does something.

In fact, because ENCODE hasn’t looked at every possible type of cell or every possible protein that sticks to DNA, this figure is almost certainly too low. Birney’s estimate is that it’s out by half. This means that the total proportion of the genome that either creates a protein or sticks to one, is around 20 percent.

To get from 20 to 80 percent, we include all the other elements that ENCODE looked for – not just the sequences that have proteins latched onto them, but those that affects how DNA is packaged and those that are transcribed at all. Birney says, “[That figure] best coveys the difference between a genome made mostly of dead wood and one that is alive with activity.” [Update 5/9/12 23:00: For Birney’s own, very measured, take on this, check out his post. ]

That 80 percent covers many classes of sequence that were thought to be essentially functionless. These include introns – the parts of a gene that are cut out at the RNA stage, and don’t contribute to a protein’s manufacture. “The idea that introns are definitely deadweight isn’t true,” says Birney. The same could be said for our many repetitive sequences: small chunks of DNA that have the ability to copy themselves, and are found in large, recurring chains. These are typically viewed as parasites, which duplicate themselves at the expense of the rest of the genome. Or are they?

The youngest of these sequences – those that have copied themselves only recently in our history – still pose a problem for ENCODE. But many of the older ones, the genomic veterans, fall within the “functional” category. Some contain sequences where proteins can bind, and influence the activity of nearby genes. Perhaps their spread across the genome represents not the invasion of a parasite, but a way of spreading control. “These parasites can be subverted sometimes,” says Birney.

He expects that many skeptics will argue about the 80 percent figure, and the definition of “functional”. But he says, “No matter how you cut it, we’ve got to get used to the fact that there’s a lot more going on with the genome than we knew.”

[Update 07/09 23:00 Birney was right about the scepticism. Gregory says, “80 percent is the figure only if your definition is so loose as to be all but meaningless.” Larry Moran from the University of Toronto adds, “Functional” simply means a little bit of DNA that’s been identified in an assay of some sort or another. That’s a remarkably silly definition of function and if you’re using it to discount junk DNA it’s downright disingenuous.”

This is the main criticism of ENCODE thus far, repeated across many blogs and touched on in the opening section of this post. There are other concerns. For example, White notes that many DNA-binding proteins recognise short sequences that crop up all over the genome just by chance. The upshot is that you’d expect many of the elements that ENCODE identified if you just wrote out a random string of As, Gs, Cs, and Ts. “I’ve spent the summer testing a lot of random DNA,” he tweeted. “It’s not hard to make it do something biochemically interesting.”

Gregory asks why, if ENCODE is right and our genome is full of functional elements, does an onion have around five times as much non-coding DNA as we do? Or why pufferfishes can get by with just a tenth as much? Birney says the onion test is silly. While many genomes have a tight grip upon their repetitive jumping DNA, many plants seem to have relaxed that control. Consequently, their genomes have bloated in size (bolstered by the occasional mass doubling). “It’s almost as if the genome throws in the towel and goes: Oh sod it, just replicate everywhere.” Conversely, the pufferfish has maintained an incredibly tight rein upon its jumping sequences. “Its genome management is pretty much perfect,” says Birney. Hence: the smaller genome.

But Gregory thinks that these answers are a dodge. “I would still like Birney to answer the question. How is it that humans “need” 100% of their non-coding DNA, but a pufferfish does fine with 1/10 as much [and] a salamander has at least 4 times as much?” [I think Birney is writing a post on this, so expect more updates as they happen, and this post to balloon to onion proportions].]

[Update 07/09/12 11:00: The ENCODE reactions have come thick and fast, and Brendan Maher has written the best summary of them. I’m not going to duplicate his sterling efforts. Head over to Nature’s blog for more.]

* (A cool aside: John Stamatoyannopoulos from the University of Washington mapped these protein-DNA contacts by looking for “footprints” where the presence of a protein shields the underlying DNA from a “DNase” enzyme that would otherwise slice through it. The resolution is incredible! Stamatoyannopoulos could “see” every nucleotide that’s touched by a protein – not just a footprint, but each of its toes too. Joe Ecker from the Salk Institute thinks we should be eventually able to “dynamically footprint a cellular response”. That is, expose a cell to something—maybe a hormone or a toxin—and check its footprints over time. You can cross-reference those sites to the ENCODE database, and reconstruct what’s going on in the cell just by “watching” the shadows of proteins as they descend and lift off.)

Find out more about ENCODE:

Redefining the gene

The simplistic view of a gene is that it’s a stretch of DNA that is transcribed to make a protein. But each gene can be transcribed in different ways, and the transcripts overlap with one another. They’re like choose-your-own-adventure books: you can read them in different orders, start and finish at different points, and leave out chunks altogether.

Fair enough: We can say that the “gene” starts at the start of the first transcript, and ends at the end of the final transcript. But ENCODE’s data complicates this definition. There are a lot of transcripts, probably more than anyone had realised, and some connect two previously unconnected genes. The boundaries for those genes widen, and the gaps between them shrink or disappear.

Gingeras says that this “intergenic” space has shrunk by a factor of four. “A region that was once called Gene X is now melded to Gene Y.” Imagine discovering that every book in the library has a secret appendix, that’s also the foreword of the book next to it.

These bleeding boundaries seem familiar. Bacteria have them: Their genes are cramped together in a miracle of effective organisation, packing in as much information as possible into a tiny genome. Viruses epitomise such genetic economy even better. I suggested that comparison to Gingeras. “Exactly!” he said. “Nature never relinquished that strategy.”

Bacteria and viruses can get away with smooshing their protein-encoding genes together. But not only do we have more proteins, but we also need a vast array of sequences to control when, where and how they are deployed. Those elements need space too. Ignore them, and it looks like we have a flabby genome with sequence to spare. Understand them, and our own brand of economical packaging becomes clear. (However, Birney adds, “In bacteria and viruses, it’s all elegant and efficient. At the moment, our genome just seems really, really messy. There’s this much higher density of stuff, but for me, emotionally it doesn’t have that elegance when we see in a bacterial genome.“)

Given these blurred boundaries, Gingeras thinks that it no longer makes sense to think of a gene as a specific point in the genome, or as its basic unit. Instead, that honour falls to the transcript, made of RNA rather than DNA.  “The atom of the genome is the transcript,” says Gingeras. “They are the basic unit that’s affected by mutation and selection.” A “gene” then becomes a collection of transcripts, united by some common factor.

There’s something poetic about this. Our view of the genome has long been focused on DNA. It’s the thing the genome project was deciphering. It is converted into RNA, giving it a more fundamental flavour. But out of those two molecules, RNA arrived on the planet first. It was copying itself and evolving long before DNA came on the scene. “These studies are pointing us back in that direction,” says Gingeras. They recognise RNA’s role, not as simply an intermediary between DNA and proteins, but something more primary.

Find out more about ENCODE:

What about diseases?

For the last decade, geneticists have run a seemingly endless stream of “genome-wide association studies” (GWAS), attempting to understand the genetic basis of disease. They have thrown up a long list of SNPs – variants at specific DNA letters—that correlate with the risk of different conditions.

The ENCODE team have mapped all of these to their data. They found that just 12 percent of the SNPs lie within protein-coding areas. They also showed that compared to random SNPs, the disease-associated ones are 60 percent more likely to lie within functional, non-coding regions, especially in promoters and enhancers. This suggests that many of these variants are controlling the activity of different genes, and provides many fresh leads for understanding how they affect our risk of disease. “It was one of those too good to be true moments,” says Birney. “Literally, I was in the room [when they got the result] and I went: Yes!”

Imagine a massive table. Down the left side are all the diseases that people have done GWAS studies for. Across the top are all the possible cell types and transcription factors (proteins that control how genes are activated) in the ENCODE study. Are there hotspots? Are there SNPs that correspond to both? Yes. Lots, and many of them are new.

Take Crohn’s disease, a type of bowel disorder. The team found five SNPs that increase the risk of Crohn’s, and that are recognised by a group of transcription factors called GATA2. “That wasn’t something that the Crohn’s disease biologists had on their radar,” says Birney. “Suddenly we’ve made an unbiased association between a disease and a piece of basic biology.” In other words, it’s a new lead to follow up on.

“We’re now working with lots of different disease biologists looking at their data sets,” says Birney. “In some sense, ENCODE is working form the genome out, while GWAS studies are working from disease in.” Where they meet, there is interest. So far, the team have identified 400 such hotspots that are worth looking into. Of these, between 50 and 100 were predictable. Some of the rest make intuitive sense. Others are head-scratchers.

Find out more about ENCODE:

The 3-D genome

Writing the genome out as a string of letters invites a common fallacy: that it’s a two-dimensional, linear entity. It’s anything but. DNA is wrapped around proteins called histones like beads on a string. These are then twisted, folded and looped in an intricate three-dimensional way. The upshot is that parts of the genome that look distant when you write the sequences out can actually be physical neighbours. And this means that some switches can affect the activity of far away genes

Job Dekker from the University of Massachusetts Medical School has now used ENCODE data to map these long-range interactions across just 1 percent of the genome in three different types of cell. He discovered more than 1,000 of them, where switches in one part of the genome were physically reaching over and controlling the activity of a distant gene. “I like to say that nothing in the genome makes sense, except in 3D,” says Dekker. “It’s really a teaser for the future of genome science,” Dekker says.

Gingeras agrees. He thinks that understanding these 3-D interactions will add another layer of complexity to modern genetics, and extending this work to the rest of the genome, and other cell types, is a “next clear logical step”.

Find out more about ENCODE:

How will scientists actually make sense of all of this?

ENCODE is vast. The results of this second phase have been published in 30 central papers in Nature, Genome Biology and Genome Research, along with a slew of secondary articles in Science, Cell and others. And all of it is freely available to the public.

The pages of printed journals are a poor repository for such a vast trove of data, so the ENCODE team have devised a new publishing model. In the ENCODE portal site, readers can pick one of 13 topics of interest, and follow them in special “threads” that link all the papers. Say you want to know about enhancer sequences. The enhancer thread pulls out all the relevant paragraphs from the 30 papers across the three journals. “Rather than people having to skim read all 30 papers, and working out which ones they want to read, we pull out that thread for you,” says Birney.

And yes, there’s an app for that.

Transparency is a big issue too. “With these really intensive science projects, there has to be a huge amount of trust that data analysts have done things correctly,” says Birney. But you don’t have to trust. At least half the ENCODE figures are interactive, and the data behind them can be downloaded. The team have also built a “Virtual Machine” – a downloadable package of the almost-raw data and all the code in the ENCODE analyses. Think of it as the most complete Methods section ever. With the virtual machine, “you can absolutely replay step by step what we did to get to the figure,” says Birney. “I think it should be the standard for the future.”

Find out more about ENCODE:


Compilation of other ENCODE coverage

A Blog by

A world of genetic diversity within a single tree

Words like “individual” are hard to use when it comes to the black cottonwood tree. Each tree can sprout a new one that’s a clone of the original, and still connected by the same root system. This “offspring” is arguably the same tree – the same “individual” – as the “parent”. This semantic difficulty gets even worse when you consider their genes. Even though the parent and offspring are clones, it turns out that they have stark genetic differences between them.

It gets worse: when Brett Olds sequenced tissues from different parts of the same black cottonwood, he found differences in thousands of genes between the topmost bud, the lowermost branch, and the roots. In fact, the variation within a single tree can be greater than that across different trees.

As Olds told me, “This could change the classic paradigm that evolution only happens in a population rather than at an individual level.” There are uncanny parallels here to a story about cancer that I wrote last year, in which British scientists showed that a single tumour can contain a world of diversity, with different parts evolving individually from one another.

I learned about Olds’ study at the Ecological Society of America Annual Meeting and wrote about it for Nature. Head over there for the details.

Photo by Born1945

A Blog by

Revising the polar bear’s evolutionary past… again

Earlier this year, I wrote about a new study showing that polar bears split off from brown bears around 600,000 years ago – already making them four times older than previously thought. Now, a new study pushes the date of that split back even further, to between 4 and 5 million years ago. The exact date is probably going to shift again in the future, and if anything, it’s the least interesting bit of the new paper.

Webb Miller, Stephan Schuster and Charlotte Lindqvist have taken a whirlwind look at the history of the polar bear. For a start, they sequenced its genome – that detail would be the centrepiece of other papers, but gets mentioned halfway through this one. They started looking at the genetic changes that have made polar bears lords of the Arctic, and they reconstructed the bears’ population history across the many climate upheavals it must have lived through. Finally, they found evidence that polar bears carry a lot of brown bear DNA in their genome (and vice versa) – a sure sign that the two species repeatedly bred with each other after diverging, in much the same way that our ancestors had sex with Neanderthals and other ancient humans.

I’ve written about the study for The Scientist. Head over there for the full story.

Photo by Alan Wilson

A Blog by

Stickleback genome reveals detail of evolution’s repeated experiment

Apathy, weary sighs, and fatigue: these are the symptoms of the psychological malaise that Carl Zimmer calls Yet Another Genome Syndrome. It is caused by the fast-flowing stream of publications, announcing the sequencing of another complete genome.

News reports about such publications tend to follow the same pattern. Scientists have deciphered the full genome of Animal X, which is known for Traits Y and Z, which could include commercial importance, social behaviour, being closely related to us, or just being exceptionally weird. By understanding X’s collection of As, Gs, Cs and Ts, we may gain insights into the genetic basis of Y and Z, which will be terribly important and there will be parties and cake.

Note the future tense. The value in sequencing yet another genome is almost never in the act itself, but in enabling an entire line of subsequent research. It’s the harbinger of news; it’s rarely news itself.

But there are exceptions. This week, there’s a paper about a new animal genome that goes the extra mile. It includes not just one full sequence, but twenty-one. It doesn’t just spell out the creature’s DNA, but also uses it to address some big questions in evolutionary biology. And its protagonist is a small, unassuming fish – the three-spined stickleback.


A Blog by

World within a tumour–study shows how complex cancer can be

When I used to work at a cancer charity, I would often hear people asking why there isn’t a cure yet. This frustration is understandable. Despite the billions of dollars and pounds that go into cancer research, and the decades since a war on cancer was declared, the “cure” remains elusive.

There is a good reason for that: cancer is really, really hard.

It is a puzzle of staggering complexity. Every move towards a solution seems to reveal yet another layer of mystery.

For a start, cancer isn’t a single disease, so we can dispense with the idea of a single “cure”. There are over 200 different types, each with their own individual quirks. Even for a single type – say, breast cancer – there can be many different sub-types that demand different treatments. Even within a single subtype, one patient’s tumour can be very different from another’s. They could both have very different sets of mutated genes, which can affect their prognosis and which drugs they should take.

Even in a single patient, a tumour can take on many guises. Cancer, after all, evolves. A tumour’s cells are not bound by the controls that keep the rest of our body in check. They grow and divide without restraint, picking up new genetic changes along the way. Just as animals and plants evolve new strategies to foil predators or produce more offspring, a tumour’s cells can evolve new ways of resisting drugs or growing even faster.

Now, we know that even a single tumour can be a hotbed of diversity. Charles Swanton from Cancer Research UK’s London Research Institute discovered this extra layer of complexity by studying four kidney cancers at an unprecedented level of detail. He showed that the cells from one end of the tumour can have very different genetic mutations to the cells at the other end.

These are not trivial differences. These mutations can indicate a patient’s prognosis, and they can affect which drugs a doctor decides to administer.  The bottom line is that a tumour is not a single entity. It’s an entire world.


A Blog by

OpenLab: The Renaissance Man, and how to become a scientist over and over again

I originally wrote this feature about the amazing Erez Lieberman Aiden back in June. It’s been one of the most popular posts on Not Exactly Rocket Science over the past year, and it was recently nominated for inclusion in the latest edition of Open Lab, the anthology of the world’s best science blogging. For that reason, I’m giving it another airing.


Erez Lieberman Aiden is a talkative witty fellow, who will bend your ear on any number of intellectual topics. Just don’t ask him what he does. “This is actually the most difficult question that I run into on a regular basis,” he says. “I really don’t have anything for that.”

It is easy to understand why. Aiden is a scientist, yes, but while most of his peers stay within a specific field – say, neuroscience or genetics – Aiden crosses them with almost casual abandon. His research has taken him across molecular biology, linguistics, physics, engineering and mathematics. He was the man behind last year’s “culturomics” study, where he looked at the evolution of human culture through the lens of four per cent of all the books ever published. Before that, he solved the three-dimensional structure of the human genome, studied the mathematics of verbs, and invented an insole called the iShoe that can diagnose balance problems in elderly people. “I guess I just view myself as a scientist,” he says.

His approach stands in stark contrast to the standard scientific career: find an area of interest and become increasingly knowledgeable about it. Instead of branching out from a central speciality, Aiden is interested in ‘interdisciplinary’ problems that cross the boundaries of different disciplines. His approach is nomadic. He moves about, searching for ideas that will pique his curiosity, extend his horizons, and hopefully make a big impact. “I don’t view myself as a practitioner of a particular skill or method,” he tells me. “I’m constantly looking at what’s the most interesting problem that I could possibly work on. I really try to figure out what sort of scientist I need to be in order to solve the problem I’m interested in solving.”

It’s a philosophy that has paid dividends. At just 31 years of age, Aiden has a joint lab at MIT and Harvard. In 2010, he won the prestigious $30,000 MIT-Lemenson prize, awarded to people who show “exceptional innovation and a portfolio of inventiveness”. He has seven publications to his name, six of which appeared the world’s top two journals – Nature and Science. His friend and colleague Jean-Baptiste Michel says, “He’s truly one of a kind. I just wonder about what discipline he will get a Nobel Prize in!”