Welcome To The Era of Big Replication
Psychologists have been sailing through some pretty troubled waters of late. They’ve faced several cases of fraud, high-profile failures to repeat the results of classic experiments, and debates about commonly used methods that are recipes for sexy but misleading results. The critics, many of whom are psychologists themselves, say that these lines of evidence point towards a “replicability crisis”, where an unknown proportion of the field’s results simply aren’t true.
To address these concerns, a team of scientists from 36 different labs joined together, like some sort of verification Voltron, to replicate 13 experiments from past psychological studies. They chose experiments that were simple and quick to do, and merged them into a single online package that volunteers could finish in just 15 minutes.
They then delivered their experimental smorgasbord to 6,344 people from 36 different groups and 12 different countries.
This is Big Replication—scientific self-correction on a massive scale.
I’ve written about this “Many Labs Replication Project” over at Nature News, so head over there for more details and viewpoints from the psychological community. The project was also coordinated by Richard Klein and Kate Ratliff from the University of Florida, Michelangelo Vianello from the University of Padua, and Brian Nosek from the Center for Open Science.
First, 10 of the 13 effects replicated. That’s certainly encouraging after months of battering.
One of the 13 was on the fence—the “imagined contact” effect, where imagining contact with people from other ethnic groups reduces prejudice towards them. It’s hard to say whether this is real or not.
And two of the 13 effects outrightly did not replicate. Both are recent studies involved social priming, the field in which subtle and subconscious cues supposedly influence our later behaviour. In one, exposure to a US flag increases conservatism among Americans; in the other, exposure to money increases endorsement of the current social system.
For Nosek, personally, the results are a mixed bag. Two of his own effects were in the mix and they both checked out. Many classics in the field are robust. This is all good. But a lot of Nosek’s own work involves social priming, and the fact that this sub-field regularly (but not always) stumbles in the replication gauntlet is troubling to him. “This been difficult for me personally because it’s an area that’s important for my research,” he says. “But I choose the red pill. That’s what doing science is.”
But he and others I spoke to also urge caution. This is neither a “te absolvo” for the field, nor a final damnation of social priming. The team chose the 13 effects arbitrarily, to represent a range of different psychological studies from different eras. It doesn’t mean that 10 out of every 13 effects will replicate, nor that 2 out of every 2 social priming ones will flunk. It’s not systematic. (Nosek, incidentally, is also leading a systematic check of reproducibility in psychology, in which more than 150 scientists are repeating every study published in four journals in 2008. The man is front and centre in this debate.)
To focus too much on the results would miss the point. The critical thing about the Many Labs Project is its approach.
Replications are really important, and there aren’t enough of them in psychology. But single, one-off replications can add more heat than light. If you can’t replicate an earlier study, the knee-jerk reaction is to say that the original was flawed. Alternatively, you could be incompetent. Or you could have changed the original experiment in important ways. Or your new study may be too small. Or you might have studied a completely different group of people. The authors of the original study can always hit back with these objections, and they would not be wrong to.
So, some scientists run meta-analyses—big mega-studies where they look at the results of past experiments and tease out the overall picture. If one replicate attempt is inconclusive, what do all of them say together? But meta-analyses also have flaws. If they don’t publish their failed replications (and until recently, it was really hard to), the meta-analysis will be badly skewed. And if everyone used slightly different methods, the results will still be inconclusive.
The Many Labs project has none of these problems because, as Daniel Simons told me, it is a planned meta-analysis. They’re did many checks all at once, and nothing was hidden away if it didn’t “work”. They consulted with the original authors where possible. They ran the exact same experiment on all of their different samples. They tested a far larger group of people than any of the original experiments (and replication attempts generally need to be bigger than the original studies that they’re checking). And they pre-registered their methods: everything was agreed before a single volunteer was recruited, leaving no room for sneaky data-massaging.
The result is a definitive assessment of the 13 effects. The priming ones didn’t check out. At the other extreme, Nobel laureate Daniel Kahneman comes out of this very well. His classic anchoring effect, in which the first piece of information we get can bias our later decisions, turns out to be much stronger than he estimated in his original experiments.
The Many Labs sample was also diverse, which tells us whether the effects being scrutinised are delicate flowers that only blossom in certain situations, or robust blooms that grow everywhere. This is important, because some psychologists like Joe Cesario from Michigan State University have argued that effects like social priming ought to vary in different contexts, or across different individuals.
I contacted Cesario, and he clarified: “At no point did I make the claim that all effects, or even all priming effects, will vary by laboratory, region, etc. The point was to appreciate the possibility that some priming effects might vary by underappreciated context variables… Absent cross-lab replication, priming researchers cannot make extreme claims about the widespread nature of priming effects.”
In the Many Labs Project, none of the 13 effects varied according to the nationality of the participants, or whether they did the experiments online or in a lab. Kahneman’s work checked out everywhere, and the priming studies failed everywhere. Cesario adds, “The ManyLabs project correctly tells us that [the two effects that did not replicate] aren’t really effects that we as a discipline should care about because they have no generalizability beyond that unique situation.”
It is very telling that everyone I spoke to praised the initiative, including the authors whose work did not replicate. There was none of the acrimony that has stained past debates. When something is done this well, it’s pretty churlish to not accept the results.
This is a harbinger of things to come.
Simons is coordinating a similar multi-lab replication attempt of Jonathan Schooler’s verbal overshadowing effect, in which verbally describing something like a face impairs our recognition for that thing. The effect has been famously tricky to repeat, and Simons says “Our goal is to measure the actual effect size as accurately as possible by conducting the same study in many laboratories.” The results will be published in the journal Perspectives in Psychological Science next spring. “This multi-lab paper provides a preview of what I hope will become a standard approach in psychology.”
Related Topics
Go Further
Animals
- Cougar travels 1,000 miles in one of longest recorded treksCougar travels 1,000 miles in one of longest recorded treks
- Rare gray whale spotted in the Atlantic—and it's only the beginningRare gray whale spotted in the Atlantic—and it's only the beginning
- Why 'funga' is just as important as flora and faunaWhy 'funga' is just as important as flora and fauna
- Termite fossils prove mating hasn't changed in 38 million yearsTermite fossils prove mating hasn't changed in 38 million years
Environment
- Why the 2024 hurricane season could be especially activeWhy the 2024 hurricane season could be especially active
- Mushroom leather? The future of fashion is closer than you think.Mushroom leather? The future of fashion is closer than you think.
- This deadly fungus is hitchhiking its way across the worldThis deadly fungus is hitchhiking its way across the world
- Why 'funga' is just as important as flora and faunaWhy 'funga' is just as important as flora and fauna
- This exploding mine holds a treasure that may change the worldThis exploding mine holds a treasure that may change the world
History & Culture
- See the story of Jonah and the whale like never beforeSee the story of Jonah and the whale like never before
- This ancient mosaic offers extraordinary insights into the pastThis ancient mosaic offers extraordinary insights into the past
- These are the real dunes that inspired Dune—and you can visit themThese are the real dunes that inspired Dune—and you can visit them
- Meet the only woman privy to the plot to kill Julius CaesarMeet the only woman privy to the plot to kill Julius Caesar
Science
- Women’s bodies are understudied—but that’s starting to changeWomen’s bodies are understudied—but that’s starting to change
- Hundreds of tiny arachnids are likely on your face right nowHundreds of tiny arachnids are likely on your face right now
- What's worse than a hangover? Hangxiety. Here's why it happens.What's worse than a hangover? Hangxiety. Here's why it happens.
Travel
- A guide to Gdansk, Poland's regenerated maritime cityA guide to Gdansk, Poland's regenerated maritime city
- A taste of West Bengal, from curries to Kolkata street foodA taste of West Bengal, from curries to Kolkata street food