In 2008, a team of psychologists from the University of Michigan apparently found a simple memory task that could boost intelligence. They asked volunteers to watch a sequence of symbols while listening to a series of letters. Holding both streams of information in their heads, they had to say if the current symbol or letter matched the one from a few cycles back. This memory-based “dual n-back” task seemed to improve the volunteers’ fluid intelligence—a general ability to solve problems that goes well beyond mere memory. The team said that their study opened up “a wide range of applications”.
Walter Boot from Florida State University and Daniel Simons from the University of Illinois at Urbana-Champaign disagree. They think the study had a critical weakness: it compared the people who did the n-back task with a control group who did nothing. Those who did the memory training may have expected to gain a temporary boost in intelligence, memory or mental abilities. Those who sat and waited wouldn’t have expected anything.
For decades, we’ve known that our expectations can wield a huge influence over our behaviour and our bodies. This is why new medicines are tested in double-blind randomised trials, where neither doctors nor patients know who’s getting a drug and who’s just getting a placebo. If the trials weren’t blinded, the patients who got the drug might get better simply because they expected to get better—the infamous placebo effect.
Running double-blind studies is easy enough when your medicine looks the same as a saline drip, but it’s usually impossible in psychology. “Psychology interventions aren’t like pills,” says Simons. “If you’re receiving an experimental treatment for depression, you know that you’re receiving treatment.” Or if you’re doing the dual n-back task, you know you’re doing a memory task.
That’s the problem: It’s hard to run these interventions without revealing your hand to volunteers. The even bigger problem, according to Boot and Simons, is that psychologists have largely dealt with this issue by sweeping it under the rug. “There’s a standard that’s really flawed,” says Simons. “Everyone knows the reason for double-blind designs. If you don’t have those, you have to control for the things they control for, and we’re not. We need to do better.”
Now, together with colleagues Cary Stothart and Cassie Stutts, Boot and Simons have published a paper—more a manifesto, really—that outlines their gripes. (You can read the full details of their project here, including an FAQ and a blog post.)
They first realised the scale of this problem when they looked at a long line of studies on the benefits of video games. Since 2003, many studies have shown that people who play action video games—mostly shooters like Unreal Tournament—have better attention and visual skills than people who play more sedate games like Tetris or The Sims. (This TED talk summarises the results.)
Boot and Simons failed to replicate a few of these classic results but, in doing so, they also noticed that none of these studies examined the volunteers’ expectations. Do the action gamers expect to do better in the tasks that they eventually do better in, compared to the slow-paced gamers? It’s a simple question, but no one was asking it, much less had answers.
The team collected some preliminary data of their own. They surveyed 400 people who watched a video of a game (Unreal Tournament or Tetris) and a second video of a mental test, of the sort used in earlier gaming studies. Did the volunteers think they would do better at the test if they had played the game? Yes, but selectively so. People who played the action game expected to do better at vision and attention tasks, while Tetris gamers expected to improve at mental rotation. And that’s exactly what previous studies found.
“We’ve only shown that expectations line up with the improvements people have seen, but not that they drive these effects,” says Simons. ”But this shows the original results are inconclusive. The point is that we just don’t know.”
The team picked on video game studies not because they’re weak, but because they’re actually some of the best ones around. Their control group actually did something comparable, unlike those in the n-back study who sat around and did nothing. But that’s still not good enough, says Simons. “People assume that all active control groups are placebo controls, which is nonsense.”
The same problems emerged when the team looked at a broader range of psychological studies, from psychotherapy to brain-training. “I’ve been collecting a list of every intervention done since the start of 2013, and have yet to find a single one in psychology or education that adequately controls for expectations,” says Simons. “The great irony of this is that psychologists are the ones responsible for demonstrating the power of expectancy effects!”
“This is a very important point that tends to be systematically shoved aside even in carefully designed studies,” says Axel Cleeremans from the Free University of Brussels. “Participants in any study will always attempt to consciously figure out what one wants from them and how they should behave.”
“It’s something the field desperately needed to be reminded of,” says Randall Engle from Georgia Institute of Technology. It’s not just subjects either. Without double-blind studies, experimenters can skew the results of experiments because they expect their subjects to behave in a certain way. “We’re all vulnerable to expectancy effects, because we have a strong vested interest in finding something interesting,” says Engle.
But Torsten Schubert from the Humboldt University of Berlin, who studies video games, thinks the problem is overrated. He also says the team’s views are inconsistent with their own past research. For example, in 2008, they showed that an action game (Medal of Honor: Allied Assault) doesn’t improve short-term memory or attention-switching compared to a puzzle game (Tetris). Fair enough, but they didn’t control for expectancy either. How could they find no effect “if expectancy is a strong factor influencing the results of training studies?” asks Schubert.
Aboard the brain train
Boot and Simons aren’t saying that studies are rubbish if they don’t account for expectations. What bothers them is the mismatch between the methods being used and the claims that are made on the back of those methods.
Consider the growing number of ‘brain-training’ companies, which purport to improve general mental abilities through simple tasks. Some of these cite studies that support their causal claims, including the Michigan dual n-back experiment. But Simons says, “None of these do an adequate job of backing up the claims. The control is either doing nothing or doing a crossword, which is inadequate. These studies are being published in top journals and affecting public discourse.”
Engle agrees. Customers for brain-training programmes range from schools to intelligence agencies, and he doubts that they will get any braininess for their buck. “If this was some ivory tower effect, I wouldn’t worry so much about it, but it’s something that has real societal importance,” he says.
Dealing with the problem is easier said than done, especially since the gold standard of a double-blind trial is unreachable. But the team says that some “silver-standard” options might do. Psychologists could actually measure expectations, as Boot and Simons did in their quick survey. Then, at least, they could adjust their final results. Better still, they could use expectation surveys to help design studies in the first place. For example, in the dual n-back study, the ideal control task would be something that people would also expect to improve general intelligence, but that doesn’t rely on memory in the same way.
Psychologists can also deliberately manipulate the expectations of their volunteers—something that doctors would struggle to do ethically. They could tell some volunteers (in both the intervention and control groups) that they’d expect to see a benefit, while telling the rest that nothing should happen. They could tell people that they’d only see benefits after a certain amount of training, and test them before and after this point.
“It is of no doubt that expectancy can play a role in training studies but every one of the methods proposed by [the team] has its minuses,” says Schubert. A mix of techniques might be best, but that would greatly increase the money and time needed for a study. Why go to such extremes when the field is still at a young point, and researchers are arguing whether the effects they’re seeing are real at all? “They’re hanging a heavy stone on a new promising research area, the potential of which is currently not yet known,” says Schubert.
The team is aware of the realities of cost, time, and the high bar that they’ve set. “I know we’ve struck a fairly negative tone because we want to alert people to this issue,” says Simons. “We want to make sure that reviewers and editors ask the right questions, and encourage people to take steps to remedy these issues. Psychology has always led the way in controlling these sorts of problems.”
Reference: Boot, Simons, Stothart & Stutts. 2013. The Pervasive Problem With Placebos in Psychology: Why Active Control Groups Are Not Sufficient to Rule Out Placebo Effects. Perspective in Psychological Science http://dx.doi.org/10.1177/1745691613491271