National Geographic

An interview with Uri Simonsohn, the data sleuth behind the Smeesters psychology misconduct case

Last week, I wrote about the case of Dirk Smeesters, a social psychologist who had resigned from Erasmus University Rotterdam after an investigation uncovered problems with the data in two of his papers. His case follows the scandal of Diederik Stapel, another psychologist from a Dutch University who was found guilty of research fraud last year.

As I noted in my post, the Smeesters case is unique. While earlier cases of fraud in psychology, including Stapel Karen Ruggiero and Marc Hauser, were uncovered by internal whistleblowers using inside information, Smeesters was found out by an external source using statistical sleuthing. At the time of writing, that source was anonymous but on Thursday, he was revealed to be Uri Simonsohn, another social psychologist from the University of Pennsylvania.

I’ve now interviewed Simonsohn for Nature News, about how he started his investigation, his motives, the fallout, and more. Go and read that first. What follows is bonus material, including a few specific points I wanted to highlight, and some quotes that were cut for length:

Not a witch hunt. Several people have speculated that this might have been a grudge match, and that the then-anonymous whistleblower was someone familiar with Smeesters who was out to hurt him. According to Simonsohn, that is not the case. His investigation started “by chance” because a colleague sent him paper by Smeesters and he thought the data looked too good to be true.

Technique will be published soon. Simonsohn’s statistical tool is still undescribed, but he will be sending it for publication soon. He says that it is simple to use, could be broadly applied to other sciences, and works particularly well for small sample sizes. He doesn’t want to comment on the existing attempts to reverse-engineer his method from the Erasmus University committee report.

More misconduct afoot. There could be other cases of misconduct that we do not know yet about. Simonsohn identified four. One was Stapel (after the fact). The second was Smeesters.  A third has apparently been investigated but not been made official yet. No one is doing anything about the fourth case. Simonsohn is convinced that data have been fabricated, but the suspect’s co-authors have not been willing to help him, and he doesn’t have the time to pursue it himself. There’s a fifth case that’s more ambiguous. “I wouldn’t bet money the papers are true but I’m not sufficiently convinced to do something about it,” says Simonsohn.

What did Smeesters actually do? Smeesters claims that he was only doing things that are common practice among psychological researchers, including leaving out outliers or people who did not read the instructions carefully. Simonsohn says that uncovering what Smeesters did was a job for the university, but adds, “His data aren’t consistent with dropping outliers, or dropping people who don’t understand instructions. [And] when I contacted Smeesters, he never mentioned the possibility that he deleted outliers. He said that he might have incorrectly entered data and agreed to retract his papers and re-run the study.”

How common is this? Simonsohn says that “it’s really hard to have an informed opinion” of how common such practices are in psychology or any other science. However, he is concerned with how easily he came across his cases. “I wasn’t looking. They landed on my desk without any explicit plan to find them.”

Why bother? Simonsohn does worry about how these activities will be perceived. “Whenever someone gets notoriety, people make inferences about their motives. It’s not hard to come up with bad motives for what I’m doing,” he says.” I asked him what his motive was. “Simply that it was wrong to look the other way,” he said.

Could this trap the innocent? Simonsohn is aware and worried of the possibility of pointing the finger at an innocent party. If he finds one dodgy paper, he always looks for at least two more before contacting someone. “I also proceed with extreme caution. I first came across Smeesters’ paper 9 months ago, and I had cordial correspondence with him for months. The empirical analyses are only the first step.”

And finally, for some context, here’s my feature for Nature about psychology’s problems with replication, and what people are trying to do about it.

There are 6 Comments. Add Yours.

  1. Dave Nussbaum
    July 3, 2012

    Thanks for covering this Ed!

    I wanted to comment on the whistleblower aspect of the story, I think it’s very important. First, I think we place a large burden on people like Uri, which may be inevitable to some degree, but still not really fair and we all owe him our gratitude for taking something like this on.

    More importantly still, as you note, there has to be a lot of concern about not getting one’s accusations wrong, especially publicly. The steps that Uri takes to minimize that possibility are very important and should be applauded. Contacting the researcher before publication is obviously important, as is the fact that he contacted the relevant authorities instead of pursuing justice on his own. Also, the attempt to replicate his findings with other papers by the same author makes false positive accusations far less likely.

    It necessarily also means that there will be more cases that will be missed, or dismissed as too ambiguous to pursue. I think that’s a trade off we have to make at one point or another, I’m glad that we’re starting at a point that’s reasonably conservative.

    I’d also like to support Uri’s call that journals (or authors) publish data. There are certainly exceptions where this is a little trickier, but now that we can post data on the internet, there are many fewer excuses for not being transparent.

  2. Aaron Sheldon
    July 3, 2012

    The technique for detecting outlier exclusion appears to rely on catching the correlations introduced by order statistics.

    Wikipedia is helpful: http://en.wikipedia.org/wiki/Order_statistic

    It presents the joint distribution of order statistics but does not show the moments, which would be nice for understanding this fraud detection algorithm.

  3. Ed Yong
    July 3, 2012

    From our chat, I understand that Simonsohn intends to release the details of this once he submits his paper, which should be within the next few weeks, or even at the end of this one. So people who are itching to see the method shouldn’t have to wait long.

  4. Aaron Sheldon
    July 3, 2012

    arXiv?

  5. bsci
    July 3, 2012

    Simonsohn says the method is noisy, & he wouldn’t make an accusation unless he found problems in 3 papers from a person. He also implies that journals might want to use methods like his to detect fraud before publication. While I agree with better policing by journals, the idea of an immense number of false positive fraud investigations at the pre-publication stage is terrifying. I assume the fraud algorithms would get more accurate over time, but, if one knows the algorithms used by journals, it would also be easier to evade them.

    I like the idea of pushing more raw data public, but the issues surrounding useful data curation & storage and volunteer privacy are non-trivial in many fields.

  6. Tim Smits
    July 6, 2012

    I am very intrigued by this whole story, having been a colleague of Dirk when we were both PhD students. Therefore, thanks for this “bonus interview material”.

    One thing I have been thinking about in how we should make progress from this point onward is to indeed put some of the responsibility in the hands of the journals. Why not demand from top journals that they do two things with the very high subscription fees we have to pay them:
    – provide a mandatory data repository service for all data published in the journal.
    – give grants to junior researchers or prizes to master students that perform replication studies, written down in white papers that are also linked to the digital record of the published papers.

    From academia, we could then follow up on this. If replication study papers would be archived in an online repository linked to the original article, this would have two benefits:
    1. We would be more informed on the true value of a published study
    2. We could decide to have these replication papers count to some degree in in the junior researchers’ track records, so that it is actually good (and valued) training to start with replicating prior studies.

Add Your Comments

All fields required.

Related Posts