Skip to content

“How We’re Duped by Data” and how we can do better

Richard Juster points us to this press release by Laura Counts from the business school of the University of California, promoting the work of Leif Nelson, one of the authors of the modern classic paper on “false-positive psychology” and “researcher degrees of freedom.”

It’s great to see this sort of work get positive publicity. I actually would’ve liked to see some skepticism about the skepticism in the press release, maybe a quote by someone saying that Nelson et al. have gone too far, but given that this sort of scientist-as-hero reporting is out there, I’m glad that it’s being wielded in the service of what I consider to be good science.

Just to be clear, I’m fully in support of Nelson’s work, and I suspect that I’d strongly disagree with any criticism that would take the this-science-reform-business-has-gone-too-far position, but just on general principles it’s my advice for reporters and publicists to present a more rounded view.

Just by analogy, if you were writing an article about someone who’s promoting evolution, the appropriate alternative perspective might be, not to interview a creationist, but to interview someone who works on evolution and can talk about open problems.

I’m not quite sure what’s the right alternative perspective to offer alongside of Nelson here. I wouldn’t recommend they interview Brian Wansink or Robert Sternberg! Maybe step back and consider someone whose research ideas are orthogonal to the open-science movement, someone who’s not opposed to open science but thinks there are more important issues to consider. For example, someone like Nancy Cartwright or Angus Deaton, who have argued that experimental methods are overrated and that social science needs stronger theory. That could be a good countervailing perspective that Nelson could then react to, and it would make it a stronger article.

Don’t get me wrong: I like this press release, and I also know that people who write these things are busy. It’s just interesting to think about this going forward, about how to offer multiple perspectives when writing about a controversial topic.

“Data 101 for Leaders: Avoid Cherry-Picking”

In a sidebar, the article offers advice from business-school professors Leif Nelson and Don Moore, “questions when someone presents you with data they claim proves something”:

1. How did you decide how much data you would collect?

2. Did any other analyses yield different results?

3. Did you measure any other variables worth discussing?

4. Did these results surprise you or were they expected at the outset?

This is fine general advice, but I also have some concerns with which regular blog readers will be familiar.

First, my problem with the advice to “avoid cherry picking” is is that it’s possible to distort your results by following forking paths without ever consciously “cherry picking” or “p-hacking.” The other problem with this advice is that it can instill a false sense of security, if a researcher thinks, “I didn’t do any cherry picking, therefore I’m ok.” Remember, honesty and transparency are not enuf. I fear that talk of “cherry picking” can make people think the problem is with other, bad researchers.

Going on to the other questions:

1. It’s ok to ask how you decided how much data you would collect. But I think in general there’s been way too much emphasis on sample size and significance levels, and not enough on design and data collection. So I’d like question #1 to be: How did you decide what you were going to measure, and how you would measure it?

2. Did any other analyses yield different results? Yes, definitely. Different analyses will always yield different results. We have to be careful (a) not to dichotomize results into “same” and “different” as this can be a way of adding huge amounts of noise to our results, and (b) not to expect that, if a result is real, that all analyses will show it, as that gets us into Armstrong territory, as has been discussed by Gregory Francis and others.

3, 4. Overall, I think it’s a good idea for any research team to situate their work within the literature. The question is not just: Did you perform any other analyses of your data? or Did you measure other variables? or Did these results surprise you? It’s also: What sorts of analyses are performed in other studies in this literature?, What other variables have been studied?, What other claims have been made? Again, forking paths is not just what you did, it’s what you could’ve done.

Leave a Reply