Skip to content
 

Indeed, the standard way that statistical hypothesis testing is taught is a 2-way binary grid. Both these dichotomies are inappropriate.

I originally gave this post the title, “New England Journal of Medicine makes the classic error of labeling a non-significant difference as zero,” but was I was writing it I thought of a more general point.

First I’ll give the story, then the general point.

1. Story

Dale Lehman writes:

Here are an article and editorial in this week’s New England Journal of Medicine about hydroxychloroquine. The study has many selection issues, but what I wanted to point out was the major conclusion. It was an RCT (sort of) and the main result was “After high-risk or moderate-risk exposure to Covid-19, hydroxychloroquine did not prevent illness compatible with Covid-19….” This was the conclusion when the result was “The incidence of new illness compatible with Covid-19 did not differ significantly between participants receiving hydroxycholoroquine (49 of 414 (11.8%) and those receiving placebo (58 of 407 (14.3%)); the absolute difference was -2.4 percentage points (95% confidence interval, -7.0 to 2.2; P=0.35).”

The editorial, based on the study said it correctly: “The incidence of a new illness compatible with Covid-19 did not differ significantly between participants receiving hydroxycholoroquine ….” The article had 25 authors, academics and medical researchers, doctors and Phds – I did not check their backgrounds to see whether or how many statisticians were involved. But this is Stat 101 stuff: the absence of a significant difference should not be interpreted as evidence of no difference. I believe the authors, peer reviewers, and editors know this. Yet they published it with the glaring result ready for journalists to use.

To add to this, the study of course does not provide the data. And the editorial makes no mention of their recent publication (and retraction) of the Surgisphere paper. It would seem that that whole episode has had little effect on their processes and policies. I don’t know if you are up for another post on the subject, but I don’t think they should be let off the hook so easily.

Agreed. This reminds me of the stents story. It’s hard to avoid binary thinking: the effect is real or it’s not, the result is statistically significant if it’s not, etc.

2. The general point

Indeed, the standard way that statistical hypothesis testing is taught is a 2-way binary grid, where the underlying truth is “No Effect” or “Effect” (equivalently, Null or Alternative hypothesis) and the measured outcome is “Not statistically significant” or “Statistically significant.”

Both these dichotomies are inappropriate. First, the underlying reality is not a simple Yes or No; in real life, effects vary. Second, it’s a crime to take all the information from an experiment and compress it into a single bit of information.

Yes, I understand that some times in life you need to make binary decisions: you have to decide whether to get on the bus or not. But. This. Is. Not. One. Of. Those. Times. The results of a medical experiment get published and the can inform many decisions in different ways.

Leave a Reply