Skip to content

Controversy regarding the effectiveness of Remdesivir

Steven Wood writes:

There now some controversy regarding the effectiveness of Remdesivir for treatment of Covid. With the inadvertent posting of results on the WHO website.

One of the pillars of hope for this treatment is the monkey treatment trial (the paper is here).

As an experience clinical trialist I was immediately skeptical of the results. On the face of it the N’s (3 arms in the trial, 6 monkeys in each group) seemed to small for the low p-values they were reporting. I am sure you are familiar with the common practice in bench studies of using as N the number of the number of tests/assays etc performed and not the number of animals/subjects.

When I looked at figure 3, I was convinced that that is what they had done. For the viral load data between the treatment group and the controls only one of 6 pairwise comparisons between the 6 groups of lung lobes results was even of borderline statistical significance. Yet in the panel B in figure 3 the p-value of difference in viral loads is presented as <.0001. This seemed impossible to me. I contacted the author of the paper who replied...the N they used was 36 per group (the number of lung lobes), not 6 per group (the number of monkeys) as they should have.

I can’t say anything about Remdesivir—I can’t even pronounce the word!—but, yes, it does seem like if you analyze different lung lobes from the same monkey as independent data points, that you’ll be overstating your certainty, and setting yourself up for future unpleasant surprises in the form of failed replications.

Or maybe not. Maybe the analysis is just fine. It might be better in this sort of paper to focus more on presenting the raw data in as many ways as possible, rather than on this sort of thing:

P.S. Zad sent along the above photo of two cats who are awaiting their next round of social distancing.


  1. Joseph Maher says:

    Academics get credit for writing papers, maybe they should also get credit for publishing interesting data sets? Is there some journal that publishes data sets not papers? Maybe you should start one…

  2. Bill Harris says:

    “I can’t even pronounce the word!”: See for the seriousness of the problem and for the way out. You were wise not to try at home, it seems.

  3. Ian Fellows says:

    “Gilead spokesperson Amy Flood said the company believes “the post included inappropriate characterization of the study.” Because the study was stopped early because it had too few patients, she said, it cannot “enable statistically meaningful conclusions.” However, she said, “trends in the data suggest a potential benefit for remdesivir, particularly among patients treated early in disease.””

    Ugg, that is the worst quote. Any bad trends aren’t meaningful, but good trends indicate it is totally beneficial! Just because a study was stopped early doesn’t mean it is in any way statistically invalid, the power is just lower so you need to be careful about throwing in the towel. This is a good case where it is much more important to look at intervals as compared to a failure to reject the null.

    I’ll be interested to read this when it comes out. I had a lot of hope that this drug would be a game changer. If we wanted to get “back to normal,” we would have needed something that provided something like a >2X reduction in death since COVID-19 mortality is something like 6 to 20 times higher than influenza. Running the numbers, this appears to be ruled out. Morality was 1.09 times higher in the treated group CI (0.54, 2.17), so a 2X reduction is ruled out.

    Of course even a 10% reduction in mortality is badly needed, so research should continue, but our expectations are now more bounded than they were.

  4. Rahul says:

    Which cat got Remdesivir?

  5. Terry says:

    The cat on the right’s name is Cauchy … because it has such a long tail <>

  6. Dickens says:

    Are the cats sitting on a taped over fire escape?

Leave a Reply