Skip to content

A factor of 40 speed improvement . . . that’s not something that happens every day!

Charles Margossian and Ben Bales helped out with the Stan model for coronavirus from Riou et al. Riou reports:

You guys are amazing. We implemented your tricks and it reduced computation time from 3.5 days to … 2 hours (40 times less). This is way beyond what I expected, thank you so much!


  1. Dave says:

    That’s great. Would someone be able to share the nature of the advice so the rest of us can learn?

    • +1 I would really like to know what the difference was.

      • Julien Riou says:

        Sorry about the delay, I updated the github and added the effects of Ben’s intervention.

        You can compare model10.stan with model10_bales.stan. I then implemented these features in the next versions of the model (11 to 14 so far).

        • Ben says:

          I think the things were

          1. Switching from RK45 to BDF integrator

          2. There was some model specific integration time stuff. Like some of the integration could be moved to post processing (I’m assuming that happened, not totally sure).

          3. And then the more specific thing was changing how the initial conditions were parameterized.

          Stan does forward mode continuous sensitivities, so if you have N states to your ODE and P things you need sensitivities with respect to, you solve an ODE with N + N * P states.

          In this case, if the initial conditions involved parameters that made P like 58. There was a way to rewrite the initial conditions so P = 5.

          This is something we’d like to change in future Stans. If we did some sort of adjoint sensitivity thing we wouldn’t have this problem. It’s something we’re working on, cos it’s not great that it’s possible to inadvertently write something that scales really badly here.

      • well, speeding it up was the right idea but I think the suggestions used were from someone else and didn’t involve changing the ODE solver… so it would be great to see what the changes were

        • Ben says:

          Haaaahahaha, I didn’t see your original comment Daniel, but one of the suggestions was an approximation that resulted in empirically a maximum 3% relative error on a sample problem (over all samples in the posterior and all points in the ODE trajectory). This didn’t end up getting used (adding approximations like that is always annoying cause you keep having to check them — and annoying turns to dangerous when you forget to check!), but amusing you picked 3% anyway.

  2. Ron Wilson says:

    40 times less is negative 136.5 days. Now that’s impressive!

  3. Zhou Fang says:

    Incidentally, Andrew, in the context of “all maps of parameter estimates are misleading”, do you have any words of advice for people trying to map coronavirus?

    • Phil says:

      So far I’ve seen maps of case counts, which don’t have the artifacts that paper is concerned with. Things get misleading when you start mapping things like “cases per 100,000 population” or “transmission risk”.

      Of course, there are plenty of other ways the case counts are misleading too, perhaps the biggest of which is that all that can be mapped is _known_ cases (obviously), and I would guess the relationship between known cases and actual virus carriers is both highly variable and very uncertain (and surely varies with population density).

      In short, the case count maps may not be telling us what we really want to know, indeed probably aren’t. but they are not subject to the artifacts that parameter estimates derived from them would be.

Leave a Reply