Charles Margossian and Ben Bales helped out with the Stan model for coronavirus from Riou et al. Riou reports:
You guys are amazing. We implemented your tricks and it reduced computation time from 3.5 days to … 2 hours (40 times less). This is way beyond what I expected, thank you so much!
That’s great. Would someone be able to share the nature of the advice so the rest of us can learn?
+1 I would really like to know what the difference was.
Sorry about the delay, I updated the github and added the effects of Ben’s intervention. https://github.com/jriou/covid_adjusted_cfr/blob/master/models/
You can compare model10.stan with model10_bales.stan. I then implemented these features in the next versions of the model (11 to 14 so far).
I think the things were
1. Switching from RK45 to BDF integrator
2. There was some model specific integration time stuff. Like some of the integration could be moved to post processing (I’m assuming that happened, not totally sure).
3. And then the more specific thing was changing how the initial conditions were parameterized.
Stan does forward mode continuous sensitivities, so if you have N states to your ODE and P things you need sensitivities with respect to, you solve an ODE with N + N * P states.
In this case, if the initial conditions involved parameters that made P like 58. There was a way to rewrite the initial conditions so P = 5.
This is something we’d like to change in future Stans. If we did some sort of adjoint sensitivity thing we wouldn’t have this problem. It’s something we’re working on, cos it’s not great that it’s possible to inadvertently write something that scales really badly here.
https://gelmanstatdev.wpengine.com/2020/03/09/coronavirus-model-update-background-assumptions-and-room-for-improvement/#comment-1259012
It seems like Daniel Lakeland with the first comment in the thread had the right idea.
well, speeding it up was the right idea but I think the suggestions used were from someone else and didn’t involve changing the ODE solver… so it would be great to see what the changes were
Haaaahahaha, I didn’t see your original comment Daniel, but one of the suggestions was an approximation that resulted in empirically a maximum 3% relative error on a sample problem (over all samples in the posterior and all points in the ODE trajectory). This didn’t end up getting used (adding approximations like that is always annoying cause you keep having to check them — and annoying turns to dangerous when you forget to check!), but amusing you picked 3% anyway.
40 times less is negative 136.5 days. Now that’s impressive!
+1 for pedantry
ROTFL
In my experience, inaccurate is 40 times less useful than accurate when it comes to math.
The only issue with negative run-times is that your prior effectively becomes your posterior
Incidentally, Andrew, in the context of “all maps of parameter estimates are misleading”, do you have any words of advice for people trying to map coronavirus?
So far I’ve seen maps of case counts, which don’t have the artifacts that paper is concerned with. Things get misleading when you start mapping things like “cases per 100,000 population” or “transmission risk”.
Of course, there are plenty of other ways the case counts are misleading too, perhaps the biggest of which is that all that can be mapped is _known_ cases (obviously), and I would guess the relationship between known cases and actual virus carriers is both highly variable and very uncertain (and surely varies with population density).
In short, the case count maps may not be telling us what we really want to know, indeed probably aren’t. but they are not subject to the artifacts that parameter estimates derived from them would be.