Skip to content
Archive of posts filed under the Statistical computing category.

Making differential equation models in Stan more computationally efficient via some analytic integration

We were having a conversation about differential equation models in pharmacometrics, in particular how to do efficient computation when fitting models for dosing, and Sebastian Weber pointed to this Stancon presentation that included a single-dose model. Sebastian wrote: Multiple doses lead to a quick explosion of the Stan codes – so things get a bit […]

How good is the Bayes posterior for prediction really?

It might not be common courtesy of this blog to make comments on a very-recently-arxiv-ed paper. But I have seen two copies of this paper entitled “how good is the Bayes posterior in deep neural networks really” left on the tray of the department printer during the past weekend, so I cannot underestimate the popularity of […]

Monte Carlo and winning the lottery

Suppose I want to estimate my chances of winning the lottery by buying a ticket every day. That is, I want to do a pure Monte Carlo estimate of my probability of winning. How long will it take before I have an estimate that’s within 10% of the true value? It’ll take… There’s a big […]

MRP Conference at Columbia April 3rd – April 4th 2020

The Departments of Statistics and Political Science and Institute for Social and Economic Research and Policy at Columbia University are delighted to invite you to our Spring conference on Multilevel Regression and Poststratification. Featuring Andrew Gelman, Beth Tipton, Jon Zelner, Shira Mitchell, Qixuan Chen and Leontine Alkema, the conference will combine a mix of cutting […]

Rao-Blackwellization and discrete parameters in Stan

I’m reading a really dense and beautifully written survey of Monte Carlo gradient estimation for machine learning by Shakir Mohamed, Mihaela Rosca, Michael Figurnov, and Andriy Mnih. There are great explanations of everything including variance reduction techniques like coupling, control variates, and Rao-Blackwellization. The latter’s the topic of today’s post, as it relates directly to […]

Correctness

A computer program can be completely correct, it can be correct except in some edge cases, it can be approximately correct, or it can be flat-out wrong. A statistical model can be kind of ok but a little wrong, or it can be a lot wrong. Except in some rare cases, it can’t be correct. […]

Exciting postdoc opening in spatial statistics at Michigan: Coccidioides is coming, and only you can stop it!

Jon Zelner is an collaborator who does great work on epidemiology using Bayesian methods, Stan, Mister P, etc. He’s hiring a postdoc, and it looks like a great opportunity: Epidemiological, ecological and environmental approaches to understand and predict Coccidioides emergence in California. One postdoctoral fellow is sought in the research group of Dr. Jon Zelner […]

How to “cut” using Stan, if you must

Frederic Bois writes: We had talked at some point about cutting inference in Stan (that is, for example, calibrating PK parameters in a PK/PD [pharmacokinetic/pharmacodynamic] model with PK data, then calibrating the PD parameters, with fixed, non updated, distributions for the PK parameters). Has that been implemented? (PK is pharmacokinetic and PD is pharmacodynamic.) I […]

Fitting big multilevel regressions in Stan?

Joe Hoover writes: I am a social psychology PhD student, and I have some questions about applying MrP to estimation problems involving very large datasets or many sub-national units. I use MrP to obtain sub-national estimates for low-level geographic units (e.g. counties) derived from large data (e.g. 300k-1 million+). In addition to being large, my […]

The opposite of “black box” is not “white box,” it’s . . .

In disagreement with X, I think the opposite of black box should be clear box, not white box. A black box is called a black box because you can’t see inside of it; really it would be better to call it an opaque box. But in any case the opposite is clear box.

Beautiful paper on HMMs and derivatives

I’ve been talking to Michael Betancourt and Charles Margossian about implementing analytic derivatives for HMMs in Stan to reduce memory overhead and increase speed. For now, one has to implement the forward algorithm in the Stan program and let Stan autodiff through it. I worked out the adjoint method (aka reverse-mode autodiff) derivatives of the […]

The default prior for logistic regression coefficients in Scikit-learn

Someone pointed me to this post by W. D., reporting that, in Python’s popular Scikit-learn package, the default prior for logistic regression coefficients is normal(0,1)—or, as W. D. puts it, L2 penalization with a lambda of 1. In the post, W. D. makes three arguments. I agree with two of them. 1. I agree with […]

Econometrics postdoc and computational statistics postdoc openings here in the Stan group at Columbia

Andrew and I are looking to hire two postdocs to join the Stan group at Columbia starting January 2020. I want to emphasize that these are postdoc positions, not programmer positions. So while each position has a practical focus, our broader goal is to carry out high-impact, practical research that pushes the frontier of what’s […]

Zombie semantics spread in the hope of keeping most on the same low road you are comfortable with now: Delaying the hardship of learning better methodology.

This post is by Keith O’Rourke and as with all posts and comments on this blog, is just a deliberation on dealing with uncertainties in scientific inquiry and should not to be attributed to any entity other than the author. As with any critically-thinking inquirer, the views behind these deliberations are always subject to rethinking […]

Many Ways to Lasso

Jared writes: I gave a talk at the Washington DC Conference about six different tools for fitting lasso models. You’ll be happy to see that rstanarm outperformed the rest of the methods. That’s about 60 more slides than I would’ve used. But it’s good to see all that code.

A heart full of hatred: 8 schools edition

No; I was all horns and thorns Sprung out fully formed, knock-kneed and upright — Joanna Newsom Far be it for me to be accused of liking things. Let me, instead, present a corner of my hateful heart. (That is to say that I’m supposed to be doing a really complicated thing right now and […]

Dan’s Paper Corner: Yes! It does work!

Only share my research With sick lab rats like me Trapped behind the beakers And the Erlenmeyer flasks Cut off from the world, I may not ever get free But I may One day Trying to find An antidote for strychnine — The Mountain Goats Hi everyone! Hope you’re enjoying Peak Libra Season! I’m bringing […]

Stan contract jobs!

Sean writes: We are starting to get money and time to manage paid contracting jobs to try to get a handle on some of our technical debt. Any or all of the skills could be valuable: C++ software engineering C++ build tools, compilers, and toolchains Creating installers or packages of any kind (especially cross-platform) Windows […]

Jeff Leek: “Data science education as an economic and public health intervention – how statisticians can lead change in the world”

Jeff Leek from Johns Hopkins University is speaking in our statistics department seminar next week: Data science education as an economic and public health intervention – how statisticians can lead change in the world Time: 4:10pm Monday, October 7 Location: 903 School of Social Work Abstract: The data science revolution has led to massive new […]

“Troubling Trends in Machine Learning Scholarship”

Garuav Sood writes: You had expressed slight frustration with some ML/CS papers that read more like advertisements than anything else. The attached paper by Zachary Lipton and Jacob Steinhardt flags four reasonable concerns in modern ML papers: Recent progress in machine learning comes despite frequent departures from these ideals. In this paper, we focus on […]