Skip to content

Bayesian Workflow (my talk this Wed at Criteo)

Wed 26 Aug 5pm Paris time (11am NY time):

The workflow of applied Bayesian statistics includes not just inference but also model building, model checking, confidence-building using fake data, troubleshooting problems with computation, model understanding, and model comparison. We move toward codifying these steps in the realistic scenario in which we are fitting many models for a given problem.

We’ve been talking about this for a long time!

Here’s a talk on the topic from 2011, and here’s a post from 2017 with some comments from others, and here’s an article from 2019 with Gabry et al. We even have some youtube videos on the topic. Let’s hope I don’t repeat too much the material from 2011, 2017, 2018, etc.

We’re in the midst of writing an article on the topic, trying to separate the computational workflow involved in successfully fitting a single model, from the statistical workflow involved in understanding a problem through a series of fitted models.

P.S. Zad sends in the above photo with caption, “When you order a black box algorithm from Amazon but forgot to read the reviews.”

P.P.S. Here’s the video.


  1. Jai says:

    Thanks for letting us know. I attended and enjoyed it a lot.

  2. Ron Kenett says:

    Andrew – it was interesting to see the “workflow” in your response to my question yesterday on generalisation of simple and complex models. It started by focusing on the learning path that you use by considering simple models. You than moved from using the reference role of simple models, to examples where they can actually be better than complex models. Eventually I believe you did consider my point about generalisation properties of models. My point is that generalisation is an overriding concept and that asking yourself the question, how good is you generalisation ability?, helps you in the data analysis and modeling workflow. Outliers obviously affect it as typically you do not want singled out events to effect your generalisability. Generalisability is different from model checking.

    One an historical perspective to all this. There was strong animosity between Box and Tukey on the role of models. I spent years both in Madison and Bell Labs and this was quite apparent to anyone there. Box was a modeler and aimed at “useful” models. Tukey launched the discipline of data analysis. In this sense, your proposal to use simple and complex models for improved data analysis is a bridge between them.

    • > Generalisability is different from model checking.
      Because it’s _out of_ model checking or perhaps just speculation that can’t be checked?

      Your comment on Box and Tukey is interesting. Was Tukey’s Configural Polysampling a partial bridge?

    • Andrew says:


      I’ve thought a lot about this, and I think that Tukey’s anti-modeling position was subtle. On one hand, the EDA book had no models. On the other, something like the rootogram seems pretty clearly model-inspired, as it comes right out of the Poisson distribution. So my take on it is that he did support the use of models to develop statistical procedures, but then he wanted the procedures to stand on their own, independent of the models that were used to develop them. My Boxian take on this is that as models get more complicated, we can’t and shouldn’t go full Tukey in this way, because the models continue to assist our understanding of the methods, even in cases where the models are flawed.

      • Ron Kenett says:


        The rootogram is indeed using a model as a reference. This is related to your comment on choosing a baseline for indices by referring to domain knowledge expertise.

        The use of baseline, or reference, is a general principle in data analysis. I used it in a proposal for comparing a given Pareto chart to a reference based on baseline data
        Even though Pareto charts are used in large numbers the above paper is one of the few who looked at its statistical analysis.

        The even wider context, related to the workflow idea, is that all this is an implementation of Deming’s (or Shehart’s) Plan Do (or Study) Check Act improvement cycle.

        The Plan is identifying the reference, the DO is collecting data, the Check is comparing to the reference and the Act is the actionable part we refer to as action operationalisation in the information quality framework.

        The bottom line is that we are dealing with statistical strategy, a topic implicit in the writings of many…

      • Ron Kenett says:

        Andrew – the main difference between the Tukey and Box perspectives was that Tukey pushed for robust statistics (The famous Princeton studies) and Box focused one modeling. I believe Box’s view is that robust statistics hide outliers which are useful to identify when you are a modeler. You mention this in your talk when you compared the simple model to the complex model where outliers can have high influence (leverage). On the other hand Tukey’s daya analysis perspective anticipated big data analysis. If you focus on the data you want you statistics to be driven by the data without undue effect of singled out observations. I clearly remember Box dispising robust statistics. One such heated debate followed a seminar given by Brian Joiner on his technometrics paper on influence functions. Brian headed the consulting lab and brought a very practical perspective. On the occasion, Box was more in the position of a theoretical modeler. In the underground floor in Madison was a room dedicated to the “pollution room” where time series of smog data in Los Angeles filled the walls. This was the kingdom of Box and Tiao. It resulted in the development of intervention effects in time series models. See and I guess Tukey would have handled this differently, or perhaps not….

  3. David Rohde says:

    Thanks again Andrew!

    For those that missed it it is here:

  4. Ron Kenett says:

    One more (late) comment on the simple versus complex models. The ultimate simple is a synthetic variable generates by simulation so with known no effect. See

    This has been used very effectively in mixture models, but not only…

Leave a Reply