Skip to content

Rethinking Rob Kass’ recent talk on science in a less statistics-centric way.

Reflection on a recent post on a talk by Rob Kass’ has lead me to write this post. I liked the talk very much and found it informative. Perhaps especially for it’s call to clearly distinguish abstract models from brute force reality. I believe that is a very important point that has often been lost sight of by many statisticians in the past. I would actually point to many indicating Box’s quote  “all models are wrong, but some are useful” as being insightful rather as something already at the top on most statisticians minds, as evidence of that.

However, the reflection has lead me to think Kass’ talk is too statistics-centic. Now Kass’ talk was only about 25 minutes long while being on subtle topic. It is very hard to be both concise and fully balanced, but I believe we have a different perspective and I would like to bring that out here. For instance, I think this  statement “I [Kass] conclude by saying that science and the world as a whole would function better if scientific narratives were informed consistently by statistical thinking” would be better put as saying that statistics and the statistical discipline as a whole would function better if statistical methods and practice were informed consistently by purposeful experimental thinking (AKA scientific thinking).

Additionally, this statement ““the essential flaw in the ways we talk about science is that they neglect the fundamental process of reasoning from data” seems somewhat dismissive of science being even more fundamental about process of reasoning from data, with statistics being a specialization when data is noisy or varies haphazardly. In fact, Steven Stigler has argued that statistics arose as a result of astronomers trying to make sense of observations that varied when they believed what was being observed did not.

Finally, this statement “the aim of science is to explain how things work” I would rework into the aim (logic) of science is to understand how experiments can bring out how things work in this world by using abstractions that themselves are understood by using experiments. So experiments all the way up.

As usual, I am drawing heavily on my grasp of writings by CS Peirce. He seemed to think that everything should be thought as an experiment. Including mathematics that he defined as experiments performed on diagrams or symbols rather than chemicals or physical objects. Some quotes from his 1905 paper What pragmatism is.  “Whenever a man acts purposively, he acts under a belief in some experimental phenomenon. … some unchanging idea may come to influence a man more than it had done; but only because some experience equivalent to an experiment has brought its truth home to him more intimately than before…”

I do find the thinking of anything one can as an experiment as being helpful. For instance, in this previous post discussion led to a comment by Andrew that “Mathematics is simulation by other means”. One way to unpack this by thinking of mathematics as experiments on diagrams or symbols would be to claim that calculus is one design of an experiment while simulation is just another design. Different costs and advantages, that’s all.  It’s the idea to be experimental and experiment most appropriately that one can – that is fundamental. Then sorting out most appropriately would point to economy of research as the other fundamental piece.





  1. John Williams says:

    I think I’m missing something. Think about the geologists trying to figure out continental drift. There were various ways they could test ideas, such as looking for similar rocks or fossils in parts of continents supposedly separated by drift, like the west coast of Africa and the east coast of South America, but how do you make this into an experiment?

    • Olav says:

      Indeed, it’s common to distinguish between the experimental and the historical sciences. But it sounds like Keith has a very expansive definition of “experiment” in mind that includes things like playing around with ideas or scribbling on paper.

    • Instant Noodles says:

      I agree, making experiments is only part of the process and doesn’t explain how hypothesis are abducted and how to assess the deducted experiments. Abduction itself is highly non-trivial. Let’s say we know about the rift in Iceland. It’s reasonable to abduct a theory of tectonic movement, it is almost indexical. However for other rifts it might be very hard to even recognize them being part of the same class.

      Also, people tend to overstate the value of statistics. Statistics is just one way of structuring examples when making an argument. Statistics might be the best way of reasoning for some types of data, but that doesn’t mean that there is no meaningful empirical research besides statistics. I can’t find a way to view a qualitative case study as an experiment.

      I often find people adding some junk statistics to a chain of arguments when it doesn’t seem appropriate (i.e. they are forgetting the huge variance in human behavior). I find this ok, since the studies mostly weren’t essential to their argument anyways. The opposing position is much worse. If one’s interested in really bad reasoning I always recommend Chomsky’s work in linguistics. He makes a distinction between linguistic performance and competence. Chomsky claims that his context free or even context sensitive grammars have physical reality in the speaker’s minds, are part of the speaker’s competence. During performance the speaker somehow is to incompetent and makes performance errors (which are anything the theory can’t account for). Chomsky is only interested in this competence and argued against empiricism. One can read interesting claims in Chomsky’s work throughout, for instance he claims that speakers of Russian internalize a version of Old-Church Slavic to synchronically explain the complicated accent patterns.

      • Keith O’Rourke says:

        Agree abduction is important but not well understood. But I would not want to rule out (thought) experimenting in how they come about.

        Previously tried to work some of this through in Bayesian inference.

        abduction -> deduction -> induction

        always three intermingled aspects.

        First (speculative inference) (1) choosing a (probabilistic) representation of how unknown quantities were set or came about (aka a prior), (2) a (probabilistic) representation of how the data in hand came about or were generated (aka data generating model or likelihood) and (3) how these two representations connect for a joint representation of an empirical happening to be interpreted more generally.

        Second (quantitative inference),(1) revising the first representation (the prior) in light of the data to get the implied representation given by the joint representation _and_ the data (aka the posterior via Bayes theorem), (2) (conceptually) generating what data could result from such an implied representation (aka posterior predictive).

        Third (evaluative inference), (1) choosing/guessing how the first and second steps might have been wrong and in light of this (2a) assessing the reasonableness of the data generating model to have generated the data in hand (checking for model data conflict) and then (2b) the reasonableness of the prior to have generated the parameters now most supported in the posterior (aka checking for prior data conflict), (3a) choosing what aspects of the joint representation might be made less wrong and how, (3b) working through the implications and how those fit with the data in hand and possibly past experience and finally (3c) deciding whether to settle on the joint representation as is and its implications for now or starting again at the first step.

        All in all,

        speculative inference -> quantitative inference -> evaluative inference or

        abduction -> deduction -> induction -> or

        First -> Second -> Third

        Over and over again, endlessly.

      • Curious says:

        Instant Noodles:

        I am more familiar with Chomsky’s politics than with his research and so I will comment on specifics of his research, but I will comment on the issue you are taking with his separation of competence and performance. This are two clearly different aspects of multidimensional phenomena. To recognize this reality one simply need consider that performance on certain types of cognitive tasks are causally impacted by a lack of sleep. Performance is not perfectly consistent across time and across administrations even for the most competent on tasks that do not suffer from a ceiling effect. This is foundational to thinking about human performance and I would argue that it is a tendency to not fully comprehend this reality that results in an enormous amount of bad research grounded in a false belief that competence = performance.

    • jim says:

      “but how do you make this [tectonics] into an experiment?”

      We can think of the geographic and stratigraphic dispositions of rocks and fossils as the result of an “experiment”. In this case, the “experiment” wasn’t designed, it just happened. We were not able to control the parameters as in many experiments, but we do have access to the results.

      The geologic features of South America’s east coast and Africas west coast are the results of a single experiment. Normally – at least in the physical/biological sciences – many experiments are needed to build and test a hypothesis. Other “tectonic experiments” would be the distribution of accreted terranes in the western US; the geology of the Alps-Himalayan belt; etc.

      In a sense did use statistics to discover plate tectonics: we found similar geologic features and relationships in many regions around the world that were consistent with the tectonic hypothesis. When we found exceptions, we were able to modify the hypothesis to explain the previously misunderstood relationships without creating new problems with the hypothesis – or at least the new problems were notably smaller than the old ones.

  2. Ron Kenett says:

    My take away fro Kass’s video is difference. He talks about the narrative of science and, from this, the starting point should be How Do You Present Findings. Somehow this seems obvious, is assumed or is ignored, depending on how you look at it. However, if this is not accounted for, much of the statistics discussions are mute, to say the least. One would want to identify findings of claims that can be reproduced, Reproducibility is not replicability or repeatability. Reproducibility also requires some ability to generalise a claim. If an animal test is carried out in Basel and then someone in Kyoto tries to reproduce it, some generalisability has happened. See

    Very few are considering the verbal representation of findings. Even fewer, map a boundary of meaning (BOM) distinguishing between alternative representations with meaning equivalence and alternatives with surface similarity. The BOM is one should be considered worth reproducing.

    My take on this, with examples, is presented in

    • Keith O’Rourke says:

      Only so much can be addressed in a short blog post.

      This is one of the ways I explain the issue you raise in a previous hour long presentation.

      “Learning from shadows as a metaphor for statistics.
      Think of learning about an object – just from the shadows it casts – while being unable to look directly at the object. We see those shadows but really are only interested in what is actually casting them. In statistics, observed samples are shadows but we are really only interested in how they would repeatedly occur in the future (get some sense of what casts them and will generalize).That is we want to discern some sense of how the observations were generated and ended up in our possession and make that explicit as possible in order to better understand the reality behind that.”

      • Ron Kenett says:

        OK – let’s go with shadows. How do you express/represent them? To do that we can use alternative representation languages. These can be verbal, graphical or outputs from STAN or other stats packaged. Kass is referring to a narrative of science, Yarkoni discussed verbal representations of claims, Gelman and Carlin mention sign type errors which are semi verbal. My suggestion is to use a boundary of meaning (BOM). Kahneman and Tversky address cognitive bias in representation of findings.

        Somehow, statisticians do not realise that this is an important issue. Have they been brainwashed???

        A related note is

  3. jim says:

    “statistics and the statistical discipline as a whole would function better if statistical methods and practice were informed consistently by purposeful experimental thinking (AKA scientific thinking)”

    Ba. Da. you know the rest. Well said Keith.

  4. Anonymous says:


    ‘I do find the thinking of anything one can as an experiment as being helpful’
    It’s helpful to me to frame it as an experiment and experimental.

  5. RE:

    ‘I do find the thinking of anything one can as an experiment as being helpful’.

    This is how I have framed my approaches. Not sure whether Peirce was instrumental in my thinking.

Leave a Reply