John Cook writes, “statistics is all about reasoning under uncertainty.”

I agree, and I think this is a good way to put it.

Statistics textbooks sometimes describe statistics as “decision making under uncertainty,” but that always bothered me, because there’s very little about decision making in statistics textbooks. “Reasoning” captures it much more than “decision making.”

My approach to describing what statistics is about is to go with a different definition of statistic. The technical definition of a functional which summarizes a distribution function is too technical. So my definition of a statistic is that

A statistic is an operator which summarizes a data set (sample or population).

Statistics is then the study of such operators.

Reasoning about their uncertainty (standard errors) is a big part of that, but not all.

This definition seems to exclude Bayesian statistics entirely.

I don’t like the definition above, but it certainly does not exclude the Bayesian approach. The “operator” to which he refers, can be chosen in many ways. The Bayesian approach is one of various dispositions or attitudes toward choosing such summaries.

Sure, but I don’t think the posterior distribution “summarizes a data set”. For one thing, a data set is a finite number of data points, whereas a posterior distribution is infinite dimensional.

In any case, I don’t like that definition either.

A posterior distribution is a summary of a data set in the sense that it is a description. The information content in a description — if the description is to say anything pertinent at all — must be greater than the information content in the data itself (setting aside for another day the precise stipulation as to what constitutes ‘information’). So for example, when someone ‘summarizes’ a data set by reporting its mean and standard deviation, this particular summary description implicitly (or explicitly, depending upon how the two numbers are applied) carries along extra information: the normal distribution whose parameters, the estimates of which are reported by the summary statistics. The attachment of a normal distribution to a data set as a “summary instrument” — either by explicit mention or implicit mention only of its parameter estimates) amounts to augmenting the original data (mere numbers) with additional information (a theoretical distribution). To put the matter in more general terms: if I record the weather today by recording the temperature, wind-direction, wind-speed, relative humidity; and then I comment that “it looks like hurricane weather”, my comment is a data summary, but it brings to bear a great deal more information from elsewhere (from past experience); and the information content of the data plus the data summary is surely greater than the data summary itself.

I’m not convinced… If I have a set of say 120 people’s weight and I publish the mean and standard deviation… I’ve summarized the data set in that when I send you those two pieces of information you have FAR LESS info than if I sent you the whole dataset, but you can still answer some questions like if you selected 12 of them would they likely overload your elevator or not?

A posterior distribution is not a summary of a dataset but rather a summary of a model of the world. For example you may model the world as a mixture model, with different weight distributions by age, sex, and state of residence. The data set may be only vaguely informative, much of the information may come from background, perhaps previous surveys or similar. as such it’s really as much about that background as it is about the data set.

“summary of a model of the world” that is compatible with (conditional on) the data set.

So it is or could be more informative if the modelling was well informed.

For instance, summarizing with a mean and variance as a thoughtless habit would be less informative. However, if based an examination of the residuals with informed judgement that Normal assumptions are helpful – it could be more informative. But you wouldn’t know until further larger studies are done to better discern if Normal assumptions were actually helpful.

This is what I wrote:

“The information content in a description — if the description is to say anything pertinent at all — must be greater than the information content in the data itself (setting aside for another day the precise stipulation as to what constitutes ‘information’)”

Let me elaborate:

If someone “describes” a data set by a “summary” of some sort, they are augmenting the data set itself by a “description”. A description may be provided in various ways. We call such a description useful if it brings with it information not contained in the mere set of data. Such extra information — if it is relevant — must come from the realm of prior experience. Prior experience may be summarized in distributional form; it may be summarized in various other forms as well.

I think at this point it’s just differences of opinions about the interpretation of words. I think we both agree what a posterior distribution is mathematically. I’m not sure if that makes it a “summary of a data set” or not.

In the end I like my definition of statistics as essentially “logical reasoning, using data” better than any definition that tries to reify and modify the formal mathematical jargon of mathematical statistics.

A definition I much prefer is something like “Statistics is the process of making reasoned logical arguments about the world supported by measurement data”

“Decision making under uncertainty” would fall under Decision Theory, right? Drawing inferences from samples should be separate from making decisions based on those inferences.

An inference is just a decision about what you think is true right?

Daniel said,

An inference is just a decision about what you think is true right?

I think it’s often a decision about what you think has a high probability of being true. Or it may be a. decision about how best to proceed, based on the available evidence, the possible consequences of various options, and your values.

I guess in the end what I meant was that decision making encompasses inference and other things as well.

In a Bayesian context, inference means deciding how much weight to give to certain possibilities given some initial weights, a model, and some data.

If someone wants to make a point estimate, we can cast that process as a decision process: choose a value to best represent a given physical quantity… where “best” is basically just an expression of utility.

By “decisions,” I meant, “what action to take” in the real world context that the researcher is modeling. In the Bayesian context, that is separate from inference. But if you expand the definition of “decisions” to include all decisions related to inference, then sure, there are decisions in modeling too. But why stop there? Decisions could include deciding to use statistics at all, and speaking English, etc… Too broad to be useful.

Point estimation in Bayesian stats is legitimately a utility based decision analysis. You take the value that minimizes the expected loss, or maximizes expected utility or whatever.

I’m ok with saying that “computing the posterior” is not a decision problem, nor is “deciding to use statistics instead of voodoo” or whatever.

But if you have multiple models and create a mixture model and find that the posterior mixture for model A is nearly zero… you could “decide that model was probably false” and drop it from any further analysis… I don’t think that’s too broad to be useful, and I think that’s a real thing we should be doing more often.

There is really very little reason to do a Bayesian analysis and stop at the step where you have a posterior, and do nothing with it… I mean, usually people don’t do a good job of the next step, but they take some next step. We should be teaching them the decision stuff and not just the inference part.

This seems something like Keynes’s approach—probability not only as “the hypothesis on which it is rational for us to act,” but as a term that encompasses inductive belief under conditions of partial knowledge. To one historian, his “probability was concerned with logical relations between propositions, the typical case being that of an argument in which the premises lend only partial support to the conclusion. Keynes referred to such relations of partial support or entailment between propositions as probability relations… the degrees of rational belief that individuals were warranted in placing in the conclusions of such arguments.”

I like Keynes’s discussion of how one thing can seem more probable than some other, unrelated thing, without our being able to put a meaningful magnitude on either probability. I understand that his attempt to formalize that logic never caught on, though.

I would like to add an emphasis, which sounds like or should sound like a mere truism: the belief in question must be conditioned upon *something* other than the mere grammatical expression of it (as a sentence or proposition). That “degree” must — in other words — be conditioned in some way upon the relationship between the [1] the belief and [2] the extent to which the belief is consistent with events or conditions obtaining in the world itself. It is necessary, for a belief to be rational, that it be conditioned upon some set of relevant evidence.

If you assign cost/utility function, then it becomes decision making.

Very few articles in statistics actually do that!

I would love to see some good published examples of this in practice.

One other observation: For IRB approval (for psych-style human experimental work anyway), isn’t it necessary to do some sort of cost-benefit analysis for the risk assessment?

The names of two nice Bayesian books are variations around the uncertainty theme:

Measuring Uncertainty – Schmitt (Addison-Wesley, 1969)

Understanding Uncertainty – Lindley (Wiley, 2006, revised edition 2013)

From the introduction of Measuring Uncertainty – An Elementary Introduction to Bayesian Statistics:

“The task of formal statistics is to set up sound procedures for evaluating certain kinds of information. Its methods apply when we can regard what we see as being random variation superimposed upon a more solid structure. Statistics then tells us how our opinion should change on the basis of data; it tells us how to estimate values of interest; it tells us what uncertainty is attached to these values; it tells us how to improve the efficiency of our investigations; it tells us how our imperfect information should be used to make decisions for action.”

There is also a chapter “Making Decisions” which accounts for 10% of the book.

“When I look at statistics today, I am astonished at the almost complete failure to use utility….Probability is there but not utility. This failure has to be my major criticism of current statistics; we are abandoning our task half-way, producing the inference but declining to explain to others how to act on that inference. The lack of papers that provide discussions on utility is another omission from our publications.”

Lindley, D.V. (2004) Some reflections on the current state of statistics, in Applied Bayesian Statistics Studies in Biology and medicine, Ed. di Bacco, M., d’Amore, G., Scalfari, F., Springer

The point is that it is not just about “reasoning” but should involve operationalisation of findings.

We propose a big picture focused on information quality, not just decisions, not just reasoning.

For an intro to information quality see https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3591808

Information quality is affected by the data (X), the data analysis (f) and the analysis goal (g), as well as by the relationships between them. Utility is measured using specific metric(s) (U). Given that, information quality is U(f(x|g))

I don’t disagree with this, but I think we must be very careful here. Some of the worst abuses appear to be when people tell others how to operationalize their statistical analysis. Given that all good analyses will be less than conclusive (elucidating the uncertainties), I do think guidance about how to frame those findings in ways that help think about decisions is critical. But this should not mean applying a decision model and then telling people what conclusion they should reach. One concern I have about utility-based decision models is that they often obfuscate the critical decision factors. The mathematics can overwhelm the kind of judgements that must be made. I am not suggesting that decision analysis be abandoned, only that it be done carefully so as to avoid replacing such judgement.

Agreed.

I like Ron’s framing of info quality because it doesn’t pretend that it is agnostic to goals and utilities. People often speak of “information” as if it were something out there in the world, but really “information” is what something means for a specific purpose. You can’t have information without a codebook.

But I also agree with Dale that the collapse of many factors onto a low-dimensional “utility” measure is often where things go awry and where a lot can be obscured. Many forking paths are possible there, particularly given that it can depend on subjective evaluations of the relative importance of different outcomes. It can be hard for an outsider to unpack those steps and develop their own utility measure better suited to their own goals.

Even in basic research, I think the utility-construction stage is one at which researchers begin to fool themselves about the broader implications of their results. For example, a priming manipulation on a sample of 40 undergrads might well tell us about effective advertising, but it probably shouldn’t on its own change our evaluation of nationwide public health programs.

Which statistics textbooks say “decision making under uncertainty”?

Alan:

I looked at a few intro statistics textbooks and they all define statistics in terms of learning from data. For example, Lock et al.: “Statistics is the science of collecting, describing, and analyzing data.” I didn’t actually see any definitions that mentioned decision making. So maybe I’m misremembering.

I did well in Statistic b/c I was already quite good at identifying logical fallacies. But to be honest, I really didn’t understand what it was designed to do, except in the most elementary of cases. And I was puzzled by the usage of NHST.

I thought that statistics was the science of using fancy math to draw conclusions from inconclusive experiments.

Okay, I don’t expect anyone here to agree with that, but I think this is a common view held by some non-statisticians.

Well that is what it is – using fancy math to draw conclusions from inconclusive experiments – but there are good ways and less good ways.

Really not an alternative so you need to tell the good from the bad – but also the ugly.

There are no alternatives.

I agree with Roger that many non-statisticians believe that statistics is “the science of using fancy math to draw conclusions from inconclusive experiments.” The implication of this for the statistical community is that we need to work hard at educating non-statisticians about the limitations, misuses, and abuses of statistical inference.

Martha – the statistical community should mostly show that it is able to generate added value and not stick to defensive stuff. The generation of information quality is what I see as the primary goal of statistics, or at least applied statistics.

It’s by no means easy to reason under “certainty”. E.g. I’m a lousy chess player. I’m safer when uncertainty prevails; then I can [a] toss a coin; [b] refuse to take any decision at all until there’s more information. However, in respect to these two alternatives: I don’t like [a] because I have no appetite for gambling; and [b] forces me back toward the role of the bad chess player: I don’t play well; on top of that, I don’t like to lose! This is a psychological catch-22 of the most devious kind! The finishing touch, to make the predicament even sillier, is that in the end, psychological counseling proves to be of very little utility in respect to such dilemmas.