Skip to content

George Orwell meets statistical significance: “Politics and the English Language” applied to science

1. Political writing: imprecision as a tool for obscuring the indefensible

In his classic essay, “Politics and the English Language,” the political journalist George Orwell drew a connection between cloudy writing and cloudy content.

The basic idea was: if you don’t know what you’re saying, or if you’re trying to say something you don’t really want to say, then one strategy is to write unclearly. Conversely, consistently cloudy writing can be an indication that the writer ultimately doesn’t want to be understood.

In Orwell’s words:

[The English language] becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts.

He quotes some ugly examples and then continues:

Each of these passages has faults of its own, but, quite apart from avoidable ugliness, two qualities are common to all of them. The first is staleness of imagery; the other is lack of precision. The writer either has a meaning and cannot express it, or he inadvertently says something else, or he is almost indifferent as to whether his words mean anything or not. This mixture of vagueness and sheer incompetence is the most marked characteristic of modern English prose, and especially of any kind of political writing. As soon as certain topics are raised, the concrete melts into the abstract and no one seems able to think of turns of speech that are not hackneyed: prose consists less and less of words chosen for the sake of their meaning, and more and more of phrases tacked together like the sections of a prefabricated hen-house. . . .

As I have tried to show, modern writing at its worst does not consist in picking out words for the sake of their meaning and inventing images in order to make the meaning clearer. It consists in gumming together long strips of words which have already been set in order by someone else, and making the results presentable by sheer humbug. The attraction of this way of writing is that it is easy. . . . By using stale metaphors, similes, and idioms, you save much mental effort, at the cost of leaving your meaning vague, not only for your reader but for yourself. . . .

In our time it is broadly true that political writing is bad writing. Where it is not true, it will generally be found that the writer is some kind of rebel, expressing his private opinions and not a ‘party line’. Orthodoxy, of whatever colour, seems to demand a lifeless, imitative style. . . .

A speaker who uses that kind of phraseology has gone some distance toward turning himself into a machine. The appropriate noises are coming out of his larynx, but his brain is not involved, as it would be if he were choosing his words for himself.

OK, so far, this is all on the level of aesthetics, or the craft of writing, or obstacles to clear communication.

But then Orwell draws the political connection more strongly:

In our time, political speech and writing are largely the defence of the indefensible. Things like the continuance of British rule in India, the Russian purges and deportations, the dropping of the atom bombs on Japan, can indeed be defended, but only by arguments which are too brutal for most people to face, and which do not square with the professed aims of the political parties. Thus political language has to consist largely of euphemism, question-begging and sheer cloudy vagueness. Defenceless villages are bombarded from the air, the inhabitants driven out into the countryside, the cattle machine-gunned, the huts set on fire with incendiary bullets: this is called pacification. Millions of peasants are robbed of their farms and sent trudging along the roads with no more than they can carry: this is called transfer of population or rectification of frontiers. People are imprisoned for years without trial, or shot in the back of the neck or sent to die of scurvy in Arctic lumber camps: this is called elimination of unreliable elements. Such phraseology is needed if one wants to name things without calling up mental pictures of them. . . .

Orwell goes on to argue that clearer writing has the potential to improve political thinking, by making it more difficult to avoid confronting what you’re really trying to say.

2. Scientific writing: imprecision as a tool for drawing indefensible conclusions from data

I was thinking about Orwell after reading this letter (forwarded to me by David Allison) from the journal, Pediatric Obesity:

We conducted a three‐arm parallel randomized controlled trial (RCT). Participants were randomly assigned to one of three groups: a control group or one of two behavioural interventions, towards either both parents and children or parents only. Our primary outcome was change in BMI‐SDS after 3 months of intervention and after 24 months of follow‐up. The pair‐wise results (using paired‐samples t‐test as well as in the mixed model regression adjusted for age, gender and baseline BMI‐SDS) showed significant decrease in BMI‐SDS in the parents–child group both after 3 and 24 months, which indicate that this group of children improved their BMI status (were less overweight/obese) and that this intervention was indeed effective.

However, as we wrote in the results and the discussion, the between group differences in the change in BMI‐SDS were not significant, indicating that there was no difference in change in our outcome in either of the interventions. We discussed, in length, the lack of between‐group difference in the discussion section. We assume that the main reason for the non‐significant difference in the change in BMI‐SDS between the intervention groups (parents–child and parents only) as compared to the control group can be explained by the fact that the control group had also a marginal positive effect on BMI‐SDS, as our control group did have follow‐up visits by a paediatric endocrinologist, who discussed with them their weight issue and gave them some advices for controlling their weight. It is important to note that other cohort studies, in which children with over‐weight and obesity were followed, showed increase in BMI‐SDS over time when there was no intervention.

After reading the letter from Dawson et al., we understand that we should have emphasized in the abstract and in the final conclusion the lack of significant between‐group difference in the change in BMI‐SDS. We would add to the abstract results section the following statement: ‘However, the between‐group differences in the change in BMI‐SDS at 3 months (P = .440) and 2 years (P = .208) were not significant’.

In addition, the conclusion in the abstract would have been better stated as ‘However, the between‐group differences in the change in BMI‐SDS at 3 months and 2 years of follow‐up were not significant, which indicate no advantage in family intervention with parents only or parents and child compared with follow‐up visits with a paediatric endocrinologist’.

Trying to follow this gave me a headache! I read this about four times before I gave up. It’s like a tongue-twister.

A key problem, I think, is that the authors are trying with their statistical analysis and writing to do something they can’t do with their science, which is to draw near-certain conclusions from noisy data that can’t support strong conclusions. The writing and the statistics have to be cloudy, because if they were clear, the emptiness of the conclusions would be apparent.

So maybe it’s not a coincidence that scientific reasoning based on sifting through noisy data looking for statistical significance (the garden of forking paths) goes along with impenetrable writing and analysis. (Recall that it’s often impossible to figure out from a published empirical paper what the data are or how exactly they were analyzed.)

Just to be clear: I’m not saying that writers of academic papers in general, or the writers of the above-link letter in particular, are writing badly in order to confuse people, to hide where the bodies are buried, or whatever. What I think is that certain popular statistical methods such as selection on statistical significance are inherently confusing (they typically follow the illogical reasoning of taking the rejection of null hypothesis B to represent evidence in favor of some favored alternative hypothesis A), even to experts, hence it’s hard to write about them without your prose ending up as a muddle.

(The flip side is that, if you can express bad scientific methods using superficially clear pose, you could probably do a lot of damage.)

Anyway, I think we can learn a bit from the unreadability of the above-linked letter, not so much about this particular case because who really cares about it, but regarding the more general Orwellian point.


  1. Brent Hutto says:

    Let’s use plain (non-scientific, non-statistical) language to describe a commonly encountered type of dataset:

    It appears that weight decreased in all three treatment groups.
    It looks like the largest decrease was in the parent-child group.
    This is the pattern we’d to see because all three treatments have shown past weight-loss results.

    That’s the sort of thing a layperson might say upon seeing the group-by-time results on a graph. The tricky part comes when we switch to scientific/statistical thinking about the meaning of “decreased” and “largest”. Depending on what exact tests we planned to do beforehand and which particular “forking paths” we followed in the analysis, we are often painted into a corner where the only clear and rigorous statement possible is something like:

    Differences in weight loss among the treatment groups were not statistically significant.

    Man, one sentence to sum up a study that cost hundreds of thousands of dollars, what a waste. And if you look at the results graphed out, they look almost exactly like the graphs you had in mind before starting the study. But no, there is no strictly stated finding that can be made.

    That’s the circumstance researchers find themselves in every day and that’s when the trouble begins. Because now we have to start obfuscating (in Orwell’s sense) and using weasel words to SAY things that we think are interesting but aren’t allowed to be said in a scientific/statistical manner. Not to even mention the manuscript reviewers who will ask for that sort of weaselly half-claim to be given if the authors try to stick to only bona fide scientific/statistical claims.

  2. “A key problem, I think, is that the authors are trying with their statistical analysis and writing to do something they can’t do with their science, which is to draw near-certain conclusions from noisy data that can’t support strong conclusions.”

    This is an excellent sentence. I see this constantly; perhaps the saddest place is at journal clubs, in which new students assume (reasonably) that the authors’ convoluted arguments imply some deep understanding that the student feels inadequate about not grasping, when in fact there’s no substance to the authors’ claims.

    • Brent Hutto says:

      It can just as often be that the authors are trying their hardest to keep from implying “near-certain” anything while pointing out the extremely uncertain but still evident nature of their data.

      It’s one kind of fallacy to imply that a noisy estimate of an apparently modestly sized effect is “near-certain”. But it’s also (to my thinking) to avoid even commenting in a not-certain manner about the apparent direction and approximate magnitude of what was observed in this particular data.

      The struggle is to find acceptable, non-Orwellian ways of saying that our estimate is noisy, we can’t be certain about any conclusions but none the less here’s how our noisy estimates for each group at each time turned out.

    • Martha (Smith) says:

      “This is an excellent sentence.”


  3. jim says:

    Orwell again, sparing no ones feelings from the truth.

Leave a Reply