## What are the most important statistical ideas of the past 50 years?

Aki and I wrote this article, doing our best to present a broad perspective. We argue that the most important statistical ideas of the past half century are: counterfactual causal inference, bootstrapping and simulation-based inference, overparameterized models and regularization, multilevel models, generic computation algorithms, adaptive decision analysis, robust inference, and exploratory data analysis. These eight […]

## Basbøll’s Audenesque paragraph on science writing, followed by a resurrection of a 10-year-old debate on Gladwell

I pointed Thomas Basbøll to my recent post, “Science is science writing; science writing is science,” and he in turn pointed me to his post from a few years ago, “Scientific Writing and ‘Science Writing,’” which stirringly begins: For me, 2015 will be the year that I [Basbøll] finally lost all respect for “science writing”. […]

## Is causality as explicit in fake data simulation as it should be?

Sander Greenland recently published a paper with a very clear and thoughtful exposition on why causality, logic and context need full consideration in any statistical analysis, even strictly descriptive or predictive analysis. For instance, in the concluding section – “Statistical science (as opposed to mathematical statistics) involves far more than data – it requires realistic […]

## Nonparametric Bayes webinar

This post is by Eric. A few months ago we started running monthly webinars focusing on Bayes and uncertainty. Next week, we will be hosting Arman Oganisian, a 5th-year biostatistics PhD candidate at the University of Pennsylvania and Associate Fellow at the Leonard Davis Institute for Health Economics. His research focuses on developing Bayesian nonparametric […]

## “In the world of educational technology, the future actually is what it used to be”

Following up on this post from Audrey Watters, Mark Palko writes: I [Palko] have been arguing for a while that the broad outlines of our concept of the future were mostly established in the late 19th/early 20th Centuries and put in its current form in the Postwar Period. Here are a few more data points […]

## Lying with statistics

As Deb Nolan and I wrote in our book, Teaching Statistics: A Bag of Tricks, the most basic form of lying with statistics is simply to make up a number. We gave the example of Senator McCarthy’s proclaimed (but nonexistent) list of 205 Communists, but we have a more recent example: One of the supposed […]

## My scheduled talks this week

Department of Biostatistics, Harvard University: Today, Tues 10 Nov 2020, 1pm Department of Marketing, Arison School of Business, Israel: Thurs 12 Nov 2020, 10am (US eastern time) St. Louis Chapter of the American Statistical Association: Thurs 5pm 2020, 5pm (US eastern time) The listed topic for the first two events is election forecasting and for […]

## Why is this graph actually ok? It’s the journey, not just the destination.

Josh Miller was in my office and started flipping through Kieran Healy’s book on data visualization, a book that I like a lot—I even use it in my class, replacing Cleveland’s Elements of Graphing Data which is wonderful but things have changed in 35 years so time for a new book. Josh noticed Figure 8.17 […]

## Reverse-engineering the problematic tail behavior of the Fivethirtyeight presidential election forecast

We’ve been writing a bit about some odd tail behavior in the Fivethirtyeight election forecast, for example that it was giving Joe Biden a 3% chance of winning Alabama (which seemed high), it was displaying Trump winning California as in “the range of scenarios our model thinks is possible” (which didn’t seem right), and it […]

## We are stat professors with the American Statistical Association, and we’re thrilled to talk to you about the statistics behind voting. Ask us anything!

It’s happening at 11am today on Reddit. It’s a real privilege to do this with Mary Gray, who was so nice to me back when I took a class at American University several decades ago.

## Misrepresenting data from a published source . . . it happens all the time!

Following up on yesterday’s post on an example of misrepresentation of data from a graph, I wanted to share a much more extreme example that I wrote about awhile ago, about some data misrepresentation in an old statistics textbook: About fifteen years ago, when preparing to teach an introductory statistics class, I recalled an enthusiastic […]

## Some wrong lessons people will learn from the president’s illness, hospitalization, and expected recovery

Jonathan Falk writes about the president’s illness: I [Falk] would think it provides a focused opportunity to make a few salient statistical education points. First, a 6 percent mortality rate (among old people with comorbidities) is really bad, but any single selected person is really quite unlikely to die, or even be really sick. Same […]

## It’s kinda like phrenology but worse. Not so good for the “Nature” brand name, huh? Measurement, baby, measurement.

Federico Mattiello writes: I thought you might find this thread interesting, it’s about a machine learning paper building a “trustworthiness score” from faces databases and historical (mainly British) portraits. It checks many bias boxes I believe, but my biggest complaint (I know it shouldn’t be) is the linear regression of basically spherical clouds of points: […]

## A question of experimental design (more precisely, design of data collection)

An economist colleague writes in with a question: What is your instinct on the following. Consider at each time t, 1999 through 2019, there is a probability P_t for some event (e.g., it rains on a given day that year). Assume that P_t = P_1999 + (t-1999)A. So P_t has a linear time trend. What […]

## The challenge of fitting “good advice” into a coherent course on statistics

From an article I published in 2008: Let’s also not forget the benefit of the occasional dumb but fun example. For example, I came across the following passage in a New York Times article: “By the early 2000s, Whitestone was again filling up with young families eager to make homes for themselves on its quiet, […]

## Everything that can be said can be said clearly.

The title as many may know, is a quote from Wittgenstein. It is one that has haunted me for many years. As a first year undergrad, I had mistakenly enrolled in a second year course that was almost entirely based on Wittgenstein’s Tractatus. Alarmingly, the drop date had passed before I grasped I was supposed […]

## Why we kept the trig in golf: Mathematical simplicity is not always the same as conceptual simplicity

Someone read the golf example and asked: You define the threshold angle as arcsin((R – r)/x), but shouldn’t it be arctan((R – r)/x) instead? Is it just that it does not matter with these small angles, where sine and tangent are about the same, or am I missing something? My reply: This sin vs tan […]

## An example of a parallel dot plot: a great way to display many properties of a list of items

I often see articles that are full of long tables of numbers and it’s hard to see what’s going on, so then I’ll suggest parallel dot plots. But people don’t always know what I’m talking about, so here I’m sharing an example. Next time when I suggest a parallel dot plot, I can point people […]

## “I just wanted to say that for the first time in three (4!?) years of efforts, I have a way to estimate my model. . . .”

After attending a Stan workshop given by Charles Margossian at McGill University, Chris Barrington-Leigh wrote: I just wanted to say that for the first time in three (4!?) years of efforts, I have a way to estimate my model. Your workshop helped me and pushed me to be persistent enough to code up my model. […]