Jessica Hullman, Christopher Wlezien, and Elliott Morris and I write:

Presidential elections can be forecast using information from political and economic conditions, polls, and a statistical model of changes in public opinion over time. However, these “knowns” about how to make a good presidential election forecast come with many unknowns due to the challenges of evaluating forecast calibration and communication. We highlight how incentives may shape forecasts, and particularly forecast uncertainty, in light of calibration challenges. We illustrate these challenges in creating, communicating, and evaluating election predictions, using the Economist and Fivethirtyeight forecasts of the 2020 election as examples, and other recommendations for forecasters and scholars.

Here are the contents of the article:

1 What we know about forecasting presidential elections

1.1 Political and economic fundamentals

1.2 Pre-election surveys and poll aggregation

1.3 State and national predictions

1.4 Replacement candidates, vote-counting disputes, and other possibilities not included in the forecasting model

1.5 Putting together an electoral college forecast

1.6 Martingale property

2 Why evaluating presidential election forecasts is difficult

2.1 The difficulty of calibration

2.2 Win probabilities

2.3 Using anomalous predictions to improve a model

2.4 Visualizing uncertainty

2.5 Other ways to communicate uncertainty

2.6 Prediction markets

3 The role of incentives

3.1 Incentives for overconfidence

3.2 Incentives for underconfidence

3.3 Incentives in competing forecasts

3.4 Novelty and stability

4 Discussion

It’s published in the journal Judgment and Decision Making; here’s the whole issue.

The most political sciencey part of the article is section 1, where we discuss the data and background knowledge that go into a national election forecast, and how we put that information together.

From a psychology point of view, the most interesting aspect of the article is section 3, where we talk about incentives for over- and under-confidence. This is relevant to our discussion from a few months ago on some of the outlandish predictions of the Fivethirtyeight election forecast.

And the part that is most interesting from the perspective of statistical workflow is section 2, especially section 2.4 on graphical communication of uncertainty and section 2.3 where we argue that the ability of forecasts to make implausible predictions (again, see the above link for a couple of examples) is a feature, not a bug, in that it gives forecasters a chance to find flaws and go back and fix their methods.

We posted an earlier version of our article a few weeks ago; this is a new and improved version.

I don’t fully understand the Martingale property as it relates to election forecasts. Maybe I am not fully understanding the Martingale property itself. Is it not reasonable to have a forecast where Biden has X probability of improving win probability to A and Y probability of decreasing win probability to B, such that current estimate of win probability Z = XA + YB is best estimate of win probably but we also expect (X > 0.5) forecast to move on Biden’s favor?

Nice read!

This part clarifies something that I recall being wondered about by some folks here:

“The forecasts from Fivethirtyeight and the Economist are not fully Bayesian — the Fivethirtyeight procedure is not Bayesian at all, and the Economist forecast does not include a generative model for time changes in the predictors of the fundamentals model — that is, the prediction at time C is based on the fundamentals at time C, not on the forecasts of the values these predictors will be at election day — and thus we would not expect these predictions to satisfy the martingale property. This represents a flaw of these prediction forecasting procedures…”

Yes, it’s nice to see that in there. I think in addition to the fundamentals values themselves not being projected forward in time, also the mapping between fundamentals and outcomes doesn’t vary in time. In our current election it might not be so important what the fundamentals are, for example if there’s a big temporary recovery in economics, people might not consider that to be something in favor of the incumbent, particularly if they feel like the recession was in major part caused by the incumbent, or that the recovery has more to do with the pandemic situation, and that the incumbent was worsening that all along, etc.

I think the highly transient nature of the underlying fundamentals leads to different assessments than if they’d been more of a typical recession.

Quoting from the paper:

“In contrast, the California conditional prediction made by Fivethirtyeight seems too pessimistic on Trump’s chances: if the president really were to win that state, this would almost certainly happen in a Republican landslide (only Hawaii and Washington D.C. lean more toward the Democrats), in which case it’s hard to imagine him losing in the country as a whole. Both the extremely wide Florida interval and the inappropriately equivocal prediction conditional on a Trump victory in California that we observe seem to reveal that the Fivethirtyeight forecast has a too-low correlation among state-level uncertainties. Their joint prediction doesn’t appear to account for the fact that either event — Biden receiving only 42% in Florida or Trump winning California — would in all probability represent a huge national swing.”

I’ve been following your statements about your model and Fivethirtyeight off and on for the last few weeks, so I’ve seen this particular claim before, and it seems like a fair criticism to me (though on a gut instinct level, not something I can back up with statistics). But on the other hand, there are some outcomes of your model that strike me almost as perverse (on the same gut level).

For example, your latest simulation contains 13,789 outcomes where Biden wins Ohio. In *all* cases where Biden wins Ohio, he also wins the electoral college. This is pretty astonishing, I would think. Is it really true that given Biden winning Ohio, Trump has basically no chance (<0.01%) of winning Pennsylvania and Wisconsin? (This never happened in 40,000 simulations.) Granted, these states are highly correlated, but my gut says that your model has them too highly correlated. In the latest Fivethirtyeight model, Biden wins Ohio just over half the time (he's now favored to win the state, a pretty striking difference from your model), yet he goes on to lose the election in 64 simulations. So it's extremely unlikely on their model, but not unthinkable, the Biden wins Ohio but loses the election.

Assuming effective odds of 1/80000 on your model for this outcome, this means Fivethirtyeight have the scenario as 128x as likely as you do.

Again this is just a gut response, not a real criticism, but I'm curious what you think about this.

Adam:

Interesting point. You could well be right here. We’ll look into this.

As someone who grew up in Michigan, I know a thing or two about Ohio ;) You make a good point about how low the probability is being likely overconfident. But if you account for demographics of Ohio and ^who^ would have to be swinging Democrat in a big way that hasn’t happened much in recent decades, it would be very very difficult to imagine a Biden win in Ohio and a loss in any part of the Midwestern Blue Firewall.

>>it would be very very difficult to imagine a Biden win in Ohio and a loss in any part of the Midwestern Blue Firewall.

Unless the loss was due to something specific to 2020 like tons of first time mail voters (combined with mail-in ballot procedures being different between states) – eg PA ends up with a much higher rate of flawed/rejected ballots than other demographically-similar states.

Although if Biden wins OH but not PA he still wins the election. The states Clinton won + WI + MI + OH is 278 electoral votes.

So yeah I think Biden winning OH but not the election is very implausible, but I’m not sure I’d say the same thing about winning OH but not *any one* of the traditionally-blue Midwest states.

Well, put it this way. If Biden wins Ohio, he wins MI in a near landslide, so we are down to WI and PA here. I think the factors you mention are critical strategically – and I’m hoping that the DNC/Biden campaign are more on top of this kind of thing than they consistently appear to be – but I doubt that these are within the “small world” of the models at hand. IOW, I’m guessing the model predictions are conditional on the actual mechanics of the election being reasonably “normal”.

You’re probably right that that sort of thing isn’t modeled. I just think that there’s enough oddity about this year that at least one state doing something bizarre (compared to other states that should be well-correlated) isn’t that implausible.

True – but if you give Trump: Nevada, New Hampshire, and Pennsylvania that also gets him over the 270 hump. So if you think Trump winning PA but losing OH is possible, jumping from there to him winning the election isn’t an impossible stretch. That seems less likely than him winning PA + WI to me, but the point is, there are multiple pathways.